Archive

Archive for the ‘Storage’ Category

Opensolaris – ZFS recovery after kernel panic

October 19th, 2009 Daz No comments

Recently i hit what i thought was a huge disaster with my ZFS array. Essentially i was unable to import my zpool without causing the kernel to panic and reboot. Still unsure of the exact reason, but it didn’t seem to be due to a hardware fault. (zpool import showed all disks as ONLINE)

When i tried to import with zpool import -f tank the machine would lockup and reboot (panic).

The kernel panic;  (key line)

> genunix: [ID 361072 kern.notice] zfs: freeing free segment (offset=3540185931776 size=22528)

Nothing i could do would fix it… tried both of these options in the system file with no success;

set zfs:zfs_recover=1
set aok=1

After a quick email from a Sun Engineer (kudos to Victor), the zdb command line that fixed it;

zdb -e -bcsvL <poolname>

zdb is a read only diagnostic tool, but seemed to read through the sectors that had the corrupt data and fix things??  (not sure how a read only tool does that) – the run took well over 15hrs.

Updated: 20/10/2009

Apparently if you have set zfs:zfs_recover=1 in your system file the zdb command will operate in a different manner fixing the issues it encounters.

Remember to run a zpool scrub <poolname> if you are lucky enough to get it back online.

This thread has some additional info…

http://opensolaris.org/jive/message.jspa?messageID=479553

Categories: OpenSolaris, Storage

zfs compression and latency

August 19th, 2009 Daz No comments

Since im using ZFS as storage via NFS for my some of my vmware environments i need to ensure that latency on my disk is reduced where ever possible.

There is alot of talk about ZFS compression being “faster” than a non-compressed pool due to less physical data being pulled off the drives. This of course depends on the system powering ZFS, but i wanted to run some tests specifically on latency. Throughput is fine in some situations, but latency is a killer when it comes to lots of small reads and writes (in the case of hosting virtual machines)

I recently completed some basic tests focusing on the differences in latency when ZFS compression (lzjb) is enabled or disabled. IOMeter was my tool of choice and i hit my ZFS box via a mapped drive.

I’m not concerned with the actual figures, but the difference between the figures

I have run the test multiple times (to eliminate caching as a factor) and can validate that compression (on my system anyhow) increases latency

Basic Results from a “All in one” test suite… (similar results across all my tests)

ZFS uncompressed:

IOps : 2376.68
Read MBps : 15.14
Write MBps : 15.36
Average Response Time : 0.42
Average Read Response Time : 0.42
Average Write Response Time : 0.43
Average Transaction Time : 0.42

ZFS compressed: (lzjb)

IOps : 1901.82
Read MBps : 12.09
Write MBps : 12.28
Average Response Time : 0.53
Average Read Response Time : 0.44
Average Write Response Time : 0.61
Average Transaction Time : 0.53

As you can see from the results, the AWRT especially is much higher due to compression. I wouldn’t recommend using zfs compression where latency is a large factor (virtual machines)

Note: Under all the tests performed the CPU (dual core) on the zfs box was never 100% – eliminating that as a bottleneck.

Categories: Networking, Storage, Virtual

opensolaris – smbd issues?

July 27th, 2009 Daz 2 comments

Hmm… i’ve been having problems since the 2009.06 (snv_111b) update with cifs.

Cant pin it exactly as it could be “load” related… hmmm.

found this ? http://opensolaris.org/jive/thread.jspa?threadID=107681 this also may be a clue.. http://opensolaris.org/jive/thread.jspa?threadID=92472&tstart=75

imapd ?  might have to go back to 2008.11

You might get better performance if you enable oplocks but
there are known issues with it but you can do it just to
see if you see any difference:

svccfg -s smb/server setprop smbd/oplock_enable=boolean: true

So far running the above command has fixed things for me? I’ll update if the problem returns.

svccfg -s smb/server setprop smbd/oplock_enable=boolean: true

Updated : 27/07/2009

Problem came back, so i’m updating to 117 as per comments below

Opensolaris : Citrix XenServer / ESX – Hooking into ZFS

July 22nd, 2009 Daz No comments

To share your zfs pool via NFS (that works with Citrix Xen / ESX) to a host called “esxhost”;

zfs set sharenfs=rw,nosuid,root=esxhost tank/nfs

Note : You MUST have a resolvable name from the opensolaris box. i.e. you should be able to ping it. I have tried with ip’s only and it will fail. I have edited the /etc/hosts file to include the following line for my config;

# Copyright 2007 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
#
# ident “%Z%%M% %I% %E% SMI”
#
# Internet host table
#
192.168.9.120 esxhost

This also requires that you are using both DNS and Files in your /etc/nsswitch.conf file. You should have a line like so;

# You must also set up the /etc/resolv.conf file for DNS name
# server lookup. See resolv.conf(4). For lookup via mdns
# svc:/network/dns/multicast:default must also be enabled. See mdnsd(1M)
hosts: files dns mdns

# Note that IPv4 addresses are searched for in all of the ipnodes databases
# before searching the hosts databases.
ipnodes: files dns mdns

i’ve also run this before hand; (to allow full access)

chmod -R 777 /tank/nfs

Update : check this guide http://blog.laspina.ca/ubiquitous/running-zfs-over-nfs-as-a-vmware-store

opensolaris / zfs – whitebox build

July 19th, 2009 Daz No comments

I’ve built a little server for home use, but it pales in comparison to this beast… This type of setup would be perfect for a lab / test environment that requires lots of fast and reliable disk. SCSI drives are fading out, SATA can perform if its setup right. When you look at the price of the entire build you wonder why corporations continue to spend the big bucks on the big storage names.

Check out this build (very nice clear guide)   http://www.stringliterals.com/?p=77

rpc-4020b (1)

Awesome piece of work.

Categories: OpenSolaris, Storage