zfs – now has dedup!

Cool. zfs as of version 21 has deduplication built in. And thats the good dedup – synchronous dedup. i.e. deduped on the fly!

How easy is it to turn on? – very!

Once you have upgraded your zpool to 21 or above you can run the following command at the pool level and deduplication will be over all your data from that point onwards.

zpool set dedup=on tank

Done

Note : Watch your performance, it will drop like a rock if you do not have enough ram for your dedup tables. Do some tests after enabling this feature.

http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup

HDD short stroking – is it worth it?

I’ve got some old 250Gb drives that are starting to show their age. I’ve currently got them setup in a 3x RAID 0 config which presents about 750Gb of space.

I’ve got everything on a single partition (meh, i’m lazy). I’ve done various speed tests in the current setup (with all space allocated), but i thought i’d re-image onto a short stroke partition.

I only use about 150Gb of space on my main machine (most of my data is on another box), so i’m going to try creating a 200Gb partition to test if this provides any kind of performance boost.

So reducing my raid 0 from 750Gb to 214Gb, and here are the results…

Before with all 750Gb presented…

Same disks but short stroked to 214Gb….

Conclusion : Yip, seems like its worth it if you have the spare space. Average throughput is up by 10MB/s and seek has improved by almost a third loosing 4ms.

You will get even more of an improvement if you can use a smaller % of capacity per drive and / or more drives for your stripe.

Updated : 07/02/2010

btw – the above was without write-back cache enabled…. if i turned that on i got the following…

OpenSolaris – iSCSI

Want iSCSI in opensolaris?

Grab SUNWiscsitgt via package manager.

enable the server via svcadm;

svcadm enable iscsitgt

create your zfs iscsi pool;  (this command will limit iscsi drive to 500GB in size)

zfs create -V 500G tank/iscsi

set isci on via zfs command;

zfs set shareiscsi=on tank/iscsi

check that target is up and running;

iscsitadm list target -v

Done. Should be able to connect via ip from another machine. I have not covered CHAP or any client side configuration. Assumed isolated LAN.

HDTune_Benchmark_SUN_____SOLARIS

zfs – checking your zpool throughput

This is quite a good diagnostic for checking your disk throughput. Try copying data to and from your zpool while your running this command on the host…

zpool iostat -v unprotected 2

capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
unprotected  1.39T   668G     18      7  1.35M   161K
c7d0       696G   403M      1      2  55.1K  21.3K
c9d0       584G   112G      8      2   631K  69.3K
c7d1       141G   555G      8      2   697K  70.0K
----------  -----  -----  -----  -----  -----  -----

The above command will keep displaying the above output every 2 seconds (average during that time). I’ve used it a few times to ensure that all disks are being used (in write operations) where needed. Of course read op’s may not be typically across all disks as it will depend where the data is…

As you can see in the output from my “unprotected” zpool, my disk “c7d0” is near full so less write operations will be on this disk. In my scenario most of my reads also come from this disk, this was due me copying most of the data into this zpool when there was only this single disk.

I’ve heard rumor of a zfs feature in future that will re-balance the data across all the disks (unsure if its live or on a set schedule)

Another way to show some disk throughput figures is to run the iostat command like so…

iostat -exn 10

extended device statistics
device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
cmdk17    1.0    0.0   71.5    0.0  0.0  0.0   10.9   0   1
cmdk18    0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
cmdk19    0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
cmdk20    0.8    0.0   33.5    0.0  0.0  0.0   13.5   0   1
cmdk21    0.4    0.0    0.5    0.0  0.0  0.0   15.5   0   1
cmdk22    0.8    0.0   66.3    0.0  0.0  0.0    9.0   0   1
cmdk23    0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0

cmdk24    0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0

extended device statistics       —- errors —

                     extended device statistics       ---- errors --- 


r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w trn tot device
0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0   0   0   0   0 c11d1
0.0    7.7    0.0   25.8  0.0  0.0    2.3    4.9   0   3   0   0   0   0 c8d0
0.0   17.6    0.0  238.0  0.0  0.0    0.0    0.3   0   0   0   0   0   0 c9d0
0.0    1.0    0.0    0.8  0.0  0.0    0.0    0.3   0   0   0   0   0   0 c7t0d0
0.0    1.0    0.0    0.8  0.0  0.0    0.0    0.2   0   0   0   0   0   0 c7t2d0
0.0    1.0    0.0    0.8  0.0  0.0    0.0    0.3   0   0   0   0   0   0 c7t3d0
0.7   21.1   29.9  315.0  0.0  0.0    0.0    1.1   0   1   0   0   0   0 c7t4d0
0.7   20.9   29.8  314.9  0.0  0.0    0.0    1.7   0   2   0   0   0   0 c7t5d0
0.8   21.0   34.1  315.0  0.0  0.0    0.0    1.2   0   1   0   0   0   0 c7t6d0
0.5   20.8   21.3  314.8  0.0  0.0    0.0    1.1   0   1   0   0   0   0 c7t7d0

This should show you all your disks and update on a 5 second interval. Copying data back and forth to your drives will show various stats.

Linux – Hard Drive Performance

Or lack of.

You must insure that your drives are using the sata driver (if they are sata disks), quickest way to check is that they will be called sda sdb etc…. and not hda hdb, as this is the default IDE driver and will slow down the performance majorly.

How to check performance;

hdparm -tT /dev/hda

Modern drives should be getting over 50MB/s easily. If your getting about 5MB/s you have the problem.

How to Fix;

I had to change the sata mode to Enhanced in the bios and disable any of the on-board IDE controllers. When rebooting all your device names will change, and you will need to edit the /etc/fstab file as appropriate.

Apparently this is due to a conflict in the drivers, and them confusing your SATA drives as standard IDE disks.