Slow disk reads... bummer

I originally posted this in a reply, but figured I start a new thread regarding the latest firmware disk-read issues (hope that doesn’t irritate). Here’s some data from my own troubleshooting efforts.

I just bought the 4TB unit last weekend and the first thing I did was upgrade the firmware.  I was extremely disappointed when I tried to stream a 1080p movie to my WD TV Live Plus, only to get stuttering and pausing… not even watchable.  I have spent a lot of time troubleshooting and tweaking, and have now hit a wall after determining that there’s nothing I can do to fix it.

I knew something had to be wrong with the storage device because I could stream the same movies off an old laptop using a 100mbps NIC, accessing the film off of a stupid USB 2.0 external hard drive with no problems!  LAME.

I systematically tested my home network, end-to-end cabling and switching.  I even configured static IP address and plugged the WD TV Live Plus player directly to the storage array with a pass through cable.  Same frickin behavior.  BS.

I loaded the Optware repositories so I can access some troubleshooting tools from the ShareSpace CLI.  Troubling comparisons between the device and the previously mentioned laptop.

The ‘dstat’ utility reported that while a movie is streamed, data is read from the drives slower than what is being sent out on the network, and it’s being read in a very bursty fashion.  Instead of continuing to read-ahead from disk to keep the read cache full, it just doesn’t seem to be keeping up.  It looks like the following.  Notice the disk read colum… and the net send column.

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw 
  4 2 94 0 0 0| 64k 96k| 28k 1011k| 0 0 | 476 122 
  2 1 96 0 0 1| 0 0 | 54k 2030k| 0 0 | 851 135 
  3 9 86 0 0 2|4100k 0 | 62k 2354k| 0 0 | 924 150 
  3 1 93 0 0 3| 0 0 | 21k 778k| 0 0 | 358 69 
  2 6 87 0 0 5|4100k 0 | 81k 2982k| 0 0 |1198 187 
  4 1 93 0 0 2| 32k 64k| 43k 1594k| 0 0 | 639 146 
  2 7 88 0 0 3|4100k 0 | 64k 2393k| 0 0 | 979 160 
  3 8 84 0 0 5| 0 0 | 57k 2138k| 0 0 | 822 138 
  2 4 92 0 0 2| 0 0 | 30k 1106k| 0 0 | 464 96 
  2 14 81 0 0 3|4100k 0 | 26k 976k| 0 0 | 482 101 
  2 1 93 1 0 3| 64k 96k| 46k 1732k| 0 0 | 743 159 
  2 3 91 0 0 4| 0 0 | 44k 1636k| 0 0 | 663 105 
  2 8 86 1 0 3|4100k 0 | 46k 1722k| 0 0 | 738 124 
  2 2 94 0 0 2| 0 0 | 38k 1415k| 0 0 | 603 104 
  4 7 87 1 0 1|4100k 0 | 52k 1977k| 0 0 | 865 142 
  3 1 91 4 0 1| 32k 64k| 22k 825k| 0 0 | 370 108 
  2 1 92 0 0 5| 0 0 | 39k 1498k| 0 0 | 640 113 
  2 8 86 0 0 4|4100k 0 | 64k 2334k| 0 0 | 922 147 
  4 3 88 0 0 5| 0 0 | 43k 1628k| 0 0 | 675 114 
  2 2 96 0 0 0| 16k 48k| 22k 818k| 0 0 | 402 100

Now compare this to an old crusty laptop running Linux and Samba, accessing the movie off of a USB HDD.  See below:

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw 
  7 5 0 84 0 4|4096k 0 | 180k 3989k| 0 0 |1969 1244 
  6 7 0 83 0 4|4096k 0 | 201k 4443k| 0 0 |2001 1299 
  4 5 59 28 0 4|2304k 4096B| 144k 3165k| 0 0 |1186 767 
  7 4 89 0 0 0|1920k 0 | 95k 2069k| 0 0 |1002 712 
  6 5 86 0 0 3|2432k 36k| 104k 2285k| 0 0 |1248 819 
  6 6 87 0 0 1|2304k 0 | 110k 2419k| 0 0 |1194 815 
  8 5 86 0 0 1|2816k 0 | 125k 2752k| 0 0 |1429 906 
  7 6 86 0 0 1|2688k 4096B| 143k 3162k| 0 0 |1364 902 
  5 6 87 0 0 2|2048k 0 | 96k 2109k| 0 0 |1062 703 
  9 6 83 0 0 2|2944k 0 | 128k 2799k| 0 0 |1486 974 
  7 5 88 0 0 0|2560k 20k| 134k 2963k| 0 0 |1299 856 
  6 6 87 0 0 1|2432k 0 | 113k 2478k| 0 0 |1257 840 
  8 6 84 0 0 2|2816k 0 | 129k 2823k| 0 0 |1423 906 
  6 9 84 0 0 1|3328k 0 | 146k 3226k| 0 0 |1652 1064 
  7 5 88 0 0 0|2688k 0 | 146k 3226k| 0 0 |1343 847 
  7 5 86 0 0 2|2688k 0 | 122k 2688k| 0 0 |1352 905 
  6 6 85 0 0 3|3200k 20k| 158k 3495k| 0 0 |1596 1001 
  9 3 63 22 0 3|2816k 60k| 122k 2688k| 0 0 |1426 939

Notice the difference between disk reads and network sends?  I don’t necessarily believe it’s a network issue with the firmware, but rather a disk read-ahead problem.  Please note, disk writes are a heck of a lot faster that the reads, so it seems that there is an imbalance there. (results of a write not shown for brevity)

hdparm results are also pretty pathetic. I mean, holy ■■■■… this is abysmal.

~ $ hdparm -tT /dev/sda

/dev/sda:
 Timing buffer-cache reads: 128 MB in 0.83 seconds =154.22 MB/sec
 Timing buffered disk reads: 64 MB in 1.28 seconds = 50.00 MB/sec
~ $ 
~ $ hdparm -tT /dev/sdb

/dev/sdb:
 Timing buffer-cache reads: 128 MB in 0.78 seconds =164.10 MB/sec
 Timing buffered disk reads: 64 MB in 1.16 seconds = 55.17 MB/sec
~ $ 
~ $ hdparm -tT /dev/sdc

/dev/sdc:
 Timing buffer-cache reads: 128 MB in 0.80 seconds =160.00 MB/sec
 Timing buffered disk reads: 64 MB in 1.17 seconds = 54.70 MB/sec
~ $ 
~ $ hdparm -tT /dev/sdd

/dev/sdd:
 Timing buffer-cache reads: 128 MB in 0.87 seconds =147.13 MB/sec
 Timing buffered disk reads: 64 MB in 1.20 seconds = 53.33 MB/sec
~ $

I have tried every Samba configuration tweak I can think of and it still elicits the same behavior.

socket options = IPTOS_LOWDELAY TCP_NODELAY SO_RCVBUF=65536 SO_SNDBUF=65536

I’ll have a go at calling WD support, but don’t want to push my return policy from Dell much farther.

FYI… called support, got through to a really nice guy in Level 2 support.  He was pretty straight up and mentioned that engineering has been looking into a few bugs in the latest firmware related to CIFS / Samba performance.  They have no ETA for a fix.  Bummer.

Just initiated a return with Dell.  WD Level 2 support confirmed engineering is looking into a performance problem and has no ETA for a firmware release.  Bummer.  Going to end up buying a QNAP 419p or comparable Synology device.  I would downgrade to 2.2.28… but apparently that doesn’t work anyway.