WD30EZRX shows SMART logging not supported

I recently bought three WD30EZRX drives (before doing enough reading to realise they might not be ideal for a RAID array :-} But this isn’t about that).

At first, although I only have my memory to show for it, they all showed full SMART data. But at some point since I first checked, one of them has stopped showing SMART log data, claiming that logging is not supported. The other two drives are still reporting fine.

The affected drive still appears otherwise to be *working*, but I’m wondering if this is one I ought to be sending back, or if it just needs a poke of some form to enable the SMART logging again.

yes, it does persist through reboots and power-cycles. :slight_smile:

This is what I get right now:

rachel@twilight:~$ sudo smartctl -a /dev/sdh
[sudo] password for rachel: 
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.5.0-19-generic] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Green (Adv. Format)
Device Model: WDC WD30EZRX-00MMMB0
Serial Number: WD-WMAWZ0209664
Firmware Version: 0958
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Fri Nov 30 11:42:26 2012 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Total time to complete Offline 
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x00) Offline data collection not supported.
SMART capabilities: (0x0000)	Automatic saving of SMART data is not implemented.
Error logging capability: (0x00)	Error logging NOT supported.
					No General Purpose Logging support.

SMART Error Log not supported
SMART Self-test Log not supported
Device does not support Selective Self Tests/Logging

 For comparison, this is what one of the other drives gives me; the third drive is similar to the second:

rachel@twilight:~$ sudo smartctl -a /dev/sdi
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.5.0-19-generic] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Green (Adv. Format)
Device Model: WDC WD30EZRX-00MMMB0
Serial Number: WD-WMAWZ0213423
LU WWN Device Id: 5 0014ee 0adaf0dbd
Firmware Version: 80.00A80
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Fri Nov 30 11:42:29 2012 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: (50700) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities: (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability: (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 487) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
  3 Spin_Up_Time 0x0027 141 140 021 Pre-fail Always - 9933
  4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 23
  5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
  7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
  9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 260
 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 17
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 68
194 Temperature_Celsius 0x0022 102 094 000 Old_age Always - 50
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
    1 0 0 Not_testing
    2 0 0 Not_testing
    3 0 0 Not_testing
    4 0 0 Not_testing
    5 0 0 Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The drives are in bays 1-3 of a hornettek enterprise 4x eSATA enclosure (the affected drive is in bay 1) connected via a Startech port-multiplier eSATA interface, on an ubuntu quantal system. I don’t think these ought to be relevant given only one of the drives is exhibiting this.

What may *possibly* be more relevant, though I feel superstitious for saying so is this:

After initially putting these three drives into the enclosure and using them for a while, I put a drive into bay 4: an older caviar green model: WD10EACS that wasn’t currently in a machine though I couldn’t remember why I had a 1TB drive not in use. After an hour or so of use, the drive in bay 1 (the drive I’m talking about) dropped out of the raid, causing a huge headache as you can imagine, with sense errors logged in syslog.

I removed the old 1TB drive, and since then the drive has been fine, no more sudden errors, disappearances, sense faults etc., even after later populating bay 4 with another even older drive (WD5000AAKS as if that’s relevant). It’s been working fine, except I *think* it no longer having SMART logging dates from then.

For now I’m considering the old 1TB Caviar Green drive to be Poison, although, as I say, I feel a bit superstitious for doing so, not knowing how such a “poison” might work, but I have a suspicion it killed my Sharkoon drive dock as well. Correlation may not be causation, but I don’t fancy putting it back into a system any time soon, and this may after all be why it was taken out of one before.

As an aside, according to those two drives, I don’t seem to be getting the rapidly-escalating LCC figures others are seeing; but it’s only just come off about a fortnight of reshaping and then repopulating the array - but having finished doing that a few hours ago *something* still seems active whenever the RAID is up (even though /proc/mdstat shows nothing going on and the drive activity lights don’t flicker; the only way I can tell is from the sound) so the drives are probably avoiding the LCC problem by virtue of never getting a chance to idle. Suspecting this may bed in later.

i don’t have an answer for you but i had a WD Green 1TB drive recently that i returned

running on nvidia 680i motherboard based sata 2 chipset and if i used the default windows

driver for the sata controller the drive was limited in what it would report to a variety of

different programs and if i installed the official nvidia sata controller driver I could then get

all data monitored. SO i say this as a possible cause to why your having issues… drivers.

you said,

“After initially putting these three drives into the enclosure…”

did the problem pop up after that ? see where I’m going with this ?

so i dunno maybe i gave ya some idea to work with ? better than people ignoring your posts lol

and good luck :slight_smile:

well, this is linux; it’s using the well-supported sata_sil24 driver for this interface - and remember the other two drives of exactly the same model are showing all their SMART data just fine, as did - IIRC - this one at first.

So I don’t think it’s drivers. :slight_smile:

I noticed it was linux before i posted hence why i made no reference to windows…

linux or not i still wouldn’t rule out the possibility of drivers… the concept is the same.

i installed the windows version of the program that gave you the logs before but

never used it so i uninstalled it so i don’t know much about it but I’m curious what results

you will get if you change the program…

ya gotta eliminate all the variables.

1 - The motherboard

2 - Sata Controller drivers

3 - Software data source

If your sure you eliminated everything but the drive then i would suspect its a drive problem…

Have you tried any testing Tools like DLG Diag for DOS ?

You can get a bootable CD iso at WD for that. I like to use the free boot cd “Hirens Boot CD”

it has the same program and many more that are similar. So if one isn’t working you can use more tools.

If you have other computers to test with maybe try that too ? Windows machine you can test on too ?

Maybe post what distrib your using too. never know that may lead to getting some help too ?

I’ve used Linux so infrequently by the time i go to use it again i’ve forgotten most of the console commands lol

Normally i use Backtrak last several times i have used Linux.

Anyway yeah i would suspect the drive has issues if you ruled out everything and still have a problem.

But that is just my 2 cents. Of course that can be a tricky thing though some boards had known issues with Sata drives that were fixed in later firmware updates like my Evga 680i mobo.

Maybe someone else can jump in offer you better insight / help.

suggesting downloading vendor’s own drivers implies windows. :stuck_out_tongue:

(and yeah, in case of an amazing exception, I checked. It’s *so* windows-only they don’t even bother to say that their firmware download file is windows-only, IYSWIM.)

however, drive firmware *might* be an issue. I note the affected drive’s firmware version is “0958”, whereas the other two both report “80.00A80” despite all of them having the device model: “WDC WD30EZRX-00MMMB0”.

So upgrading the drive firmware, if possible, may be worth trying. (Interesting the firmware versions are of a completely different *format*, even.)

And it is possible my memory of smart working on this drive when I first got it is false. It was nearly a fortnight ago when I *thought* i did a basic check on all of them, on receipt and first installing them.

weird issue it would bug me too

i’d like to know if you got it figured out

i take it you currently have three Green drives and 1 of 3 are not providing SMART data ?

Very odd indeed :frowning:

i don’t have a wd disk, but came across this discussion as i was looking for clues regarding a quite similar experience i’m having.  in my case, i have four disks, all in a 4 bay disk enclosure [  http://www.hornettek.com/hdd-enclosure/3-5-quad-bay-jbod.html ] it’s an esta/usb enclosure, and in my case, is connected via usb, to a computer running ubuntu 12.10.

i see a very similar set of output from smartctl, even though my disks are completely different from yours:

smartctl -a /dev/sdb
smartctl 5.43 2012-06-30 r3573 [i686-linux-3.5.0-17-generic] (local build)
Copyright © 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST3750330AS
Serial Number: 5QK042GB
Firmware Version: 0958
User Capacity: 750,156,374,016 bytes [750 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Jan 10 22:14:01 2013 EST

==> WARNING: There are known problems with these drives,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/207951en
http://knowledge.seagate.com/articles/en_US/FAQ/207957en

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x00) Offline data collection not supported.
SMART capabilities: (0x0000) Automatic saving of SMART data is not implemented.
Error logging capability: (0x00) Error logging NOT supported.
No General Purpose Logging support.

SMART Error Log not supported
SMART Self-test Log not supported
Device does not support Selective Self Tests/Logging

note “Firmware Version: 0958”, “Auto Offline Data Collection: Disabled”, and the truncated/incomplete output

what’s interesting is that this appears to be related to the order in which the disks are enumerated.  even though there are 3 different models of disk in this group of four, whichever disk is enumerated first suffers from this inaccurate smartctl output.  in my particular case, they enumerate from top to bottom in the enclosure, so whichever disk is the topmost disk suffers from this - even if i leave the top bay empty, and include only three disks in the enclosure.  

having experimented with different ordering, and with another single usb -> sata adapter [vantec], i’m fairly confident that this smartctl output is erroneous in some capacity.  each of the four disks will correctly report their various info, including the proper/confirmed firmware version, auto offline data collection, and complete smartctl output - as long as the disk is not first in enumeration, or is connected to the vantec single usb -> sata adapter.  additionally, having just updated the firmware on all of these disks, i know with certainty what firmware revision is on each - none of which are “0958”.

here is that same seagate disk, now in position two:

smartctl -a /dev/sdc
smartctl 5.43 2012-06-30 r3573 [i686-linux-3.5.0-17-generic] (local build)
Copyright © 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.11
Device Model: ST3750330AS
Serial Number: 5QK042GB
LU WWN Device Id: 5 000c50 00838a648
Firmware Version: SD1A
User Capacity: 750,156,374,016 bytes [750 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Thu Jan 10 22:25:07 2013 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 634) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 174) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x103b) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 112 099 006 Pre-fail Always - 67235
3 Spin_Up_Time 0x0003 093 093 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 62
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 065 055 030 Pre-fail Always - 12896954537
9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 5164
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 62
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 8590065667
189 High_Fly_Writes 0x003a 099 099 000 Old_age Always - 1
190 Airflow_Temperature_Cel 0x0022 067 046 045 Old_age Always - 33 (Min/Max 32/33)
194 Temperature_Celsius 0x0022 033 054 000 Old_age Always - 33 (0 19 0 0 0)
195 Hardware_ECC_Recovered 0x001a 022 015 000 Old_age Always - 67235
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

and here is another disk [hitachi], now in slot one, where the seagate was previously:

smartctl -a /dev/sdb
smartctl 5.43 2012-06-30 r3573 [i686-linux-3.5.0-17-generic] (local build)
Copyright © 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Hitachi Ultrastar A7K1000
Device Model: Hitachi HUA721075KLA330
Serial Number: GTE200P8GBGHZE
Firmware Version: 0958
User Capacity: 750,156,374,016 bytes [750 GB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Jan 10 22:26:00 2013 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x00) Offline data collection not supported.
SMART capabilities: (0x0000) Automatic saving of SMART data is not implemented.
Error logging capability: (0x00) Error logging NOT supported.
No General Purpose Logging support.

SMART Error Log not supported
SMART Self-test Log not supported
Device does not support Selective Self Tests/Logging

same peculiar firmware and accompanying characteristics.

additionally, attempting to enable auto offline data collection fails for the affected disk:

smartctl --offlineauto=on /dev/sdb
smartctl 5.43 2012-06-30 r3573 [i686-linux-3.5.0-17-generic] (local build)
Copyright © 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
Warning: device does not support SMART Automatic Timers.

Error SMART Enable Automatic Offline failed: scsi error medium or hardware error (serious)
Smartctl: SMART Enable Automatic Offline Failed.

smartctl --offlineauto=on /dev/sdc
smartctl 5.43 2012-06-30 r3573 [i686-linux-3.5.0-17-generic] (local build)
Copyright © 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Automatic Offline Testing Enabled every four hours.

this may not be topical enough for discussion here, but if it is, i’m wondering what else might be done to reveal some additiona clues about why this is happening.

well, i really should have read the discussion more thoroughly - having now done that, i see that we share the same enclosure, albeit in use via usb vs esata.  i also have/am experiencing raid stability issues.  initially, i had a seemingly functioning raid 5 across four disks, but at some undefined point it failed [it is used in a backup system that is not neccessarily attended to every day, and i’ve not yet rebuilt it, wanting to sort out this behavior first.

well now that *is* interesting. Same enclosure, different make/model of drive, but the drive in bay 1 shows same firmware version number and same symptoms! I think we can call that a smoking gun. :slight_smile: It’s put my mind at rest then - a bit annoying full smart functionality on the drive in that bay doesn’t work, but no reason to think the drive itself is problematic.

I thought via USB you don’t get full smart functionality anyway, but it looks like you do. I had mine briefly connected to usb3 but it seemed too slow - the initial raid reshape probably wouldn’t have finished *yet* - so I switched to eSATA. I was only using usb3 initially because i wasn’t confident of the hotpluggability of the esata stuff and didn’t want to do a restart right then.

Possibly a driver issue, it looked like it was only doing usb2 speeds, and it was the first usb3 device i’d ever plugged into the usb3 sockets on that pc so any such issues wouldn’t have affected me before.

yes, things seem to otherwise be functioning just fine - i’m not certain that my initial problems weren’t operator error.  i’ve only just rebuilt the array and begun to store data on it, so time will tell if the same thing will happen again.  has your array been functioning without issue [aside from the problem you mentioned initially with the WD10EACS]?  what raid level do you use?

i guess it various from product to product, but the smartmontools folks have been pretty good at coercing smart output from disks on the other side of usb bridges and port multipliers.  the consensus in this case is that it’s perhaps a firmware issue of some sort in one of the components used within the enclosure.  for details, see this thread:

http://sourceforge.net/mailarchive/message.php?msg_id=30343704

i’ve attempted to contact both hornettek and jmicron, hoping they might express some interest in this.  i’m not a regular here, but if anything productive results, i’ll try to remember to revisit this thread.

Yes, everything’s been fine since. I initially used raid5 but reshaped to raid6 later. (that took a ridiculous long time!) Some failures were occuring which I think I narrowed down to one of the older drives on a different interface - hasn’t recurred since taking that out anyway. NB: The eSATA interface the hornettek is connected to is a silicon image, not a jmicron, so it still looks like the common factor is the hornettek case.

I think the most useful thing I did during the whole process was finally make a way to map the devices to their physical locations; via the drive serial number in the symlinks in /dev/disk/by-id. So when one drive was throwing up errors in syslog I was able to find it exactly. Actually that happened twice, but once it was sense errors pointing to a loose connector, next time it really was a bad drive. All that reshaping is a bit of a stress-test i think!