2To RED showing a lot of "Raw Error Read Rate"

Hi, I have a RAID 1 setup with 2 WD20EFRX, one of the disk shows a lot of “Raw Error Read Rate”.

The value is increased nearly each day :

11/09/2013 : 14687

12/09/2013 : 14687

13/09/2013 : 14883

14/09/2013 : 14883

15/09/2013 : 15686

Also my RAID has turned into virying state 2 times since 10/09/2013 and maybe 4 time since I did setup the RADI 1.

Should I RMA the “bad” disk?

I would contact Support about an RMA. Report the symptoms and see what they say.

Do you have the “uncut” version of your answer?:neutral_face:

It’s a bit short, is’nt it?

It will be up to Support to judge the situation. An increasing error report is troubling. You can get more information by running Data Lifeguard- WinDLG. Having had difficult experience with a drive that was showing signs of problems, I tend toward immediate replacement, if possible. I take it that the RAID has rebuilt itself more than once? I’m not sure what “turned into virying state” means. It’s a good thing that this is a RAID 1. Rebuilding with a new drive should be straightforward.

1 Like

“virying” =verifying, sorry…

And thanks for this long version answer.

That was also my point of view.

I was going to purchase 2 others red 2To to make a RAID 10, I will wait the RMA result now…

An other question arises : I have bought this drive from Amazon (France). Then, should I first try to ask Amazon for RMA, or directly to WD?

Thanks again.

It seems unlikely that a drive will be replaced based on raw read error rate.

The first thing to understand that raw read error rate is not a counter, the number you see is not a direct count of read errors or anything like that. I just had a quick look at a few seagates, the raw read error rate value was 56 million. But if you take that and convert it into the proper formatted value (starting at 100, counting down to 0, with a value 6 or below being failed), then it’s at 65. And these drives all jump down to 65 as soon as you turn them on.

Each manufacturer has their own algorithms behind raw read error rate, and in fact manufacturers implement this differently on different models of their drives.

So for one you need to look at the formatted value, because raw value is meaningless. The issue here is that drives have ECC, meaning it can detect and correct read errors on the fly. So when you actually get a “bad sector” it is a sector so damaged that ECC can’t correct it. Now, if you take a WD black in good condition, ECC will be correcting errors all of the time, but the black will give a raw read error rate of 0 i.e. giving no information at all about what’s going on or what overall state the drive is in. Pick up a green and it could be in better condition than the black but it might give an raw read error rate of say 55 million, which is 68/100, which is say 18 higher than the typical for those drives meaning the drive is better than most other greens… Those numbers for the greens are made up, but I was trying to illustrate that you need to be careful when trying to interpret certain smart values for HDD’s, raw read error rate is not something most people should pay much attention to.

If you are worried about the disk (and I can’t tell without the formatted value) do a read test on both of your drives, one of the test that reports read time for each sector. If drive a reads most sectors at say 4ms with a few at 12, but drive b has a ton much higher, then it’s time to look into RMA. But if they give the same results you are wasting your time trying to RMA the drive…

-What do you call “formatted value”?

-no word on my RAID that need frequent verifications?

-“do a read test on both of your drives, one of the test that reports read time for each sector” : How can I do that?

It might be easier if you give a bit of information about what software/tools you are already running. Almost all software that reports smart will take the raw data (the one you supplied) and display it as the formatted value, or along-side the formatted value.

The raid needing verification really depends on the hardware/software running the raid. Again, more info is needed from you on your systems. The thing is the raw read error rate alone will NOT EVER lead to raid verification, if there is a fault with the drive that will lead to a verification of your array it will present itself in other smart values, which are much more prominent and easy to recognise as a problem. If you post the entire set of smart parameters I can take a look.

I use a dos tool that you probably don’t want to go near. HD Tune Pro has a trial period and will do the sort of test you want, HDD scan is free and does it. Google HDDScan and run a verification scan on your drives and compare the map tabs.

P.S. HDD Scan will give formatted SMART readouts too.

1 Like

You are filling in lots of gaps for me, Zatick. Thanks!

@Zatick :

I have downloaded HDDScan but it’s not capable of reading SMART info if HDs are in RAID mode(Intel RST).

HD Sentinel can :

HD1

  No. Attribute Threshold Value Worst Data Status Flags
1 Raw Read Error Rate 51 199 199 15686 OK Self Preserving, Error-Rate, Performance, Statistical, Critical  
3 Spin Up Time 21 197 172 3150 OK Self Preserving, Performance, Statistical, Critical  
  4 Start/Stop Count 0 100 100 249 OK (Always passing) Self Preserving, Event Count, Statistical  
5 Reallocated Sectors Count 140 200 200 0 OK Self Preserving, Event Count, Statistical, Critical  
  7 Seek Error Rate 0 200 200 0 OK (Always passing) Self Preserving, Error-Rate, Performance, Statistical  
  9 Power On Time Count 0 93 93 5188 OK (Always passing) Self Preserving, Event Count, Statistical  
10 Spin Retry Count 0 100 100 0 OK (Always passing) Self Preserving, Event Count, Statistical  
  11 Drive Calibration Retry Count 0 100 100 0 OK (Always passing) Self Preserving, Event Count, Statistical  
  12 Drive Power Cycle Count 0 100 100 207 OK (Always passing) Self Preserving, Event Count, Statistical  
  192 Power off Retract Cycle Count 0 200 200 96 OK (Always passing) Self Preserving, Event Count, Statistical  
  193 Load/Unload Cycle Count 0 200 200 152 OK (Always passing) Self Preserving, Event Count, Statistical  
  194 Disk Temperature 0 111 106 36 OK (Always passing) Self Preserving, Statistical  
196 Reallocation Event Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical  
197 Current Pending Sector Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical  
198 Off-Line Uncorrectable Sector Count 0 100 253 0 OK (Always passing) Self Preserving, Event Count  
  199 Ultra ATA CRC Error Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical  
  200 Write Error Rate 0 100 253 0 OK (Always passing) Error-Rate  

 

 

 

 

HD2

  No. Attribute Threshold Value Worst Data Status Flags
1 Raw Read Error Rate 51 200 200 0 OK Self Preserving, Error-Rate, Performance, Statistical, Critical  
3 Spin Up Time 21 212 175 4375 OK Self Preserving, Performance, Statistical, Critical  
  4 Start/Stop Count 0 100 100 260 OK (Always passing) Self Preserving, Event Count, Statistical  
5 Reallocated Sectors Count 140 200 200 0 OK Self Preserving, Event Count, Statistical, Critical  
  7 Seek Error Rate 0 200 200 0 OK (Always passing) Self Preserving, Error-Rate, Performance, Statistical  
  9 Power On Time Count 0 93 93 5195 OK (Always passing) Self Preserving, Event Count, Statistical  
10 Spin Retry Count 0 100 100 0 OK (Always passing) Self Preserving, Event Count, Statistical  
  11 Drive Calibration Retry Count 0 100 100 0 OK (Always passing) Self Preserving, Event Count, Statistical  
  12 Drive Power Cycle Count 0 100 100 217 OK (Always passing) Self Preserving, Event Count, Statistical  
  192 Power off Retract Cycle Count 0 200 200 102 OK (Always passing) Self Preserving, Event Count, Statistical  
  193 Load/Unload Cycle Count 0 200 200 157 OK (Always passing) Self Preserving, Event Count, Statistical  
  194 Disk Temperature 0 117 111 33 OK (Always passing) Self Preserving, Statistical  
196 Reallocation Event Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical  
197 Current Pending Sector Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical  
198 Off-Line Uncorrectable Sector Count 0 100 253 0 OK (Always passing) Self Preserving, Event Count  
  199 Ultra ATA CRC Error Count 0 200 200 0 OK (Always passing) Self Preserving, Event Count, Statistical  
  200 Write Error Rate 0 100 253 0 OK (Always passing) Error-Rate  

 

 

 

I just launched a read test in HDDScan, but it gives no feedback of the test progress...so I let it run.

It gives feedback if you double click the test being run in the window at the bottom.

I can say though, that there doesn’t appear to be anything wrong with either drive and I would be extremely surprised if WD would even consider replacing the drive based purely off that SMART readout.

Do you have the info about your RAID?

chart.gif

But  I realy wonder why I am doing this test…

One of those disk is keeping breaking my RAID1 that is that simple!

That picture is blocked, did it contain the serial number?

I mentioned it before, raw read error rate alone will NOT break the raid. If there are read errors it will ALWAYS present itself in other parameters that are a lot easier to interpret.The raw read error rate you showed does not necessarily mean there is a problem.

Most likely you are using intel ICH raid and if you elaborate on your raid controller I might be able to help you with that.

I’m trying to help you here, if you don’t want it you can try to RMA the disk. But given the data you have presented you could very well find they just send your disk back and charge you for postage…

Regarding the picture, I think it must be validated by WD before you can see it. What it shows is a slow decreasing curve BUT with a hudge gap at 3/4 of the disk array.

Yes I use the Intel sata port on a z77mx-d3h mobo with RST 12.6.0.1028 drivers and 12.6.0.1867 OROM.

Okay, that would indicate there may be a problem with the disk and it should show up as pending sectors or reallocated sectors. Do you have counts in either of these fields on either disk?

Also, check offline uncorrectable. Again, it owuld be easier if you could just post the full set of smart details.

I have allready posted full SMART details for both disks.

ok thats enough : 3 months later system now simply hang when I try to copy a 20Gb movie.

  • raw read error rate : keeps growing, its now 18323

  • and today, a good news :  2  Pending Sector Count…:smirk_cat:

I was planning to buy 4 4Tb WD red next week, if WD refuse me to RMA this dying disk I will simply go to Seagate.