Installed eight WD3000FYYZ drives on QNAP TS-EC879U-RP (8-bay Linux-based NAS box) in RAID6. All brand new. During array initialization, several messages popped up:
[Harddisk 7] medium error. Please run bad block scan on this drive or replace the drive if the error persists.
… about 200 of “read error corrected”, four each of I/O and “medium” errors. Since I ran initialization three times, this may have caused three instances of I/O and “medium” errors on the same block. All four I/O error entries have the same numbers - does it mean the error happens on the same physical block / sector of the hard drive?
According to some research, the I/O error above is RAID management reporting a URE but with failed re-allocation, “Unrecovered Read Error, Auto Reallocation Failed”. If I am getting it right, that means the drive (or the RAID brain) couldn’t write the data to a spare sector. Yet the initialization continued and eventually ended successfully. So I am a little puzzled as to what it all means.
Ran a “short” (5 minutes) SMART test - nothing. “Raw_Read_Error_Rate” - 0.
Running a “bad block scan”. This will take about 7 hours…
Questions:
What does this all mean? Is the drive bad or is it not? If it is bad, why does “Raw_Read_Error_Rate” show zero?
Are these errors enough to want / need to replace the disk? Will they be enough for WDC to issue an RMA?
If not, what tests do I need to run to determine that the drive is bad, that would be sufficient for WDC purposes?
I am running a “complete” SMART test on the drive on QNAP (so far no error messages), 70% done after four hours. If NPF, will also run WinDlg on it. If still NPF, what’s next? Where did those error messages come from?
Also, what’s the purpose of running the “extended test” if a bad block scan and “Quick Test” found nothing? What does the extended test do, exactly, that the first two don’t? The extended test will take about seven hours - I really don’t have the time for it, and if I have to run it, I’d like to have a better idea what it’s for.
unfortuantely, i had issues with certain raid controllers (HP) on linux before, resulting in the linux kernel even crashing sometimes. if the driver for these raid controllers aren’t really stable and properly tested, that might actually be the cause for the issues you are experiencing.
so, you would need to thourougly test this with a non-linux based operating system such as windows, to confirm the HDDs are properly working. might as well see if others have similar issues on linux with said sata/raid controller (-> google) first. might save you trouble and time!
so, you would need to thourougly test this with a non-linux based operating system such as windows, to confirm the HDDs are properly working. might as well see if others have similar issues on linux with said sata/raid controller (-> google) first. might save you trouble and time!
Thanks David, that’s the standard operating procedure, already went through it (and then some) with no good results.