WD5000ABYS valid SMART but NOT Readable?!

uruiamme · December 20, 2011, 8:47am

This is at least my third 500GB RE2 drive to fail. :mansad: This one is weird, though. It failed in Windows while doing an automatic backup, and in my confusion after the failure and not realizing I had a dead drive, my backup file got overwritten by a backup without the data I want.

Nothing can read the WD5000ABYS-01TNA0 drive.

BIOS will usually (With some help, depending on the PC) detect it.
When I try a recovery program (booting to DOS/Linux CD) that can read MBR and Track 0, the drive always gives read errors and not a single byte seems to be accessible. If the utility uses FreeDOS, it is very slow to boot.
The SMART data is readable from a LiveCD Linux environment, but Gparted does not see the drive. It appears that the Linux environment does not mount the drive for Gparted to see.
Spinrite 6.0 cannot read it. FreeDOS spits out errors, so this is slow to boot, but Spinrite exclaims in red text that something is wrong. I have used Spinrite a lot, and I haven’t seen this before.
WDC diagnostics give an error 0003 I believe. It doesn’t get beyond drive model and serial number and a quick test that fails.
When I add it to a spare PC, Windows boots slow.
When I go to Disk Management, Windows will ask to initialize it and says it is unpartitioned.
Sounds: Yes, it spins, and yes, it makes seek sounds, for example when I run Testdisk for hours as it tries to read every sector looking for an MBR.

I have, in the past, fixed dead hard drives with an embedded controller PCB swap. But I don’t know if this is a candidate for this.

Now for the interesting part. It was partitioned (dynamic disk) with a 30 GB C: drive mirror (RAID 1) and a 270 GB partition that was half of a RAID 0, and then a 200 GB regular partition. So although my mirrored C: drive is working, my RAID 0 is the drive I am concerned about. I would like to get some of the data off my RAID 0 partition if possible, but I cannot even make an image of the dead drive. My other drive is (of course) working fine.

Event logs:

Prior to death, a lot of timeouts were logged by atapi (event id 9) during ntbackup, and then a BSOD. I rebooted, with no clue as to what the problem was because the machine was not being manned. More of the same errors and another Blue screen a few days later during an automatic backup. Manual reboot (still no clue) and then the inevitable “uncorrectable read error” from dmio (event id 35). There were also write errors and delayed write errors. After about 3 hours of these errors, another BSOD occurred. The status code of 0xc0000185 generally means to check the cable or termination. I know the cable is not culprit here.

Question: Anyone know how I might get some data off this drive?

Awopero · December 21, 2011, 2:06pm

If the drive is not been seen by the computer you will need to contact a professional data recovery company, they could dismantle the unit and try to recover the data.

Wayne · December 21, 2011, 8:17pm

did you at least replace the cable to make sure that it isn’t the cable? the other thing is that data recovery for raid 0 is expensive, so you never use raid 0 unless you have a backup. also, if you did any change to the drive, you won’t be able to re-establish it in the raid 0 environment it was in. you’ll have lost your data, because it will be doubtful that a data recovery company will be able to do so either.

fzabkar · December 22, 2011, 9:37pm

If your board has an 88i6745-TFJ1 Marvell MCU, then be aware that this component often suffers from a problem that mimics a head or media fault. Unfortunately this chip stores unique, drive specific “adaptive” data in its internal flash memory (I’m assuming that location U12 is vacant), so a straight board swap rarely works. You will need to transfer these data to your donor PCB.

Some board suppliers offer a firmware or “ROM” transfer service for $10, at least for those cases where there is a discrete flash memory chip at U12. Otherwise there are several DIY possibilities involving cheap and not-so-cheap software tools.

See http://forum.hddguru.com/help-wd1600bevt-after-overheating-t21373.html

uruiamme · December 24, 2011, 10:24pm

Yes, I have almost the exact same board as displayed in that forum. It looks like:

http://pcb-hdd.com/images/WD3200AAKS-75SBA0%202060-701444-003%20REV%20A%20PCB.jpg
(which I found at:
http://forum.hddguru.com/wd3200aaks-00l9a0-where-u12-chip-t21124.html )
My board revision seems to be “3307” and has a large “4” where the picture shows “2707” with a large “2”. My board also has no U12 chip.

My next idea is to read that data using mhdd. I already have a boot disk that contains MHDD 4.6, although I someone says to use version 4.5 for some reason.

Quote from the link above, “Your drive is a ROYL, and you will need MHDD ver 4.5.”

There is a PCB board model number (printed on the PCB) 2060-701477-002 REV A plus what I guess is a firmware sticker, 2061-701477-900 AB followed by a long code beginning with XC.

Can anyone tell me now what I can do with mhdd? It appears that I need to pull off the firmware.

Can someone explain what is a ROYL drive controller ROM or what difference it makes?

Does anyone have a link to tools in English about how to grab the ROM or firmware?

I found a potential supplier of the PCB that I need. See http://www.drivestar.biz/wd-701477-p-226.html I suppose that if the drive platters and read system are okay, I can merely capture the ROM from the dying controller, switch PCBs, and then update the new PCB with the old ROM, and then the drive will work as before?

Incidentally, it is in warranty, but I don’t know if WDC has any refurnished RE2 drives available for replacement, although I assume that they would have something. What has happened previously when I sent them one of these drives is that WDC will send me a similar drive. For example, I have 2 computers with one each of WD5000YS and WD5000ABYS. It just happened that way by chance as the older drives failed and WDC would send me replacements. At least one time I had a drive that was failing (but still readable) and I was able to get the replacement drive shipped to me first (with a credit card as a backup or as a prepay, I forget) and I was able to have both the old and refurb drive available for copying.

I almost forgot: when I examined the PCB, I noticed two possible issues.

the Marvel 88i6745-TFJi appeared to have turned the foam pad a little brown (heat?)

One of the C37 capacitors may have leaked? These are small SMD caps, so I don’t think they “leak” like electrolytic caps, but I assume they fail in a catastrophic way, too. I measured both while still in the circuit and they read the same 38 microfarads.

fzabkar · December 25, 2011, 9:08pm

WD uses Vendor Specific Commands (VSC) for low level diagnostic access to the hard drive’s ROM and firmware area. Standard ATA commands do not have this functionality. In order to read the flash memory contents (“ROM”), you need to run the appropriate MHDD script. MHDD version 4.5 supports scripting whereas version 4.6 does not.

In your case you would use the script entitled “WD Marvell Royl serial flash 192 read”:
http://yura.projektas.lt/files/wd/mhdd/index.html

Before you run the script, you will need to create a 512-byte file named rom192.bin containing the following data:
http://yura.projektas.lt/files/wd/mhdd/romread192.jpg

After executing the script, you would need to examine the 3 resulting files (3rom0/1/2.bin) to confirm that they look right, and then pack them into a single file. This file would then need to be written to your donor’s flash memory.

If the above is too much for you, then a clickable alternative is WDR-UDMA, but it isn’t freeware (~US$200).

BTW, “ROYL” refers to the architecture of your drive. If you examine the firmware modules (as in the links in the HDD Guru thread), you will see a “ROYL” header. ROYL drives use different VSCs than “Marvell” drives (even though both have Marvell MCUs).

C37 and the other parallel connected capacitors would be the filters for the Vcore supply for the Marvell MCU. I suspect the voltage would be around +1.35V or less.

BTW, “2707” and “3307” are WWYY (Week / Year) date codes. This means that the PCBs were manufactured during weeks 27 and 33 of 2007, respectively. You will see similar YYWW date codes on each of the chips.

uruiamme · December 26, 2011, 5:18am

THANKS! My Lithuanian is a bit rusty.

Ok, I had aleady long prepared for this over the last few days. I am using an old 40 GB hard drive that boots to DOS 7 (Win98SE) and I am running Hiren’s Boot CD to be able to easier copy the files (although I have to reboot to the CD in between copying a file). Even though my fixing computer has a floppy drive, except for formatting the C: drive I haven’t needed it. With Hiren’s, I can use the network or a USB stick to transfer the files.

So far, so good. I captured the 3 files and I even concatenated them while in DOS. (One thing about DOS is it is easier to combine binary files, or “pack them into a single file” as you call it.) Your instructions and the manual are pretty sketchy, so I will vriefly explain it for others: I copied the rom192.bin file that I created with WinHex into the MHDD directory. I placed my script file in the script subdirectory and removed its file extension. When it came time to run MHDD, I selected the drive, ran id and eid to ensure it was alive, then I typed a period followed by the script name. This became:

id

eid

.royl-192

This last part was rather fast, and I think it scrolled past some info. In other words, my drive was able to read all of the stuff properly imho.

So you think that the 3 little files are all that represent this hard drive’s parameters, and everything else on the Marvel chip is firmware? And I guess we are hoping that any further calibration and parameters are safe and sound in negative LBA cylinders on the physical drive? I hope this works.

By the way, I can zip the 3 files and let you see them. I personally think they look okay, especially since there is data in most all of the files, except a lot of FF stuff towards the end of the third file. Of course I saw the ROYL code and the drive’s firmware revision code near the end of 3ROM2.BIN. I guess everything else is encoded or encrypted. So much for ASCII view!

Now for my Ace in the Hole…

I actually have an identical drive that already has the same firmware revision I think. I will be verifying this when I put it into the test PC, but according to Wintune, the firmware revision matches what MHDD reported for my bad drive. My only qualm is that I don’t know if I will be able to use the donor drive’s PCB, get the bad drive working, and then revert the donor drive back to a functional drive. It will be interesting to try! My donor drive is currently in a mirrored Win Server, so I can break the mirror while I attempt this. I guess the other way is to see if WD would send me an equivalent drive upon request so that I didn’t have to borrow one of my own drives.

fzabkar · December 26, 2011, 9:38pm

Nice work! Most people throw up their hands and run away screaming when they see a command line.

Just to be sure, when you concatenate the three files, you need to use the /b switch for a binary mode copy, as follows:

copy /b file1 + file2 + file3 file123

Also, you can compare your ROM image against other ROYL models here:
http://www.datadonor.net/index.php?folder=SEQgV2VzdGVybiBEaWdpdGFsL1JveWwgU2VyaWVz

AIUI, the ROM contains a small amount of boot firmware plus the unique adaptive data. The bulk of the runtime firmware is fetched from the System Area on the platters.

BTW, a freeware alternative to Winhex is HxD:
http://mh-nexus.de/en/hxd

Still another way to create the 512-byte file is to use the DOS Debug command.

D:\Junk>debug
-f 100 2ff 0
-e 100
12EB:0100 00.24 00.0 00.1 00.0 00.0 00.0 00.0 00.0
12EB:0108 00.0 00.0 00.3
-d 100 10f
12EB:0100 24 00 01 00 00 00 00 00-00 00 03 00 00 00 00 00 $...............
-rcx
CX 0000
:200
-n rom192.bin
-w 100
Writing 00200 bytes
-q

D:\Junk>dir rom192.bin

 Volume in drive D is DATA
 Volume Serial Number is 11EA-0F64
 Directory of D:\Junk

ROM192 BIN 512 12-26-11 5:59p ROM192.BIN

Now comes the sad part. I don’t know how to WRITE the image file back to the ROM. There is obviously a VSC that does this, but I don’t have a list of these commands.

If you can find a list of WD VSCs, then we could write an MHDD script to finish the job. Otherwise you could try to find a board supplier that could do it for you. Drivestar, for example, also sell HDD data recovery tools, so they should be able to take your ROM image and transfer it to a donor PCB.

uruiamme · December 27, 2011, 4:05am

ok,

ROM file downloads from bad drive, done
copy to a combined ROM file, done
compare my ROM file with similar ones, done (I found a WDC WD5000AAKS-00TMA0 ROM file that was remarkably similar)

When I compared my file, it was the same size and looked very similar to the file located in ROM/ROM.bin in

To do:

obtain a PCB
copy ROM to new PCB
everything will work

Now, my questions:

Could I or should I plug in a PCB without having it connected to the drive heads? Like should I put electrical tape over the connectors that go to the drive coils, boot up the drive, and copying the ROM like that?
Do I have to have the same firmware revision on the donor drive? By this I mean all of the MOD files that contain the modules… aren’t they supposed to be the same? Are all of those MOD files from the Service area of the chip, or the disk, or what?
Is there a way to extract or upload all of those mod files without one of those “tools”? There are hundreds of MOD files, and there is a “0” directory and a “1” directory in the RAR file that I got from datadonor.net, each with a ton of those MOD files.
What is a VSC? (Oh wait, a vendor specific ATA command)
Drivestar has a compatible board for cheap ($6), but they didn’t offer to send me one with a specific firmware.
Since you know so much about DOS debug (I will admit I have only used it when I followed a detailed procedure, and then only rarely), All you have to do is debug a WD firmware flash utility (like the one I used for my WD5000YS drives – search for that) or see this link to VSC for WDC: http://idle3-tools.sourceforge.net/ in which someone has documented how (and why) a person can prevent a WD drive from spinning down by sending some VSC to the drive.

At http://yura.projektas.lt/files/wd/mhdd/index.html we got the procedure to grab the ROM. Did you not see that there is a write command in one of the scripts? He pulls some data out with one of those “sectorsto = rtra01.bin || regs = $d5 $00 $bf $4f $c2 $a0 $b0” commands. He then has a procedure to write back some data with “sectorsfrom = rtra01.bin || regs = $d6 $01 $be $4f $c2 $a0 $b0” – I think this must be the VSC we need. It relates back to the draft written by Western Digital on “Log Page Command Transport” using a SMART command similar to retrieving logs. Instead of logs, Western Digital uses their Super On command to tell the drive to get ready to transfer service area (SA) data. The Super On command seems to be “regs = $45 $0b $00 $44 $57 $a0 $80” ($80 is vendor specific) and the $b0 is supposed to be SMART.

In the T13/e05109r4 PDF file, we clearly see that $d6 is the SCT Write Log subcommand while $d5 is Read. In the Latvian website, I see the scripts using both, and this is for the ROYL firmware drive, too. Can we write a script that does the read and write? It looks complicated with MHDD, for obvious reasons.

fzabkar · December 31, 2011, 9:10pm

AUI, you should be able to copy the “ROM” with the board disconnected from the drive. To prevent the drive trying to spin up, you could install the PM2 (Power Up In Standby) jumper.

All you need to do is to physically match the donor PCB. You shouldn’t need to match the MOD files. The latter are the modules in the Service Area on the platters. The original contents of the flash memory will be overwritten by the ROM image you have just captured. They will not be affected by the MOD data.

The cheapest commercial tool for working on the Service Area (and ROM) of WD drives appears to be WD HD Pro. The OP of the following thread is selling it for US$80:
http://forum.hddguru.com/call-forum-administrator-t21565-20.html

Otherwise you could painstakingly create your own MHDD scripts to do the same thing.

For example, the script entitled “WD Marvell Royl SA rom copy read” could be your template for a more comprehensive SA MOD dump. Notice that the module ID appears in the associated 512-byte VSC file in little endian format (byte reversed).

I only know enough about DOS Debug to perform small jobs. What you are suggesting (code disassembly) requires an understanding of assembler.

The script that refers to “rtra01.bin” reads entire tracks, not individual modules. I’m not familiar with these structures.

You are right about the SCT Read and Write Log commands. WD’s VSCs tunnel through to the drive encapsulated as SMART log file data.