How To: Recover files from WD Sharespace Raid 5 When the datavolume missing and lvm data lost

Scott_Pulliam · October 7, 2012, 12:11am

I successfully rescued my 39 GB of family photos and financials from my unresponsive 4TB WD Sharespace (WDSS) configured as Raid 5 without losing a file.

My method involved many of the steps posted earlier but there were some important differences and specifics I would like to pass along. I hope by posting my process here it may help some of those for whom the earlier process had not worked. This write-up summarizes about 40 hours of trial and error work into a process that should take a handful of commands and a couple of hours.

As I want this to be useful for the inexperienced (as I was coming in) I’m including more detail for interim steps near the end of the article for those in need of the extra help. This extra detail should also allow this process to be applied to other Linux-based NAS, as well. When applying to other NAS extra effort will be necessary to confirm the appropriate raid parameters.

I am going to post this as a series of 7 subsequent messages due to the character limit of the forum.

I hope all find this useful.

Scott_Pulliam · October 7, 2012, 12:17am

This is #2 of a multipart solution. Please read the entire document for the full solution.

Background

My 4 TB WD Sharespace worked flawlessly for about 18 months. While on a family vacation I began to receive email notices from the unit that it was rebooting coincident with storms moving through the region. Yes, the UPS had failed. I returned to find all the drives amber and shares down. Initially, I was able to SSH remotely with PuTTY on networked pc but subsequently even this access was lost after an attempt to power cycle the unit. No, I did not have backups of this unit. This should sound very familiar from what I’m reading on this forum.

What Worked

As supported by other posts on this forum I found that on each hard drive I had four partitions three of which were flagged as raid components. (Note that my lettering will begin with “b” rather than “a” as I chose to install Ubuntu rather than use the live CD. My Ubuntu OS system drive becomes disk “a”.) These partitions are listed as an example below for the first hard drive (sdb). The “b” corresponds to the hard drive both on port1 on my rescue system and position 1 in the WD Sharespace. I could have represented the partitions on the other hard drives in exactly the same fashion replacing “b” with “c”, “d”, and “e”.

sdb1 214 MB

sdb2 1.1 GB

sdb3 213 MB

sdb4 999 GB

My data was on the fourth and largest partition.

The important multi-disk (prefixed with “md”)raid units are: md0 (raid 1 comprised of sd[e-b]1 used as a system partition); md1 (raid 1 comprised of sd[e-b]2 of uncertain use); and md2 (should be raid 5 comprised of sd[e-b]4 containing our data).

IMPORTANT: The WDSS uses an extra layer of abstraction called LVM. The file system lives within this layer rather than at the level of the raid, md2. To recover your files intact you must restore all layers to their original state in the listed order: 1. the raid correctly configured; 2. lvm layer correctly applied (physical layer > volume group > logical volume each with their previous unique identifier, aka uuid); 3. The file system repaired in the lvm layer.

I spent several hundred unfortunate dollars on file recovery software to prove this last point. The best these software packages can do is return broken fragments of small text files and images. It seems LVM embeds information blocks WITHIN your files which effectively corrupts the file when read by another system. For example, 96 sectors into a camera jpg lvm injected 8 sectors of presumed directory information. The picture information was still there but shifted below this 8 sectors. This has the effect of a serious phase-shift in the middle of your resulting image. Conclusion: I found none of the automated packages could handle the lvm layer abstraction or even determine the raid disk order. While neither Disk Internals or Zar were helpful, a shout-out is warranted for R-Studio’s Demo version. While R-Studio could not resolve the logical layer, its tools were indispensable for determining the raid configuration.

So, let’s detail a bit more your process to recover your files.

First, reassemble the raid 5. The biggest surprise here, proven by a sector by sector analysis of the component hard drives, was that the drives were ordered backwards as mentioned earlier by Cyberblitz. When the create command was used with the correct parameters and order you will not need to “force” it. Required parameters for me included a 64 KB block size, left asynchronous (aka asymmetric) and metadata= 0.90.

Second, you MUST have the lvm configuration file for the correct uuid’s for this layer. Seems the whole file system is keyed to the presences of the correct uuid on each the physical layer, volume group and logical volume. I will go over how this might be recovered and applied. Specifically, for my unit there should be: a physical volume for md2 which corresponds to pv0; volume group “vg0”; and a logical volume “lv0”. These names were pulled from the lvm configuration file.

Third, if the first two steps are accomplished then fsck.ext3 command may be able recover the file system.

Finally, mount the new assembly and copy resulting files/folders to a large USB drive.

Please continue reading the next posting for details.

Scott_Pulliam · October 7, 2012, 12:23am

This is #3 of a multipart solution. Please read the entire document for the full solution.

The Details

I installed Ubuntu on an old pc, ASUS A7N8X-E with 3G ram, and added a FastTrak Tx4310 PCI card for serial ATA ports. Note: The install program went well but would not proceed with ps2 keyboard but a USB keyboard worked fine.

Carefully label the hard drives 1-4 consistent with the WDSS case numbering and remove them from the WDSS case. (Here I copied all four WD Sharespace drives to 4 nearly identical 1 TB drives to work with copies. It was too scary to work directly with my data. Your call on this.) As I found I could not boot the computer with the WD drives in place I connected them secondarily in a hotswap fashion to the serial card once booted-up without difficulty. Avoid disconnecting a drive which is actually “mounted” in Linux. Rather, unmount them first.

Now, we are ready to check the status on few things. I’m going to show the results of a few queries of my initial state so you might compare it to your situation. I have removed the information arising from my Ubuntu system disk as irrelevant. I found that the md2 raid datavolume does not exist and the necessary lvm pieces are completely missing. If elements of these are still present in your system then you may not need to perform every part of the following restoration.

++++++++++++++++++++++++++++++++++++++++++++++

1. Check existing Raid configurations. Notably absent is md2. ( md0 is missing a disk which might be important for the WDSS not booting but not important to bringing back my data.)

sudo cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md0 : active raid1 sdd1[1] sdc1[2] sdb1[3]

208768 blocks [4/3] [_UUU]

md1 : active raid1 sde2[0] sdd2[1] sdc2[2] sdb2[3]

1044160 blocks [4/4] [UUUU]

2. Check for lvm physical units. Absent is any physical volume for md2.

scott@ubuntu:~$ sudo pvdisplay

[sudo] password for scott:

“/dev/md0” is a new physical volume of “203.69 MiB”

— NEW Physical volume —

PV Name /dev/md0

VG Name

PV Size 203.69 MiB

Allocatable NO

PE Size 0

Total PE 0

Free PE 0

Allocated PE 0

PV UUID J6oLG5-u3zv-8qH0-H5x8-WOQO-2ih3-g8pfg5

“/dev/md1” is a new physical volume of “1019.69 MiB”

— NEW Physical volume —

PV Name /dev/md1

VG Name

PV Size 1019.69 MiB

Allocatable NO

PE Size 0

Total PE 0

Free PE 0

Allocated PE 0

PV UUID kVCFaA-9n8o-o5nJ-37G7-9eEx-80f0-ziAShZ

3. Check for available volume groups. Absent is vg0.

sudo vgdisplay

… only displayed ubuntu system groups. None relating to md0, md1, or md2.

4. Check for available logical volumes. Absent is lv0.

sudo lvdisplay

… only displayed ubuntu system volumes. None relating to md0, md1, or md2.

++++++++++++++++++++++++++++++++++++++++++++++

Please continue to the next posting for additional detail.

Scott_Pulliam · October 7, 2012, 12:28am

This is #4 of a multipart solution. Please read the entire document for the full solution.

Recovery

1. Secure the lvm configuration data. You are looking for a file which contains references to the correct physical volume uuid (pv0), volume group uuid (vg0) and logical volume uuid (lv0). In a functioning WDSS this file is NOT present on your data partition but rather in the system partition with backup copies generated with each lvm component create event. I’m not sure what folder the primary copy would have occupied though a backup copy is kept in /etc/lvm/backup and archived copies in /etc/lvm/archive. You will find your old lvm config file in md0 not md2. Archived copies can be helpful as include creation date and event to sort out which version is really your target. Unfortunately, the directory system for md0 is likely just as out of reach as that of your datavolume. To recover the config file we need to search for a characteristic string of the file in md0. Your final config file will looks something like this:

************************************************************

# Generated by LVM2 version 2.02.66(2) (2010-05-20): Sat Sep 29 14:38:46 2012

contents = “Text Format Volume Group”

version = 1

description = “Created before executing ‘lvremove /dev/vg0/lv0’”

creation_host = “ubuntu” # Linux ubuntu 3.2.0-31-generic-pae #50-Ubuntu SMP Fri Sep 7 16:39:45 UTC 2012 i686

creation_time = 1348947526 # Sat Sep 29 14:38:46 2012

vg0 {

id = “hgntSv-jM5G-LnY8-GDn9-2Lkh-C9kH-cSv2k2”

seqno = 3

status = [“RESIZEABLE”, “READ”, “WRITE”]

flags =

extent_size = 8192 # 4 Megabytes

max_lv = 255

max_pv = 255

physical_volumes {

pv0 {

id = “Tz6B09-FDMG-5N7N-2Xjg-tIcM-ceUq-GnqB75”

device = “/dev/md2” # Hint only

status = [“ALLOCATABLE”]

flags =

dev_size = 5851788288 # 2.72495 Terabytes

pe_start = 384

pe_count = 714182 # 2.72439 Terabytes

}

logical_volumes {

lv0 {

id = “OMmCHE-IMw3-nyw7-NplM-w5ht-FIsq-raWAkX”

status = [“READ”, “WRITE”, “VISIBLE”]

flags =

segment_count = 1

segment1 {

start_extent = 0

extent_count = 714182 # 2.72439 Terabytes

type = “striped”

stripe_count = 1 # linear

stripes = [

“pv0”, 0

]

}

************************************************************

I do not know the significance but I found on my system several copies which were important but corrupted for which I had to repair/replace the five trailing brackets.

We can find this file with a Linux program like Hexedit. Install Hexedit with

sudo apt-get install hexedit

Start Hexedit with the following parameter which targets the desired partition, md0:

sudo hexedit /dev/md0

If md0 is not intact then just search the first partition on any disk as they should be mirrors, eg sde1.

To use hexedit: Hit <Tab> to toggle to ASCI column, <ctrl-s> to put in to search mode. The string you will search on: “phyical_volumes {”. (Note the underscore between physical and volumes and the space character before the curly bracket.) Once found, start select text with <ctrl-spacebar> and select using arrows. <esc-w> or F7 copies to buffer (not to clipboard!). Paste into a specified file with <esc-y>. The buffer must be fairly small. If the resultant target file is empty then select a smaller segment. Repeat this process until you have all the copies of the config files found in md0. Recommend selecting file names to reflect the instance found, ie lvm1, lvm2, etc. Use <ctrl-c> to exit hexedit without saving. It is very easy to get confused and accidentally edit your data inadvertently.

Now, review the saved files with an editor like mousepad (sudo apt-get install mousepad). You are looking for one with the right create date or event trigger which created it. It needs to have all three relevant uuid. Repeating again, I found the lvm file I needed was corrupted with inserted trash at the end in place of five missing terminal brackets. I edited out the trash and replaced the terminal brackets. If you need to do this but find you cannot save the file its likely a privilege issue. Save As to your desktop to get around this.

Once you have your best candidate save the file taking care to include the contents and version lines as above as these are required elements. Your file should look very similar to my example above though the exact idents are not likely critical. You will need to save a copy to your current protected system directory /etc/lvm/backup with the SAME file name as your volume group, eg for me “vg0”. The file name is important. If not named exactly the same as the volume group it will not be used by the restore command later. Therefore, I used the following to copy from my desktop to my CURRENT system directory:

sudo cp /home/scott/Desktop/yourFileName /etc/lvm/backup/vg0

May confirm its there with:

sudo cat /etc/lvm/backup/vg0

So, summarizing this series of steps: you find the lvm config file in the OLD system partition on md0; you copy and rename it to the CURRENT ubuntu system lvm backup directory.

Please continue to the next posting for additional detail.

Scott_Pulliam · October 7, 2012, 12:32am

This is #5 of a multipart solution. Please read the entire document for the full solution.

2. Recreate md2 raid if non-existent. Take care to use the “assume-clean” parameter so the system will not try to sync your parity data on the chance you have not put the parameters in correctly. The following command is typed on one line though wrapped here as two lines.

sudo mdadm --create /dev/md2 --chunk=64 --raid-devices=4 --level=5 --assume-clean --metadata=0.9 /dev/sde4 /dev/sdd4 /dev/sdc4 /dev/sdb4

Create means to put the multi-disk together. Every critical parameter is specified including:

–chunk=64 aka “stride”. The amount of data in KBs (1 KB=1024 bytes) which will be written to a single disk for each row. As each sector is 512 bytes this means the 128 sectors will be written to each disk per pass or for a 4 disk set gives a “stripe-width” of 256 KB (512 Sectors).

/dev/md2 name for the new raid device

–raid-devices=4 Number of disks in configuration. This must be exactly the same as the original set you are restoring.

–level=5 Specifies raid type, eg Raid 5.

–assume-clean Very important. This tells the system not to try correct your device’s parity information. If you were off on any of your settings this attempt to fix the parity would be over-writing your data.

–metadata=0.9 I found that the existing superblocks were using ver 0.9. This could be seen by running a mdadm --examine /dev/sde4 (for example). If not specified then the system would default to ubuntu’s current 1.2 version with uncertain consequences.

/dev/sde4 /dev/sdd4 /dev/sdc4 /dev/sdb4 The critical point here is the order and the correct partition number. I am instructing the system to use the 4th partition on each hard drive on my SATA card in the port order 4,3,2,1. If I had lost a hard drive I would substitute the word “missing” at the position of the missing partition. (Once again, for me my system drive was sda. Yours letters may be different.)

–parity= Set raid5 parity algorithm. Options are: left-asymmetric, left-symmetric, right-asymmetric, right-symmetric, la, ra, ls, rs. By omitting this parameter I invoked the default which is left-asymmetric. (I’ve included this parameter for other NAS systems where another system might be necessary.)

Here is the command and response:

scott@ubuntu:~$ sudo mdadm --create /dev/md2 --chunk=64 --raid-devices=4 --level=5 --assume-clean --metadata=0.9 /dev/sde4 /dev/sdd4 /dev/sdc4 /dev/sdb4

mdadm: /dev/sde4 appears to be part of a raid array:

level=raid5 devices=4 ctime=Sat Dec 12 02:49:24 2009

mdadm: /dev/sdd4 appears to be part of a raid array:

level=raid5 devices=4 ctime=Sat Dec 12 02:49:24 2009

mdadm: /dev/sdc4 appears to be part of a raid array:

level=raid5 devices=4 ctime=Sat Dec 12 02:49:24 2009

mdadm: /dev/sdb4 appears to be part of a raid array:

level=raid5 devices=4 ctime=Sat Dec 12 02:49:24 2009

Continue creating array? y

mdadm: array /dev/md2 started.

And just for verification. Note sde4 is in the first, or [0] position:

scott@ubuntu:~$ cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]

md2 : active raid5 sdb4[3] sdc4[2] sdd4[1] sde4[0]

2925894144 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

md0 : active raid1 sdd1[1] sdc1[2] sdb1[3]

208768 blocks [4/3] [_UUU]

md1 : active raid1 sde2[0] sdd2[1] sdc2[2] sdb2[3]

1044160 blocks [4/4] [UUUU]

This completes restarting the raid itself. This is not mountable. Any attempt to mount at this point would be frustrated by an inability to declare a proper file system.

Please continue to the next posting for iinstructions applying the LVM layer.

Scott_Pulliam · October 7, 2012, 12:35am

This is #6 of a multipart solution. Please read the entire document for the full solution.

3. Re-establishing lvm. This is a two-step process. First, declare the appropriate lvm physical volume with the correct uuid. Second, use the restore command, vgcfgrestore, which will leverage the lvm file we saved earlier to do the rest of the work for us.

Use pvcreate with the uuid for the physical volume you found in the lvm configuration file. Reviewing back to the earlier config file we find this line under the pv0 properties:

id = “Tz6B09-FDMG-5N7N-2Xjg-tIcM-ceUq-GnqB75”.

This is your lvm physical volume uuid. Therefore, your command and response are ( the command should be one line though wrapped for display here as two lines):

sudo pvcreate --uuid “Tz6B09-FDMG-5N7N-2Xjg-tIcM-ceUq-GnqB75” --restorefile /etc/lvm/backup/vg0 /dev/md2

Couldn’t find device with uuid Tz6B09-FDMG-5N7N-2Xjg-tIcM-ceUq-GnqB75.

Physical volume “/dev/md2” successfully created

Results can be verified. Note md2 with its correct uuid:

scott@ubuntu:~$ sudo pvdisplay

“/dev/md0” is a new physical volume of “203.69 MiB”

— NEW Physical volume —

PV Name /dev/md0

VG Name

PV Size 203.69 MiB

Allocatable NO

PE Size 0

Total PE 0

Free PE 0

Allocated PE 0

PV UUID J6oLG5-u3zv-8qH0-H5x8-WOQO-2ih3-g8pfg5

“/dev/md1” is a new physical volume of “1019.69 MiB”

— NEW Physical volume —

PV Name /dev/md1

VG Name

PV Size 1019.69 MiB

Allocatable NO

PE Size 0

Total PE 0

Free PE 0

Allocated PE 0

PV UUID kVCFaA-9n8o-o5nJ-37G7-9eEx-80f0-ziAShZ

“/dev/md2” is a new physical volume of “2.72 TiB”

— NEW Physical volume —

PV Name /dev/md2

VG Name

PV Size 2.72 TiB

Allocatable NO

PE Size 0

Total PE 0

Free PE 0

Allocated PE 0

PV UUID Tz6B09-FDMG-5N7N-2Xjg-tIcM-ceUq-GnqB75

Next, run the restore command. (This command looks for the configuration file we saved earlier in /etc/lvm/backup. The config file must have the same name as the specified volume group, eg vg0):

scott@ubuntu:~$ sudo vgcfgrestore vg0

Restored volume group vg0

That was easy. Let’s verify the result. Note the results of vgdisplay and lvdisplay particularly the associated uuid. In this one step you have correctly reconfigured both the volume group and the logical volume with their correct properties and uuid. This saves you from having to run separate vgcreate, lvcreate or trying to understand the “extents” listed under Total PE.

scott@ubuntu:~$ sudo vgdisplay

— Volume group —

VG Name vg0

System ID

Format lvm2

Metadata Areas 1

Metadata Sequence No 3

VG Access read/write

VG Status resizable

MAX LV 255

Cur LV 1

Open LV 0

Max PV 255

Cur PV 1

Act PV 1

VG Size 2.72 TiB

PE Size 4.00 MiB

Total PE 714182

Alloc PE / Size 714182 / 2.72 TiB

Free PE / Size 0 / 0

VG UUID hgntSv-jM5G-LnY8-GDn9-2Lkh-C9kH-cSv2k2

scott@ubuntu:~$ sudo lvdisplay

— Logical volume —

LV Name /dev/vg0/lv0

VG Name vg0

LV UUID OMmCHE-IMw3-nyw7-NplM-w5ht-FIsq-raWAkX

LV Write Access read/write

LV Status available

# open 0

LV Size 2.72 TiB

Current LE 714182

Segments 1

Allocation inherit

Read ahead sectors auto

- currently set to 768

Block device 252:2

This concludes restoring the lvm logic layer. We are now ready to repair the file/folder structure.

3. Repair directory structure. This is done using fsck.ext3 with the VOLUME GROUP/LOGICAL VOLUME AND NO SPECIFIED FILE TYPE. So for my instance I will reference /dev/vg0/lv0.

It is a really good sign if you see the “NASRAID: recovering journal.” The raid on the WDSS is called NASRAID and recovering journal implies the system has found directory information to build on. Pass 1 could take more than an hour. Go get a cup of coffee.

The parameter “-y” tells the command to assume yes to all confirmation. You almost certainly want to do this. If you do not you will be standing there with your finger on the y key through a thousand confirmations!

scott@ubuntu:~$ sudo fsck.ext3 -y /dev/vg0/lv0

e2fsck 1.42 (29-Nov-2011)

NASRAID: recovering journal

NASRAID has gone 1027 days without being checked, check forced.

Pass 1: Checking inodes, blocks, and sizes

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

/lost+found not found. Create? yes

Pass 4: Checking reference counts

Pass 5: Checking group summary information

NASRAID: FILE SYSTEM WAS MODIFIED

NASRAID: 33539/365674496 files (8.8% non-contiguous), 113033843/731322368 blocks

This concludes repair of the file system. You are ready to mount the drive.

4. Mounting the drive. First, create a mount point or directory to which the new file system will be linked. I’m using the subdirectory /mnt and adding an additional arbitrarily-named subdirectory, “recovered”.

Make the new directory.

scott@ubuntu:~$ sudo mkdir /mnt/recovered

Now, mount your recovered drive using the lvm device to the new directory.

scott@ubuntu:~$ sudo mount /dev/vg0/lv0 /mnt/recovered

No real acknowledgement…

You should now be able to view and explore your restored files under the /mnt/recovered directory.

This concludes the primary recovery of a WD Sharespace as an example of a Linux based NAS. The following discussion is really an appendix to review related matters.

Please continue to the next posting for additional helpful topics:

Cloning healthy and unhealthy drives

Determining Raid Parameters when not known

Correcting mistakes

Scott_Pulliam · October 7, 2012, 12:38am

This is #7 of a multipart solution. Please read the entire document for the full solution.

Copying Your Hard Drives

It should be considered a best practice not to work directly with the hard drives containing the data you are trying to preserve. The better workflow is to make a sector by sector copy or image.

I found the Bytecc T-203 HDD/SDD Stand Alone Duplicator worked fine for healthy source drives. 1 TB drive took 4 hours to copy. This does not work if your source has damaged sectors.

One of my source drives had 4 damaged sectors. This was best copied with Linux’ dd command.

Sudo dd if=/dev/sdc of=/dev/sde bs=4096 conv=noerror,sync

Use dd carefully so you keep your input and output straight. In the command above hard drive sdc will be copied to hard drive sde. The “bs” sets a reasonable block size. Too small block size makes for longer processing time. Block size too large leads to more lost data adjacent to damaged areas. The conv allows processing to continue even if errors are incurred, for example bad sectors are found. “sync” means to pad output sectors if necessary. This process took about 6 hrs to clone a 1 TB hard drive. It does not give interim updates.

How to Determine Raid Parameters When LVM Present

To determine your raid configuration manually you must first read up on raid 5 parity arrangements. You need to understand the difference between: left-asymmetric, left-symmetric, right-asymmetric, and right-symmetric. Here a picture is worth a thousand words so checkout the illustrations on sites like Wikipedia. These concepts are not difficult as you might first think. BTW: Here R-Studio shined above all other disk tools.

Your data is saved as stream of information poured from one disk to the next in a predetermined consistent measure called a “stride.” In a four disk raid 5 your data will write on a maximum of 3 disk at the same offset, or distance from the beginning. The fourth write will be a parity of the other three disks of the same offset. This means the 4th stride will be written on the next row, or offset. You are trying to determine how long is the stride and what order through the hard drives does the stride proceed. The order of the procession will determine not only the physical disk order in the raid but also the parity arrangement stated above.

You really need a known file for comparison purposes. Most easy is to use is jpg file that you know is both on the broken raid but still on your laptop or a cd.

The workflow I found easiest was to use R-Studio in Demo mode on a windows computer. On opening you will see all the partitions of the unmounted hard drives. Select to create a new “block raid” drive. Drag the data partitions over one by one putting in the order you feel most promising. Set the block size for 64 sectors (128 KB) or 32 sectors (64 KB) presuming 512 bytes to a sector. Set the Raid Type to Raid 5. Select the new virtual raid in the left panel and choose to scan with the save to file option. This process will take about 36 hours for the 4TB WDSS. BTW loading from the saved file is also a very long process taking about 12 hrs.

This first scan will allow you to determine the offset where most of your data can be found. Knowing the offset of the bulk of your files will allow you to rescan subsequent trials over much more confined regions of 1 GB or so.

With results found start looking through the jpg files for any which can be opened. Distorted but recognizable pictures were most often found in the range of 8K to 20K bytes size files. Larger were too corrupt to open. Smaller were meaningless. Find a picture you recognize and for which you have an original. (I do not recommend opening text files much. A bug in the program would occasionally crash R-Studio when opening these files which can cost you a lot of scanning/loading time.)

Once you have found a known image you are ready for the tedious detective work. Put a copy of the known picture on your system drive and open it with R-Studio’s view/edit hexadecimal reader in one window. Open the found image on the virtual raid in a separate view/edit window. The raid version will show not only the byte content but also which hard drive that sector is pulling from. Select a lead byte and you can choose a right mouse option to take you to the view/edit window for that very source hard drive. March through the source hard drive comparing to your known image. Remember lvm embeds sectors in your file which can shift the expected content down. Use bookmarking liberally. Use the search function often. If my file stopped unexpectedly then I might put the next 4 or 5 bytes expected into the search and let R-Studio search for the next occurrence.

Hints:

1. While your file might begin in the middle of a stride the next section should fill the next stride from top to bottom of the stride. If it does not then the original offset of the whole virtual raid may be off. My raid system had an offset of 512 sectors for the virtual array in R-Studio. (This offset was not used on the Linux side when creating the raid array.)

2. You should quickly find a consistent length written to each disk each turn. This length is your stride.

3. The Stripe-width is the length of data written to one row. It is the stride X (number of disks -1 ). Changing the offset of the whole virtual array by multiples of the stripe-width has the effect of changing which hard drive sector’s data is assumed to be the parity value.

You are done when you can specify the stride length in sectors, the order of the disks and confirm the layout (most often left-asymmetric).

If lvm were not used in your system then you may be able to pay for R-Studio and recover your files. If you know lvm is involved don’t bother but rather follow the instructions above.

What If You Make a Mistake?

Make mistakes on copies of your data rather than your real data.

All the lvm pieces can be removed using the associated remove commands in the reverse order applied. Creating volume groups and logical groups does over-write the any existing data. If you anadvertiently have over-written important metadata remember to check the current system /etc/lvm/archive for old copies. These copies are often created before and after create commands.

umount /dev/vg0/lv0

lvremove /dev/vg0/lv0

vgremove /dev/vg0

pvremove /dev/md2

You can tear down the raid with

Mdadm --stop /dev/md2

The biggest problems occur when writing to your data disks. Do not use mdadm --create without assume-clean as the system immediately starts to “fix” parity which means writing to the disks. Do not use fdisk, which was not covered here, without a clear understanding of its parameters.

Alucardx23 · October 7, 2012, 9:06pm

Dude! Awesome, really useful info.