Datavolume doesn't exist ! message

Well, thanks Nathan_H for the instructions on how to completely clean the drives so that I can rebuild a new array. This worked and its now back up and running fine (for now).

I will NOT make the mistake again of confusing a RAID 5 for a backup. I now have my data stored on 2 completely separate items of hardware, and am still hurting from the cost of data recovery. If only WD could create something more than Diagnostics, like a proper recovery system or even a NAS that didnt cause so many headaches… then I might recommend it to others. Overall, its a Thumbs Down to WD from me.

Everyone,

I’ve suffered too much from the WD Sharespace running RAID5,  it failed on my almost once every 2 months and required a rebuild or a very time consuming recovery.  I now use Disk Striping and have not had an issue for almost a year now.  My issue was that a power cut would immediately break the raid5 and then an automatic rebuild of the drive would occur, taking almost a day.  In the case of disk striping, I assume that there is no inconsistency when the data is written and the power interrupted, so it just continues from there.

The proabability of a power cycle is much higher than the probability of a drive failure for me at least.

I still backup to another media just in case.

Well, isn’t this an interesting thread!

Can’t say I’d had many problems with our 8TB device with RAID5, except for every month or two, it’d tell me one of the drives was missing, but SMART always showed it as good. Clean it, pop it back in and it’d rebuild no problem.

However, after it last happened and a discussion with WDC support, we decided to RMA the drive. The new 2TB disk duely arrived, popped it in and it started to rebuild.  THEN, shock horror, after the 60% complete message, it all went to poop and now I find myself with the “doesn’t exist” issue - albeit slightly different circumstance to what others are seeing here.

After reading this and other threads carefully, I started down this path:

~ $ mdadm --assemble -f /dev/md2 /dev/sd[abcd]4
mdadm: forcing event count in /dev/sdd4(3) from 4283834 upto 4283840
mdadm: clearing FAULTY flag for device 3 in /dev/md2 for /dev/sdd4
mdadm: /dev/md2 has been started with 3 drives (out of 4) and 1 spare.
~ $ pvcreate /dev/md2
  No physical volume label read from /dev/md2
  Physical volume “/dev/md2” successfully created
~ $ vgcreate lvmr /dev/md2
  Incorrect metadata area header checksum
  Incorrect metadata area header checksum
  Incorrect metadata area header checksum
  Volume group “lvmr” successfully created
~ $ lvcreate -l 714329 lvmr -n lvm0
  Incorrect metadata area header checksum
  Logical volume “lvm0” created
~ $ fsck.ext3 /dev/lvmr/lvmr0
e2fsck 1.38 (30-Jun-2005)
fsck.ext3: while trying to open /dev/lvmr/lvmr0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

fsck.ext3:

Despite it not looking good, it seems to be “recovering” something according to the mdadm detail output:

~ $ mdadm -D /dev/md2
/dev/md2:
        Version : 00.90.01
  Creation Time : Fri Apr  3 13:54:17 2009
     Raid Level : raid5
     Array Size : 5854981248 (5583.75 GiB 5995.50 GB)
    Device Size : 1951660416 (1861.25 GiB 1998.50 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Wed Apr 27 11:41:01 2011
          State : clean, degraded, recovering
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 13% complete

           UUID : b2c561ae:9acfcff0:2594d787:fb4eb047
         Events : 0.4283848

    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       4       8       20        1      spare rebuilding   /dev/sdb4
       2       8       36        2      active sync   /dev/sdc4
       3       8       52        3      active sync   /dev/sdd4

I guess I’ll wait and see what happens, but not very optimistic right now.

If anyone has anything to contribute to my situation, please feel free to do so! :neutral_face:

Thanks in advance.

1 Like

I’m no expert at all (I ended up forking out for a professional recovery), but your situation looks OK to me…

Perhaps you made a typo at :

$ fsck.ext3 /dev/lvmr/lvmr0

perhaps should it have been :

$ fsck.ext3 /dev/lvmr/lvm0

Good luck!

1 Like

You are giving me hope, guys (lost 8T of data, Raid5).

I have created the partitions according to your useful feedback.

I am starting the file check but it seems to be getting forever (and only in Pass1). Is this normal?

After:

fsck.ext3 /dev/lvmr/lvm0

e2fsck 1.38 (30-Jun-2005)

Couldn’t find ext2 superblocks, trying backup blocks…

ext3 recovery flag is clear but journal has data.

Recovery flag not set in backup superblock, so running journal anyway.

NASRAID: recovering journal

Pass 1: Checking inodes, blocks and sizes

… and it seems not to be moving anymore (30’ already). Is this normal?

Regards,

Well, it’s not looking good.

Twice now it’s failed during the rebuild process, leaving me with:

~ $ mdadm -D /dev/md2
/dev/md2:
        Version : 00.90.01
  Creation Time : Fri Apr  3 13:54:17 2009
     Raid Level : raid5
     Array Size : 5854981248 (5583.75 GiB 5995.50 GB)
    Device Size : 1951660416 (1861.25 GiB 1998.50 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Fri Apr 29 14:17:52 2011
          State : clean, degraded
 Active Devices : 2
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : b2c561ae:9acfcff0:2594d787:fb4eb047
         Events : 0.4283852

    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       1       0        0        1      removed
       2       8       36        2      active sync   /dev/sdc4
       3       0        0        3      removed

       4       8       20        -      spare   /dev/sdb4
       5       8       52        -      faulty spare   /dev/sdd4

Any suggestions (and thanks for the typo pickup earlier…)

…just as an update, I keep trying to re-assemble and it keeps failing, although it made it to over 80% complete last time.

Considering updating the firmware as there appears to be a raid build fix in the newer versions, but still don’t want to risk damaging my data as part of the process.

So, I’m sitting here right now with waiting for my Sharespace to recover.  After working for about a year flawlessly, it started running very slowly, but I kept postponing the investigation of why.  Finally I decided to shut the Sharebox down.  After waiting 12+ hours I finally just removed the power to shut it down, holding the off button, no matter how long, wouldn’t stop it, neither would the web interface.  BIG MISTAKE!!!

When it came back on-line all lights were amber, i.e. all drives failed.  After a desperate search and much trial and error, and especially with the help in this forum, I was able to get 3 of the drives to come on line as a degraded RAID 5 and then add the 4th drive, which had been kicked out of the RAID due to corruption.  Now everything is recovering but Sharespace tells me that my 90% full drive will take about 10 days(!) to recover. 

If this ends up being successful I’ll post more complete information about what I did (if it’s not you’ll be able to read an article in the SF Chronical about a WD customer who jumped off the Golden Gateway bridge after losing Terabytes of data, including a digitized music collection of 1000s of irriplaceble albums and CD for which the orriginals are no longer available)

But as others have mentioned, sharespace is built using a very basic version of Linux, Busy Box, and  the Linux software-raid tool mdadm.  This is actually a very sensible, and powerful, way to build a turnkey raid system.  Unfortunately WD’s documentation of the the actual underlying system is terrible and there web admin tools are atrocious. However if you read up on mdadm and know a bit of Linux shell programming, you should be able to recover from any situation so long as you can log into to the box via ssh and you don’t inadvertently ‘restore’ your system to factory (wiped) settings.

As mentioned elsewhere

cat /proc/mdstat
adadm
dmesg
logs in /tmp

are your friends

@whydididothat - don’t suppose you know how to get mdadm to rebuild / resync as much of an array as it can??!!

It seems that in my case, the reason it was failing to re-assemble the array was ANOTHER failing drive!! Now, it won’t even get to 1% sync when I force the rebuild before it quits.

This leaves me with 2 drives out of the array ok, and a new (previously RMA’d) drive that had reached >80% resync before the other drive stopped it - i.e. 2 good drives and one new drive that had 80% sync.

I’d taken each drive out and attached through eSATA to my PC and run the WDC DataLife tools - all showed SMART ok, but when you run even the most basic test, I get a problem. An extended test reveals bad sectors and a “repair” didn’t help.

I’m preparing for the fact that the data is now entirely lost, luckily for me, it’s not the end of the world as there are no personal photos etc. - just work data.

Cheers,

I should probably reiterate that not only am I not an expert, but I may well know less than I even think that I do, i.e. I won’t know whether actually recoverded my system for another 10 days.

However, if two of your four harddrives are truely, physically bad, then you’re probably out of luck, at least without the specialized tools of a data recovery expert.  But, having two hardrives fail, at the same time, on a device used for home office or personal entertainment is extreemly unlikely.  It’s normal for drives to have bad sectors, so just because you noticed that on a low level scan doesn’t mean the drive is bad.  So, the first question is: how did you determine that your harddrives where bad? What happens when mdadm tries to assemble your raid using the original drives?

The WDL tools say it’s bad…  It fails the basic test and the low level “extended test” shows a lot more errors.

Perhaps the first drive wasn’t faulty - but every month or so, the WDShareSpace would tell me HDD3 was “absent”, which obviously degraded the array. If I “cleaned” the disk, rebooted the array, it would rebuild. After consulting with WDC support, they suggested I RMA the drive, which I did. When the new drive turned up, popped it in, it started to re-sync and all was well 'til somewhere over 80% complete. THEN, I won’t go in to the specifics now, but ended up with the “doesn’t exist” message, tried to force the rebuild and it fails at some point, showing in mdad -D:


    Number   Major   Minor   RaidDevice State
       0       8        4        0      active sync   /dev/sda4
       1       0        0        1      removed
       2       8       36        2      active sync   /dev/sdc4
       3       0        0        3      removed

       4       8       20        -      spare   /dev/sdb4
       5       8       52        -      faulty spare   /dev/sdd4

Where SDD4 is HDD1. I’ve seen it reach as high as 80% on the rebuild again, but now, it’s consistently failing before even 1%.  Consistent with a failed drive.

It’s possible both drive failures have been caused by over heating - but who knows…

When the drives get rebuilt again (when RMA replacement arrives) it’s unlikely to be a RAID5 config!!

@Footleg

mdadm is indeed showing the drive as faulty (which as you’ve stated you allready confirmed using a diagnostic tool.)

But you should be fine, mdadm is showing that you have 3 good drives (two active and a spare).  You could check to see if everything is OK  with them by doing the following

mdadm --stop /dev/md2
mdadm --assemble --update=resync /dev/sda4 /dev/sdb4 /dev/sdc4

But that shouldn’t be needed; once you get the new drive and replace /dev/sdd4 with it, mdadm should automatically resync everything.

@whydididothat

Thanks for the response - I’ll check those options out, but my concern is that I think the “spare” drive I the previous RMA replace that had not fully re-synced before the other one packed up - i.e. I think 2 are good, 1 has quite a large amount of sync (>60%, possibly >80%) but not complete and the other is faulty.

Will see what happens.  Thanks again.

Thanks to everyone in this post to share their experience, My situation is as follows,

WD ShareSpace with firmware ver. 2.2.90 with MioNet 4.3.0.8

It has 4 Disks of 1 T.B each with RAID 5 configured and only one Volume. I was unable to access data in it. When i try accessing it on network it shows all shares but does not allow to go in it.

During a data copy process i suppose it reached Full space utilization and then “DataVolume” became unavailable. We tried restarting it but to no success. Currently it is showing alert of " [Volume Status] Data Volume ‘DataVolume’ doesn’t exist!". While all 4 disks are showing “Failed” status and volume as Unassigned.

I feel lucky that this post helped me a lot and did work for me in combination, mainly I followed @footleg instructions,

~ $ mdadm --assemble -f /dev/md2 /dev/sd[abcd]4
mdadm: forcing event count in /dev/sdd4(3) from 4283834 upto 4283840
mdadm: clearing FAULTY flag for device 3 in /dev/md2 for /dev/sdd4
mdadm: /dev/md2 has been started with 3 drives (out of 4) and 1 spare.
~ $ pvcreate /dev/md2
  No physical volume label read from /dev/md2
  Physical volume “/dev/md2” successfully created
~ $ vgcreate lvmr /dev/md2
  Incorrect metadata area header checksum
  Incorrect metadata area header checksum
  Incorrect metadata area header checksum
  Volume group “lvmr” successfully created
~ $ lvcreate -l 714329 lvmr -n lvm0
  Incorrect metadata area header checksum
  Logical volume “lvm0” created
~ $ fsck.ext3 /dev/lvmr/lvm0

Some of output from commands differ from @footleg output above. Sad that during all process i did not take output for you lovely people. still there were little difference than mwentioned above. only that last command of fsck.ext3 did gave couple of lines that it initiated something. then my cursor kept blinking and i waited for more than 50 minutes and to try checking status i cancelled it thru Ctrl+C, then i checked status and it showd 3 drives healthy and first disk as dead (It seemed much better than 4 Dead). When i tried mdadm -D /dev/md2 it showed same 3 fine and 1 dead drive but no percentage against Rebuild status (infact this line was not there) and State : clean, degraded but no mention of rebuild. I think it was due to my fault of cancelling the process.

I did try running fsck.ext3 command again but it said something is busy, then i tried changing lvmr to lvmr1 as suggested in another reply to this post, but it did not work.

Finally i took all drives out numbering them 1 to 4 from bottom to top. Placed them in a system connecting SATA cables from sata 0 to sata 3 in order. Downloaded Fedora Desktop 14 Live and burnt it on CD. Boot this system from CD. When it logs in desktop it gave error on top right bar that i have one disk with serious bad sectors. I did went through this message, there it showed disk problem status, under multi-disk devices i clicked and then selected to mount this volume (it was named lvmr as tried earlier through SSH, so i believe it played a positive part as well). I have copied some 30 GB data from it and it seems fine. Further copy is in progress now, thank you all to bring back the dead data whom i thought lost.

On WD part they must not be serious with this type of support. Believe me problems happen with every kind of most stable product but there must be good response and some resolution. I have no knowledge of linux still you people threw in option which i tried and succeeded, i hope they could do this little thing as well. after all it is their product and i m not sure wether this company’s directors or owner have read through this community data here. Officially i m waitng for their reply to email that i submitted yesterday.

Hmm, well, I’ve finally got my new RMA drive (delayed sending back), plugged it in, but the mdadm --assemble -f command says no such device sdd4, and no superblock.

Any ideas what I need to do to initialise the new drive for mdadm?

It shows as “new” in the web admin tools.

The minimum amount of information anyone would need in order to give personalized advice was in the part.  :slight_smile:

If the drive shows as “new” in the web admin tools, then it should rebuild automatically. I’d make a backup copy of your data on the ShareSpace and then use the web admin tool to add the disk back to the array.

Thanks Nathan,

However, my post was in the context of my previous posts in this thread - where you’ll find all relevant pieces and understand why it won’t start the rebuild automatically. Apologies for not re-inserting…

But, for the record, the <blah> part is:

~ $ mdadm --assemble -f /dev/md2 /dev/sd[abcd]4
mdadm: cannot open device /dev/sdd4: No such device or address
mdadm: /dev/sdd4 has no superblock - assembly aborted


For additional info:

~ $ pvdisplay
  Incorrect metadata area header checksum
  /dev/sdd3: open failed: No such device or address
  /dev/sdd4: open failed: No such device or address
  Incorrect metadata area header checksum
  /dev/sdd3: open failed: No such device or address
  /dev/sdd4: open failed: No such device or address
  Incorrect metadata area header checksum
  Incorrect metadata area header checksum
  Incorrect metadata area header checksum
  /dev/sdd3: open failed: No such device or address
  /dev/sdd4: open failed: No such device or address
  — NEW Physical volume —
  PV Name               /dev/sda4
  VG Name
  PV Size               5.45 TB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               yFd455-sK3y-Zvcl-oBhn-eGAI-FsmA-3ScJD4

  — NEW Physical volume —
  PV Name               /dev/sdb4
  VG Name
  PV Size               5.45 TB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               QnU0sF-Slnc-eBAH-TBHT-GwXp-zPM8-puJCwO

Just by way of an update, had to use FDISK to create a partition structure for the new disk to match the others. Then, the superblock was still missing for sdd4, so I found out you can use mdadm --create to create a new raid, but it’s allegedly smart enough to realise there’s an existing raid configured and thus the data *may* not be overwritten.

sdd should look like this (8TB WS):

Disk /dev/sdd: 2000.3 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdd1               1          26      208813+  fd  Linux raid autodetect
/dev/sdd2              27         156     1044225   fd  Linux raid autodetect
/dev/sdd3             157         182      208845   fd  Linux raid autodetect
/dev/sdd4             183 18446744073709527469 18446744073514118086+  fd  Linux raid autodetect

Then I tried:

mdadm --assemble -f /dev/md2 /dev/sd[abcd]4
mdadm: no RAID superblock on /dev/sdd4
mdadm: /dev/sdd4 has no superblock - assembly aborted

Then:

mdadm --create /dev/md2 --verbose --level=5 --raid-devices=4 --spare-devices=0 /dev/sd[abcd]4
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 64K
mdadm: /dev/sda4 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Fri Apr  3 13:54:17 2009
mdadm: /dev/sdb4 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Fri Apr  3 13:54:17 2009
mdadm: /dev/sdc4 appears to be part of a raid array:
    level=raid5 devices=4 ctime=Fri Apr  3 13:54:17 2009
mdadm: size set to 1952050048K
Continue creating array? y
mdadm: array /dev/md2 started.

Then:

/ $ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5]
md1 : active raid1 sdc2[2] sdb2[1] sda2[0]
      1044160 blocks [4/3] [UUU_]

md2 : active raid5 sdd4[4] sdc4[2] sdb4[1] sda4[0]
      5856150144 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
      [>…]  recovery =  4.1% (80921856/1952050048) finish=1865.2min speed=16718K/sec
md0 : active raid1 sdc1[2] sdb1[1] sda1[0]
      208768 blocks [4/3] [UUU_]

unused devices: <none>

…wonder if there will be any data when when the “recovery” has finished.  I somehow doubt it, but I’ll let you know whether my stumbling around ultimately yields a positive outcome.

Thanks for that–sometimes these details end up changing and if you are successful, the extra info might be helpful to the next person who comes along.

The ShareSpace should be able to add the new drive back into the array and rebuild if it is completely blank (all zeros and a quick erase doesn’t count for some reason).  The mdadm method does need the disk to be formatted properly, and I’d usually do this using some other command to save the formatting of the array and then apply it to the disk–I can’t recall which yet.

I’ll be crossing my fingers for you, although if the array status shows three disks as “up”, it looks as though it may have kept the existing configuration.  My favorite command for this sort of thing is ‘watch’, as in

watch -n 60 cat /proc/mdstat

 which will show the RAID array status and update it every 60 seconds (until you press Ctrl-C).

Well, there’s actually some good news! By way of an update.

After some the re-sync completed (successfully for the first time!!), I had to follow some more steps and a little more mucking around (as per @macwolf on page 4 of this thread), I was able to mount my DataVolume and recover SOME of my data.

/DataVolume was back, (although not according to the web admin pages - I could map it!). However, one of my shares did appear, but was NOT actually there.

I’d skipped the f s c k step, because it was erroring, so after a reboot (and repeat of @macwolf steps), I tried to run the f s c k with the following results:

/dev $ fsck.ext3 /dev/lvmr/lvm0
e2fsck 1.38 (30-Jun-2005)
The filesystem size (according to the superblock) is 1463744512 blocks
The physical size of the device is 731472896 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>?

I said, “N” to this, and it started scanning the inodes quite happily until many of these forced me to quit:

Error reading block 731513067 (Invalid argument) while doing inode scan.  Ignore error<y>?
yes

Force rewrite<y>?
yes

Any ideas if:

  1. How do I resolve this superblock issue?

  2. If I can solve it, will running f s c k possibly bring my other data back?

  3. Should I just quit now and “clean” the drives and completely re-initialise the raid through the admin pages?!!

I guess the good news is I seem to have got a large amount of my data back and copied off the drive…

Couldn’t have done it without this thread and others!