Bricked after upgrade, restored manually, very slow, lost shares, etc

Marooned · May 7, 2015, 8:14pm

Hi there

I have MyCloud 3TB for a year and have done upgrade only once, to 03.04.01-230. It was working without any problems. Via ssh added some packages (mc, transmission, nullmailer - so nothing fancy).

Yesterday I decided to make another upgrade, to the newest 04.01.03-421. It went ok so I go to ssh to reinstall few extra packages that got lost during the upgrade.

vim and mc finished correctly

transmission and nullmailer failed and sugessed to run apt-get -f install - and that was probably my mistake (as later found on the forum) as it caused a mass damage - mostly because installing bad libraries - almost every command raised this error:

ls: relocation error: /lib/arm-linux-gnueabihf/libc.so.6: symbol _dl_find_dso_for_object, version GLIBC_PRIVATE not defined in file ld-linux-armhf.so.3 with link time reference

Fortunately, I had ssh session opened (no new connection was possible) and samba was working (I was able to copy some files to /shares). After few tries I’ve got working busybox so I could run simple commands. Removed link libc.so.6 and point to an older version and finally got command line working. Still, no new ssh connection was allowed.

I’ve decided to run upgrade one more time to remove what was broken by apt-get -f install. Tried from web/UI and via

updateFirmwareFromFile.sh sq-040103-421-20150217.deb

which took some long time to fail in the end:

...
Unpacking..
Unpacking replacement sq ...
dpkg-deb: error: subprocess <decompress> was killed by signal (Segmentation fault)
dpkg: error processing sq-040103-421-20150217.deb (--install):
 subprocess dpkg-deb --fsys-tarfile returned error exit status 2
sq postinst
Processing triggers for wd-nas ...
[wd-nas.postinst] 05/06/15 23:50:29: triggered project-install-trigger context=triggered
[wd-nas.postinst] 05/06/15 23:50:29: done.
Processing triggers for smb-file ...
[smb-file.postinst] 05/06/15 23:50:30: triggered project-install-trigger context=triggered
[smb-file.postinst] 05/06/15 23:50:30: done.
Processing triggers for alerts ...
[alerts.postinst] 05/06/15 23:50:30: triggered project-install-trigger context=triggered
[alerts.postinst] 05/06/15 23:50:30: done.
Processing triggers for itunes ...
[itunes.postinst] 05/06/15 23:50:31: triggered project-install-trigger context=triggered
[itunes.postinst] 05/06/15 23:50:31: done.
Errors were encountered while processing:
 sq-040103-421-20150217.deb
Unpack timeout occurred
failed 202 "upgrade download failure"
logger: disable lazy init
stopping duplicate md device /dev/md0
Restore raid device: /dev/sda1
Restore raid device: /dev/sda2
umount2: No such file or directory
umount: /nfs/TimeMachineBackup: not found
umount2: No such file or directory
umount: /nfs/SmartWare: not found

Unfortunately, short after this my ssh session was broken and I’ve lost the ability to run any command.

I’ve tried “40s reset button” way but it did not do anything. Power off/on of course too.

It was the time to dismount the device, took HDD out and connect it to my PC, run it from SystemRescueCD and write 2 raid partition using rootfs.img (taken from sq-040103-421-20150217.deb).

That took me some time (I’m not good at Linux and my guru-friend was already gone - it took me whole night till 6am) as there was no “disks” utility mentioned on tutorial on this forum. I failed to find and install it, and “gparted” was unable to alter partition. Finally, I did it using

dd if=./rootfs.img of=/dev/sda1
dd if=./rootfs.img of=/dev/sda2

and it was the time for another try. After quite a long time MyCloud finally appeared on the LAN and I was able to reach web/UI in browser. Unfortunately, all settings were gone, all users, all shares. I’ve enabled access to ssh, logged in and confirmed that my data were untouched.

So, this sounds like the end of my problems… well, not really. Here are my current issues that made me to make this thread.

all my shares are gone [I had two: Marooned (private) and Public (well, public)] - I was able to create (in web/UI) Marooned, but trying to create Public gives an error: “Share name Public is reserved”

→ I’ve edited /etc/samba/overall_share to add

## BEGIN ## sharename = Public #
[Public]
  path = /shares/Public
  comment = Public Share
  public = yes
  browseable = yes
  writable = yes
  guest ok = yes
  map read only = no
## END ##

and restarted samba - didn’t help. Share is not visible on web/UI and Windows asks for credentials when going to \192.168.1.125

Trying to create another test share via web/UI gives me “Share function failed. (400099)” (but /shares/folder was created and entry in overall_share too - it just do not appear on web/UI - looks like common issue without a solution?)

on initial powerup after writing fresh partition, disk IO was very high - if I remember correctly, top showed setfacl as top process and now shares looks like:

WDMyCloud:~# ls -l /shares/
total 16
drwxrwx—+ 10 root share 4096 May 6 16:22 Marooned
drwxrwxrwx+ 15 root share 4096 May 6 13:08 PublicWDMyCloud:~# getfacl /shares/Maroonedgetfacl: Removing leading ‘/’ from absolute path names# file: shares/Marooned# owner: root# group: shareuser::rwxuser:www-data:rwxuser:marooned:rwxgroup::—mask::rwxother::—default:user::rwxdefault:user:www-data:rwxdefault:user:marooned:rwxdefault:group::—default:mask::rwxdefault:other::—WDMyCloud:~# getfacl /shares/Publicgetfacl: Removing leading ‘/’ from absolute path names# file: shares/Public# owner: root# group: shareuser::rwxuser:www-data:rwxgroup::rwxmask::rwxother::rwxdefault:user::rwxdefault:user:www-data:rwxdefault:group::rwxdefault:mask::rwxdefault:other::rwx

I’ve stopped (just for now) few daemons (wdnotifierd, twonky) that are responsible for thumbnails (well, more or less - they just scan hdd for files so that can wait)

→ looks like setfacl is reindexing/recheck for permissions all files on the hdd? I can take few days then… still going

overall performance is very low (due to point 2 ??) - loading web/UI takes even a minute and doesn’t work all the time (see issue on point 1)
SMART issue? web/UI shows red icon with error:

Drive SMART failure

The drive self-check failed. Please contact customer service.

Wednesday, May 06, 2015 8:00:08 PM Code 0003

so on ssh:

smartctl -A /dev/sda

SMART Disabled. Use option -s with argument 'on' to enable it.

So I’ve enabled it and get those results:

WDMyCloud:~# smartctl -A /dev/sda
smartctl 5.41 2011-06-09 r3365 [armv7l-linux-3.2.26] (local build)
Copyright (coffee) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 78
  3 Spin_Up_Time 0x0027 179 178 021 Pre-fail Always - 6050
  4 Start_Stop_Count 0x0032 085 085 000 Old_age Always - 15451
  5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
  7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
  9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 9433
 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 6
193 Load_Cycle_Count 0x0032 193 193 000 Old_age Always - 22770
194 Temperature_Celsius 0x0022 108 092 000 Old_age Always - 42
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

------
WDMyCloud:~# smartctl -Hc /dev/sda
smartctl 5.41 2011-06-09 r3365 [armv7l-linux-3.2.26] (local build)
Copyright (coffee) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection: (41460) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x703d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

so looks like SMART is ok? The error on web/UI still exists. Should I be worried about my data and disk health?

5) Last but not least, how to install transmission:

WDMyCloud:~# apt-get -y install transmission-cli transmission-common transmission-daemon
Reading package lists... Done
Building dependency tree
Reading state information... Done
You might want to run 'apt-get -f install' to correct these:
The following packages have unmet dependencies:
 libdbd-sqlite3-perl : Depends: perl (>= 5.20.1-1) but 5.14.2-21 is to be installed
                       Depends: perlapi-5.20.1
 libdbi-perl : Depends: perl (>= 5.20.0-4) but 5.14.2-21 is to be installed
               Depends: perlapi-5.20.0
 libtext-charwidth-perl : Depends: perl-base (>= 5.20.0-4) but 5.14.2-21 is to be installed
                          Depends: perlapi-5.20.0
 php5-apcu : Depends: phpapi-20121212+lfs but it is not installable
 transmission-cli : Depends: libcurl3-gnutls (>= 7.18.0) but it is not going to be installed
                    Depends: libminiupnpc10 (>= 1.9.20140610) but it is not going to be installed
                    Depends: libnatpmp1 but it is not going to be installed
 transmission-daemon : Depends: init-system-helpers (>= 1.18~) but it is not going to be installed
                       Depends: libcurl3-gnutls (>= 7.18.0) but it is not going to be installed
                       Depends: libminiupnpc10 (>= 1.9.20140610) but it is not going to be installed
                       Depends: libnatpmp1 but it is not going to be installed
                       Depends: libsystemd0 but it is not going to be installed
E: Unmet dependencies. Try 'apt-get -f install' with no packages (or specify a solution).

having in mind that

apt-get -f install

last time did all this mess so is not a way to go.

Uhh… I understand it’s quite a long post but wanted to give detailed background of all those issue. At the moment MyCloud is far from being usable.

Thanks in advance for any help to restore normal behavior of the device.

alirz1 · May 7, 2015, 9:08pm

If your data is gone or if you are not concerned with losing it. Perform a full factory from the dashboard…That should get all the partition structure back properly that could be causing various issues.

You can always create all the paritions manually. Another member “Fox_exe” has a very nice tuttorial on manually de-brcking a drive from total scratch. I’ve performed that several times and drive works as good as new.

His instructions are at: https://drive.google.com/folderview?id=0B_6OlQ_H0PxVRXF4aFpYS2dzMEE&usp=drive_web

As for installing transmission. Keep in mind Firmware v4.x cannot use the standart APT sources that come in the WD firmware. I guess you’ve figured that out by now.

Use the “transmission V4 install” thread by member “Nazar78” to install transmission or you can use “Fox_exe” is V4 compatible repo.

Marooned · May 7, 2015, 9:16pm

I know, that post was long so you might miss it: “and confirmed that my data were untouched.”.

Data are ok and it’s too much that I could store aside to perform full wipe.

So the main problem currently is restoring users/shares so data can be again accessible via samba/dlna/etc.

Thanks for the info about transmission - I’ll read atached threads.

Marooned · May 8, 2015, 11:03am

Hm, right now I’m not even able to reach web/UI due to error:

PHP Fatal error: Call to undefined function Core\\apc_fetch() in /var/www/rest-api/api/Core/ClassAutoLoader.php on line 57

so did WD missed to add APC extension to PHP in new firmware or is something still broken with my fresh install? That one is easy to fix by adding that extension (hopefully, not tested yet) but it’s like issue by issue. Never ending story.

Marooned · May 10, 2015, 7:35pm

So after 3 days (well, nights actually) of fighting I gave up and installed last of 3.x firmware. MyCloud started to work as it should although I did not have shares listed in web/UI nor samba and could not create one via UI as “public” share name was not allowed. The UI check was easy to pass (just change class name for input to avoid share name validation) but server still answered with error.

Manual editing was necessary.

For samba just edited /etc/samba/overall_share

For web/UI had to manual insert data to SQLite DB at /usr/local/nas/orion/orion.db

Now still some problems:

Can’t access password-protected share - does not accept my password, Windows popup says it uses user “\marooned” when I just provided “marooned” - I’m not sure if that isn’t breaking access? It should rather MyCloud\marooned ?
public share is not password protected and prior to upgrate it was working with anonymous login - now access requires to provide “public” as username (and blank password) - how to change it back to anonymous access?
web/UI shows that I have 0kB free space left (and I have free ~50%)