My Cloud OS5 Firmware 5.27.157 Problems

Many users have reported various unusual problems since updating to My Cloud OS5 firmware version 5.27.157.

  • No configured RAID Volumes
  • File Access Issues
  • Missing Files
  • Network Dropouts

The missing RAID volumes problem can generally be solved by recreating a missing /dev/md1 device, restarting the RAID array to verify that it’s ok, then rebooting one or more times until it sticks.

RAID 1:

  • mknod /dev/md1 b 9 1;
  • mdadm --assemble --run /dev/md1 /dev/sda2 /dev/sdb2;

RAID 5:

  • mknod /dev/md1 b 9 1;
  • mdadm --assemble --run /dev/md1 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2;

At first, rebooting appeared to have been incidental, but further research suggests that it’s actually an important clue, and that the actual source of the problem may be the Linux kernel itself. This would also explain many of the strange problems users have reported.

My Cloud OS5 has always had an outdated Linux kernel (currently 4.14.22, latest 4.14.329), which didn’t seem to cause any major problems. However, when WD updated the base distribution packages from Debian 10 (Buster) to Debian 11 (Bullseye), it may have simply been a bridge too far.

  • 4.14.xxx - My Cloud OS5 Kernel Branch
  • 4.19.xxx - Debian 10 Kernel Branch
  • 5.10.xxx - Debian 11 Kernel Branch

The Linux kernel uses a process called udev to create all devices under /dev based on any hardware that’s detected. After updating to My Cloud OS5 firmware version 5.27.157, it occasionally fails to properly create the required /dev/md1 RAID device, which then starts a chain reaction of failures.

The solution appears to be simple, WD needs to update the Linux kernel ASAP.

My Cloud OS5 Release Notes

2 Likes

Yet another My Cloud OS5 firmware version 5.27.157 problem has been discovered. After running a system test via the dashboard, the memory always indicates a problem… on three separate PR4100s.

1 Like

Not to mention the RSYNC problem as well!
Thanks for the detailed report

1 Like

@fpetit I’ve created an app to solve the rsync problem (caused by WD code changes), and it’s undergoing final testing now. Would you like to test it?

1 Like

I’m not very skilled on Linux…so far I downgraded to previous firmware and reported the issue to WD which I guess must be aware of all mess they’ve created.

1 Like

Having problems Disk1 replaced showed degraded.
Disk2 good Ex2Ultra trying to rebuild new disk1
On Raid1.
EX2 Ultra goes thru rebuild then starts rebuild again.

Should I do the following & how do you send it.
Step by step.

Thank you
In advance

RAID 1:

  • mknod /dev/md1 b 9 1;
  • mdadm --assemble --run /dev/md1 /dev/sda2 /dev/sdb2;
1 Like

The first thing you should do is check the hard drives for errors.

  • smartctl -a /dev/sda;
  • smartctl -a /dev/sdb;

How to Access WD My Cloud Using SSH (Secure Shell)

New Drive replacement for Bad drive

root@MyCloudEX2Ultra ~ # smartctl -a /dev/sda;

smartctl 7.2 2020-12-30 r5155 [armv7l-linux-4.14.22-armada-18.09.3] (local build )

Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Model Family: Western Digital Red (SMR)

Device Model: WDC WD60EFAX-68JH4N1

Serial Number: WD-WXG2D53KA02H

LU WWN Device Id: 5 0014ee 2c0856aa2

Firmware Version: 83.00A83

User Capacity: 6,001,175,126,016 bytes [6.00 TB]

Sector Sizes: 512 bytes logical, 4096 bytes physical

Rotation Rate: 5400 rpm

Form Factor: 3.5 inches

TRIM Command: Available, deterministic, zeroed

Device is: In smartctl database [for details use: -P show]

ATA Version is: ACS-3 T13/2161-D revision 5

SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is: Wed Nov 15 15:14:02 2023 EST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status: (0x00) Offline data collection activity

was never started.

Auto Offline Data Collection: Disabled.

Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (32084) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off supp ort.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 2) minutes.

Extended self-test routine

recommended polling time: ( 739) minutes.

Conveyance self-test routine

recommended polling time: ( 2) minutes.

SCT capabilities: (0x3039) SCT Status supported.

SCT Error Recovery Control supported.

SCT Feature Control supported.

SCT Data Table supported.

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_ FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x002f 100 253 051 Pre-fail Always - 0

3 Spin_Up_Time 0x0027 230 230 021 Pre-fail Always - 3500

4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 7

5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0

7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0

9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 44

10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0

11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 6

192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 4

193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2

194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 50

196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0

200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA _of_first_error

1 Short offline Completed without error 00% 39 -

2 Short offline Completed without error 00% 26 -

3 Short offline Completed without error 00% 14 -

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Old second drive trying to rebuilding drive1 raid1

root@MyCloudEX2Ultra ~ # smartctl -a /dev/sdb;

smartctl 7.2 2020-12-30 r5155 [armv7l-linux-4.14.22-armada-18.09.3] (local build)

Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Model Family: Western Digital Red

Device Model: WDC WD60EFRX-68L0BN1

Serial Number: WD-WX31D47CED0N

LU WWN Device Id: 5 0014ee 264252099

Firmware Version: 82.00A82

User Capacity: 6,001,175,126,016 bytes [6.00 TB]

Sector Sizes: 512 bytes logical, 4096 bytes physical

Rotation Rate: 5700 rpm

Device is: In smartctl database [for details use: -P show]

ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b

SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is: Wed Nov 15 14:34:17 2023 EST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status: (0x00) Offline data collection activity

was never started.

Auto Offline Data Collection: Disabled.

Self-test execution status: ( 17) The self-test routine was aborted by

the host.

Total time to complete Offline

data collection: ( 2384) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 2) minutes.

Extended self-test routine

recommended polling time: ( 678) minutes.

Conveyance self-test routine

recommended polling time: ( 5) minutes.

SCT capabilities: (0x303d) SCT Status supported.

SCT Error Recovery Control supported.

SCT Feature Control supported.

SCT Data Table supported.

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0

3 Spin_Up_Time 0x0027 200 195 021 Pre-fail Always - 8975

4 Start_Stop_Count 0x0032 047 047 000 Old_age Always - 53917

5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0

7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0

9 Power_On_Hours 0x0032 051 051 000 Old_age Always - 36271

10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0

11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0

12 Power_Cycle_Count 0x0032 098 098 000 Old_age Always - 2005

192 Power-Off_Retract_Count 0x0032 198 198 000 Old_age Always - 1981

193 Load_Cycle_Count 0x0032 183 183 000 Old_age Always - 52019

194 Temperature_Celsius 0x0022 100 097 000 Old_age Always - 52

196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0

197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 5

198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0

200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 4

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

1 Short offline Aborted by host 10% 36268 -

2 Short offline Aborted by host 10% 36243 -

3 Extended offline Aborted by host 50% 36228 -

4 Short offline Completed without error 00% 36211 -

5 Short offline Completed without error 00% 36207 -

6 Extended offline Completed: read failure 10% 36205 3076528808

7 Short offline Completed without error 00% 36186 -

8 Short offline Completed without error 00% 36169 -

9 Short offline Completed without error 00% 36169 -

#10 Short offline Aborted by host 90% 31922 -

#11 Short offline Completed without error 00% 30508 -

#12 Short offline Aborted by host 50% 30420 -

#13 Short offline Completed without error 00% 27009 -

#14 Extended offline Aborted by host 90% 27009 -

#15 Short offline Completed without error 00% 26421 -

#16 Short offline Completed without error 00% 26318 -

#17 Short offline Completed without error 00% 25419 -

#18 Extended offline Completed without error 00% 25411 -

#19 Extended offline Aborted by host 90% 25399 -

#20 Short offline Completed without error 00% 25271 -

#21 Short offline Completed without error 00% 24112 -

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

carlos.demelo@outlook.com

1 Like

The /dev/sda hard drive uses SMR technology, which is NOT suitable for RAID use. Consider replacing it if you value your data.

The /dev/sdb hard drive has issues, where the following attributes should both be zero.

  • Current_Pending_Sector - 5
  • Multi_Zone_Error_Rate - 4

Try running a “Scan Disk” from the “Settings” section of the dashboard.

Much to my dismay, I discovered yet another My Cloud OS5 firmware version 5.27.157 problem that’s potentially catastrophic. However, I’m not certain if this is a new problem, or one that’s been quietly lurking in the background, waiting to be discovered… the hard way.

My PR4100 development box normally has two drives, both formatted as JBOD, with a single share (DEV_1 and DEV_2) for each drive. About a month ago I had removed the secondary hard drive to test another issue on the EX2 Ultra, but never bothered moving it back to the PR4100 until now.

After inserting the secondary drive and powering on the PR4100, a “RAID Roaming” prompt appeared as expected, so I proceeded to click “OK” to integrate the drive, which seemed to happen without any problems. Then, I noticed that two new unwanted “TimeMachineBackup” and “Public_2” shares had appeared.

The unwanted "TimeMachineBackup" share was deleted without any problems, but when an attempt was made to delete the "Public_2" share, it actually DELETED my primary "DEV_1" share, and the "Public_2" share remains persistent, despite multiple attempts to get rid of it.

  • /shares/DEV_1 -> /mnt/HD/HD_a2/DEV_1
  • /shares/DEV_2 -> /mnt/HD/HD_b2/DEV_2
  • /shares/Public -> /mnt/HD/HD_a2/Public
  • /shares/Public_2 -> /mnt/HD/HD_b2/Public
  • /shares/TimeMachineBackup -> /mnt/HD/HD_a2/TimeMachineBackup
  • /shares/Volume_1 -> /mnt/HD/HD_a2
  • /shares/Volume_2 -> /mnt/HD/HD_b2

Fortunately, this PR4100 is just a development box and most of the data on the primary "DEV_1" share was backed up, but it will take a lot of time to restore everything, and I may have lost a few files that weren’t backed up. Furthermore, the so-called “Recycle Bin” feature was enabled for all shares, yet utterly worthless in this situation.

Western Digital, you should either get your sh*t together, or get the hell out of the NAS business, because the sheer incompetence of your so-called developers never ceases to amaze me.

Enough is enough!

Uh… TLDR… Don’t update to this firmware version?

Thanks to the power of BACKUPS, my PR4100 development box is back in business and fully operational once again. Let this be a lesson to RAID users who don’t have BACKUPS, because I can guarantee you that WD Support won’t pull your ass out of the fire when you need it most.

Sometimes,
sh*t happens,
someone has to deal with it,
and who ya gonna call?

The only way I found to finally get rid of the unwanted "Public_2" share, was to manually delete the folder and symlink, then reboot the PR4100.

  • rm -rf "/mnt/HD/HD_b2/Public";
  • rm "/shares/Public_2";

But that stupid "TimeMachineBackup" share keeps coming back after running a "System Test", etc. I’m absolutely NOT a Mac user, and don’t want Mac turds cluttering up my pristine environment.

Additional problems have been found in My Cloud OS5 firmware version 5.27.157.

Problem 1:

There’s a typo in the dosfsck symlink.

Incorrect:

  • /sbin/dosfsck -> sbin/fsck.fat

Correct:

  • /sbin/dosfsck -> /sbin/fsck.fat

The Fix:

  • ln -sf "/sbin/fsck.fat" "/sbin/dosfsck";

Problem 2:

The destination path of the zoneinfo symlink does not exist.

Broken Symlink:

  • /usr/sbin/zoneinfo -> /usr/local/modules/zoneinfo

Problem 3:

An asterisk symlink exists in the /bin directory, likely as a result of a script typo.

Broken Symlink:

  • /bin/* -> /usr/local/modules/bin/*

Further investigation revealed that the My Cloud OS5 memory test is a fraud.

The dashboard calls a CGI compiled binary program named “smart.cgi”, which then calls another compiled binary program named “sys_diag”, and this is where things get interesting.

  • diagnostics.html > diagnosticsDiag.js > smart.cgi > sys_diag

The WD “sys_diag” compiled binary program does NOT actually test the memory at all, it merely calls the “dmidecode” Linux program and gets the size of all installed memory modules.

  • dmidecode -t 17 | grep "Size: "

The problem definitely exists within the WD “sys_diag” compiled binary program, because it’s XML output indicates a memory error, yet the “dmidecode ” Linux program output is fine.

  • # cat "/var/www/xml/sys_diag.xml"
<config>
<sys_diag>
        <rtc>passed</rtc>
        <usb1>not exist</usb1>
        <usb2>not exist</usb2>
        <usb3>not exist</usb3>
        <hdd1>passed</hdd1>
        <hdd2>passed</hdd2>
        <hdd3>not exist</hdd3>
        <hdd4>not exist</hdd4>
        <memory>failed</memory>
        <temperature>passed</temperature>
        <fan>passed</fan>
</sys_diag>
</config>

The debug output of the WD “sys_diag ” compiled binary program seems to think that 2 + 2 = 0, which seals the deal. WD and their so-called developers borked it yet again.

  • # /usr/sbin/sys_diag -d
Size:  [2]  [GB]
Size:  [2]  [GB]
total size = 0 MB
memory size= 0M
memory status: failed

Lastly, here’s the “dmidecode” Linux program output when executed manually from the command line.

  • # dmidecode -t 17 | grep "Size: "
        Size: 2 GB
        Size: 2 GB

For comparison, here’s the full output of the “dmidecode” Linux program, which clearly shows that the memory is fine., despite not being properly tested.

  • # dmidecode -t 17
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.0.0 present.

Handle 0x000D, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000B
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 2 GB
        Form Factor: DIMM
        Set: None
        Locator: A1_DIMM0
        Bank Locator: A1_BANK0
        Type: DDR3
        Type Detail: Unknown
        Speed: 1600 MT/s
        Manufacturer: InnoDisk
        Serial Number: XXXXXXXX
        Asset Tag: A1_AssetTagNum0
        Part Number: M3SW-2GSJCL0C-QDM
        Rank: 1
        Configured Memory Speed: 1066 MT/s
        Minimum Voltage: 1.35 V
        Maximum Voltage: 1.5 V
        Configured Voltage: 1.35 V

Handle 0x000F, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000B
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 2 GB
        Form Factor: DIMM
        Set: None
        Locator: A1_DIMM1
        Bank Locator: A1_BANK1
        Type: DDR3
        Type Detail: Unknown
        Speed: 1600 MT/s
        Manufacturer: InnoDisk
        Serial Number: XXXXXXXX
        Asset Tag: A1_AssetTagNum1
        Part Number: M3SW-2GSJCL0C-QDM
        Rank: 1
        Configured Memory Speed: 1066 MT/s
        Minimum Voltage: 1.35 V
        Maximum Voltage: 1.5 V
        Configured Voltage: 1.35 V

The App Store app information boxes aren’t tall enough to properly display the text, which makes the WD programmers look like rank amateurs. The fix is to add 15px to the CSS height of the .appList class and .app_desc class.

Old CSS:

.appList {
    height: 240px;
    width: 220px;
    border: 1px solid #515151;
    border-radius: 5px;
    float: left;
    margin-right: 8px;
    margin-top: 10px;
}

.appList .app_desc {
    float: none;
    padding: 10px;
    height: 135px;
    font-size: 13px;
    overflow: hidden;
    text-overflow: ellipsis;
}

New CSS:

.appList {
    height: 255px;
    width: 220px;
    border: 1px solid #515151;
    border-radius: 5px;
    float: left;
    margin-right: 8px;
    margin-top: 10px;
}

.appList .app_desc {
    float: none;
    padding: 10px;
    height: 150px;
    font-size: 13px;
    overflow: hidden;
    text-overflow: ellipsis;
}

The kernel ring buffer (dmesg) of my primary PR4100 box was filled with USB port 1 errors and the power button stopped working. Rebooting required a forced poweroff, performed by pressing and holding the power button.

My suspicion is that it was caused by the PR4100 PMC chip timeout event (TEC) problem that was never fully resolved. This may or may not be related to My Cloud OS5 Firmware version 5.27.157, but it’s worth mentioning.

By the way, the so-called “Hibernate” function of the dashboard is stupid, expecially when the dashboard says “Device is hibernating”, but the LCD display says “Device is shutting down”.

it’s stupid because many of the WD NAS boxes don’t have a power button.

Once you shut these down; you need to cycle the plug power to get them to boot.

The bad thing about this is that if you get a power dip (and don’t have a UPS); the unit will crash and immediately reboot. There should be a button and an option for the unit to NOT automatically reboot upon restoration of power.

Of course; it is appropriate to have a UPS for the unit: But in my case; I generally have the units online for only a few days a month and I have good power reliability.

The WD Support website is a mess, and their staff couldn’t figure out how to operate a paperclip.

My Cloud OS 5 Third Party Apps GPL Codes

Amazon S3, Camera Backups, calmAV, Dropbox, FTP Downloads, Internal Backups, iTunes, Joomla, phpBB, phpMyAdmin, Remote Backups, Transmission, USB Backups, WordPress

File Size: 1.1 GB

Release Date: 11/1/2021

I simply wanted to download the latest My Cloud OS5 apps GPL source code, but no download link is available on their website, which is a clear violation of GPL licensing terms. A link to an older version (saved previously) still works, but there is no link for the latest release.

https://downloads.wdc.com/gpl/WDMyCloud_NAS_Apps_GPL_20201112.tar.gz

Therefore, I contacted WD Support, who were less than clueless. First they gave me a link to a dead forum post from many years ago, then they wanted to transfer me to a “senior level technician”.

Still waiting for a direct answer to a very simple question…

I would love to test the app, if it’s not too late.
What can I do?

The rsync app can be downloaded from the link below.

1 Like