I have a WD My Cloud EX4 with four WD 2TB Red drives. They are in a 3 drive RAID 5 with a hot standby drive. This config has been working perfectly with no problems for several years.
A couple days ago I noticed there was a solid red light over disk 1. Checking the web console it reported the RAID volume was ok, but disk 1 had “failed” or “faulted”. When I first bought the array, I had purchased an identical spare disk. It was sitting in the unopened box next to the array. I replaced the bad drive. It started an automatic build of the new drive into the array. Everything seemed to be working as expected.
But the red light didn’t go away.
The RAID volume reports healthy. Both disk tests, Quick and Full, report all drives good. I can access all volumes and files on the array. It appears healthy every way I look at it. Unless I look at the front of the unit, because there’s a big ol’ red light gleaming on the front of it.
Rebooting and shutting down cold and restarting, both give the same results. Perfectly healthy array with a red light on the front.
I have run multiple quick and full disk tests, and restarted it multiple times. That first disk light does start blue on restart, but very quickly turns red, apparently while the array is initializing.
So, 2 questions…
- Is there a way to really confirm that my drives are good? (red light makes me question the health)
- Is there a way to reset the red light to blue if it truly is a healthy drive?
Log in via SSH and issue the command
cat /proc/mdstat
… post the output.
root@NetShare root # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid5 sdd2[3] sdc2[2] sdb2[1]
3898637952 blocks super 1.0 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
bitmap: 0/1 pages [0KB], 131072KB chunk
md0 : active raid1 sdd1[3] sdc1[2] sdb1[1] sda1[0]
2097088 blocks [4/4] [UUUU]
bitmap: 0/16 pages [0KB], 8KB chunk
unused devices: <none>
root@NetShare root #
I also did the mdadm command on /dev/md1 too. Looks like that’s my RAID5 volume.
root@NetShare root # mdadm -D /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Mon May 27 16:50:02 2019
Raid Level : raid1
Array Size : 2097088 (2048.28 MiB 2147.42 MB)
Device Size : 2097088 (2048.28 MiB 2147.42 MB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Wed May 29 23:03:41 2019
State : active
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
UUID : eb72e1eb:e0c55c87:3dd9f759:efbf585f
Events : 0.1
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
root@NetShare root # mdadm -D /dev/md1
/dev/md1:
Version : 01.00.03
Creation Time : Mon Sep 5 18:50:04 2016
Raid Level : raid5
Array Size : 3898637952 (3718.03 GiB 3992.21 GB)
Device Size : 3898637952 (1859.02 GiB 1996.10 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue May 28 22:10:00 2019
State : active
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
Name : 1
UUID : 4c18177a:2d21a63e:2abbbad3:0a113d62
Events : 7246
Number Major Minor RaidDevice State
3 8 50 0 active sync /dev/sdd2
1 8 18 1 active sync /dev/sdb2
2 8 34 2 active sync /dev/sdc2
root@NetShare root # smartctl -A /dev/sda
smartctl version 5.38 [arm-mv5sft-linux-gnueabi] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 186 174 021 Pre-fail Always - 3675
4 Start_Stop_Count 0x0032 093 093 000 Old_age Always - 7289
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 083 083 000 Old_age Always - 13080
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 25
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 8
193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 7286
194 Temperature_Celsius 0x0022 117 106 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0