I’ve got a 4TB passport model WDC WD40NDZW-11MR8S1 with a GPT partition layout with a couple partitions, one speciffically for file storage using BTRFS.
Now, I was writing files from my home directory to transfer stuff to another computer as sneakernet only costs gas money and I don’t have to break the bank to transfer a few dozen Gigabytes over internet (I’ts unreliable, expensive and not available on one end)
So I’ve written some gigabytes of data and cp -drvn ~/* /media/usb
crashes and I can no longer mount any partitions, dmesg comes up with a handfull of error scenarios like operation timeout, unreadable partition layout and some others, but this is what allways appears save for the last two or so lines:
[88018.088763] usb 2-1.1: new high-speed USB device number 21 using ehci-pci
[88018.244814] usb 2-1.1: New USB device found, idVendor=1058, idProduct=2627, bcdDevice=40.08
[88018.244826] usb 2-1.1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[88018.244831] usb 2-1.1: Product: My Passport 2627
[88018.244834] usb 2-1.1: Manufacturer: Western Digital
[88018.244838] usb 2-1.1: SerialNumber: 575835324435314446463443
[88018.245319] usb-storage 2-1.1:1.0: USB Mass Storage device detected
[88018.245735] scsi host7: usb-storage 2-1.1:1.0
[88019.268518] scsi 7:0:0:0: Direct-Access WD My Passport 2627 4008 PQ: 0 ANSI: 6
[88019.269532] scsi 7:0:0:1: Enclosure WD SES Device 4008 PQ: 0 ANSI: 6
[88019.276098] ses 7:0:0:1: Attached Enclosure device
[88019.279553] sd 7:0:0:0: [sdc] Spinning up disk...
[88019.280879] ses 7:0:0:1: Wrong diagnostic page; asked for 1 got 8
[88019.280882] ses 7:0:0:1: Failed to get diagnostic page 0x1
[88019.280885] ses 7:0:0:1: Failed to bind enclosure -19
[88020.305372] ......ready
[88025.373290] sd 7:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
[88025.373840] sd 7:0:0:0: [sdc] 7813969920 512-byte logical blocks: (4.00 TB/3.64 TiB)
[88025.373847] sd 7:0:0:0: [sdc] 4096-byte physical blocks
[88025.374930] sd 7:0:0:0: [sdc] Write Protect is off
[88025.374937] sd 7:0:0:0: [sdc] Mode Sense: 47 00 10 08
[88025.376120] sd 7:0:0:0: [sdc] No Caching mode page found
[88025.376128] sd 7:0:0:0: [sdc] Assuming drive cache: write through
[88025.553963] sdc: sdc1 sdc2 sdc3 sdc4 sdc5
[88025.557389] sd 7:0:0:0: [sdc] Attached SCSI disk
at best I can mount one of the smaller filesystems for a short while and maybe even read/write though not the largest one as it takes too long and the process crashes
the partition table is roughly 1:512MB, 2:3.6TB, 3:32GB, 4:16GB, 5:16GB
the purpose of wich was that the first partition is an EFI boot partition with a chainloader and two partitions for a 32 and 64 bit system and one shared home folder with a generic storage partition
I also ran a short and long SMART test with smartmontools’ smartctl with the output
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.15.59-0-lts] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Elements / My Passport (USB, AF)
Device Model: WDC WD40NDZW-11MR8S1
Serial Number: WD-WX52D51DFF4C
LU WWN Device Id: 5 0014ee 269a6cd26
Firmware Version: 02.01A02
User Capacity: 4,000,753,475,584 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
TRIM Command: Available, deterministic
Device is: In smartctl database 7.3/5319
ATA Version is: ACS-3 (minor revision not indicated)
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Sep 28 17:22:22 2022 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 105) The previous self-test completed having
the servo (and/or seek) element of the
test failed.
Total time to complete Offline
data collection: (10320) seconds.
Offline data collection
capabilities: (0x1b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 140) minutes.
SCT capabilities: (0x30b5) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 198 198 051 - 46
3 Spin_Up_Time POS--K 253 253 021 - 4783
4 Start_Stop_Count -O--CK 100 100 000 - 178
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 184 184 000 - 1103
9 Power_On_Hours -O--CK 100 100 000 - 689
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 253 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 75
192 Power-Off_Retract_Count -O--CK 200 200 000 - 30
193 Load_Cycle_Count -O--CK 200 200 000 - 1648
194 Temperature_Celsius -O---K 111 106 000 - 41
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 16
198 Offline_Uncorrectable ----CK 100 253 000 - 0
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 100 253 000 - 0
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
ATA_READ_LOG_EXT (addr=0x00:0x00, page=0, n=1) failed: scsi error medium or hardware error (serious)
Read GP Log Directory failed
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 5 Comprehensive SMART error log
0x06 SL R/O 1 SMART self-test log
0x09 SL R/W 1 Selective self-test log
0x30 SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f SL R/W 16 Host vendor specific log
0xa0-0xa7 SL VS 16 Device vendor specific log
0xa8-0xb6 SL VS 1 Device vendor specific log
0xb7 SL VS 82 Device vendor specific log
0xb9 SL VS 4 Device vendor specific log
0xbd SL VS 1 Device vendor specific log
0xc0 SL VS 1 Device vendor specific log
0xe0 SL R/W 1 SCT Command/Status
0xe1 SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log (GP Log 0x03) not supported
SMART Error Log Version: 1
ATA Error Count: 4
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 4 occurred at disk power-on lifetime: 688 hours (28 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 01 e0 00 00 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
e1 d4 01 e0 00 00 00 00 01:55:46.213 IDLE IMMEDIATE
e0 d4 01 e0 00 00 00 00 01:12:46.439 STANDBY IMMEDIATE
e7 d4 01 e0 00 00 00 00 01:12:46.439 FLUSH CACHE
2f d4 01 e0 00 00 00 00 01:11:59.485 READ LOG EXT
2f d4 01 e0 00 00 00 00 01:07:59.720 READ LOG EXT
Error 3 occurred at disk power-on lifetime: 688 hours (28 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 01 e0 00 00 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
e0 d4 01 e0 00 00 00 00 01:12:46.439 STANDBY IMMEDIATE
e7 d4 01 e0 00 00 00 00 01:12:46.439 FLUSH CACHE
2f d4 01 e0 00 00 00 00 01:11:59.485 READ LOG EXT
2f d4 01 e0 00 00 00 00 01:07:59.720 READ LOG EXT
2f d4 01 e0 00 00 00 00 01:03:59.955 READ LOG EXT
Error 2 occurred at disk power-on lifetime: 688 hours (28 days + 16 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 61 01 e0 00 00 00
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
e7 d4 01 e0 00 00 00 00 01:12:46.439 FLUSH CACHE
2f d4 01 e0 00 00 00 00 01:11:59.485 READ LOG EXT
2f d4 01 e0 00 00 00 00 01:07:59.720 READ LOG EXT
2f d4 01 e0 00 00 00 00 01:03:59.955 READ LOG EXT
2f d4 01 e0 00 00 00 00 01:00:00.190 READ LOG EXT
Error 1 occurred at disk power-on lifetime: 686 hours (28 days + 14 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
04 51 01 ff ff ff 0f
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
e7 d6 01 98 4f c2 00 00 00:03:27.702 FLUSH CACHE
b0 d6 01 98 4f c2 00 00 00:03:02.248 SMART WRITE LOG
b0 d5 01 80 4f c2 00 00 00:02:10.316 SMART READ LOG
ec 90 01 00 00 00 00 00 00:02:10.315 IDENTIFY DEVICE
2f 90 01 00 00 00 40 00 00:02:10.314 READ LOG EXT
SMART Extended Self-test Log (GP Log 0x07) not supported
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: servo/seek failure 90% 687 -
# 2 Short offline Completed: read failure 50% 687 469280
Selective Self-tests/Logging not supported
SCT Status Version: 3
SCT Version (vendor specific): 258 (0x0102)
Device State: Active (0)
Current Temperature: 41 Celsius
Power Cycle Min/Max Temperature: 33/46 Celsius
Lifetime Min/Max Temperature: 22/46 Celsius
Specified Max Operating Temperature: 38 Celsius
Under/Over Temperature Limit Count: 0/0
Vendor specific:
03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -41/85 Celsius
Temperature History Size (Index): 128 (72)
Index Estimated Time Temperature Celsius
73 2022-09-28 15:18 42 ***********************
74 2022-09-28 15:19 42 ***********************
75 2022-09-28 15:20 42 ***********************
76 2022-09-28 15:21 ? -
77 2022-09-28 15:22 28 *********
78 2022-09-28 15:23 ? -
79 2022-09-28 15:24 33 **************
80 2022-09-28 15:25 34 ***************
81 2022-09-28 15:26 35 ****************
82 2022-09-28 15:27 35 ****************
83 2022-09-28 15:28 36 *****************
84 2022-09-28 15:29 36 *****************
85 2022-09-28 15:30 37 ******************
86 2022-09-28 15:31 37 ******************
87 2022-09-28 15:32 38 *******************
88 2022-09-28 15:33 38 *******************
89 2022-09-28 15:34 38 *******************
90 2022-09-28 15:35 39 ********************
91 2022-09-28 15:36 39 ********************
92 2022-09-28 15:37 39 ********************
93 2022-09-28 15:38 40 *********************
... ..( 4 skipped). .. *********************
98 2022-09-28 15:43 40 *********************
99 2022-09-28 15:44 41 **********************
... ..( 2 skipped). .. **********************
102 2022-09-28 15:47 41 **********************
103 2022-09-28 15:48 42 ***********************
... ..( 8 skipped). .. ***********************
112 2022-09-28 15:57 42 ***********************
113 2022-09-28 15:58 43 ************************
... ..( 15 skipped). .. ************************
1 2022-09-28 16:14 43 ************************
2 2022-09-28 16:15 44 *************************
... ..( 5 skipped). .. *************************
8 2022-09-28 16:21 44 *************************
9 2022-09-28 16:22 45 **************************
... ..( 4 skipped). .. **************************
14 2022-09-28 16:27 45 **************************
15 2022-09-28 16:28 46 ***************************
16 2022-09-28 16:29 45 **************************
... ..( 4 skipped). .. **************************
21 2022-09-28 16:34 45 **************************
22 2022-09-28 16:35 44 *************************
... ..( 2 skipped). .. *************************
25 2022-09-28 16:38 44 *************************
26 2022-09-28 16:39 43 ************************
... ..( 5 skipped). .. ************************
32 2022-09-28 16:45 43 ************************
33 2022-09-28 16:46 42 ***********************
... ..( 10 skipped). .. ***********************
44 2022-09-28 16:57 42 ***********************
45 2022-09-28 16:58 41 **********************
... ..( 22 skipped). .. **********************
68 2022-09-28 17:21 41 **********************
69 2022-09-28 17:22 40 *********************
70 2022-09-28 17:23 41 **********************
71 2022-09-28 17:24 41 **********************
72 2022-09-28 17:25 41 **********************
SCT Error Recovery Control command not supported
Device Statistics (GP/SMART Log 0x04) not supported
Pending Defects log (GP Log 0x0c) not supported
ATA_READ_LOG_EXT (addr=0x11:0x00, page=0, n=1) failed: scsi error medium or hardware error (serious)
Read SATA Phy Event Counters failed
I suspect the drive has experianced a shock bad enough to mess with the heads in some way
also, Ive had refurbished, decade old, abused, rack mountable (but never so) WD harddrives that lasted for longer than this thing has: 99% of those hours it has spent chilling on top of my tower idling, waiting for more data dumps. what do?
P.S. My system is running Alpine 3.16 Linux 5.15.59-LTS and the other PC runs Artix Linux 5.19-Zen