Mysteries of Gen2 volume management


#1

I recently debricked my Gen2, which had been sitting around basically gutted, after its HDD developed some bad sectors (after my darling kitties knocked it off the shelf while I was at work.) It is currently driving on a WD Green drive I had kicking around, but I have specific plans for an alternative, see below.

During the debrick process, I first wanted to use a gimmicky SDcard to SATA adapter (It RAID0 combines 4 SDcards into a single sata volume), but I determined that while the unit will indeed boot from the thing, the mycloud will not allow you to set it up as a volume, because the SMART capabilities of the adapter are… Ahem… Not quite up to spec. I want to use it for a basic reasons. 1) I already have it, and it is taking up space. 2) If my darling kitties knock it over again, big whoop. 3) The unit will be literally silent after modification. 4) I could theoretically pack it up for a trip if I want, and not have to worry too much.

What do I mean by “set it up as a volume”?

Well, before a debricked mycloud will use the second partition as user storage, it wants to jiggery-pokery it with its JBOD implementation first. Before the system’s GUI will let you do that, it needs to perform a SMART drive self test to assure the drive is healthy. (For a poor man’s ssd implementation, like that gimmicky adapter, this poses a significant problem.)

I know that this is all controlled by some XML files scattered all over the place on this thing; Has anyone researched where all I will need to doctor to get this?

Alternatively-- Can somebody with CSS/Web savvy give me a magic doodad to disable the SMART test phase of the setup process? (EG, always return success message, even when it fails?)

I still want to use this adapter. I don’t care that real SSDs are both cheaper, faster, more reliable, and have real SMART implementations. That would require me to buy something, and I have SDcards and this adapter laying around taking up space.


#2

Why do you think that the sda2 partition is JBOD? Is there a command that would show this status? I know that the Gen2 sets up sda1 as a one disk raid. This is the section of the boot script that sets up the raid.
md0 state is ▒
mdadm: /dev/sda1 appears to be part of a raid array:
level=raid1 devices=2 ctime=Wed Jun 6 16:09:09 2018
mdadm: size set to 2097088K
mdadm: creation continuing despite oddities due to --run
md: bind
md/raid1:md0: active with 1 out of 2 mirrors
md0: bitmap file is out of date (0 < 1) – forcing full recovery
created bitmap (16 pages) for device md0
md0: bitmap file is out of date, doing full recovery
md0: bitmap initialized from disk: read 2 pages, set 262136 of 262136 bits
md0: detected capacity change from 0 to 2147418112
mdadm: array /dev/md0 started.
md0: unknown partition table
Setting up swapspace version 1, size = 2147385344 bytes
UUID=c8e46763-dd38-40a6-9b84-b45cfa3ab35f
Adding 2097056k swap on /dev/md0. Priority:-1 extents:1 across:2097056k
mdadm: ‘1’ is an unusual number of drives for an array, so it is probably
a mistake. If you really mean it you will need to specify --force before
setting the number of drives.

I don’t like the Gen2 busybox setup. Very difficult but not impossible to compile a missing program. Trying to restart samba is convoluted. They don’t use any scripts. They have special programs named smbcmd smbcom smbac smbgp smbcv smbif smbwddb with no documentation on any of them. They are also not in the GPL source.


#3

Tell me about it. >.<

The undocumented commandlet involved in this dirty process is “diskmgr”. You can get some (really bad) documentation by calling it with -6 (–show-example), but I dont know exactly what the CGI call is feeding the ■■■■■■■ thing.

As for why I assert it is JBOD-- When you do a clean rebuild of the device, the first thing it wants to do after you push the firmware over is jump you straight to the RAID manager, and have you set up the volume as JBOD with a single disk.

It does some magic behind the scenes, and adds the volume to its XML whack-stack then reformats the volume.

I am thinking this is what I am gonna do:

  1. rebuild the mycloud with the 3tb volume again.
  2. Before doing the JBOD thing in the RAID manager, I am gonna use TAR to create a complete dump of the entire file system of the device to a remote store.
  3. Do the JBOD thing.
  4. Do another TAR clone of the file system to a new remote store
  5. Run a diff against the two.
  6. Evaluate what the RAID manager actually DID to register the volume.

#4

system_init calls diskmgr --genlink /usr/sbin/ --genlink does not show in the usage. But strings diskmgr|grep --genlink does find --genlink. The other place that calls diskmgr is the raid_init.sh. But I can’t find who or where it is called.
Also diskmgr plays with a file called dm_volume_info.xml. But this file does not seem to exist. It is supposed to be in /var/www/xml.
Do you know of a way to tell if a disk is in a JBOD configuration?


#5

Nope. This is some black-box bologna.

After work I will do the needful to investigate exactly what changes the RAID manager does with its XML crazy.


#6

OK, I think I have a clever means of figuring out what the cgi/bin is passing to diskmgr when it does the volume thing.

Rename diskmgr to something else
Make a shell script named diskmgr (in the appropriate place) containing this:

#!/bin/bash

echo $1 $2 $3 $4 $5 $6 $7 > /home/diskmgrcall.txt

make the script executable.

Do the thing with the raid manager and see what gets caught. I will do this shortly.

I have already caught what the command it sends for disk self test is:

smartctl -s on -l selftest /dev/sda

If you make a shell surrogate for smartctl that contains this:

#!/bin/bash echo "smartctl version 5.38 [arm-marvell-linux-gnueabi] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 8808 -
# 2 Short offline Completed without error 00% 0 -"

then it effectively skips disk checking and lets you jump straight to jbod stuff. However, it jammed up hard as soon as I tried to do the JBOD thing. I suspect that this is because I am doing testing on a tiny 4gb sdcard, with nonstandard partition sizes for everything, and the sizes for the other partitions are hard-coded into diskmgr.

(edit)
Confirmed. Booted Fox-exe’s recovery USB jobber, and this is the new partition table…

Model: ATA FC-1307 SD to CF (scsi) Disk /dev/sda: 3964MB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags:

Number Start End Size File system Name Flags
7 17.4kB 359kB 342kB ext2 Linux filesystem
3 359kB 1049kB 690kB ext2 Linux filesystem
1 1049kB 2149MB 2147MB linux-swap(v1) Linux swap
4 2149MB 3222MB 1074MB ext4 Microsoft basic data
5 3222MB 3964MB 741MB Linux filesystem

Seems you cannot use smaller partitions, since the commandlet will nuke the whole partition table, and rebuild everything! (the build failed because the total size of the partitions it wants to make are larger than the media! That’s why partitions 2 and 6 are missing!!)

I am still going to try catching what the ■■■■ thing is being fed when jbod config is called.


#7

Where is JBOD config located?


#8

under the RAID manager. It only shows up on a fresh debrick, or with the right options poked and wdcrack enabled.

If you mean the XML files… There are some under the /user/local/config and some more in /var/www. I have not dug that deeply yet.

current best guess:

diskmgr produces the volatile files in /var/www on boot, compares the data with the persistent config files in /usr/local/config, then does the needful to mount /dev/sda2 and /dev/sda4. Does this every boot.


#9

I dug through my drawers today, and dug out a suitable 64gb card for testing.

After doing the smartctl surrogate fakeout trick, I was able to get the raid manager to succeed. It was definitely an issue with the disk’s size. Seems to be working peachy now.

I reformatted the /dev/sda2 partition manually so that suitable stripe and stride-width values are in place for the card (I dont want it to burn up). I am going to create a filesystem on /dev/md0, so that the system does not use swap. Swap on SDCard is bad juju.

Initial use testing shows I get acceptable write speeds over wireless AC connection. (21mb/sec). Realworld read speed over the same link is also ~21mb/sec. This tells me the adapter is fast enough for service, and that network is the bottleneck. YAY.

Looks like this plan to make a kitty-proof version with old junk is gonna pan out. Still gonna do research on volume stuff, but later. Sometime tomorrow.