What WD needs to do, really, is implement something similar to ROBOTS.TXT in their firmware.
Basically, when the media scanning daemon starts crawling the device, it looks for some file (analogous to robots.txt), which if it finds it, it knows not to index that folder or its contents. This would allow more fine control over the silly thing, as it would not waste hundreds of hours of disk access crawling every single file in every single folder, and indexing things that you will not want to have show up in your media archive. (Like button images and the like.)
Even better would be an IndexMe.txt magic file, where if it finds that, it will index, and otherwise just passes over the folder and subfolders.
There is a significant problem in regards to the indexing service: It pretends that it has infinite memory available, and pretends that its scans will proceed quickly. These are both incorrect assumptions. The MyCloud devices are memory constrained embedded devices, having either 256mb (gen1) or 512mb (gen2) of RAM installed. The MyCloud caches the index in memory, (and apparently never writes it to disk except if you count swap operations…) so it has to rebuild the index all the darn time. If there is a very large and complex file system attached, crawling that will take AAAAGGGEEES. If it finds lots and lots of files to index, the index will swell to absurd sizes, will compete with valuable system daemons for RAM, and cause I/O contention as the SATA controller gets hammered between having to shuttle data in and out of swap like a crack monkey, while trying to continue hammering on the filesystem to keep indexing.
Here is my suggestion to you, WD:
Step 1) Enable zmalloc and zram in the kernel.
Step 2) Enable Zram as a swap backend device with a higher priority than the internal HDD’s swap partition.
Step 3) Change the indexing service’s behavior to check for magic files to control indexing behavior
Step 4) Change the default max memory used line for your daemon’s invocation to something sensible for your hardware, like 100mb, instead of 300mb.
This will have the following effects for your hardware:
- The device will try to use compressed swap first, until it exhausts it. Compressed swap comes at a CPU hit, but the myclouds are dual core SoCs. The performance hit of using compressed ram backed swap is significantly lower than hitting external storage, and it does not incur the same IO penalties.
- Because the compressed ram backed swap is not stored on the HDD, the occasional blip on it will not wake the drive from sleep.
- The “magic files” control method will permit the indexing service to look only in places the user specifies, greatly improving the completion speed of the indexing pass.
- Limiting the amount of memory your daemon can use to a sensible amount will assure that your daemon does not saturate the compressed ram backed swap, and will allow the system to quiesce more reliably. (at the possible expense of having a smaller realistic index size.)
I have pulled the sources for your 2.xxx firmwares, and the zram sources are present. You just need to enable the necessary kernel options, and then push a new firmware that has zram enabled. Granted, it is a staging driver in the ancient source tree you are using-- but in just a few minor revs, it was promoted. The version you forked has it as a fairly reliable staging driver, that should be useful in production.
Make these changes, and your issue will be greatly alleviated.