Reigning in the Gen2

Wierd_w · April 23, 2017, 1:36am

OK, that is odd. The gen1 and gen2 use the same kernel source tree. exactly the same version. (at least as of 4.04.113) This suggests* that I only need to compile zram modules once, since both systems use arm7 cpus, and both kernels are using the same page size, on the same kernel source tree. (barring some special quirks implemented with build tools.)

However, I still cannot build working zram+pals, for levels of “working” that are greater than 1 page of memory. There is clearly something broken with zsmalloc being built as a module. (I can write on a 40k zram0 just fine, for instance. But cannot write on a 400k one without breaking the system. IIRC, this kernel uses 64k page size.) This is with a slightly modified zsmalloc that works around a non-exported [unmap_kernel_range] symbol from vmalloc, by using exported symbol [unmap_kernel_range_noflush] instead, with manual flushing the exact same way that vmalloc does flushing in [unmap_kernel_range]. (ugly hack implemented because we cannot change the vmalloc built into the running kernel.)

for clarity, our kernel’s vmalloc has this function, but does not export the symbol for it. I looked at the function, and saw how it does its flushing. I wrapped the surrogate function in appropriate calls so that it is functionally identical.

Without getting to know zsmalloc very very intimately, I am not sure how to address this. At this point, I am wishing that the maintainer of the zram+pals code had permitted some patches to allow zram to use zpool allocator backed by zbud. Instead he staunchly refused integration of proposed patches for that. zbud would give significantly less compression, but is a much less complicated allocator. It is intended for zswap, which tries to hold onto pages that would normally get swapped to disk by compressing them and holding onto them in ram until pressure forces a disk write, but zswap cannot be built as a module. (it would probably be more ideal for our nas boxes than zram backed swap, as it would give us the protection against hitting the disk, without all the overhead, but sadly requires custom kernel, because it can only be a builtin.) The maintainer seems very strict on enforcing separation there, insisting that zsmalloc is for zram, and zbud is for zswap, and nary shall the two meet. His prerogative, it is his project-- but I would rather deal with the simple allocator at this point. (zbud only allocates 2 page pairs, where zsmalloc allocates byzantine complex combinations of pages, does memory compaction, and a bunch of other things that would complicate trying to understand why multipage allocation is failing.)

Our kernel tree does not know anything about zbud or zswap, because those are both from the redhat backport I manually merged into the tree. Support for them only exists in my modified local working copy.

It is worthy to note that the “cannot allocate more than 1 page” on zram also exists on the unmodified staging version that is present in the base source tree.

This suggests that this system does memory allocations in some way that the maintainers of zram and pals do not know about.

rac8006 · April 23, 2017, 4:22am

It is true that both gen1 and gen2 use 64k page size. But a program compiled for the gen1 will not necessarily
run on a gen2. But if you compile the program with -static the program will run on either system.

Wierd_w · April 23, 2017, 4:57am

Kernel modules do not reference libraries like user mode code does. Rather, they all live inside the kernel, but may expose interfaces to each other. This is accomplished with exported functions and symbols.

rac8006 · April 23, 2017, 5:00am

Thanks I understand that. Just thought I would let you know about the problem of running gen1 programs on gen2.

Wierd_w · April 23, 2017, 5:16am

It is entirely possible I did not wrap the doctored function correctly, and that I am corrupting the cache table when this code gets invoked, and that in the unpatched version, the unexported function leads to similar problems.

The CORRECT way to fix this, is to recompile the kernel with zsmalloc incorporated as a builtin. But WD decided that it needed to make this difficult.

rac8006 · April 23, 2017, 1:06pm

One other thing. Both the gen1 and gen2 are compiled from the same source code. But
there is a difference in the compiled kernel. The gen1 will allow blktrace to run. But on
the gen2 it gets an error.

hvdkamer · July 8, 2017, 3:22pm

I’ve just soldered a connection according to this photo. I tried 115200,8,N,1 but no output. The WD My Cloud is not fully booting either. It is hard to guess the color of the LED but it somewhere between red and blue…

After unplugging the USB to UART adapter from the computer and rebooting the WD My Cloud the same is happening. Only after disconnecting the adapter the WD My Cloud boots normally. Phew. For a moment I thought I made a terrible mistake .

Have you solved it? Or does someone else have a suggestion? It is a gen2 (according to memory, CPU and other indications) although the serial on the box suggest a gen1. I bought it a few weeks ago…

hvdkamer · July 8, 2017, 4:14pm

I’ve solved it myself . The photo is wrong! Found it through images on Google. Further searching gave this file. The GND is the second pin – the square is pin 1 and labeled on the board with an arrow, pin at the edge/bottom is labeled 5 – and not the third as in the photo. I confirmed this with a multimeter. After switching I got the complete boot. It seems af it then stops, put from other embedded devices I learned that hitting enter most of the times reveals the login prompt. As is the case for the WD My Cloud.

I’ll post the bootlog on my labjournal website. But first something to eat

SQ5 · October 17, 2018, 8:13pm

As some info to add back to this thread, the DSM 5.2 build that Fox_exe created unexpectedly has a working zram module that isn’t promoted or running automatically but can be loaded successfully and works on the MyCloud gen1. That now provides one workaround of sorts to somehow leverage zram swap on those especially limited RAM WD devices. I think his build of OMV allows the same, and more likely yet with his Clean Debian. Real shame WD has never enabled it in one their own firmware updates, which is still needed if you are using your MyCloud for WDSync.