Warning: Firmware Release 1.02.36 - High CPU

If your considering installing the new 1.02.36 firmware, be carefull

after doing the firmware upgrade  the UI/Cloud access is VERY slow…

The php-fpm process is consistantly taking up 80-100% of the CPU all the time, even after a fresh reboot (and no acces other then SSH).

it takes 3-5  mins to login… 

Hi  ckc123,

Does this issue exist after a reboot?  If the symptom still exists, is the same process the culprit?

Thanks,

Yes it does. Even after the factory reset

Its always the same process taking up all the CPU.

I’m just online with a WD technician right now who is looking into it right now. He’s Collecting info to send to the engineers for a detailed look.

Hi  ckc123,

Oh, yes I figured that out… (I head up that effort).  Thanks for taking our call by the way…

WD daytrdr and ckc123:

I posted some info on this -> http://community.wd.com/t5/WD-My-Cloud-EX4/EX4-Firmware-Release-1-02-36/m-p/722394#M1565

Of course daytrdr already knows about this info since his team is developing this stuff.

These are not “idle” processes that are not being killed…  they are actively using up as much CPU is available on the machine…

As soon as the machine starts up with NO client connections attempted (confirmed with SSH), it’s starting up and using all 5 of the maximum “Servers”.

Yes…as they are using up about 11-12% cpu cycles each fr the 3 processes on my machine…BUT they are idle nonetheless…unless you are actively using the dashboard, they become idle. Idle doesn’t mean they are not going to use cpu cycles.

They are not idle as you can see the impact when it takes over 30 seconds to login and to use the menus once you do…
You can see it continually using up as much CPU as it can ( via the top command in ssh)

Ckc123:

Let’s try something that doesn’t make sense. :slight_smile:

Log in via SSH and issue the command

netstat -a | grep EST

the only thing listed should be your SSH connection. Is that the case?

Yes, and as I have CONTINOUSLY said…  there is a high cpu (75%-100%)  without a single connection to the server… 

when you do try to login it takes about 30 seconds…

The only connection was my SSH session…

Does anyone know how to get changes to stay when you modity the /etc/php/php-frm.conf  file after a reboot…  I assume it’s loading it from an OS image to memory to run from…

that or what parent process starts the lighthttp server (since there is no script in init.d to start/stop it)

I can modify the conf file, and restart (enabling debug in the conf file) but not sure which process I need to restart

ckc123 wrote:

Yes, and as I have CONTINOUSLY said…  there is a high cpu (75%-100%)  without a single connection to the server… 

 

when you do try to login it takes about 30 seconds…

 

The only connection was my SSH session…

 

 

Does anyone know how to get changes to stay when you modity the /etc/php/php-frm.conf  file after a reboot…  I assume it’s loading it from an OS image to memory to run from…

 

that or what parent process starts the lighthttp server (since there is no script in init.d to start/stop it)

 

I can modify the conf file, and restart (enabling debug in the conf file) but not sure which process I need to restart

 

 

Hi ckc123,

Can you please power off your device and remove your drives (when conventient)?  Then, power the device back on with out the drives in (remember to label or organize your drives in a way that you will remember which drive goes into which drive bay).  Does the problem persist?

daytrdr wrote:

Can you please power off your device and remove your drives (when conventient)?  Then, power the device back on with out the drives in ( remember to label or organize your drives in a way that you will remember which drive goes into which drive bay ).  Does the problem persist?

Daytrdr, does this bold part really matter or are you just being dilligent?

The reason I ask is I a comtemplating swapping out my unit and I have asked your staff if I can just pull the drives out of one unit, and put them in another and it will recognise them.  Your/WD T2 staff says “it “should” work”", but disclaims “backup your data prior to trying in case there are problems”…  (basicly telling me, they dont know)  

Do you know for fact?

All drives are JBOD. (In case you were wondering)

SO two new things which seem to have helped.

I’ve disconnected 2 of the drive

System has

1   4TB WD Red

2   4TB WD Red

3   2TB WD Green (From 4 TB MyBook world)

4   2TB WD Green (From 4 TB MyBook world)

I’ve disconnected both of the WD Greens, and left the NAS for 2 days at a high CPU without a reboot…

it now seems to be operating “normally” with a good response…  I’m going to try a complete power off to see if it’s the removal of the drives, or just leaving it running a long time without a reboot…

EX4Shot wrote:


daytrdr wrote:

Can you please power off your device and remove your drives (when conventient)?  Then, power the device back on with out the drives in ( remember to label or organize your drives in a way that you will remember which drive goes into which drive bay ).  Does the problem persist?


Daytrdr, does this bold part really matter or are you just being dilligent?

 

The reason I ask is I a comtemplating swapping out my unit and I have asked your staff if I can just pull the drives out of one unit, and put them in another and it will recognise them.  Your/WD T2 staff says “it “should” work”", but disclaims “backup your data prior to trying in case there are problems”…  (basicly telling me, they dont know)  

 

Do you know for fact?

 

All drives are JBOD. (In case you were wondering)

Yes it kind of does matter.  Depends on your RAID level too.  For example if you’re using RAID 1 or 5 it probably won’t matter as much as a RAID 10. If you’re using a RAID 10 then I would suggest marking your drives and inserting the first mirrored set into the unit and waiting a few minutes for the blue LED’s to stop blinking first before inserting the next mirrored set into the unit and completing the RAID 10 stripe.  Because these units allow for RAID Roaming techinically you could just pop em into the device in any random order and let it attempt to rebuild your volume but you’re just asking for trouble like that.

For me the difference was… First time I replaced my EX4 I put each drive into the unit one at a time like WD suggested, only inserting another drive as the system LED stopped blinking.  After this process I had to let it rebuild my volume which took over a full day.  This in my opinion is the wrong way even tho it’s suggested.  I tried something else today instead.  Again mine is configured as RAID 10 here.  I inserted drive 1 and 2 mirrored drives into bay 1 and 2 and waited until it read the drives as a set and system LED stopped blinking (took like 3-4 minutes). Then I inserted drives 3 and 4 into the unit in bay 3 and 4 and let it read the second mirrored set and complete the stripe.  This method allowed the device to immediately read the drives properly and required no rebuild. 

So using this theory and results, and depending on your level of RAID, you want to insert any mirrored set of drives at the same time so it can read the mirror and not just half a mirror. No need to power down at any point since it’s hot swappable. 

The new firmware release 1.02.36 resolved my high CPU usage and CPU is running very low now.  Hope it stays this way.

Well it looks like I spoke to soon…  the slowness is back again… (with only the 2 red drives)

anyone know how to restart the lighthttp server on the command line?

Hi,

My EX4 also had high CPU U% and it was a little slow in response to the UI but not 4-5 mins more like 30 seconds or so.

I didn’t understand what the processes were doing, but had an idea that they were to do with Scanning and Converting Media. So I turned off all Media Streaming everywhere I could find a setting for it. Now my CPU is only around 5% U. And generally my response to the UI is failry fast. Using a Laptop in a different Domain still fast but slower that my PC in the ‘workgroup’. I have over 1000 Pictures and about 850GB of Video. So will enable Media Streaming (DLNA) a little later and see what happens.

ckc123 wrote:

Well it looks like I spoke to soon…  the slowness is back again… (with only the 2 red drives)

 

anyone know how to restart the lighthttp server on the command line?

 

 

 

Hello,

Do you have any of the third-party apps installed?  If so, which ones?

Also, does the issue exist when there are no drives in the product?