Question regarding the FAN used by WD My Cloud EX2


#142

Not monitoring, but posts on the topic are still being forwarded to me.

My main concern with long cycle times is that one unexpected event that could be disastrous if not responded to. I agree, it shouldn’t happen, but…

I wrote OS level software for HP for about 12-13 years, and spent that long in tech support as well, often supporting the stuff I’d written. I have a “healthy paranoia” about these things :slight_smile:

but it appears that’s WD’s own code doing things in the background

All of WD’s NAS devices seem to perform some periodic task that almost everyone seems to wish they didn’t. Google around a bit; I found useful solutions for my older WD My Book NAS drives, but for the EX, not so much.

I’ve modified my nohup to do the logging to the same /usr/local/bin location as the script

/usr/local/bin is really meant for executables, not data files. It sounds like you’re saying it’s in NVRAM or RAM or something, since it doesn’t hit the boot (shared) drive. Could you clarify?

/var/log might be a better choice, though still on the HD. Maybe create /var/log/program_fan/. In any case, I’m not sure what gets nuked when you install an update from WD (lotsa stuff), so you might lose logs anyway, but I’d still recommend against /usr/local/bin unless it has special magic.


#143

Not monitoring, but posts on the topic are still being forwarded to me.

Thanks. Yeah when I said monitoring I meant getting emails, half the time people no longer have the same address they registered with it seems. I would’ve questioned you if you were literally monitoring the thread for year, haha! Anyways, very glad we can have this discussion. :smile:

I have a “healthy paranoia” about these things

I can appreciate your “healthy paranoia” for sure. I’ve done a lot of programming using about a dozen different languages for my occupation (C, C++, Java, VHDL, Matlab, Perl, etc. and yes, a small amount of Python here and there). If I can get the code working to avoid drive status temperatures when they’re sleeping (it looks like I’m just about there), then a solution that addresses your concern as well may be to only monitor drive temperature every 15 minutes, and monitor system temperature in smaller intervals. That seems less elegant though, but ensures a drive can still sleep after 10 minutes. I’m certainly open to ideas on how to best implement both capabilities.

By the way, a nice side effect of my current code edits is that I also log the drive status (Active vs. Standby) so if someone was concerned about whether or not their drives were sleeping, and for how often, this script now accomplishes that as well.

/usr/local/bin is really meant for executables, not data files. It sounds like you’re saying it’s in NVRAM or RAM or something, since it doesn’t hit the boot (shared) drive. Could you clarify?

Admittedly, while I’ve done a lot of programming in Linux in the past, I’m not well-versed enough on where the “best” location should be for the script or the log. I agree /var/log is probably better and will likely have the log go there eventually, which should still avoid writing to the installed drives I believe. What I mean by not hitting the drive is this: My current understanding is that the whole directory structure outside of the /shares area is not on the installed hard drives, but on some smaller storage area that’s integral to EX2’s system board. I could be mistaken, but the drives do sleep even while I write this log file to a location outside the /shares directory. Maybe it’s RAM or NVRAM but I just assumed the board had a small drive of its own. Do you have additional insight here?

I’ll also note, when I did a reboot of the EX2, the program_fan script disappeared from /usr/local/bin and I needed to copy it back over. I’d like to also determine where the best location is for it, and ideally not have it obliterated on a reboot. Ideally, I’d also have a permanent cron job (that the EX2 also wouldn’t obliterate) to check for program_fan running and to start it if it is not.


#144

So the script updates I made are working now, to include monitoring of drive state and to not measure drive temperatures when they’re sleeping, which prevents them from waking back up. I’ve set my interval at 900, or 15 minutes. If it’s set to 10 minutes or less, you risk having the script keep the drives awake by checking their temperatures, since WD sleeps them after 10 minutes of inactivity. Some additional coding could remove this limitation.

See the logging below. When the program had started, the drives had been sleeping for several hours before I copied the latest version of the program over and ran it. So with a 10-20 minute interval of the drives on, the system temperature increased by 3C, and then began cooling down again when the drives went back to sleep. It’s clear that, with the drives sleeping, the system temperature does stay a lot lower, as shown here. So for people that are concerned about temperatures, enabling the drives to sleep may help. The other quirk to get them to sleep successfully is to change the Twonky rescan interval for enabled media servers, as the default is to scan continuously (sigh).

I also like this latest version because I can see in the logging if my drives are sleeping, which could help people to troubleshoot items that may be keeping them awake (like a persistent Twonky scan).

2017-07-26 16:27:14  program_fan - STATUS: Temperatures... Sys: 46  Hd1: 34  Hd2: 34  - Drive Status... Hd1: Active   Hd2: Active   - Fan... Index: 0 Speed: 0 RPM
2017-07-26 16:42:20  program_fan - STATUS: Temperatures... Sys: 49  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM
2017-07-26 16:57:25  program_fan - STATUS: Temperatures... Sys: 48  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM
2017-07-26 17:12:30  program_fan - STATUS: Temperatures... Sys: 48  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM
2017-07-26 17:27:36  program_fan - STATUS: Temperatures... Sys: 47  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM
2017-07-26 17:42:41  program_fan - STATUS: Temperatures... Sys: 47  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM
2017-07-26 17:57:46  program_fan - STATUS: Temperatures... Sys: 47  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM
2017-07-26 18:12:51  program_fan - STATUS: Temperatures... Sys: 47  Hd1: NA  Hd2: NA  - Drive Status... Hd1: Standby  Hd2: Standby  - Fan... Index: 0 Speed: 0 RPM

#145

By the way, a nice side effect of my current code edits is that I also log the drive status (Active vs. Standby) so if someone was concerned about whether or not their drives were sleeping, and for how often, this script now accomplishes that as well.

This one is worth the price of admission! I’d love to know the wakefulness of my drives.

My current understanding is that the whole directory structure outside of the /shares area is not on the installed hard drives, but on some smaller storage area that’s integral to EX2’s system board. I could be mistaken, but the drives do sleep even while I write this log file to a location outside the /shares directory. Do you have additional insight here?

None whatsoever :frowning:

Yet…

ex6TB: du -sh /*
26K /CacheVolume
563K /bin
32K /dev
4.5M /etc
6.0K /home
3.6M /lib
0 /linuxrc
12K /lost+found
3.7T /mnt
0 /nfs
1.0K /opt

[/proc info elided]

0 /proc
1.0K /root
1.0K /sbin
1.0K /shares
0 /sys
2.0K /system
3.6M /tmp
679M /usr
21M /var

There isn’t an awful lot outside of /mnt (everything in /shares is a symlink into /mnt, do an 'ls -l /shares | grep -v mnt’ to see it quickly), maybe 700 MB all up. I could see WD easily fitting that stuff into a small and cheap FLASH or something; thumb drive-class FLASH would suffice, no SSD durability-extension strategies needed. It would also allow easy swapping in of new drives; after all, the EX makes this simple, and you don’t seem to have to do anything special to get an OS on the new drive, though I’ve never done it so can’t be sure. So the OS lives elsewhere.

I think you’re right, only /shares stuff lives on the HDs! Your logs, in relation to the amount stored in /usr, would be tiny, so putting them in /var/log/ forever without cleanup is probably fine.

I’ll also note, when I did a reboot of the EX2, the program_fan script disappeared from /usr/local/bin and I needed to copy it back over. I’d like to also determine where the best location is for it, and ideally not have it obliterated on a reboot. Ideally, I’d also have a permanent cron job (that the EX2 also wouldn’t obliterate) to check for program_fan running and to start it if it is not.

I keep a bash_profile in /shares/Public//, and it seems to survive reboots, though it’s been a while since I last rebooted. Surviving firmware updates is a similar problem, I believe. Store these guys somewhere in /shares; you’ll only access them once in a while, maybe only once after boot, or when actively working on the drive with ssh, so the keep-awake aspect won’t be a problem. And since they’re user data, WD won’t dare to erase or replace them.

I’ve always put program_fan where you see below, not in /usr/local/bin:

nohup /shares/Public/bin/program_fan -t 600 -S > /shares/Public/fanLog.txt 2<&1 &

I made the fanLog public so I could see it easily from the Finder in macOS, ditto Windows/Linux. And now, to keep the drives from waking at each log(), I guess I’ll move it.

But I realize the scripts etc shouldn’t be in Public, or thumb-fingered users might mess with them. Looking at /shares/Public:

lrwxrwxrwx 1 root root 20 May 18 23:48 Public -> /mnt/HD/HD_a2/Public

Maybe it would be best to create /mnt/HD/HD_a[12]/mystuff/ and stick everything custom under there, except stuff running under cron and the like, and churning items like logs, which would wake the drive. I would hope, but can’t say for sure, that /mnt/HD/HD_a*/ are as sacrosanct as any other user data and that WD wouldn’t touch them. Then they’d be private except to an ssh login. Putting everything on both drives protects against one of them failing, though I’d also keep a copy on my computer.

This is kinda fun!


#146

So I haven’t forgotten about this and cleaning up/uploading my code changes, but I’m on a bit of a tangent for the moment. Having an FTP issue where a new IP camera is automatically banned from FTP, and deleting it from the ban just gets it re-banned. My attention is temporarily on that for the moment. :slight_smile:


#147

Hi everybody,

I dont know if my problem is related to sth. about the FAN issues but I experience serious disk temperature read errors in My Cloud EX2 Ultra :disappointed:

I have 2x2TB WD RED disks in Spanning RAID mode. All of a sudden the device shows System Under Temperature notification but of course the room temperature doesn’t suddenly fall down to 0°C. It says “The system temperature is below the specific minimum temperature. Move the system to a warmer location” and “The system temperature is within the normal specified temperature range” consecutively for 9 or 10 times. At the same time diagnostics read Disk1 temp:0°C, and Disk2 temp:42°C. The fan works at a normal speed like 3500-4000 RPM. When i check the logs i find “HWlib unable read HD1 temperature” record. When i do a system test at first it gives me only system temperature fault but after a few minutes it also gives drive1 fault.

After rebooting or hibernating-turning on or system restore everything turns to normal with no disk or temp error but these solve the problem only for until next time. Each time this happens after 10 minutes or so i lose all my connection to the dashboard and the system becomes unreachable.

I strongly feel this is something about the temperature sensors. I do not want to lose my archieve in the disks so i couldn’t try changing the raid mode or format the disks. Has anybody ever experienced anything like this before and found a solution? I would really appreciate your comments, thanks.


#148

Hi mayanoglu,

When you get the temperature read issues, have you tried reading the temperatures directly through an ssh terminal using the command below?
fan_control -g 0

It may be that your hard drive is experiencing a problem with its sensor and may need a replacement. I have not personally heard of anything specific to WD’s (sometimes poor) implementation that could cause the behavior you’re seeing.


#149

Below is a link to the updated program_fan code that I’ve been using (I’ve bumped it to version 0.5 from 0.3). As a reminder, the changes allow the installed drives to successfully stay asleep when they are configured to do so through your own MyCloud configuration settings. When the MyCloud isn’t accessed much throughout the day, sleeping the drives is the best way to keep them cool. The temperatures are substantially cooler when the drives sleep.

Preventing the script itself from waking the drives back up is accomplished by avoiding temperature readouts of the drives when they are asleep. I also added some additional logging details when logging is enabled.

https://drive.google.com/open?id=0B6N81h28rcZGQnFsWmhTR0xmT2s

Recommended usage:

  • Copy into /usr/local/bin and then do the following commands:
  • Change permissions so program is executable:
    chmod 500 /usr/local/program_fan
  • Start script independent of ssh session.
    nohup /usr/local/bin/program_fan -t 900 -L > /var/log/program_fan.log 2<&1 &
    -t 900 sets the script loop to run every 15 minutes
    -L enables logging
    The remaining portion of the command redirects all logging into the log file/path specified, i.e. /var/log/program_fan.log As an aside, the location referenced in this example code is separate from the installed drives, so writing the log here will not wake up the drives if they’re sleeping.

Note that if you restart your MyCloud or there’s a power outage, the script gets wiped and you need to perform the above steps again. I’m considering a setup that will automatically re-copy the script and execute it on startup, but it’s a lower priority for me.

Feedback welcome!


The Reason why your WD Cloud ain't getting no sleep and We Should Petition WD to fix!
Things you want to know about the EX2 but nobody had the answers... until now
#150

Hi everybody!

I’m the owner of WD My Cloud Mirror Gen2 8 TB and my drives (2 WD Red’s) were constantly running at around 52º C, so I was among others who were concerned about high operating temperature of WD My Cloud drives.
I’ve read this thread top to bottom and accumulated suitable solution (at least for me it is). Basically, My Cloud OS already have temperature controlling program (daemon), we only need to make it do it’s job somewhat more passionate. The temperature thresholds are stored in four .xml files in /etc/wd/ folder:

BVBZ-thermal.xml  BWAZ-thermal.xml  BWVZ-thermal.xml  BWZE-thermal.xml

Each file is meant for particular WD NAS model.
То determine what file is used for your device you need to login to your WD NAS via ssh and run the following command:

 ps | grep wdtms  

The output will be something like this:

 5434 root     33984 S    /opt/wd/bin/wdtms -config=/etc/wd/BWVZ-thermal.xml

So, the My Cloud Mirror uses BWVZ-thermal.xml and yours might be different. Anyway, this is а fairly simple xml config file and by changing it contents we could force the fan to run faster. Obviously, any configuration changes are forgotten upon reboot, so we do need to manipulate this settings after every restart. This is how I do it now:

Change current directory to Public:

cd /mnt/HD/HD_a2/Public

Create two files in Public folder. First wdmc_lower_drive_temperature with following content:

#!/bin/sh
sed -i -f wdmclowtemp.sed /etc/wd/BWVZ-thermal.xml
/etc/init.d/wdtmsd restart

Of course you need to substitute BWVZ with what you’ve got from your ps | grep wdtms output

Second wdmclowtemp.sed containing this:

s/"[0-9][0-9].0" interval="300" goto="set_drv_extreme"/"65.0" interval="300" goto="set_drv_extreme"/
s/"[0-9][0-9].0" interval="300" goto="set_drv_pending"/"61.0" interval="300" goto="set_drv_pending"/
s/"[0-9][0-9].0" interval="300" goto="set_drv_danger"/"57.0" interval="300" goto="set_drv_danger"/
s/"[0-9][0-9].0" interval="300" goto="set_drv_hot"/"52.0" interval="300" goto="set_drv_hot"/
s/"[0-9][0-9].0" interval="300" goto="set_drv_warm"/"44.0" interval="300" goto="set_drv_warm"/
s/"[0-9][0-9].0" interval="300" goto="set_drv_content"/"42.0" interval="300" goto="set_drv_content"/
s/"[0-9][0-9].0" interval="300" goto="set_drv_cool"/"40.0" interval="300" goto="set_drv_cool"/

Obviously enough, values after second back slash represent different temperature thresholds. Those that are above I’ve got after a little of experimentation. They got my drive temperatures from around 52º C to around 43º C with fan speed floating from 4500 rpm to 8000 rpm depending on load.

Your mileage may vary, so you can experiment with those values on your own.

Set correct permissions (you only need to do it once):

chmod 700 wdmc_lower_drive_temperature

and execute script:

./wdmc_lower_drive_temperature

Just to be sure, you can check that wdtms is indeed running with

ps | grep wdtms

If not, type

/etc/init.d/wdtmsd start

and hit enter.

On next reboot you need to login via ssh again, cd to Public directory and execute script.

That’s it! You’ve got a lot cooler and a quite a bit noisier NAS beside you. I’m okay with noise, since it’s in my office, not at home. More importantly, I hope, it will now serve a lot longer. I understand WD’s desire to keep it quiet for us but I do not consent with risk of reducing MTBF over this. All in all I think it is a good example of bad thermal design.

Big thanx to twins and Ben_W for their findings!
And remember, if you do this, do it at your own risk. I’ve merely provided an example that worked for me. If you f##k this up, please, do not hold me responsible. This is a makeshift solution, so maybe someone will come up with something more elegant. Kudos!


#151

How do you enter these commands?
I cannot see where you are able to do this.
thanks


#152

You need to ssh to your device.


Follow the red arrow and configure ssh access. Then connect to your My Cloud device via Terminal or PuTTY.
Use “root” for user name, e.g. ssh root@your.mycloud.ip.address


#153

thank you, I have now gained access to ssh and have played with the fan control commands, proving that the fan does work and that cranking it up will reduce the drive temp.
I like the idea of your solution, however I did not get very far with it. I am using a My Cloud EX2 and have tried entering cd /etc/wd - but there is no such directory. I entered the command “ps | grep wdtms” and got the reply:

12262 root 2880 S grep wdtms

Which makes me wonder if there is any kind of fan control setup at all!

Any ideas?


#154

Sorry for late reply. I’m not a frequent visitor of this forum.
Dunno why you can’t cd to /etc/wd … You can try to root cd / and ls all existing directories and go from there.
May be, your firmware uses another way to control fan and temperature but that would be really strange, cause they are not very different devices.


#155

Hi, I want to get your thoughts on CPU temp control. I want to do the same thing that you did, but considering the CPU temperature.

   <!-- set component temperature state and continue -->
    <step name="start"             action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="1.0"  interval="10" goto="set_cpu_extreme" next="check_cpu_danger" />
    <step name="check_cpu_danger"  action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="3.0" interval="10" goto="set_cpu_danger"  next="check_cpu_warm" />
    <step name="check_cpu_warm"    action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="10.0" interval="10" goto="set_cpu_warm"    next="check_cpu_content" />
    <step name="check_cpu_content" action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="17.0" interval="10" goto="set_cpu_content" next="check_cpu_under" />
    <step name="check_cpu_under"   action="compare_temperature"    source="dts" index="*" comparison="greater_than" value="89.0" interval="10" goto="set_cpu_under"   next="check_cpu_cool" />
    <step name="check_cpu_cool"    action="compare_temperature"    source="dts" index="*" comparison="greater_than" value="16.0" interval="10" goto="set_cpu_cool"    next="check_mem_extreme" />

These numbers not seeming correct to me. Do you have any idea whether your method will work on CPU temp?
Thanks,


#156

mkilicar, tbh I don’t. It was all trial and error for me. Than again, why would you want to control CPU temp? It’s temperature not even shown in web interface and I think that installed ARM CPU is fairly simple to not be temperature-depended. Do you run many third party applications on your box to be concerned about CPU load? Anyway, cooling hard drives effectively cools CPU also, so why bother?


#157

Thanks for the reply. Actually I just realized the group is EX2, but I have PR4100.
My main HHDs are mainly idle, so mostly very cold. I am using SSD for fast file access for my applications - nextcloud, plex, etc. But the problem is CPU thermal settings. Those numbers don’t look right to me.
I test with high CPU load and while CPU temp constantly inreases, without fan increment. The thing I’m not sure is, is it a bug or intentionally implemented that way?
Can you check what do you have on CPU settings? Mine is:

<step name="start"             action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="1.0"  interval="10" goto="set_cpu_extreme" next="check_cpu_danger" />
<step name="check_cpu_danger"  action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="3.0" interval="10" goto="set_cpu_danger"  next="check_cpu_warm" />
<step name="check_cpu_warm"    action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="10.0" interval="10" goto="set_cpu_warm"    next="check_cpu_content" />
<step name="check_cpu_content" action="compare_temperature"    source="dts" index="*" comparison="less_than"    value="17.0" interval="10" goto="set_cpu_content" next="check_cpu_under" />
<step name="check_cpu_under"   action="compare_temperature"    source="dts" index="*" comparison="greater_than" value="89.0" interval="10" goto="set_cpu_under"   next="check_cpu_cool" />
<step name="check_cpu_cool"    action="compare_temperature"    source="dts" index="*" comparison="greater_than" value="16.0" interval="10" goto="set_cpu_cool"    next="check_mem_extreme" />

#158

Well I have read a lot on various blogs but couldn’t find a satisfactory solution. I live in QLD Australia where currently the outside temp is 35 deg C. The drive, being used in Span Mode, shut down on me recently, and when I looked at the dashboard it said it was in the mid 70 deg C. so I built a 30 mm high wooden box with 2 rectangular and one round hole in it and mounted my 2 NAS drives onto the box over the rectangular holes so the drives weight causes it to seal and over the round hole which is in the middle I mounted a 150mm computer fan running at 1000 rpm. The noise is extremely low but it managers to keep the temperatures of my EX2 NAS drive under 40 deg. My older LIVEDUO has never had a temperature excess problem.