2.10.302 Hoses NAS to USB Backups

Hmmm… Comparing the figures in the other thread the 32,767 upper limit for a signed 16 bit integer is looking suspicious LOL… Pure speculation of course so I’m re-running a backup of the share it always stalls on for me to see exactly how many files it does.

EDIT: Yep pure speculation. It still stalls at 113GB, but has only copied 8,225 files and 677 folders.

Hey Stealth57!

I’ve had some interesting developments on the NAS to USB backup front this evening.

I logged it as a support case with WD Support (one of several including the stupid LED behaviour and DropBox app) and have had quite an exchange of emails, test results (many many tests) etc.

A couple of days ago they asked about the possibility of doing a remote control session to my PC so they could try a test backup and then monitor it via and SSH session to the EX2. Well after some discussion due to me saying “not a chance” they told me what they wanted me to do and what command to type in to the SSH session to monitor the job.

So I did as asked and the job stalled (in “copy” mode as usual) with the usual symptoms - Dashboard progress bar not moving and all HD activity stopped (usually the EX2 and external USB drive go to sleep, bored probably LOL). Anyway, this happened tonight as expected except the SSH session showed that the backup job had clearly bombed with “segmentation fault”!

So I’ve packaged up the log file, some screen shots etc and fired them back to WD support. I’ve also added another serious flaw/bug to the already outstanding list - the fact the backup had actually bombed, not just stalled, and Dashboard was utterly clueless about it. That needs fixing urgently!

If you want to try the same thing (if you haven’t already) let me know and I’ll post the details for you.

George

I have done some similar things and provided logs. However, to close the loop with my level 2 support, can you PM me your case number? I can do the same so that they can compare notes. I did try using the Sync vs. Copy command and Sync does work. However, I cannot get a definitive statement from WD about the difference in these two approaches to backup. The simple statement in Dashboard leaves a bit of room for interpretation and I don’t want to run a second backup using sync unless I really know what it is doing.

PM sent… :wink:

My level 2 tech just sent me the following. I am a bit confused as he seemed to indicate that “%20” was somehow in the actual file name of the file that was causing the crash. I don’t believe that as I never use special characters in my file names. On the other hand, that looks like html for a blank space substitution.

============
OpSec%20Security%20Limited

This is the file causing the issue it is causing the copy
process to crash. It is the %20 in the name. I just had another customer today
with the same issue.

Hi Stealth57,

It could be that because he has seen just the last few lines of my log file he is jumping to a half-correct conclusion, for the wrong reasons. Half-right in that the last file in the log, immediately above the “segmentation fault”, may well have triggered the error but only in as far as it may be the proverbial “straw that broke the camel’s back”.

If he said/implied that the “%20”, which as you correctly surmised is just a standard way of “encapsulating” spaces, is the cause then he is wrong for three reasons:

First the full log file is 9,041 lines long. That file appears on line 9,039. There are other examples of file names which include “%20” on other lines which did not cause a problem such as these beginning at line 8257. It is a bit difficult to see where each line starts, but the beginning is “lib”.

(I’ll send the full details in a PM since the paths may contain sensitive information)

Secondly the source file is already on the EX2 and got there without a problem. It got there originally from an NTFS share. The “%20” are actually literal characters in the file names. That sometimes happens when they arrive as attachments on emails - the spaces get encapsulated but never get converted back at the receiving end. The destination USB drive is formatted as NTFS so again should not cause a problem otherwise it would have a long time ago. None of these file names include “illegal characters”.

Finally, as we’ve both seen, it works fine if the backup mode is set to “Synchronize”.

George

Hi,

Guess what - I’ve just had the same explanation back! So I’ve just sent a reply giving a bullet point demolition of their logic using the above arguments.

LOL!

George

The issue with copy is the combination of the % sign and the next alpha character. They know that %n and %s crash the copy program (sync uses a completely different Linux code - rsync). What is interesting is that if you have numbers or spaces between the % and the alpha, they are ignored. So %20 s kills the copy routine. I am running putty now to find all the combinations that are killing my 5 shares. One share had a music file with 99.9% Sure…removing the % sign from the name allowed copy to work fine.

Interesting… Very interesting…

I’m waiting now for their response to my bullet point arguments as to why “%20” should not make a difference. It could well be that the copy program they are using balks at certain, otherwise perfectly legal, filenames. If they come back with that response I’ll be asking for supporting evidence. However I would argue that if it’s legal it should work and they need to fix it.

WD simply saying “avoid certain charaters/sequences in filenames” is unacceptable - but then I’m preaching to the converted here LOL!!!

I was told that they do recognize this as two bugs 1) the % sign and 2) Dashboard not updating and issuing a failure code.

I found an interesting factoid. The first share I tried did have a few files with names of “…99.9% Sure…” and those definitely crashed the copy backup. I removed the % sign from those files and the copy ran just fine. However, what is really interesting, is that the second share I tried and which failed numerous times previously, ran to completion without any errors. I am now testing the other 3 shares which previously wouldn’t run either. Right now I have the question of why changing file names in one share impacted the copy routine in the second share???

I did use the same name for the copy routine “Test” during my testing. I am wondering if once a copy fails if all subsequent copies fail regardless of the share. Maybe something gets hung up in the code? That’s the only think I can think of as I most likely would have run the share with the % signs first as it is the smallest share. Hmmm…

I found one other very curious issue when reviewing the putty log. Most of my media files (mpg, jpg) are being duplicated during the copy process (original.mp4, original_1.mp4). The log shows a normal copy and then it follows up with a “get new name” line and copies the file again with the _1 appended. This did not happen using the Sync command.

Hmmmm…

Those are both very interesting observations.

I have the sneaking suspicion that the “%” in the filename issue is (possibly) causing some form of exception during the copy process, which isn’t handled properly, and results in a memory leak. I’m only guessing though. I do Centos admin as part of the day job, but am no expert on it by a long way.

On the duplicating files issue, were you backing up to an empty destination folder on the external USB drive? I vaguely remember reading somewhere that “Copy” mode does not overwrite existing files. I’ll have to check that.

Brand new folder as these were new tests.

Just thought I’d verify to be sure that I had a clear picture of your tests. I wonder what explanation WD will give for that behaviour?

Some foibles of NAS behaviour I can (just about) understand such as point blank refusing to accept files being copied to it with certain Linux/Unix system file names (.bashrc is the one I’ve found with both my Seagate Central and the EX2). Even with that though I’d argue that it shouldn’t balk at it. It is a plain text file and the name is only significant in certain contexts. While acting as a file respository for none Linux/Unix type machines they should be handled as being outside of that context.

Simple tests. With putty running (and logging), run the Copy process for each share one at a time. If putty shows an error, look at the file that crashed and rename it. Run again until there are no crashes. The log file was 164MB. The only name combination that crashed me was %S.

Sadly I think it is going to become a moot point for me. The reply had I yesterday gave the distinct impression they’d rather give me a refund to go away than fix any of the (various) problems.

So I’ve asked if my support contact can try to at least:

  1. Obtain a possible timescale for a fix for the backup
  2. Official confirmation if WD intend to fix the stupid LED behaviour

The answers to those two questions will, most likely in the next few days, decide if I persevere or get rid of and put the refund towards a proper NAS.

Don’t hold your breath on the LED issue. I was told months ago that the LED behavior is perfectly logical and you “can’t please everyone.” This was from the person that is the main interface between level 2 support and software engineering.

On the other hand, that same person told me a few days ago that the copy issue is a high priority to fix, but since they use someone else’s code to do the copy, it isn’t such an easy fix.

Well I’ve had my responses - sadly they were what I was expecting.

So now to backup my EX2 to a USB (using sync mode), then copy everything directly from the EX2 back to my PowerEdge 2800 RAID 5 array (too bloody noisy) then wipe the EX2 so it can go back.

Then I think I’ll buy a proper NAS made by QNAP.

Hi,

Well I’ve had enough of WD NAS units (their HDs are a different matter). I’ve got my data off the EX2 and doing a Full Factory Restore. I’ve also sent the email to Eric saying I’ve had enough and want a full refund and provided the required details.

To move my stuff of the EX2 I setup my old Dell PowerEdge (PE) 2800 with two arrays on it running SBS 2003. The system drive (C:) on array A, consisting of two mirrored 146GB U320 SCSI drives. The data drive (E:) on array B consisting of 4 x 300GB U320 SCSI drives in full RAID 5. While setting that up I did a final backup to USB of the EX2 using the internal backup in Synchronize mode.

Once that backup was done I copied everything across to the new “NAS Data” share on my server’s RAID 5 array, which appeared to go fine. At the end I did my usual “sanity” check when copying/moving important data - I browsed to the top level folder in each of the EX2 shares and did a “properties” on it in Window explorer. I then did the same on the same top level folder in “NAS Data” and compared the bytes, files and folders counts to check they were identical.

I did hit an issue with the above which even showed up in the four historical backups of the EX2 I have on the external USB drive. However I’m not certain at the moment that it wasn’t caused by the approx 260 character limit imposed by the MAX_PATH_LENGTH Windows API parameter - stupid feature of Windows that, have NTFS support huge path lengths and then cripple it with that. Anyway that caused a glitch copying one sub-folder across to my PE2800 which had some deep folder nesting. I also found it in the EX2 backups (!!!) but I’m not sure if that again could be just the API issue as I was checking it through windows.

Anyway I’ve emailed off all the requested details for the refund (“not fit for purpose” for the backup, LED, fan and other issues - while copying my data off the EX2’s drives were too hot to touch and STILL no fan operation) etc and now totally wiping the EX2 and doing a full factory restore.