Large cache drives for Gamers

Hello,

Yesterday I was unzipping about 300MB of Javadocs to my HDD. I noted that it unzipped at about 30-40MB/sec for the first two seconds, then dropped down to 4-5MB/sec. Out of interest I then unzipped the same files to my SSD. It unzipped at a constant 30-40MB/sec. (probably CPU limited)

What I was seeing was probably the effect of write caching to the drive’s 64MB cache. If I pause the unzipping I still hear my HDD chattering away - but then if I resume unzipping after the chattering stops, it again unzips at about 30-40MB/sec (for 2 seconds), confirming my suspicions.

I like this caching. Lots of gamers would too. Every once and a while coders do silly things like dump 120MB debug files to the HDD in a single frame, causing massive lag. A few TF2 updates resulted in 1-3 second jolts for laptop users, but on my 64MB cache drive the jolts were only a fraction of a second. I think even larger caches would eliminate a lot of the stutters that HDD users face when gaming, and would have a very positive effect on how fast a drive feels.

A while back I experimented with RAM caching, using trials of products like SuperCache and FancyCache. Unfortunately I had to discontinue their use - RAM caching plays havoc if you suffer a BSOD… And that is where drive-level caching matters. If my OS crashes, then my HDD still happily chatters away, completing its tasks.

I suppose my question is, does WD plan to release any drives with large amounts of cache? I mean 512MB+; If it only increases the price by $20-60, people will buy it.  I feel there’s a market here… comprised of gamers. (and perhaps power users that own UPS’s)

Obviously optimal use of larger amounts of cache is not as easy as flipping a figurative switch. But if WD was able to incorporate similar write-combining logic to what FancyCache and SuperCache have implemented, it would vastly speed up many tasks. Some tasks involve writing to the same location multiple times, or writing many smaller chunks serially - these could all be completed instantly (as far as the OS is concerned) and then combined into larger sequential writes to the platter surface. Current drives already do this to some degree, but with a much larger cache it could be taken much further, and benefit quite a few more advanced tasks.  It’s got to have a very positive effect on performance.

It’d also be nice if such a drive prioritized reads over writes, as that would virtually eliminate stutters in games (and reduce usefulness in enterprise scenarios :wink: Can’t cut in on that market, can we?  :P  ) Going back to that poorly coded game and 120MB debug file - the drive absorbs it and continues on, and thanks to read prioritization it does not get jammed up if the game requires reading a texture or model the very next frame. Who needs an SSD when you can get a 2TB drive with very few of a regular HDD’s weaknesses?

More cache could be very fun for us power users… depending on what the drive’s firmware is coded to do. I wonder if Western Digital has considered boot preloading like the Momentus XT has? If a hard drive recorded which blocks are needed during booting, it could preload many of them to that nice big 512MB cache before your OS even starts loading. (I’m thinking of Desktops here, with long POST times - often 25-35 seconds on high end gamer boards… that’s at least a 10 second headstart after the drive(s) initialize.) An HDD with SSD-speed boots would certainly make headlines, and gamers wouldn’t think anything about dropping an extra $60+ on that.

Anyway, thank you for your time. I hope someone reads this and considers it. :slight_smile:

-Kramy

Drive caches are pointless these days since all operating systems released in the last 10+ years have done their own caching with main system ram.  There is no point in the drive caching things that the OS already has cached ( and thus, won’t ask for again ).  The slow down you describe sounds like normal Windows stupidity causing heavy fragmentation.

Games generally read whatever data they will need when loading a level or similar; they don’t write a bunch of data to disk and then try to read other data that they need immediately and thus, can not continue to play smothly.  Game designers figured out long ago how to do seamless zone transitions by reading new map data when you get close to a seamless border so it is availible when you cross the border and the data is actually needed.  Operating systems also try to flush writes to the disk slowly in the background so that reads remain responsive.

Hi psusi,

You’ve made a lot of generalizations there, and most are incorrect.

Drive caches are definitely not pointless. As I explained, all it takes is a BSOD or kernel panic and caching writes to system RAM can be a catasrophe. You say that all operating systems released in the past 10+ years have done their own caching, but it’s easily provable that Windows XP had no such caching enabled. (Or at least it didn’t work properly)

Here’s an example (I don’t like to generalize - I prefer specifics that other people can check) - the game League of Legends loads tons of level data before the map/round starts. On Windows XP loadtimes are quite long. On Windows 7 the first loadtime is long, and then data has been cached to RAM and they are super fast. If you dump LoL into a RAMdisk, cached loadtimes match RAMdisk loadtimes, indicating the caching is working properly. On Windows XP loadtimes will stay slow. (Much to my annoyance, and prompting me to upgrade)

“There is no point in the drive caching things that the OS already has cached ( and thus, won’t ask for again ).”

The extra cache will not be wasted. It can be used for writes, to eliminate stutters and jolts.

"Games generally read whatever data they will need when loading a level or similar; they don’t write a bunch of data to disk and then try to read other data that they need immediately and thus, can not continue to play smothly. "

Another generalization (easy to prove false) - you’re also lumping every game genre and engine type together, which is another bad move. Some do streaming, some have levels and loading screens, some (like MMOs) let you load zones as fast as possible to get somewhere quicker. Not all benefit from more cache, but some do. Also, as mentioned above, Team Fortress 2 has introduced stutters numerous times, then later fixed them with patches. The drives with the most cache fared the best when the developers made these screwups. (And believe me, devs everywhere are making such screwups)

The small/indie game Sanctum (very popular - I think it sold over a million copies?) had stutters from exactly the same cause - writing something, locking up the I/O, then wanting to read something immediately after.

"Game designers figured out long ago how to do seamless zone transitions by reading new map data when you get close to a seamless border so it is availible when you cross the border and the data is actually needed. "

Yes, some have. But most definitely have not. Probably 80% of developers are clueless about that sort of thing. Luckily most just build their games off engines like UE3, which has most of it figured out. But those that build their own, or even mangle it in some way (Sanctum - built on UE3) can still introduce jolts. Unfortunately there’s a lot of them.

How difficult is it to add more cache? Not very. It’s not very expensive either. It’s more a question of how “dangerous” it is, and whether it’ll make desktop drives popular in servers. (Undercutting another market)

I’d also like to bring up MMOs - most people have said that MMOs benefit immensely from SSDs due to all the streaming that they do (levels, models, textures, etc.); HDDs have trouble keeping up, and you obviously don’t have enough RAM for your operating system to cache a 20-40GB MMO… HDDs with extra cache won’t be able to fix it, but at least they’ll absorb jolts caused by writes and would be an improvement.

“Operating systems also try to flush writes to the disk slowly in the background so that reads remain responsive.”

Correct. But that may actually hurt performance.

Games generally issue read requests serially (one after the other) - if a write requires a seek, it’s going to take 5-20ms to complete, and knock the head out of position. Then the drive has to seek back. Doing a read then write them read then write (issued from different programs on the same drive) actually harms performance far more than you’d think.

Your operating system may see some reads coming from a game, issue them to the drive, then also issue some writes since it’s been a few miliseconds and it wants those writes to happen soon. More reads could come in at any time, but it won’t wait to see. Extra cache would let the drive put those writes on the backburner while important reads are dealt with. Currently, once you run out of drive cache (for example, when unzipping a file) the drive has to deal with the writes immediately. This can be quite detrimental to the performance of other software, and is one of the reasons people consider HDDs so slow.

After experimenting with huge write caches with the FancyCache software, I can tell you that hard drives feel way way way quicker when writes can be put on the backburner near indefinitely. (In practice the writes drain to disk as soon as possible - but it would keep stockpiling them if it needed to.)

Here’s an example you can probably wrap your head around. If I’m launching the game Borderlands, it takes about 25 seconds to start. If I’m launching it while unzipping something in the background, it takes about 3-4 minutes to start. (The effect of having to seek around to write, then seek back to read, etc.) If however I have a large FancyCache write cache, then it takes about 25 seconds to start. (again, while unzipping something in the background) However, I can hear my drive clicking away for another ~20 seconds afterwards, presumably writting all the unzipped stuff. That’s the effect of more write cache. That’s what I want, but without the danger of a BSOD or kernel panic nuking huge amounts of data (and my filesystem) - more drive cache improves performance safely. It’s a good thing. I want it.

-Kramy

All windows versions since NT 3.1 ( on which Win2k and all subsequent releases have been based ), have absolutely used availible free ram to cache disk IO.  Journaling filesystems like NTFS were designed to ensure that power loss would not cause catastrophe using techniques such as a journal.  Such writes bypass any cache on the disk since the disk cache is also lost in the event of a power failure.

Disk write cache does not help since games do not write much data in the first place.  Any data they do write certainly can fit in the kernel filesystem cache, which generally has access to far more memory than the on disk cache, thus, adding more disk cache will not help ( unless you add a LOT of disk cache ).

Yes, it is a generalization that games are smart enough to preload data they will need before they need it.  It also happens to be largely true, and adding disk cache will not help games that do not, since cache mostly keeps around data that has been recently acceded, not data that will soon be needed, but has not been accesedd lately, thus a badly written game engine will not benefit from more disk cache.

If you think it is easy to add cache to a drive, it is even easier to add ram to the system, and let the OS use it for cache.

Extra cache on the drive would not allow the drive to delay background writes any more than the OS could given the same amount of additional ram, hence, it is better to add the ram to the OS than to the drive.  If unzipping in one process slows down another process that is just reading that much, then that points to a fault in the OS, not the drive; the OS should be prioritizing the reads over the writes ( and Linux does so ).

In terms of data safety, any application that cares about safety bypasses the OS and disk cache and makes sure its important data has hit the physical medium before proceeding anyhow.  Moving cache from the OS to the disk theoretically could allow applications to only require the data hit the disk cache before proceeding, but applications are not written to do so; if they care about the data, they make sure it is actually on the disk, not just in the disk cache, because they care about power failures, not just kernel crashes.