[Tfug] Small-ish (capacity + size) disk alternatives

Yan zardus at gmail.com
Thu Jan 31 14:57:57 MST 2013


> No!  My problem is *laptop* HDD's wear out -- not HDD's in general!
> In the 30 years I've owned computers, I've had exactly three HDD's
> wear out -- all laptop drives despite the fact that I rarely *use*
> a laptop (one drive in a laptop, two more in this 365/24/7 situation).

You got quite lucky. I haven't owned computers for anywhere near 30
years (although for most of my computer-owning time, my gear's been
bought used), but I've had at least a dozen hard drive failures. Here
in the lab, we have somewhere around 100 computers with probably
upwards of 250 hard drives. Those were all bought new, and most of
them are enterprise-grade. We have something around a failure a month,
and that's probably beating the statistics on these things.

> No!  But, neither does every "block worth" (which might be as much as
> 500KB!) contain 100% "new data".  That's the point here -- that the
> work you are asking the disk to do can be much less than the work
> it actually ends up *doing*.
>
> E.g., if I update the "hours worked this week" for each employee
> in a dataset and each employee's "record" resides in a different
> FLASH block, then an entire block is erased for each employee in
> the organization -- even if that's only 4 *actual* bytes per employee!

Admittedly, this is a problem for SSDs (although I'd argue it doesn't
preclude it from being used for the application you describe). There's
some research being done in this area. Samsung and some South Korean
university published a paper proposing a new filesystem optimized for
(among other things) reducing erase cycles. The paper is at
http://static.usenix.org/events/fast12/tech/full_papers/Min.pdf and
the presentation at
https://www.usenix.org/conference/fast12/sfs-random-write-considered-harmful-solid-state-drives
. It doesn't look like they released the implementation itself,
though, so this is more of a future argument.

> But it isn't.  Error frequencies go up which means more (hidden)
> update cycles are incurred, etc.  Notice how "enterprise" SSDs
> tend to stick to SLC technology -- trading capacity/speed for
> endurance/reliability.

I'm not sure if this is the case anymore. Most SSDs I've seen billed
as enterprise are heading to MLC nowadays. This might be a temporary
trend, though, for all I know.

>> So what. This is what they do. If you have stuff that is never being
>> written (only read) the controller will move it to a frequently written
>> cell. It will do this before it thinks that there is only one write left
>> on that target cell. Even if the cell fails, so what. Cells fail. Those
>> cells are marked as bad, and the drive uses other cells. ALL modern SSDs
>> have spare area. They can (and do) handle cell failures.
>
>
> This is reflected to the interface (i.e., user) as indeterminism.
> The application never *knows* that the data that the drive has
> previously claimed to have "written" has actually been written.
> It's an exaggeration of the write caching problem.  But, it is
> brought about by the inherent "endurance" limitations of the
> media.  As if you had a HDD that was inherently failure prone.

It isn't much different from HD caching, or from the OS-level caching,
or anything like that. If you want to complain about the disk
equivalent of bufferbloat, that might be valid, but pretending that
this only occurs with SSDs is very inaccurate. In fact, it's probably
better on an SSD, because the amount of time that it takes to remap a
block is (probably) faster than the amount of time data sits in an HD
write cache waiting for the HD to spin to the right place.

> Imagine a HDD that was designed to *randomly* pick a block of
> otherwise stable data, copy it to a new location, verify that
> the copy succeeded and then erase (not just overwrite) the original
> all so that some NEW piece of data could be written in its place
> (and, at the same time, updating the behind the scenes bookkeeping
> that keeps track of all that shuffling around -- using media
> with the same "characteristics" that it is trying to workaround)

That'd be insanely slow, and that's because HDs are HDs, and are slow.
While an SSD would be faster if it didn't do this, it's still quite
fast while doing it.

>> I fail to see the problem. SSD controller have a complicated job to do,
>> and they do it.
>
>
> They *try* to do it.  I saw a recent survey claiming 17% of respondents
> had an SSD fail in the first *6* months!  (of course, a survey in which
> respondents self-select will tend to skew the results -- people are
> more likely to bitch about their experiences than praise them!)

On top of the statistical issues with that poll (which had "600+"
respondents), I would guess that those types of failures ("my SSD
completely stopped working") are more likely caused by the disk
controller giving out than the underlying medium. That's a guess, and
it doesn't make things better for the end-user, but it might take some
of the heat off of the wear-leveling debate.

It does shine light on another issue, which is the fact that SSDs are
just barely leaving the early adoption phase. I hopped on about a year
ago, and haven't had any issues. People around the lab are getting
SSDs more and more, and I haven't heard of any failing either, so at
least my anecdotal evidence implies to me that the manufacturers are
getting the hang of things better.

> So, you are suggesting I simply say, "Buy this particular SSD otherwise
> the system won't work"?  Would you build a MythTV box if you were told
> you had to use this disk (endurance), this motherboard (performance),
> this fan (sound level), etc.?  Or, would you cut some corner and then
> complain later to anyone who will listen?

"Buy this particular SSD otherwise the system won't work" should
really be "Buy this particular SSD otherwise the system will fail
faster".

If you're going to build a system, the quality of the components you
choose is going to affect its performance and reliability. You can't
get away from that. If you go with HDs, you'll have to say "Buy a
non-laptop drive of this caliber or it'll fail faster." It's the same
thing. If you go with a high-quality SSD, it doesn't mean that the box
won't work with a low-quality one. It'll just work slower or less
reliably. It's the same for every other component.

> Passive cooling.  The inside of the enclosure has never been above 35C.
> No one wants to listen to fans 24/7/365!

Here's another plus for SSDs. They'll pump a lot less heat out, so
your machine might work in less friendly environments (ie, locked in
an entertainment center or something).

> When was the last time you replaced your thermostat?  Irrigation
> controller?  Garage door opener?  Washer/dryer?  Doorbell?  TV?
> Security camera(s)?  DVR?  "HiFi"?  Hot water heater?  Weatherstation?
>
> Then, ask yourself *why* you replaced it:  because you were tired
> of "last year's model"?  Because it wasn't performing as well as
> it should?  Because it *broke*?
>
> Chances are, most of these things did their job until they broke (or
> were outpaced by other technological issues) and *then* were replaced.

In the last year, I've replaced an irrigation controller (and
irrigation pumps), had to repair a dryer, had to repair an AC unit,
and had to replace a hard drive (not a laptop one, either) in a
security camera system. That's not including regular filter changes
and so forth, either. Components fail. There's not going to be a
magical HD that'll prevent that.

>> You are avoiding one limitation (SSD finite erase/program cycle) but
>> with HDDs you still suffer mechanical wear and tear. As you noted in
>> your original email the HDDs you've been using "die pretty easily".
>
>
> But those have all been laptop HDD's!  E.g., I suspect moving to
> a "real" disk drive will give me the same sorts of reliability
> that I've seen in my other machines (though at a higher power
> budget and cooling requirements).

I think the common wisdom is that a good HD will have a longer
lifespan. The argument a) that this lifespan isn't indefinite and b)
that an SSD's lifespan won't be unreasonably short. But it's your
system; you'll ultimately have to decide whether it's worth upping the
specs for it.

>> If you don't fully understand your data access/update patterns it
>> doesn't seem like you can say whether or not they will overly burden an
>
>
> I can look at data write *rates* (sector counters) and make conclusions
> based solely on that!  The SSD won't give me any better guarantees than
> total number of rewrites.  I.e., it doesn't care if I am writing
> "AAAAAAAAX" or "AAAXAAAAAA" in place of "BBBBBBBBB" -- as long as either
> write is a "sector".
>
> Knowing the access patterns (at the application level) IN DETAIL lets
> me restructure tables so that the data that are often updated "as a
> group" tend to be grouped in the same memory allocation units.
>
> E.g., if you;re running a payroll application, then wages and taxes
> are the hot items that see lots of use.  OTOH, if you are running
> an application that tracks attendance (timeclock), then wage
> information probably sees *less* activity than "hours worked"
> (which would have to be updated daily).  In either case, employee
> *name* is probably RARELY updated!

If the data is in a database, ensuring this at the hardware level
might be harder than you'd think, given filesystem abstraction,
separate storage for indices, etc. If the data is in a DB-backed LDAP
directory or something, you can probably forget about it. Of course,
if you're using a specialized filesystem and so forth, it's possible,
but that's a lot of effort for questionable gain.

> It won't be *me* that's rolling the dice!  :>  Rather, it will be
> someone who tries to build and configure a similar system and
> wonders why *his* choice of storage media proved "less than ideal".
> Or, why the *identical* system ("as seen on TV") performs so
> differently after he's made some "trivial" changes to the code.
>
> "Gee, I just changed the payroll program to update the wages on
> a daily basis -- each time the timeclock recorded additional
> hours for the employee.  Now, I'm seeing problems with the wage
> data's reliability..."

Given the abstraction involved in these things, I think the actual
effect will be "Gee, I just changed the payroll program to update the
wages on a daily basis -- each time the timeclock recorded additional
hours for the employee. Now the disk failed a day faster.". They'll
eventually see data issues ("bad sectors" and so forth) from the wear,
but the wear-leveling will distribute it out of just the
frequently-updated wage data.

- Yan




More information about the tfug mailing list