[Tfug] ECC (was Re: Using a Laptop as a server)

Fri Mar 15 22:09:13 MST 2013

Hi Keith,

On 3/15/2013 8:56 PM, keith smith wrote:
> Wow what an interesting thread.

> I had not thought about ECC in years.  Maybe over 10 years.  Not sure If
> I will in the future.  Checked with the data center where the servers are
> that I do my work on.  The data center owner said divers and power supplies
> were his main issue not ECC.  He said ECC can be an issue, however from
> his experience ECC memory is not worth the gain realized.

IMnsHO, <something> that tells you if your memory is working/failing
has merit.  The problem is determining the "best" way of getting
that information -- and, then knowing how to *interpret* the results.

For a COTS solution, nowadays, the easy way to get that information
is with a MB that supports ECC memory.  There, all you have to do is
spend the money for ECC DIMMs instead of "non-ECC" and you've got the
*mechanism* in place to report on memory "reliability".

(for a *custom* solution, I contend that "genuine" ECC may not be
worth the total cost to implement)

> What made me thing was the question "What do you do with the info?".  I
> have no answer.

Exactly.  Having data but no way to *apply* it is silly.  How many
errors are "too many"?  How do you know if this is a soft error or
a hard error?  (hard errors being more significant as they tell
you that your error ECC *capabilities* are now impaired)

If you canvas the available literature, you'll get all sorts of
different theoretical and observed error rates.  So, how do you
set a policy/threshold by which you can use that data to:
- convince yourself that your system is operating reliably
- alert you to likely impending failure(s)

> My question would be, would it be wise to run SSDs on a web server?  I'm
> sure that would make for a much faster server and much less heat, however
> is that a viable solution at this time?

IMO, SSDs are a win when you have a read-intensive application.
I.e., serving up const pages (and keeping a log elsewhere, etc.).

At the other extreme, an application that *writes* a lot to the
SSD is going to cause it to wear out (eventually).  E.g., an
RDBMS server using SSDs for its primary tablespace *and* temporary
tablespaces (in addition to general "swap space") is going to abuse
the SSD with lots of "write once; read once; then discard" cycles.

You can build a "poor man's" SSD based server using a thumb drive
or PCMCIA FLASH card as the "primary storage media" -- depending on
how fast you need to pull things off the medium.  It might be an
easy test vehicle for you to experiment with.  E.g., use a
live CD with <whatever> you need and mount a thumb drive as
whatever writable file system your application needs  (perhaps
with a symlink farm to tie it in where needed)

> Thank you all for a very interesting thread!!