[Tfug] "Downgrading" ("underclocking?") processors

Bexley Hall bexley401 at yahoo.com
Thu Feb 20 03:09:40 MST 2014


Hi John,

Sheesh!  I'll be juicing oranges for a *month*!  :<

On 2/19/2014 9:44 PM, John Hubbard wrote:
> On 02/19/2014 09:29 PM, Bexley Hall wrote:
>> Hi John,
>>
>> My goal is to have a design that doesn't crap out because something
>> blocked an intake (fur balls/dust bunnies clogging an input filter)
>> or a fan gave up the ghost, etc. (come home to find CPU has *melted*
>> and everything it was expected to do in your absence didn't get done!)
>
> Modern systems will throttle performance when things get too hot.

What *exactly* are you claiming?  "System" is a vague term.  :>
Can I take a box off the shelf, write ANY SOFTWARE I WANT to run
on that BARE METAL and be assured that the machine will protect
itself *and* guarantee a specific level of performance (if so,
what EXACTLY is that?) regardless of temperature?

IME, things like disks *may* spin down -- but won't automagically
spin back *up* (i.e., "system software" needs to be aware of this
characteristic of this drive and *know* how and when to try to
spin it up).

This should be easy to test!  Get a sacrificial system.  Write some
code to do something that ensures a steady workload (write to
pseudorandom memory addresses to keep the cache cold; push pseudorandom
buffers of data onto disk; send a sequence number out a serial port;
repeat -- forever).  Then, tape the airholes closed and let it sit for:
- 10 hours (a normal "work day" while you're away from home)
- 48 hours (a "weekend away")
- 72 hours (a three-day weekend)
- 168 hours (a one week vacation)
- 336 hours (two weeks away)
and see:
- *if* it is still running at the end of that interval
- if it has continued working at the same "rate" over that interval
- if cycling power allows it to recover

Hmmm... I think UofA auction was yesterday.  So, I'll have to wait
two weeks before I'll have a chance to find something "disposable"
to experiment on.  Or, maybe I'll try WC to see if they have a
couple of "scrap" machines that I can toast.

See what happens when machine sits idle with no ventilation.
Running a full workload on bare metal.
Running a "modern OS" (Windows/Linux/*BSD) with the same full workload.
Same experiments with fans unplugged (system should be able to *sense*
this BEFORE it ever starts to heat up!  What will it do to protect
itself?)

Then, figure out what constraints this imposes on the choice of
components that can be stuffed *in* the case.  It may be that
the server-side of this project is just not well suited to
an "open" solution.  Maybe just let folks design their own
"motes" and "applets" and keep the server's design more "controlled".

> Something would have to screw up pretty badly for the machine to melt or
> even damage itself. Generally you'd just see performance go down.

So, what level does it fall to?  Where do you "look up" that detail?
If you can only count on 80% of "normal", then why not set the CPU to
run at 80% of normal and design the entire system to operate under
those conditions -- because it *has* to guarantee that all the
intended work actually gets done!

Or, do you come up with some scheme for prioritizing which activities
can be shed?

"Hmmm... maybe I shouldn't worry about monitoring for intruders as its
probably more important to ensure the temperature inside the building
stays comfortable for the pets/plants/etc?  Or, maybe skip watering
the yard in the hope that it rains while I concentrate on watching
for burglars?  Or, ..."

This seems like a harder problem to solve!  (esp given that each user
would have his own ideas as to what's important -- and, his own
expectations of how likely these events are to occur!)  :>

> When summer rolls around my nvidia GPU starts thermally throttling at 90
> or 95 degrees Celsius. I'll get you the exact number in a couple of
> months. :( My house is swamp cooled, and generally about the time my
> machines starts outputting air almost hot enough to boil water I need to
> find a beer and a cool place to drink it :)

What if the machine *can't* exhaust that hot air?  Is the machine smart
enough to throttle back to a point where the interior temperature
DOES NOT CLIMB?  Is it still able to do work?  How much?  Is it
smart enough to remove *power* when it senses that it's workload
throttling isn't managing temperature successfully?  *Regadless*
of what "stuff" you've installed in that enclosure?

Making an open, general-purpose system that *can* do lots of things is
relatively easy.  Making one that *will* do a specific set of things
is considerably harder.




More information about the tfug mailing list