[Tfug] good enough is good enough (long)

Fri Jul 12 12:49:25 MST 2013

Hi Robert,

On 7/12/2013 8:45 AM, Robert Hunter wrote:
> An interesting talk by Alex Martelli at this year's PyCon.  Might be
> especially relevant to some folks from Bexley Hall.

Naw, the folks there would just *pray* for guidance!

> http://www.youtube.com/watch?v=yo4Uqq7NXQc

This is an *old* argument.  It is predicated on:
- developing an update is "free" (low cost)
- distributing an update is "free" (low cost)
- bugs cost the user nothing (very little)
- user inconvenience (relearning) bears no cost/consequence
- there is no cost to re-rerunning the application
- developers don't understand the problem domain
- the "product" will never be "finished"
- time to market (even a FOSS market!!) is paramount
- etc.

If the market you are addressing is The Desktop, internet
connected machines that don't interact with anything of
*value* (i.e., just users and their actions -- which are of
no importance), then you can get away with this sort of thinking.
(until your users get tired of your buggy products and start
looking elsewhere)

But, most (by a LARGE margin) of the software deployed in the
world (!) does not fit these expectations.

[Speaking solely from first-hand experience (i.e., not conjuring
"what-ifs")]

I was tasked with designing an electronic KWHr meter (i.e., the
glass bubble that sits on the outside of your house and figures
how much electricity you've used).  Design had to operate OUTDOORS
in the cold winters of South Dakota along with "Death Valley" summers.
Accuracy at 1% of reading (*not* "full-scale") for a nominal 200A
residential service.  Had to cost $30 (DM+DL), a life expectancy of
over 30 years and be produced in 30 million units.  (this was *such*
an easy spec to remember!  :> )

A bug costs the user or the utility real money -- either overbilling
or underbilling.  And, its unlikely you'll be able to effectively
correct for such an error "at the billing end" since it may not
be a nice, linear correction factor (e.g., for time-of-use metering,
*when* the electricity was consumed determines it's price/value;
if the device misrecords consumption, you have no audit trail
by which to go back and "adjust" the bill).

Developing an update is costly.  Not just the time coding it
but, also, running it through regulatory agencies to prove that
it is "correct"/fair ("Gee, didn't you tell us the *last* version
was correct?")

Distributing the update is costly as you now need to send a licensed
(union?) electrician to the residence to remove the old unit, install
the new unit (which temporarily interrupts power to the residence:
"Ooops!  Sorry, all your clocks will need to be reset -- and, BTW,
your PC wouldn't have crashed if you'd invested in a UPS!!  But,
hey, that's just a minor inconvenience, right?  It's not like we're
talking about MILLIONS of customers!  Oh, wait..."), record the serial
number of the new unit and return the old unit for a "final
reading/disposition".

And, this has to be done pretty quickly as each month that goes by
means someone is getting cheated out of $$$.

You would *think* that distributing a software update (we'll assume
HARDWARE designers are perfect and it's only the SOFTWARE folks who
make mistakes!  At least we don't hear hardware folks uttering
platitudes like "Well, you EXPECT a certain number of hardware
bugs...") would be a piece of cake -- after all, every meter is
attached to a "wired network"!  Ah, but there are lots of reactive
devices (transformers and phase shaping capacitor banks) in that
network.  So, you'd need to install "bridges" across each transformer
(e.g., typically, only four residences share a single "transformer
output"... and, those transformers are fed by a still larger
transformer in the neighborhood, etc.)  So, how much of that $30
do you want to spend adding hardware/software to the meter *and*
it's "share" of the distribution network to accommodate "software
updates"?

[Note that the *initial* deployment can happen at a much slower
pace -- "as is convenient" -- because the existing OLD meters
are still operational at that time]

====

I was involved in the design of a control system for a tablet press
(machine that makes "pills").  Among other things, the system was
to ensure the proper "weight" of the tablets being produced by the
press.  Presses will produce up to 1.1M tablets per *hour*.  I.e.,
that's three "bottles of 100" every second!  (though they usually
are operated at much lower rates -- esp for more exotic products
where you want to be damn sure every tablet you produce is saleable).

A tablet, once produced, is either good -- or garbage!  You can't
install an update and re-make the tablets.  (well, you can make some
*new* tablets but those that you already made are "loss")

Again, an update costs a lot to distribute:  regulatory agencies
want a say in "validating" the "product"; the equivalent "department"
in the customer's organization wants a say (each customer wanting
their *own* say!); the deployment has to be scheduled into the
production schedule -- a machine that is being "updated" isn't
producing saleable product; employees have to be trained on the
new/repaired features; some initial portion of production may require
extra *manual* inspection to reassure the customer that all is well;
etc.

If you connect the press to some sort of network to ease software
distribution, you then have the risk that some outside agency can
interfere with production ("Prove to me that the tablet press's
operation can't be compromised by ANY sort of network activity...")

====

I designed a "maritime autopilot" many years ago.  The device obtained
real-time position data from a LORAN-C receiver (predates GPS by a
few decades) and, based on knowledge of where the vessel *is*,
would adjust the rudder to drive the vessel to a specific point
on the globe.

[Up to this point, autopilots kept the vessel pointed in a fixed
*direction/heading* but could not accommodate "drift" (cross currents)
that were effectively altering the course-over-land to something
other than the intended heading.  I.e., I *measured* this drift
and adjusted the rudder to compensate.  If you wanted to get to
a specific lobster pot deployed off the tip of Cape Cod, you *got*
to that lobster pot -- not "somewhere nearby, depending on the currents
that day"]

But, the ocean doesn't follow prescribed rules.  There's nothing
that *guarantees* that the currents will be of a particular
magnitude.  Or, that the autopilot will be deployed on a 20 ton
displacement vessel vs. a 50 ton.  Or, that the engines will be
able to fight a given current, etc.

On which revision of the software do you introduce the code that
prevents the vessel from crashing into the rocks off the coast?
*How* did you "discover" the need for this "update"?  ("Ooops!
Sorry about your boat, Charlie...")

How do you explain the need for a particular update to a fisherman?
Does he understand the concept of geometric dilution of precision?
Does he understand the risk of the ambiguity that operating in the
region of the baseline extension poses to the systems' ability to
determine that "you are here" -- vs. "you are 800 yards NNW of here"?
(i.e., we spent days researching whether the keypad should be laid
out "like a telephone" or "like a calculator".  And, more days
deciding how many pounds of force the user could be expected to
exert on the keypad -- since the keypad had to be designed to
tolerate fish guts being strewn across it!)

====

"Programmers" (folks who work in isolated, safe little desktop
domains) are spoiled thinking there are no consequences to their
mistakes.  What does it matter if we swap the functions assigned to
the mouse buttons -- then, swap them back in the next release?

"So, your paycheck is off by $300 this week.  We'll fix it NEXT
week.  Or, the week after..." ("But my creditors are expecting
that $300 *this* week!")

But, other industries/professions aren't as coddled.

We have a friend who underwent a hysterectomy a few years back.
At the time, she told the surgeon to remove her ovaries -- she was
done bearing children and they were more of a liability than an
asset at that point in her life.

Apparently, it wasn't "convenient" for the surgeon to get to the
ovaries (let's assume the surgeon wasn't lazy but had a bonafide
reason -- risk/reward -- for not doing so).

She now has stage 4 ovarian cancer.  Had the surgeon removed them,
this might have been prevented.  Or, if the cancer had already
developed, she would at least have had a couple years head start
on a treatment regimen.

Were the surgeon's actions "good enough"?  How many corners can
I cut in a product's development -- regardless of their consequences
to the user -- and still consider it "good enough"?  Perhaps version
*2* of that hysterectomy will be better??  (though I don't think
*she* will benefit from it!)

Would you be happy if the Da Vinci robot cut off your testicles
instead of the prostatectomy that it was trying to perform?
("We'll fix that later -- for free!")

Would you be happy if your 3GHz PC spent 99% in a loop counting
from 0 to infinity so that it acted like a 30MHz PC?  ("We'll
fix that later -- for free!").

Would you be happy if the HDMI input in your new TV didn't work?
("We'll fix that later -- for free!")

Would you be happy if the active restraint system in your car
didn't work ("We'll fix that later -- for free!")

"Release early.  Release often.  You *EXPECT* a certain number
of bugs!  Don't sweat the details..."

Nothankyouverymuch.

YMMV, of course.

--don