[Tfug] Smoothing curves

Wed Dec 3 02:31:03 MST 2008

Hi, Rich,

> > Ideally, such a filter would be configurable -- to
> > allow its effect to be tailored to the environment
> > (i.e., user + input device) without impacting the
> > recognizer itself.
> 
> > In canvasing the literature, it seems that many
> > recognizers use such a feature.  But, their design
> > seems to be somewhat ad hoc/arbitrary -- there
> > is no *reasoned* explanation for why a particular
> > implementation is chosen over some other (better?)
> > implmentation.  Indeed, even the coefficients used
> > in these filters seem somewhat arbitrary!  It's as
> > if the implementers just "tried something" and
> > didn't even prove to themselves that their particular
> > approach had merit -- let alone being ideal!  :<
> 
> So you're saying there's no research to back up
> these proprietary solutions.

Correct.  It "feels" like they've just "had a problem"
with the data and "massaged it" a bit to lessen it.
I.e., none of the implementations give more than a
cursory mention of the "filtering" applied to the data.
Nor do they justify their particular *choice* of filter!

Note that (IMO) naively averaging data points in this way
can degrade the data set as it throws away detail (e.g.,
points of intentional high curvature).  This seems like it
would be a real issue as most of these recognizers are
applied to on-line character recognition (and many glyphs
exhibit striking discontinuities in the "path" which are
key to resolving ambiguities between "characters -- like
'5' vs. 'S')

In my case, I'm not trying to match "characters" per se...
but, any filtering would obviously have an impact on the
choice of templates that I support (for the above reason).

> To state the obvious, maybe you
> should look for research papers on the subject.

<grin>  That's what I have done.  I probably have 70 - 80
different papers on just this subject. It is disheartening
to see how this aspect seems to have been universally
ignored in all of them (i.e., lots of hand-waving but no
substance to back it up)

> This is a trivial motion capture problem. There should be
> something in Siggraph or the ACM Transactions on Graphics.
> 
> Another area to research would be digital image
> stabilizing. Drop the inputs -- the feature recognition part
> -- and plunder the raw smoothing algorithm.

I'll do some hunting.  Thanks.

> > Some, for example, simply "average" each point's
> > coordinates with its immediate neighbors.  I.e.,
> >   X = (Xi-1 + Xi + Xi+1) / 3
> >   Y = (Yi-1 + Yi + Yi+1) / 3
> That seems reasonable for testing. I'd have this as a
> control in my research.

But, it's just arbitrary.  Why look at just the immediate
neighbors?  Why give them each equal weight?  Etc.  I'd
like to see an explanation/justification as to why it is
*needed* and why this is the "right" way to approach it.
They offer *neither* :<

> > Others claim to be trying to fit a Gaussian to each
> > by adjusting the weights of neighbors accordingly:
> >   X = (Xi-1)/5 + 3*(Xi)/5 + (Xi+1)/5
> >   X = (Yi-1)/5 + 3*(Yi)/5 + (Yi+1)/5
> > This latter approach, of course, ignores pesky little
> > details like the role of "sigma", etc.
> Evidently they've seen the other implementation and
> decided to "improve" on it, without knowing why.

Exactly.

I see two (at least?) issues that could suggest the need for
some sort of "pre-conditioning" of the data:
- issues related to the capture device (e.g., genuine electrical
  "noise" being induced into the system)
- issues related to the user (e.g., tremor, biomechanical
  aspects, etc.)
I (personally) would fold the first class of problems into a
device driver for the particular input device.  I.e., let the
"virtual device" handle presenting it's data in a "noise free"
manner.

The latter, I believe, should be handled in the recognizer as it
represents an aspect of the input *methodology* that must be
accommodated (as opposed to the input *device*).  For example,
dehooking pen-based gestures, etc.

> > Almost universally, each approaches applies the
> > filter in the time domain instead of in space.
> Hm.
> 
> > I.e.,
> > Pi-1 and P are separated by a fixed amount of *time*
> > as are Pi and Pi+1, etc.  I haven't been able to
> > convince myself that this is correct or incorrect.
> > <frown>  But, neither have the implementers!
>
> You're talking about real-time data.

Yes.  That's the world I work in.  :>

> Temporal smoothing is the simplest implementation.

Yes.  I can see someone copping out and taking the easy
approach -- but, then, *admit* this is what/why you are
doing to give it *some* justification.  But, the
implementations seem to gloss over all of this -- almost
as if they were unaware of the consequences.

E.g., I can see using "distance-squared" as a metric for
how well a curve fits a template -- it's easier (cheaper)
to compute this than it would be to compute "distance".
But, then this reason is explicitly acknowledged.

The fact that no such acknowledgement is made of the
"filtering" approaches leads me to worry that the
implementers just didn't *understand* the Why behind
their design choice but, rather, just did it because it
*seemed* to make things run better...  :<

> But for truly accurate data, you'd do spatial smoothing.

That's what I would *think*... yet, I can argue myself
into a similar rationalization *for* temporal application.
:<

> > An alternative (more rational?) approach is to apply
> > the filter in *space* -- so that points *farther*
> > away have considerably less influence on the given
> > point regardless of their proximity in time.  E.g.,
> > if Pi+1 is "quite far" from Pi, then it's weight on
> > the average should NOT be the same as that of Pi-1
> > (which may have been physically *closer*) despite
> > the fact that each is separated in *time* from Pi
> > by the exact same interval.
> 
> I'd go for something between the two. With lots of
> input data, it would be purely spatial, but as input slowed
> down, I'd start aborting the data logger for each
> spatial group.

I'm trying to bury this filtering in the "front end" of
the acquisition routines.  I work in a resource starved
environment so I don't let algorithms carry much state.
Ideally, I would like to implement a small sliding window
centered on the "current input datum" and, from that, feed
a "filtered datum" to the recognizer.  So, the recognizer
is unaware of the filtering *and* the filter's cost can
be low.  

Note that the more state the filter has to track, the greater
the lag in the delivery of data to the actual recognizer
and, thus, the less responsive the algorithm becomes  :<
The more I can trend towards "on-line" recognition (i.e.,
little state) vs. "off-line" algorithms (i.e., take a
snapshot of the "finished" gesture and work from that!),
the less horsepower I will need to attain a given level of
responsiveness  :<

> For the spatial algorithm, I'd look at fitting a bezier
> curve to averaged control points. Maybe a cubic curve. But
> we're talking about lots of calculation to find control
> points that lie off the actual path.

<grin>  That's what the recognizer *proper* does.  I just
want to apply some preprocessing to the data to ensure that
the recognizer sees what the user "intended" the data to be
(irregardless -- my favorite non-word! -- of what the
hardware device and user's physiology imposed on the process)

> > Lastly, does treating each coordinate independantly
> > of the other really make sense?  Or, should any
> > filter model the location (in 2-space) of those
> > other points in their impact on the point in question?
> To fit mathematical curves to the data, you should consider
> all dimensions at once. You're looking for vectors, and
> they're two dimensional.
> 
> > Having independant controls for each axis is a big
> > win for many devices.  For example, the horizontal
> > characteristics of pen-on-tablet motions are very
> > different from the vertical ones.  And, their
> > effects *swap* when applied to things like mice...
> > 
> > But, it seems (intuitively) like this process should
> > be modeled as a "flexible stylus" moving through a
> > "viscous fluid".  I.e., the stylus' stiffness and
> > liquid's viscosity are the parameters being tweeked
> > to affect the path that the *tip* of the stylus
> > actually takes.  [But, I have *no* experience with
> > the behaviour of fluids so I can't draw on anything
> > besides intuition to clarify that to myself  :< ]
> 
> That seems to be an excessive solution -- modelling
> something computationally *very* expensive to achieve a
> desired result which would really be a simple mathematical
> curve.

Yes, I was simply trying to come up with an intuitive
analog for what I am trying to do to act as an inspiration
for alternative solutions.

> *Every* solution you come up with could be represented by a
> mathematical function -- you just have to find the right
> one.

Of course!  The trick is coming up with a model (in your head)
for what is happening so you can design the right "compensation" 

> Have fun!

*Always*!  :>

Thx,
--don