[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

on genetic programming



Friends,
 
Michael Pazzani (in his reply to my series of links on speed and genetic
programming) raises some very pertinent points.  As it is Christmas Day, I will
take a little time to answer them!
 
>The great thing about genetic programming, is its generality. it can solve a
>wide variety of problems, from learning classification rules, curve fitting,
>robot control, game playing etc. But I don't think you really care about this
>generality.
 
Actually, I do, and I have been exploring the specifics too.  I am currently
using the GP approach on three different problems -
 
1.  Optimising a 7-tap FIR filter to apply to noisy data derived from a laser
Doppler rheometer.  I have already carried out about 1 month of conventional GA
optimisation on this problem as well, and am comparing the GA and GP
approaches.
 
2.  Optimising a window 7 stack filter for the same purpose.  I have already
carried out a heuristic search for the same solution, and will be comparing the
results.
 
3.  Fitting a curve to what has been conventionally accepted (but never
actually quantitated as such) as a multiple exponential decay process in what
is termed 'reactive hyperaemia' (this is the increase in blood flow following
cessation of occlusion of the arterial supply to a peripheral part of the
body).  I will challenge our statistician to pursue more conventional
approaches to fitting curves to the same data (knowing that a single
exponential does not fit well, and that trying to regress multiple exponentials
is rather testing!).
 
There is such limited a priori knowledge - and conventional approaches are so
difficult - in all three cases as to make GP a realistic option, and it has
already improved on the GA solution to (1), and looks as if it will also
improve on my heuristics in (2).  Stack filter optimisation has been tackled by
conventional GA and by neural network (actually, to date, only perceptron)
approaches as far as I am aware of the literature, and for the sort of
situation that I am in, the problems (in terms of applicability, requirement
for exemplary data, and speed) are very similar.  I have yet to see an approach
via simulated annealing (or, if we are to believe Lester Ingber, very fast
simulated reannealing), but I must admit to finding it very hard to grasp
exactly how to apply it to this case (NN, GA and GP I can conceptualise easily,
SA and VFSR get me struggling to work out how to change the temperature!).
Stack filtering is notoriously slow in software, as it is designed for custom
IC implementation, but it so happens that a median filter easily outperforms
the best 7-tap FIR on my sample data, so it is an avenue which has to be
explored.
 
Although I appreciate the problems with unseen data, I have applied GA and GP
optima to real data and eye-balled the results, and they *are* as good as on
the exemplary data (note though that my exemplary data is really fictitious
anyway - as there is no such thing as 'clean' data, only what I imagine to be
clean!  What I therefore did was to hand clean one 600-point dataset, and use
the product of that as the target, and then add artificially generated noise
statistically similar to that apparent in real data to produce the dirty input
data).
 
The other aspect of this work is that up to now, it has proved intractable, or
solutions adopted have appeared greatly inferior to these GA and GP ones.  For
instance, until now, the instrument in question has only ever had RC
single-pole filters used on its data, and the fitness values from such filters
(and eyeball results) are greatly inferior to even the roughest GA or GP
solution - in other words, GA and GP are already easily out-performing all
previous solutions.  In the case of (2), stack filter fitting is (otherwise)
pretty well guess work, and generally restricted to simple forms such as
medians, and in the case of (3), no-one has yet fitted an equation to the data
(although the form of the curve has been well described for about 70 years!).
 
I do not believe that simply taking the best S-expression out of a few GP runs
is a valid result - but the proposal to take complexity into account in fitness
measurement begs many other questions (e.g. in a curve fit, you must perform
careful simplification first, and eliminate terms which contribute little - GP
is notorious for including lots of redundant or insignificant terms).  However,
I think you will agree that I am not intending so to do.
 
In conclusion, I agree that Koza's work is a significant research contribution;
indeed, for a large number of previously intractable problems, it is a very
strong hope for major steps forward.  However, it is very important that
*others* gain experience with GP as well, and I am sure that he and we have
plenty to learn about it yet.  Do you think that I will convince our financiers
to buy the farm of Quadras, then?!
 
Happy Christmas,
Howard.
Howard Oakley,                      * Howard@quercus.demon.co.uk
EHN & DIJ Oakley, Brooklands Lodge, * AppleLink UK0392
Park View Close, Wroxall, Ventnor,  * CompuServe 70734,120
Isle of Wight UK PO38 3EQ           * Tel +44 983 853605, fax 853253