Its Computer Science Time: July 2010

Wednesday, July 28, 2010

Symbolic Regression Success!

I am so proud of myself. I successfully applied PGEP to two different symbolic regression problems (the one in the original GEP paper and the one in the original PGEP paper). The solutions are so accurate I'm certain they are equal to the correct ones, as I'm getting minuscule numbers for the residual error.
This is especially awesome because I've had problems with duplicating other peoples results, but this time it seems like I'm able to find the correct functions. Especially when adding constants to the terminal set, which seems to have improve the quality of the solutions by several thousand times with only 1, 2, 3, 4, and 7.
Now if only I can replicate those results (or something similar) with BPGEP I will be able to actually start collecting data and all that.

Friday, July 23, 2010

Independent Research Again

I am much further along in my attempt to write a Java program with some among of pure functional programming mixed in. I will be mutating the occasional state, but only because I don't currently have the time or interest in building enough infrastructure to avoid it. Right now I am using almost no new types. Populations are lists, individuals are lists, loci are symbols (which is a class I defined). Almost every part of the system is a higher order function, and no complex type hierarchies are getting in my way. I am really enjoying it, even if it is just Java, and I still have to fight the language and accept messy constructs for some things. Soon enough I will have a binary prefix gene expression program doing some simple symbolic regression using the fitness function I found in a paper on constant creation techniques in gene expression programming.

Saturday, July 17, 2010

Independant Research Update

I've got some work done on the evolutionary algorithm framework, which I'll informally call fea (functional evolutionary algorithms) (informally because I don't feel the need to give names to my projects).
The use of functional programming concepts such as composition and the basic data structures (lists and functions) the framework is much nicer, cleaner, and more expressive than it was before. Making an experiment consists of composing the right units into the right sequence, not filling out a lot of subclasses. I am spending a lot of time fighting java's generics and its syntax, as well as wrapping code in what are essentially lambdas (this would be so much nicer if java would just add closures).
I am just now starting to express gep, and I am glad to find that even in this infant form, my program can express them pretty much like a ga (which is the sample case I used to build the framework for). The only change is exactly where it should be, which is fitness evaluation.
Ideally I would have fitness evaluation consist of providing a translator from list to list and then an evaluator someobject->double. This would isolate the parts of fitness evaluation. There is a great deal of generalizing and abstracting work of this sort to do.

On another note, while programming this, I discovered and discarded self-bounded types. They seem interesting and potentially useful, but they can't seem to introduce generics in subclasses without breaking (I could write the classes but not instantiate them). Maybe I will post some code explaining them, cause they seem to have some cool properties and may be useful someday.

Friday, July 9, 2010

Starting Out

I would like to post about other things, but right now I'm starting to rewrite my evolutionary algorithm framework from scratch using the function java library. I would like to build it sort of "bottom up" style so that I can avoid some of the problem with my current framework.
The main thing is that the in the current one you fill in holes in a large and static structure (mostly) and are therefore constrained by what types of things fit in the holes. I would like to make it mostly about function composition and reuse of small parts, with a trivial little driver loop to get it going.

The representation of the whole algorithm and its state will be (Population, Environment, NextGenFunction, TerminationFunction). Populations are structures that hold structures of items, the env binds names to objects and can evaluate individuals (I hate to mix those ideas, but thats what it is for now), the nextgen function takes a population and returns a new one, and the termination function is from population to boolean and determines if the algorithm is done.
population is S I a, env is O->O next gen is Pop -> Pop, and term is Pop->Bool.