Sunday, January 29, 2012

Evolutionary Algorithm Framework

I've been trying to come up with some satisfying framework/library for doing evolutionary algorithms and possibly other types of stochastic optimization for a while now. Part of the problem is that the space of possible algorithms is huge and diverse, and its easy to see how any design decision restricts the ability to express certain types of EAs. Another problem is that I've not been very organized about this. To combat the second problem I've decided to record some of my ideas before I forget them.

First off is the structure of the program. Should it be a library, like a set of combinators with which one constructs the desired algorithm? This would be nice, but I think ultimately it would be a bunch of individual libraries of functions to build different types of algorithms. The problem is that there is so much diversity in possible structures that one might use that to be truly general we can't say much at all about the stochastic operators and what structure they require of the population or individuals.

This leads to the second possibility, which is some definition of EAs in particular as a series of constraints. This would mean a class of some sort describing operations necessary to describe an EA. Part of the problem there is that it is hard to see what these really are, and how they are used to build an EA. This is true for two reasons that I've seen, the main one being state. Its hard to say what state will be necessary, and every algorithm might need slightly more or less than any other. The other reason is that even within GAs its hard to exactly specify the general concept of mutation, for example. It can mean several different things in different techniques, and the more general the definition the less it really says. This line of thinking always leads me to ideas that don't really help the user at all. They say the more abstract you go the more work you need to do to get something concrete, and I agree.

The other idea is to be more explicit about what I want, rather then getting caught up in generality. This would mean defining what I believe an EA really is, and then working on that problem instead, where hopefully the extra structure gives some inspiration. So what does an EA consist of? I'm my ponderings I have often thought that it is important to separate the parts out as much as possible to prevent unnecessary couplings between them. For example, the individuals should not "know" how to mutate themselves, in the sense that it shouldn't be part of their structure. One problem with that sort of coupling is when you go from, say, a regular GA to a GEP, where there are several mutation operators. Perhaps you could say mutation is the composition of each of the several operators, but this still doesn't help much when you want the same structure to mutate in different ways in different algorithms or different times in an algorithm.

Starting with the notion of a population, what sort of thing do we have? It would be easy to say that a population is a *set* of *lists* (or fixed length vectors) of *symbols*, but there are situations in which all of these things do not necessarily hold. It would be nice to include linear or grid populations, individuals with multiple "domains", and things like number or arbitrary structures rather then just symbols. Instead we might say that populations and individuals are combinatorial species, which is very generic, that they satisfy some predicate (are members of some classes) that describe "enough" structure to do EAs, or that they are members of some specific data type which should be defined to have as many possible structures as possible.

The last idea requires some single type, recursively defined, that covers enough ground to express the usual sort of EAs (I don't expect to cover all possibilities- I'm not sure thats really possible in a single library without generalizing to the point of uselessness). This would be something like "data Structure = R Double | Integer Int | Str String | Linear [Structure] | Pair Structure Structure | None", or something like that.

The idea of doing this with species is always intriguing, but I don't know enough about them to know if this is useful or approachable. This gets at the main problem here, which is that EAs are so general that we really need a general theory of container types to work within. The reason that this is a problem is that such a theory is a subject of study in its own right, and I don't know much about it.

Okay, I guess thats enough of that for now. Sorry about the rambling and aimless post, but I want this to go somewhere eventually, and I need to start researching and organizing my thoughts.

No comments:

Post a Comment