Wednesday, October 13, 2010

Abstract Structure of Evolutionary Algorithms

Very often abstraction comes at a cost. I've often heard that we should avoid abstraction for abstractions sake- if we put in the effort to abstract it should be to serve a purpose. The more abstract something is, the more work is needed to come up with concrete results with it. I've found that Category Theory is really fascinating, but is dauntingly abstract. This comes up in programming all the time. If you create some library, what is the domain of problems it covers? Is it the most general possible solution, needing then to be specialized by the user before any real work is done, or does it just provide one solutions to a problem, so the user may have to write their own library if they need to be able to express something not covered by your program? This is an interesting trade-off, especially when the subject of the program is something complex and interesting. Evolutionary algorithms vary enormously, and it is not obvious that they can be characterized. In fact, I've noticed that there is almost nothing I can pin down and say "this is always the case for all EAs I can think of." Partially the problem is just that the space of possible EAs is so large. I've been thinking a lot about this, as I would like my Haskell library to be as general as possible, but to still be usable. This comes up especially for research, as I want to be able to try different variations of the methods I'm studying without rewriting the whole framework. If there was some way to capture the structure of EAs in general, or even just linear GA-like programs in general, it may lead to a useful library and a deeper understanding of EAs. At least it would deepen *my* understanding of EAs.

One of the nice things about Haskell is it forces you to be explicit about things that are otherwise implicit. This means that when we look at the type signatures of the things in a library for some theory or method it tells us something about the nature of that theory. Or at least, it can tell us something about the theory or method. For example, in EAs everything requires some generation of random numbers. This is not just about GEP or GA or GP or something like that- AI is stochastic. I am not content with just saying that if we don't need randomness then we don't use it (adding randomness generalizes no having randomness in that we can get back to not having randomness by not using it) because that justification allows anything at all to be added as long as we don't have to use it. But randomness stays- it really does describe something fundamental in EAs and AI in general.

The next thing is the population- every evolutionary computation that I am aware of has some population. This is not as easy to model as we might hope, as the population can have many different structures, though really it is often unstructured (a bag structure). I don't think it is enough to say that the population is a functor. All that tells us is that we can act on the individuals. While this is important, there appear to be at least two types of operators we use in the methods I'm familiar with- operators where we act on one individual (point mutation, rotation, IS, RIS) and ones that act on groups (often producing groups of the same size) like the various crossovers. I think selection can be put in this type- tournament selection makes potentially small groups, while roulette always uses the largest possible group. The ability to describe operations this way may turn out to be useful. The problem appears to be in the grouping mechanism- Do we need random groups (normal crossover)? Groups influenced by the structure of the population (crossover in a CGA)? With replacement (tournament selection)? Without replacement (potentially crossover, though it could be done with replacement)? And even then we have problems- in CGAs a crossover can occur with any cell around an individual, so do we get all the cells and only produce one individual? What about with pairwise crossover, were both individuals (the children) must be added back to the population? Or when the parents can survive crossover, adding their children without dieing off themselves? If we could characterize all this, maybe the general structure of populations could be defined, but basically the structure of populations is, as far as I can tell, some structure. There is not much else we can say, really. Also, we should force a structure to only have one grouping mechanism, in the case that we want to do multiple types of grouping, such as for different operators.

Moving on from structure and operations to the individuals that fill up the structure. Again, all we can say is that they have "some structure." In a way individuals may even be worse- at least I was assuming in the above that the populations were homogeneous (with the assumption that multiple groups, as in co-evolution, constitute multiple populations). Individuals can be very complex- linear lists of fix length with a fixed alphabet strings at best, but possibly a tree or a lists containing multiple types of things (bits as well as real numbers as well as symbols, etc). Perhaps this means that individuals are some algebraic datatype, to be deconstructed as needed. This is nice, but very general. Some encodings are very complex in a way, and may include a variety of types of loci next to each other. It would be nice to say "a list of loci" rather than "a real number follow by 8 bits followed by a real number followed by 8 bits..." and so on. So do we have some "evolutionary component"? What properties does this have? Often this is used in the more complex GAs so we can say something like "mutate all loci" and have each loci know what the means for itself. Then we can do crossover over a list of things, without worrying about what they are, as long as they are in a list. Crossover of course has different meaning depending on what structure we are talking about. And there is no use forcing individuals to be structures that have a mutation and a crossover operation because we can have multiple crossovers or mutations for the same structure for the same EA.

Back to the Haskell part, in addition to randomness, all EAs keep track of some changing state. This means we have the random generator as state, as well as some problem specific (method specific) environment. This can include rates (mutation rate, crossover rate, etc) the current population, and anything the operators need that can't be fixed before the start of the EA. This by itself is a problem, as any time we need to keep track of additional information we need to change the structure that we carry around as the state of the EA. Even if we just say it is some environment, this means we have a 2 layer state monad already. Should we add logging, so we can run experiments and write things out? This would give us a pretty complex structure, but maybe that is the best we can do, I really don't know.

And forget about just the data structures, what about evolutionary process? It goes through generations (though I saw a CGA once that performed not in the usual all-at-once style, but incrementally). These generations see the application of some operations that make changes to the population, resulting in the next population. I don't have anything more to say about that at the moment.

Sorry for the rant here, but I needed to get down the ideas I've been playing with trying to come up with something we can generalize and create a nice, usable EA library out of. I may just have to pick some concrete behaviors out and say, "this is the type of EA that this library supports" and hope that it will be usable as well as modifiable, abstract as well as conrete.

1 comment:

  1. I was looking over this post, and I felt the need to mention that I'm aware that not all AI techniques are stochastic- especially some of the symbolic AI techniques. I'm more interested in sub-symbolic AI, and in techniques that fall under the classification "stochastic meta-heuristics."

    ReplyDelete