This post describes a pretty efficient way to do Genetic Algorithms in an imperative language. Its pretty much like the pure way I talked about in an earlier post (using Haskell's Sequence and Vector types), except should be pretty straightforward to implement with destructive updates. Its a little harder to say exactly what the complexity is in practice, as it depends on how close the population is to convergence. However, even in the worst case it should be a good bit faster then a naive implementation. Note that this assumes the usual genetic operators and a population of linear individuals.

The representation of the population is as a collection (an array, for example) of complete 2-trees (trees with 2 children where all but the last level is full except possibly the last, and all nodes on the last are as leftmost as possibly). The nodes of the trees simply contain pointers to their children, and only the leaves contain values (the elements of the individual). The shape never changes, and it can be allocated as an array of pairs of pointers to tree nodes. The genetic material itself can be in a separate array or actually held in the leaves, as long as it is easy to get access to the leaves as an array. We will see why we need each piece of this setup soon.

Mutation is easy, as long as the locations can be accessed as a single array (or 2d array, or anything that can be accessed in constant time). As I've described before, rather then testing each location, simply calculate the number of failures you will have before the next success, skip over that many elements moving left to right through the population, and mutate that location. This calculation requires sampling from the geometric distribution describing failed mutation attempts before the next successful one. The only tiny trick here is to use an efficient sampling algorithm (Vose Alias Algorithm would be my choice) and to lump the probabilities of all indices beyond the end of the population into one. For example, if there are 10 indices in total, then 0-9 are given by the geometric distribution, and the 10th has the sum of all possibilities past 9 to infinity. If you generate something past the range of locations, you are done with mutation, so thing turns the infinite geometric distribution into a finite one.

Crossover is where we make use of the tree structure. This works essentially for the same reason the Sequence type in Haskell works for this, although its implementation is simpler. This is slightly tricky to explain, but it isn't really very complex. First off, note that we can usually consider the random pairs to be simply individuals next to each other in the population, as selection usually randomly picks out individuals, so they would already be shuffled. In this case we have to do this ourselves, which only takes O(n) time (shuffle a list of 0..n-1 and use the locations next to each other in this list as pairs).

Now we want to actually do crossover- we have our pair of individuals and a crossover point chosen. I'm only going to describe 1 point crossover, but 2 or n point would be essentially the same. What we want to do is to swap the child pointers in the two trees representing our individuals such that the range of indices from 0 to c (the cross point) and from c to m (the end of the individuals) are swapped. This proceeds as follows- traverse the two trees from their roots by checking if the crosspoint c is less then or greater then the midpoint of the range of indices in leave nodes under the current node- if less than, go to the left child, if greater then go to the right child. This information can be precalculated or stored at the internal nodes. When moving down a level, if you move left, simply follow the left pointer/reference. If you move right, then before recursing on the right child, swap the child pointers for the left children at the current node between the two trees. In other words, we swap all pointers on the left as we move down. If you get to a leaf node, swap the pointers for that node in the internal node that pointers to it.

This process is again hard to describe, but it is truly easy, especially if you look at something like http://www.haskell.org/haskellwiki/Zipper. The cool thing here is that it is a pointer update to do crossover, and requires only logarithmic time in the size of the individuals.

Finally we get to selection. Selection isn't really any different then usual here- we can do tournament selection and roulette wheel selection efficiently as I've discussed in previous posts. There is the problem of indexing into the individual- while in mutation we could consider the population on large array regardless of which individual a location belongs too, when inspecting a particular individual we need to traverse its tree to get the actual location of an index as they have been mixed up through crossover. Accessing a single location would take logarithmic time, but accessing the locations one after the next would only take linear time, so we many want to copy the individual off somewhere for evaluation.

The only other optimization I've considered is to do the replacement of winners with loser in place, leaving winners where they are in the population. We might do this as follows- assume we have the individuals in an array, and we know for each individual how many times they are to be represented in the next generation. We move two pointers through the population, one keeping track of the current individual with more then one representation, and the other pointing to the next individual with 0 representations in the next generation. Simply copy the winning individual into the losing one and decrement the representation count. This is repeated until the winning individual has a count of one, and we move to the next repeat winner until we get to the end of the population. This doesn't save a huge amount of copying, except perhaps in a mostly converged population using roulette wheel selection, where many individuals will have some representation in the next generation.