My Haskell implementation of RGEP (Robust Gene Expression Programming) is going really well! Its called Heal, Haskell Evolutionary Algorithm Library, and its the latest iteration in my quest for a library to do my research with where I can change anything I want at any time to test new algorithms and genetic operators, etc.
I've decided to change everything to vectors, hoping to get some performance improvements. While I don't have any direct comparisons, I just ran a 100 individual population with individuals having 128 codons, and each codon of length 3. That is a largish dataset, and if I remember correctly my Java implementation took about 24 hours to do similar work (note that that was a pretty poor implementation on my part). My new library is about a third the size, supports PGEP, RGEP, and GAs, with an easy way to add more, and ran in 6 minutes (giving me a pretty poor answer to this symbolic regression problem I'm testing with, but thats not Haskell's fault). Wow! I also had the profiler running, and a full 87% of the time was spent garbage collecting, leaving only 40 seconds to run my code. It allocated a total of 21 Gigabytes during the entire run.
Now, I obviously must do something about this because that is an absurd amount of time in GC. The problem is I don't really know how to do deeper profiling in Haskell, and I'm having trouble with the most direct approach (running the executable with +RTS -p). So far the "shotgun" approach of adding strictness annotations is not working. Since every individual needs to be evaluated all the way every generation, laziness is only helping for some stuff, and I don't really need it for the main population, so I don't think throwing in bangs will hurt, but so far it hasn't helped at all. In fact, with only 8 Megabytes for the stack, the program stack overflows, so I gave it 15 and it runs fine.
In addition to being much faster, this library is much nicer looking (though I need to clean it up pretty badly) and I have delighted in using the various features of Haskell. I'm not even talking about the advanced features- the normal everyday ones are amazing.
On to profiling! I really want to see where all the time is being taken up. I feel like there is a lot of opportunity for improvement in time, space, and in just general style and organization.
No comments:
Post a Comment