Thursday, December 29, 2011

The Sizes of Data Types

I noticed today that when considering the sizes of data types (in C in this case) we get a morphism from the algebra of types to a positive semiring of sizes.

The tropical semiring is given by (R U {*}, +, 0, min , *) with the underlying set the reals with infinity (written here as *). If we take max instead of min, and take negative infinity as its identity we get a max plus algebra. In this case I'm going to restrict us to the regular old natural numbers, as all sizes of things are whole, and take 0 instead of negative infinity (so it is not much like a tropical semiring at all, but this was the source of my thought process). The morphism of algebraic types into this algebraic structure is straightforward- for every primitive type we have some natural number of bytes, or words, or bits, or whatever we want to measure (holes?), and for types M and N, size(NxM) = size(N)+size(M), size(N+M) = max(size(N), size(M)) as they are represented in C. Assuming a Void type, size(Void) = 0, which works okay, and for a Unit type size(Unit) = 1.

Pointers in C are all represented by the same size thing, so there doesn't seem to be a need to have a star operator in this structure. Instead, I suppose we can just embed all pointer types as primitives, or just specify that the star operator in the type system acts specially with respect to the size morphism.

This is pretty trivial (and probably wrong), but I wanted to record it as I like to think about the shapes of things and the properties of shapes.

Tuesday, December 27, 2011

Efficient Mutation for Genetic Algorithms

In the next couple of posts I'm going to investigate some ways of implementing mutation in a Genetic Algorithm (or really, any Evolutionary Algorithm). Lets start off with some well known and used ones, and lead more interesting ones (that are probably common knowledge, but I've not seen used) for later posts).

The naive and most general way to implement mutation is to generate a number in [0, 1] for each location in each individual in the population, and mutate the location if the number is less then some preset parameter (the probability of mutation). This is general in the sense that no matter the content of each location, this will tell you whether or not to perform some form of mutation operation on it. Notice that while this will certainly work on a bit vector, it may be very slow, as for a population of size m with individuals having n bits we need m*n random numbers.

Another technique I've seen is to simple find the expected number of mutations, and change that many locations. This involves generating exactly that number of random locations (probability of mutation times number of locations) and mutating only those locations. This is also general in the same sense, but it does not seem morally correct. With the first method there is a chance that no mutations will occur, and a chance that every single location will mutate. The chances are small, but the point of mutation is small random change, and I just don't feel right making it so deterministic, despite the increased performance (assuming a small mutation rate and a large number of locations). To be honest I was a little appalled the first time I saw this used. You can make it slightly better by making those changes over the whole population, as each individual will experience closer to the random probability of mutation. For example a population of size 1000 with individuals length 100 and probability of mutation 0.001 (0.1 percent) we get 1000*100*0.001 = 100 mutations spread over the while population, meaning we need only generate 100 locations to mutate by generating numbers between 0 and n*m. We can also randomly choose individuals and randomly choose locations in those individuals, so we need 200 random numbers.

Okay, that was the two most common, simplest ways to do mutation. I can think of only two other ways, and I will go over those soon.

Wednesday, December 21, 2011

Systems Programming

I've come to a realization about programming recently which makes me appreciate my job all the more. What I've realized is that the type of programming that we there do isn't about abstractions (which are helpful nonetheless) but rather about hardware. This is programming not for an abstract machine or where there is a clear denotational semantics, but where there is a semantics in the actual hardware sitting in front of you- the execution on one particular machine and maybe a small class of similar machines. Some parts of the program may be so specialized that they will only ever work on your particular project because of unique hardware or some special case that you didn't expect. In this case, C is actually kind of nice. It isn't safe at all, and maybe another language could do better, but the ability to be unrestricted in what you do is a necessity. In this context it would be nice to have more safety, and I am interested in this concept and I'm reading some papers on type safe systems programming, but it has a clarity of purpose that I appreciate.

This has lead me to realize my problem with Object Oriented Programming isn't so much the idea, which is mostly about a class of data structures and their relationship with each other, but rather in the practice- where I see abstractions that have no foundation in math, that leak, are poorly conceived and ambiguous, and whose authors don't have access to the powerful tools of mathematics and computer science to use. I'm over-generalizing of course, and I'm not saying that math solves everything, but it is no doubt powerful, and it at least gives you a leg to stand on when you program. Starting with something that is well understood and composes well allows you to built something understandable that may compose, but starting with something that is not well understood and does not connect or compose at the primitive level does not give much hope of building reusable software at the abstract level.
So, that my rant for today!

Tuesday, December 20, 2011

Haskell News Quotes

Some quotes from the quote of the week on #haskell, posted on Contemplating Code.

hpc: and when you want to get really ridiculous, agda has universe polymorphism, which is when you don't know how high up the tower of turtles you have managed to climb
RaptorRarr: Analogy puzzles are to intelligence as __________ is to elephants
[djahandarie] Group projects are stupid [shachaf] Try a semigroup project sometime. You need to lose your identity.
heatsink: Maybe, (), and Bool go to the Lone Star Bar. The bouncer stops Maybe and says, "we don't serve your kind here."
Jafet: Can oleg create a term so complicated that even he could not type-check it?
elliott: i'm here to prove theorems and compile code and I'm all out of code
kmc: my algorithm is O(1), but the constant factor grows quadratically as the problem size increases
luqui: Haskell is a DSL for writing monad tutorials.
spoolio: Haskell programmers know the cost of nothing, the type of everything, and might know the value of a few things later.
James Cook: I think Haskell questions on SO tend to the opposite extreme; no matter how poorly thought-out the question, the Haskell community will descend on it like a swarm of helpful piranhas.
djahandarie: Category theorists are morphisms for turning cofree theorems into free theorems.
monochrom: the road to haskell is paved with good abstractions
ddarius: Attempting to join #not-not-math sent me to #math. Freakin' Boole.
monochrom: @let germanize = concat . words

Genetic Algorithms Through Structures

I've noticed that it is possible to describe the genetic operators of Genetic Algorithms (and other evolutionary algorithms) though simple data structures. This does not seem to be any advantage in speed or even simplicity, but I like to think about GAs in different ways and try to decouple all the parts from each other, and this is a way to decouple the randomness from the action of a genetic operator.

First off- mutation. There are many ways to do mutation, but the simplest is to test for each symbol on each individual, and with a certain probability replace that symbol with a random one from the symbol set. In particular you often will simply flip a bit. The knowledge needed to change a symbol is whether or not you need to change it, therefore a boolean contains enough information to make this decision. For an individual of some structure, that same structure containing boolean values is enough to perform mutation so given, say, a vector of bits (the individual) and another vector of bits (the mutations), we can deterministically flips the bits in the individual that correspond to the 1s in the mutation vector (by xoring then in this case).

For crossover all we need is a single number (for one point crossover) to indicate the cut point. To crossover a whole population we need a structure containing numbers, and a structure containing pairs of individuals, and we can create a structure containing paired individuals that are crossed. We can then unpair them and get a population. We could also create some selection structure, like pairs of indices into the original population indicating who to cross. For uniform crossover, or crossover with multiple individuals, we could create a list of individuals and a list of number where each number indicates which individual the position it is in should be filled from. In other words, each index's symbol is chosen from that index in the individual indicated by the number in our crossover number list.

Selection is no harder then any other mechanism. We can do tournament selection by a structure of pairs of numbers, the indices into the population to pick the individuals for the tournament from. Notice that this means that, like all the other operators, the genetic operator structure can be determined without looking at the population at all. These pairs would have to be tagged with a boolean indicating whether the more or less fit individual should be chosen when the selection structure is applied.

In RGEP and PGEP there is a genetic operator called rotation, which is basically a shift with wraparound. Clearly this can be indicated by the number of positions to shift by, which is a single natural number.

So those are the ones I can think of right now, but it does seem like any genetic operator can be decomposed into a random part which generates the data for a deterministic part. Given the structure generated and the population you can perform the operator without generating random number.

Sunday, December 18, 2011

Forth Interpreters and the C Stack

I haven't been working on my Forth project for a while, but I'm coming back around to it recently, and there are some things I want to change. One of these things is to start using a variation on a technique I saw in the paper "Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine". In this paper they give a single line interpreter for their machine: "while (TRUE) { cont = (*cont)(); }" which simply calls the function pointer cont, and expects the function to return a new function pointer. This has the advantage of always cleaning the stack up after every function call, preventing it from overflowing or from having to throw it away occasionally.

The plan as of right now is that instead of always returning the next thing to call, the looping interpreter will simply do the action of "next"- it will copy the current IP (Instruction Pointer) onto the return stack, copy W into IP, increment IP, and call the function W points too. All functions will return nothing and the threading will be taken care of for them- they do not have to each have the code of "next" as I've done in the past.

I'm also going to change how the original dictionary is compiled. Before I was doing some pretty tedious manipulation of an array of cells (integers) for some reason. Instead of this, I will use a word struct of fixed length (the name will be a pointer to a character string rather then the string itself- this makes finding the xt of a word easier). I don't know why I didn't do this before, but it should be simpler and cleaner now.

I've been thinking about what I'm going to do for structures in this Forth. I don't want to use objects particularly, so I'm going to be thinking about how to do reasonable algebraic datatypes in Forth. It will probably involve runtime checking of types. I've devised a scheme for doing classes in the Haskell sense which I will post about, but its in the planning stage and I may abandon the idea at any moment.

I'm looking forward to cleaning up the implementation and getting it actually running so I can start exposing hardware functionality and adding threading, interrupts, etc.