Saturday, January 29, 2011
A Guide to High-Speed Mathematics
Two of my friends who know I love math gave me an old (1959) book about high speed mathematics. Some of it I was taught in school and some I want to start using, but there was one thing in particular that struck me as odd. There is a technique they call digit-sum that checks your work for addition, subtraction, multiplication and division by taking the digit sums of each number and doing the problem with them as well as the original numbers- the digit-sum of the answer should be the answer of the digit-sums (for the mathematically inclined this may tip you off about where I'm going with this- sounds an awful lot like a homomorphism).To take a digit-sum of a natural number, add its digits, and if it ever goes above 10, apply the digit-sum to that number before proceeding. Another way to say this is the digit-sum is the sum of the digits modulus 9. Notice that there are only 0-8 (9 acts exactly like 0) so there is a 1/9 chance (all else being equal) that a mistake will go undetected (more on this latter).
Monday, January 24, 2011
Reverse Polish Gene Expression Programming
I've recently realized that Robust Gene Expression Programming could just as well be called Reverse Polish Gene Expression Programming. When I read the original PGEP paper I immediately wanted to define my own GEP using postfix notation, but I found that there where problems that made it messier than I would like. I gave up on that idea and instead started designing a GEP that was more to my liking, ending up with RGEP. During this process I came up with an editing scheme to correct invalid individuals so that they could be evaluated in prefix notation even if they did not encode an expression tree. At some point I realized that the most direct way of performing this edit was to cut the non-coding region (if it exists) and evaluate the individual as a postfix expression while ignoring operators that would cause an underflow.
By ignoring such operators we are sure to end up with one of three cases- an empty stack (very unlikely but possible), a singleton stack, or a stack of depth > 1. The last case can only happen if at least one operator has greater than 2 arguments, which is fairly uncommon. In that case we simple take the top of stack as the result.
I happen to like postfix notation from my experience with Forth, and it is a niche not explored (as far as I know) in GEP literature, so I'm glad to have found a reasonable (and even well justified) way to perform a postfix evaluation in GEP.
I am suggesting in my thesis that a GEP that uses this notation but not the rest of RGEP, which I think is reasonable as I'm making a lot of changes and some people might like the notation but not the rest of it, refer to GEP with reverse polish notation as RPNGEP. I would love if someone took me up on this and used it in a paper.
By ignoring such operators we are sure to end up with one of three cases- an empty stack (very unlikely but possible), a singleton stack, or a stack of depth > 1. The last case can only happen if at least one operator has greater than 2 arguments, which is fairly uncommon. In that case we simple take the top of stack as the result.
I happen to like postfix notation from my experience with Forth, and it is a niche not explored (as far as I know) in GEP literature, so I'm glad to have found a reasonable (and even well justified) way to perform a postfix evaluation in GEP.
I am suggesting in my thesis that a GEP that uses this notation but not the rest of RGEP, which I think is reasonable as I'm making a lot of changes and some people might like the notation but not the rest of it, refer to GEP with reverse polish notation as RPNGEP. I would love if someone took me up on this and used it in a paper.
Tuesday, January 11, 2011
Friday, January 7, 2011
PICs Are Cool
Low level computing can be a lot of fun. One of my favorite personal projects was a forth interpreter in z80 assembly for my TI-83 calculator. It was 2000 lines of assembly, and had an working interpreter, so you could program directly on the calculator, as well as a way of loading text programs into the calculator and interpreting them. I wrote several programs for displaying cellular automata like ant and different rules like 184 and 90. I've wanted to get back to it, either to make it more stable and polish it up, or to write a compiler from forth to z80 assembly, so I would just load the programs and run them without an interpreter (which was more fun then useful).
Instead of going back to that, my friend who is an electrical engineer and does cool micro-controller projects has inspired me to buy a pic2 kit and a little micro-controller to play around with. It comes with a little test board, so I can do cool things like light up LEDs.
I'm not terribly good with physical things, though I would love to learn more, so one of my main interests is in programming the device. I would like to write a forth compiler in haskell that results in assembly that can be assembled and loaded onto the device. These things are so resource constrained (37 instructions total, 4KB flash memory, 256 RAM which is partially reserved for special purpose registers- so not a lot to work with) that I would be providing a layer above assembly rather than a real threaded, interpreted forth system. This could be really cool if I pull it off with any success.
I hope to post more about my experience as I learn about this little device. Maybe my friend will even learn a little forth and use it to do his projects! It is certainly nicer then c or assembly!
Instead of going back to that, my friend who is an electrical engineer and does cool micro-controller projects has inspired me to buy a pic2 kit and a little micro-controller to play around with. It comes with a little test board, so I can do cool things like light up LEDs.
I'm not terribly good with physical things, though I would love to learn more, so one of my main interests is in programming the device. I would like to write a forth compiler in haskell that results in assembly that can be assembled and loaded onto the device. These things are so resource constrained (37 instructions total, 4KB flash memory, 256 RAM which is partially reserved for special purpose registers- so not a lot to work with) that I would be providing a layer above assembly rather than a real threaded, interpreted forth system. This could be really cool if I pull it off with any success.
I hope to post more about my experience as I learn about this little device. Maybe my friend will even learn a little forth and use it to do his projects! It is certainly nicer then c or assembly!
Wednesday, January 5, 2011
Duality in Computer Science
In category theory every construction has a dual which is just as valid a structure as the original. The dual object is constructed by reversing the arrows in the diagram describing the original construction. This is very rigorous and well defined, and sometimes cuts the work in half as every time you go through the effort to describe something you have also described its dual. It is also nice as it makes explicit the relationship between certain concepts- like the difference between Cartesian products and disjoint unions in set theory.
I've seem some interesting duality in certain programming concepts. First a disclaimer- I don't think these dualities are always valid or rigorous, and I don't like the idea of over simplifying what are otherwise rich and complex concepts by lumping them into opposing sides. With that said, some of these are instructive and fun.
The first is OOP and FP. This video mentions the duality I'm talking about "http://channel9.msdn.com/Shows/Going+Deep/C9-Lectures-Dr-Ralf-Laemmel-Advanced-Functional-Programming-The-Expression-Problem" where FP (if we consider Algebraic Data Types as a FP specific thing) is very easy to add operations to manipulate the data, but hard to add new data types, where in OOP it is easy to add new data (by subclassing) but hard to add new operations. Both sides can solve their problems with some more advanced techniques, but it is interesting that at least naively they are the opposite.
Another OOP vs FP is the duality of abstraction and encapsulation (using certain definitions of the two) as found here: http://lukepalmer.wordpress.com/2010/11/23/encapsulation-considered-harmful/ where abstraction is considered more FP-like and encapsulation more OOP-like (at least in practice).
In OOP the data (objects) are complex and able to perform tasks (they are smart) while the control flow is typically just the usual structures like conditionals, iteration, ect. In FP the data is not so smart- it is simple and easy to describe, built up, and take apart, but it has no idea how to manipulate itself. The control flow is of course where the power is, and there are all sorts of inexpensive abstraction available to create ways to manipulate the data and create arbitrarily complex control structures.
Another is Forth vs Lisp. Both are homoiconic language that easily allow EDSLs (embedded domain specific languages) and both are fundamentally very simple. The differences are pretty interesting though- in the video found here http://archive.forthworks.com/ots/ots-01.mpg?torrent the speaker says that in Lisp "Code is Data" which is certainly true as we are able to manipulate code as symbol lists giving rise to macros (which is made possible because in lisp we program the abstract syntax tree directly). On the other hand, he says that in Forth "Data is Code" in that we can write config files and data files that are actually executable code, and forth is a untyped language so there is really no difference between the data and the code. Another, kind of silly, duality here is that in Lisp everything is in parenthesis and semicolons are comments, while in Forth things in parens are ignored as comments and semicolons are used to end the definition of words (and so appear all the time).
There is a paper called "Call-by-Value is Dual to Call-by-Name" by Phillip Wadler which is a more rigorous duality between evaluation orders in the lambda calculus through application of de Morgans law (using the dual nature of 'and' and 'or'.
Dynamic typing and static typing have a complex relationship that is not so much about duality, but the (very good) article found here http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/ does mention one way in which they are opposite- static typing excludes program behavior through proofs (which will exclude certain "correct" programs) while dynamic typing (in practice) excludes program behavior by testing, which may include "incorrect" programs.
I've seem some interesting duality in certain programming concepts. First a disclaimer- I don't think these dualities are always valid or rigorous, and I don't like the idea of over simplifying what are otherwise rich and complex concepts by lumping them into opposing sides. With that said, some of these are instructive and fun.
The first is OOP and FP. This video mentions the duality I'm talking about "http://channel9.msdn.com/Shows/Going+Deep/C9-Lectures-Dr-Ralf-Laemmel-Advanced-Functional-Programming-The-Expression-Problem" where FP (if we consider Algebraic Data Types as a FP specific thing) is very easy to add operations to manipulate the data, but hard to add new data types, where in OOP it is easy to add new data (by subclassing) but hard to add new operations. Both sides can solve their problems with some more advanced techniques, but it is interesting that at least naively they are the opposite.
Another OOP vs FP is the duality of abstraction and encapsulation (using certain definitions of the two) as found here: http://lukepalmer.wordpress.com/2010/11/23/encapsulation-considered-harmful/ where abstraction is considered more FP-like and encapsulation more OOP-like (at least in practice).
In OOP the data (objects) are complex and able to perform tasks (they are smart) while the control flow is typically just the usual structures like conditionals, iteration, ect. In FP the data is not so smart- it is simple and easy to describe, built up, and take apart, but it has no idea how to manipulate itself. The control flow is of course where the power is, and there are all sorts of inexpensive abstraction available to create ways to manipulate the data and create arbitrarily complex control structures.
Another is Forth vs Lisp. Both are homoiconic language that easily allow EDSLs (embedded domain specific languages) and both are fundamentally very simple. The differences are pretty interesting though- in the video found here http://archive.forthworks.com/ots/ots-01.mpg?torrent the speaker says that in Lisp "Code is Data" which is certainly true as we are able to manipulate code as symbol lists giving rise to macros (which is made possible because in lisp we program the abstract syntax tree directly). On the other hand, he says that in Forth "Data is Code" in that we can write config files and data files that are actually executable code, and forth is a untyped language so there is really no difference between the data and the code. Another, kind of silly, duality here is that in Lisp everything is in parenthesis and semicolons are comments, while in Forth things in parens are ignored as comments and semicolons are used to end the definition of words (and so appear all the time).
There is a paper called "Call-by-Value is Dual to Call-by-Name" by Phillip Wadler which is a more rigorous duality between evaluation orders in the lambda calculus through application of de Morgans law (using the dual nature of 'and' and 'or'.
Dynamic typing and static typing have a complex relationship that is not so much about duality, but the (very good) article found here http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/ does mention one way in which they are opposite- static typing excludes program behavior through proofs (which will exclude certain "correct" programs) while dynamic typing (in practice) excludes program behavior by testing, which may include "incorrect" programs.
Idle Thoughts
I've posted before about the connection between some data structures and certain algebraic structures. The other day I noticed something odd, and which is probably not very useful, but may turn out to be interesting. An algebra for a monad can be considered a model of some algebraic structure. This means that we could use monads to describe data structures, as well as modeling computational contexts! Given an operator with the correct properties the monad could describe lists, sets, multi-sets, etc. Note that we could use the list monad to describe lists, so naively there may be a chicken and egg problem, but thats not really the point. The only really interesting thing here is that if we can describe the operator with the correct axioms we could describe a set structure that is not describable as a "polynomial type" using the sum and product from the theory of combinatorial species. This is because sets are non-regular species and are taken as the primitive species E.
Perhaps other structures could be described this way? I really like rigorous descriptions of data structures like this because they give deep meaning to things that are common-place for programmers, and a way to reason about and manipulate the structures in a less ad-hoc way. Probably this is either well known or not useful, but I have no real way of knowing which.
Perhaps other structures could be described this way? I really like rigorous descriptions of data structures like this because they give deep meaning to things that are common-place for programmers, and a way to reason about and manipulate the structures in a less ad-hoc way. Probably this is either well known or not useful, but I have no real way of knowing which.
Monadic Burritos
There is an article about certain types of tutorials called "Abstraction, intuition, and the “monad tutorial fallacy”" which mentions a fictional tutorial that try to explain monads by saying that they are like burritos. Part of the problem with monads in particular is the mystique that surrounds them because they are an "advanced" concept, although like many things they are much simpler than they are made out to be. I have to admit that it *was* hard to learn enough category theory to really understand them, but the concept as it is used in functional programming is very understandable.
Strangely the joke that monads are like burritos is closer to the truth then anyone realized at first- the post on The Universe of Discourse titled "Monads are like burritos" explains that burritos do in fact form a monad, which the author of the original article didn't even intend. What a delicious coincidence.
Strangely the joke that monads are like burritos is closer to the truth then anyone realized at first- the post on The Universe of Discourse titled "Monads are like burritos" explains that burritos do in fact form a monad, which the author of the original article didn't even intend. What a delicious coincidence.
Tuesday, January 4, 2011
Applicative Functors and Universal Computation
I've been reading the functional pearl "Applicative programming with effects" by Conor McBride. His papers are always interesting, and this one is no exception. I really like how functional programmers bring in these mathematical concepts to ground their abstractions and give them a clear definition. In this case we get applicative style programming based in "applicative" functors (which as the paper describes are actually strong lax monoidal functors). I have a basic understanding of what it means to be a strong lax monoidal functor, but I would like to look more into it.
The reason that these functors are especially interesting to me is something McBride mentions along with the instance of applicative for functions. On an unrelated note, the fact that -> (as in a function a->b) is a type constructor is amazing to me. When giving the definition for pure, he mentions that it is the K combinator, and when giving the definition for the operator, he mentions that it is the S combinator. It is known that the combinator I can be constructed out of S and K, meaning that there is some connection between applicative functors and computation (as SKI is a system of universal computation). I don't know enough about category theory to work out the connection, somehow there is a situation in which computation can be described by some functor and its composition. Maybe there is even some functor that corresponds to the untyped lambda calculus (and maybe others for other lambda calculi?). I know of work describing lambda calculus in category theory, but normally they describe a category, not a functor. I really don't know where this leads, but the paper has as a resource "Premonoidal categories and notions of computation" which hopefully investigates this more.
The reason that these functors are especially interesting to me is something McBride mentions along with the instance of applicative for functions. On an unrelated note, the fact that -> (as in a function a->b) is a type constructor is amazing to me. When giving the definition for pure, he mentions that it is the K combinator, and when giving the definition for the operator, he mentions that it is the S combinator. It is known that the combinator I can be constructed out of S and K, meaning that there is some connection between applicative functors and computation (as SKI is a system of universal computation). I don't know enough about category theory to work out the connection, somehow there is a situation in which computation can be described by some functor and its composition. Maybe there is even some functor that corresponds to the untyped lambda calculus (and maybe others for other lambda calculi?). I know of work describing lambda calculus in category theory, but normally they describe a category, not a functor. I really don't know where this leads, but the paper has as a resource "Premonoidal categories and notions of computation" which hopefully investigates this more.
Subscribe to:
Posts (Atom)