Wednesday, September 28, 2011

Forth Progress Update

My standalone Forth for the x86 is coming along nicely. I make little bits of progress every time I work on it. Right now it can compile new words, variables, and constants, as well as do simple arithmetic. The command line interface needs work, but it already displays the system state (compile = 0, interpret = 1) in the bottom right corner of the screen, and shows the top 5 items on the stack on the bottom left corner (with the top of the stack on the right).

The next things I want to tackle is to factor out the screen information, the command line information (to which I want to add a command history so I can press down and up to cycle through commands), and so attach to each key a forth word call-back. I want to talk a little bit about that in this post.

I want different keys to do different things, and I would like to be able (when the system is better developed) to attach an arbitrary forth word to a key. When I say attached I mean that the each key gives you a fixed integer value, which I use as the index into an array of characters to get the character value (the keyboard doesn't just give an ASCII value). In addition to this use of the key as an index, I want to also index the integer given into an array of execution tokens (which are basically forth's equivalent of function pointers). The interpreter would read in a single character from the keyboard and call the word (function) in the character's index. This would allow me to customize the keyboard easily, and may help when I (waaay in the future) add a text editor. Most of the time keys will be attached to a word that calls putch (put the character on the top of the stack to the screen and move the cursor one position over), but keys like enter, the arrows, backspace, and possibly other will need special handling that is done as part of the outer interpreter right now and can't be configured easily.

One interesting use of this is to have many threads, each with a piece of memory for a screen and their own command line information, and to tie one to each of the function keys (F1-F12) to get virtual terminals. This would be a fun feature even while this is still a single threaded system. Way in the future I will read the APIC documentation (possibly HPET also?) to get a preemptive round robin going, but I would like at least a simple memory manager going before I allocate much fixed stuff on the heap (which is everything above the 3 MB mark right now, with the kernel at 1 MB and the system dictionary at 2 MB).

Well, that is enough for today- I have lots of fun things to work on! I love the response you get from adding features to a system like this with quick turnaround and useful results.

Monday, September 26, 2011

Types and Abstractions

I was talking to a friend the other day about what computation is. I mentioned how computation is basically "what a Turing Machine does", that it tries to formalize the intuitive notion of performing a calculation or proof (or any sequence of actions), and how there are a lot of systems that give rise to this phenomenon. In fact, computation is surprisingly easy to come by- we can get it out of a lot of substrates like machines, proof simplification (finding normal forms in certain logics), patterns in Conways game of life, rewriting strings of symbols in the various rewriting systems like universal grammars, and many more.

It is interesting to me that the substrate effect the way we can construct computations and the nature of the structures we construct, and yet there are underlying principals that unite the various systems of universal computation (systems equivalent to Turing Machines). There is always a way to have a computation blow up- nonterminal or the lack of a normal form for example- and there is never a way to tell when this will happen in the general case (if there is then there is no way to compute all things- the system give up being universal). This is not always such a bad thing, really, as it is interesting to work within more restricted systems with nice properties (strong normalization/guaranteed termination). This is analogous to what we do when we introduce layers of abstraction in programming- we introduce a layer of restrictions on the actions of the lower layer, simplifying its use but often removing some things that are useful. The point I want to make here is that type systems are essentially restrictions on untyped universes of structures (bits, lambda terms, machines, strings of symbols, etc) that describe computations (I'm only interested in computation here) in a way that is similar to abstracting over a lower level interface. We can impose rules on the structures and ensure certain properties, while at the same time losing some information and power from the lower layer.

This is a pretty unformed thought, but the point stands- type systems and abstraction seem essentially the same. It seems like the desire to expose all possible power of the lower layer is only possible if we break an abstraction, which is analogous to doing things like adding Y to a type system to get universal computation- we have to give up our clean abstraction and deal with the substrate directly with all its power and complexity. Resist the desire and lose power, give in and lose any nice property you could have gained (this is a simplification, but it is often true).

Friday, September 23, 2011

Inner Interpreter Works!

Progress update on my standalone forth- the inner interpreter works! I can compile primitive words and the threaded interpreter will run them to completion. If I start out with QUIT on the return stack, with some word's execution token (the location of its array of function pointers) in the IP register of the virtual machine (meaning the IP variable of a struct) and call next then the word will run and the program will block waiting for user input. This is because QUIT clears the return stack and starts the outer interpreter. The outer interpreter needs work, and I'm trying to get word lookup, compilation, and interpretation working.

Right now a word is arranged in memory like so- last_word, flags, length, characters..., function pointers..., exit.

The last_word is a pointer to the previous word in the dictionary of words. This means that each definition is linked to the last, up to the very first, which is linked to NULL. The characters are actually held in a cell length (4 bytes) for convenience. I'm willing to throw away memory for this system because I expect to use less than a single percent of my 2 gigs, so space is not a problem. If I packed them tighter than I would want to align the function pointers to 4 bytes, which would mean calculation the length of the word's name mod 4 and adding that many unused bytes at the end of the characters part of the definition. This is simply more work to save some bytes.

I'm working on getting numbers and words to be recognized and for words to be looked up and compiled or run. After that I will be able to extend the dictionary beyond a couple of single words and get this project on its feet. I'm hoping to also look into doing a preemptive round robin threading mechanism, as well as multiple terminals, a block file system, and a text editor. Big plans!

Wednesday, September 14, 2011

Forth Indirect Threading

Every time I want to remember how the inner interpreter works in Forth, I have to read the first part of Moving Forth again, so I thought I would record my thoughts this time for future reference. This is going to be in the context of my current project- YMMV.

The Forth inner interpreter consists of: next, exit, and docol (aka enter). There are two types of words- primitives and higher words. Primitives are machine code (C for my project) and higher order words are lists of other words (possibly also higher order). In ITC all words start with the location of some primitive functions. For primitives this is the code to execute the word, which must end in a call to next. For higher order words the first pointer is to the function docol, and the last a pointer to exit. The forth vm has an instruction pointer, IP, and a working register W (I plan on also having registers A and B, and possibly X and Y for indexing). When the outer interpreter has a word it needs to execute (entered by the user and looked up in the dictionary) it places the quit word on the return stack, sets IP to the execution token of the looked up word, and calls next.

Next copies what IP points to into W, and moves IP to the next cell over. It then calls the function that W now contains a pointer too. If this is a primitive then that function is the primitive itself, which executes some function. It must end with a call to next, which starts the process again. If all words where primitives, this would move the IP along the array of pointers until hitting exit. If a word is higher order, however, then the first function is docol. This function pushes IP unto the return stack, copies W to IP (as W contains the pointer to the docol pointer) and moves IP over one. It then calls next, which starts executing the current word just as before.

When exit is reached, the top of the return stack is popped into IP, and next is called to continue executing the next word up in the call tree.

If the return stack reaches the very bottom, where the outer interpreter places the pointer to quit, then the last exit will call the function quit. This function will clear the return stack (in case it is called in some nested context) and will clear the C frame stack (in my case) with some inline assembly. It will then call the outer interpreter to start up again, which will wait for user input.

Wednesday, September 7, 2011

Forth Threading Models

In the last post I gave one possible motivation for threading. Now I would like to go over a couple of techniques that can be used to achieve this affect. For more detail, see "Moving Forth" at http://www.bradrodriguez.com/papers/index.html, which is an amazing reference.

The first model is Direct Threaded Code (DTC). In this model, each subroutine (word) contains machine code. If the word is a primitive, this code is just the implementation as an assembly subroutine. If this is a high level word, and therefore only contains calls to other words, then the machine code is a push of the next instruction and a jmp to a subroutine that will start calling each address, one after the next. This would look like:

call DOCOL; a; b; c; NEXT;

Where DOCOL is the subroutine that calls each of a, b, and c (addresses of subroutines) in turn. NEXT is the address of the instruction that keeps this whole process moving- if we have entered a word from another word it will leave the current word and move one step outwards in the nesting. I will be explaining the various stack in a later post, so just know that NEXT handles cleaning up at the end of a word.

Another model is Indirect Threaded Code (ITC). In this model, a word must always contain a list of addresses of other words. This means that even the most primitive words must contain a pointer. Such a word might like (assuming the word starts at address 0x8000 with 32 bit cell sizes):

0x8004; add r1, r0, r1; call NEXT

This word simply jumps to the next instruction, as its require to jump somewhere. The actual action is to add two registers, and then call NEXT.

Another technique is Subroutine Threaded Code (STC). This one is a little bit ironic, as it "threads" the address by simply making each one a call. The disadvantage is that it may take up more space then other techniques as each address is actually the call instruction. Remember, however, that the original motivation here was to remove these redundant calls. While at one time it made sense to save that little bit of memory, now it may make more sense to increase speed by removing this extra level of indirection and just make all jumps subroutines calls. Another advantage here is that primitives can just be machine code- we always jump to a word like DTC. The word NEXT may literally be just "ret" or return as we are just doing the usual jumping to an address while storing the instruction pointer call we can do the usual return to the last subroutine by popping the return stack to the instruction pointer.

Yet another technique is token threading. In this technique a word is an array of values that aren't themselves addresses to other words, but are rather indices into another array of addresses of words. The only real advantage to this technique is that it can be used to get very very small compiled words- a table of only 256 words can fit each "token" (index) in a single byte. The problem is we have yet another level of indirection to wade through, and the number of words is limited to the size of the table. I don't like this technique very much as it also makes the structure of the dictionary particularly fixed (we will get to the dictionary later) and that makes some interesting things much harder.

While there are more forms of threading, I'm only going to discuss one more- switch threading. This technique is basically what a lot of bytecode interpreters do- they have a massive switch statement that determines what do with a compiled bytecode.

Well, thats it for now. I believe I will be doing a ITC interpreter for my new Forth, simply because it is easy to place pointers to functions in C. I'm considering playing around with the C stack a bit, but for now I'm looking at each word as an array of pointers to void functions with no arguments.

Forth Threading Models, motivation

I will be posting a lot more about Forth nowadays, as I work on my standalone Forth for the x86. First off I would like to talk about threading models, starting with some motivation.

Many languages have some abstract machine model that gives them some sort of semantics (sometimes rigorously, as an the operational semantics of Haskell in the STG Machine, and sometimes informally, as in Java's JVM). Forth has such a machine, which gives a basic idea of the things that a Forth system should do without specifying implementation details. This post, and hopefully many more in the future, is about that abstract machine as I understand it.

When writing a program in assembly, we may write a large number of subroutines to do trivial tasks, and to reuse these simple subroutines to define higher level routines. The higher level routines will consist largely of calls to other routines. Imagine we have a pretty resource constrained machine, and we are looking at a program that looks like this:

call a

call b

call c

ret

Which uses some "call" instruction, or if there is no "call" then it must push the instruction pointer (so we can return to it later), and then jump to the subroutine. We are imagining that a major concern is memory usage, so we immediately notice that there are a lot of calls. Why keep all of the instructions around if we are just going to call each address anyway? Instead of having these explicit calls, we could just have

call run

a

b

c

end

This requires a little explanation, especially as I've just made up "run" and "end". What this code is suppose to suggest is that we call a subroutine called "run", which is written once and used many times, which will look at the instruction pointer on the stack (pushed by the call instruction) and run that code. When that code returns, run will call b, than c, than end. The subroutine "end" is another one that is defined only once, and it will do any clean up necessary, which may just be a ret like a normal subroutine.

So we added a little wrapping around the bunch of calls, with the call to a special function to start and the address of a special function to end it off. If your program consists mostly of such lists of calls, then there is some savings of memory here, but that turns out to be a side note today. Now we have plenty of memory, often even in embedded systems. This technique turns out to have other advantages, however, and gives rise to the basic notions of the Forth language.

Okay! Thats enough for now I guess. I was hoping to get into threading techniques, but maybe next time!

Monday, September 5, 2011

Binary Operators

There is something that has bothered me for a long time in mathematics: what is up with the number 2? Why do we have "and" and "or" (conjunction and disjunction) in logic and type theory, meet and join in lattice theory, unions and intersections in topology (and set theory), Cartisian products and disjoint unions as polynomial data types (and set theory), products and sums in abstract algebra (sometimes we even have two "layers" of two operators which take two arguments, like in an algebra), and categorical products and coproducts in category theory. There are two things going on here- why are these binary operators (in the sense that they act on two objects) and why are there always two of them (why are they paired up)?

The reason this interests me is that we can easily define unary operators, or ternary, or what-ever-ary we want, so why is binary so common? Also, certainly not all structures have binary operators, and just having one doesn't always imply having the other, in general. For some, its because of their relationship with another pair of operators, like how lattice theory is historically a model theoretic view of truth-valued logic. However, the point is that there are often the two binary operators, not four ternary operators or one 5-ary operator or anything else. I don't know why this is the case, but I will take a stab at it.

One of the reasons is that its often really just one operator and its dual. This is particularly clear in category theory, where they are dual constructs, but similar duality conditions seem to hold for the others (perhaps not exactly? I don't know). This covers why two operators, but why are they binary?

The answer seems to be that they aren't, really. They can be given as binary operators, but as long as they are associative there is no ambiguity when defining them over finite lists of objects. Some can even be defined over infinite lists/sets. Giving them as binary operators is easy to define, but we can just as well define X V Y (with V the join) as the join of X and Y as we can define V{X, Y} as the join of all elements of a set (in this case X and Y).

So why two binary operators? It seems like we are really just looking at some operation, its dual, and the special case of defining it over two things instead of groups of things, which is just for convenience.

First Ever Successful Boot!

I'm restarting a project that I tried a few years ago- to write a stand-alone forth operating system. The complexity of the x86 processor has stopped me in the past, but this time I am more prepared. Another thing that has stopped me is getting a kernel to actually boot on a real computer. I've gotten them to boot on emulators, but for some reason I could never do it on my desktop. Well, this has finally changed! I got grub2 to boot a kernel image using tutorials from osdev, where the kernel simply prints NOAH to the screen (the hello world of hobby os development seems to be to print your own name to the screen).

This is great news! All the starting up stuff has been keeping me back for a while, and I still have a lot to get worked out (interrupts, memory model, memory protection, and keyboard controlling) but now I can start actually working on it and not have that nagging worry that I won't be able to boot it even if it works in an emulator.

I am trying to keep everything simple here- there will be no restrictions to memory, it will be mono-tasking at first (forth multitasking isn't super hard, but I need something to work with first before I go there), and I don't intend to support anything but a keyboard. One major difficultly is disk access- I would like to support a block file system, possible with largish block (4KB rather than the usual 80*25 to store a screen) or maybe not, I don't know. The problem is accessing the hard disk- a possibly complex operation that I don't know anything about. I would be okay with a partition that is organized into blocks, or a file on another file system that is loaded as a full block files system- which would be really neat because I could edit it in Linux as a text file, and load it as part of the operating system. The problem of course is I don't know how to do this either.

So in conclusion- YAY! I finally started this project! Its been forever! I'm planning on using it mostly for fun, but also as a way of getting more comfortable with C, which I will be using a lot in the future. I was going to just use assembly, to take complete control of the system and make the Forth registers the actual machine registers and all that like in my previous forth system for the z80 of my ti-83 calculator, but I decided it would be easier, and possibly more instructive for me to use C to learn about using it as a systems programming language.

Closed Categories, Properties, Philosophy of Mathematics

Warning- this post is just my recent thoughts about math. I don't know how to make them rigorous, and they may be completely wrong.

I've been thinking about how the subfields of math each center around the theory of some object. The object is normally given as a set theoretic construction satisfying some axioms, and the field of study is concerned with the properties of these objects, what objects in other fields satisfy the axioms (and are therefore part of the field), and with transformations between the chosen objects. Category theory in some sense captures this idea by making the objects of study this situation- a category can be thought of as containing the objects of some field of study, the morphisms are the transformation between the objects. Categories can just be thought of as another object, defined like any other field's chosen object, but they can be considered the embodiment of a field of mathematics. This makes it a little less surprising (though not much, this is pretty surprising!) that categories are so intimately related to theories (the object "a theory" consisting of types, function symbols, axioms, and relation symbols). In fact, the "internal language" of a category is a theory, and the model of a theory is a category. This is an amazing thing that I'm just beginning to grasp, really.

In most theories of some object, there are important properties that we want to show that a particular object has. In group theory, for example, we want a set to have, among other things, an identity element for it to be a group. The group homomorphism from the one element group to a given group points out that identity element, and therefore the one element group embodies the property of identity. The proof that a group has an identity is then embodied in the existence of a homomorphism between the group that embodies the identity and the chosen group.

What I'm wondering is- how general is this idea? Are all properties of objects embodied in some object, and all proofs embodied in the existence of morphisms from that object? In closed categories, where the morphism are objects of the category, this means that the theory of the objects of the category contain their own proofs. This seems like Topos theory, which requires a couple more properties then just being closes, which studies categories that have enough structure to contain systems of mathematics (as I understand it). Not all categories contain their morphisms in this way, and so not all theories contain their proofs, but it looks like Topos and their associated theories do this.

Interestingly being closed appears to be fundamental to categorical notions of computation. The categories whose theories are type systems over the lambda calculus are Cartisian closed categories, although I think that can be weakened in some ways. This is a basic fact of computation it seems- Turing machines have their universal Turing machines which run encoding of other machines, lambda terms are defined over other lambda terms, and actual machines can't distinguish (in general) between data and code. The whole Howard-Curry isomorphism and proofs as propositions interpretation gives us the fact that programs encode proofs of a proposition in some system of logic (in the case of the lambda calculus, lambda terms can be read as programs to be reduced, or as proofs. It just depends on how you read them). It seems to me that in a constructive mathematics all proofs would indeed be some form of transformation which can be encoded in an object (functors, homomorphism groups, etc) meaning that objects and their proofs are the same thing. This puts a perspective on things like naive set theory- if you can describe objects that don't make sense (contradictory sets) then you have to expect contradictory proofs. It may be that the safest thing to found a system of mathematics on is something where we can construct all objects, where the transformations between objects are contained in the theory, and therefore we can construct the object embodying all proofs (if we couldn't construct it, it wouldn't be true).

I normally ignore the philosophy of constructive mathematics as my interest in it is really just as a computer scientist. I think of math as being a human activity, so classical logic isn't really "wrong" and people are welcome to do it if they want (it is certainly powerful, even if its a bit odd at times). On the other hand, it does seem that constructive math, as I understand it, is on a good foundation where it safely contains its own proofs and its lack of freedom in defining objects (as opposed to classical math where we can define some very strange things) prevents the weirdness of classical math. Type theory started as a way to prevent set theoretic paradoxes, and intuitionistic type theory may be the way to go for a system of math that constrains its objects and proofs enough to keep math down to earth.

Okay, enough ranting. Sorry for the half formed thoughts, but I wanted to record them somewhere.

New Job = New Posts?

I have my first real, out-of-school, full-time type job at SSAI working on embedded systems for a NASA instrument for the International Space Station. I am enjoying it a lot so far, but I've been pretty busy and haven't posted for a while.

But that may be about to change! I'm learning a lot about real-time operating systems, embedded systems programming, network protocols using in space, radiation hardened devices, and programming in C (which I will be doing a lot of). This gives me lots of new material to talk about. I also have some side projects I'm starting up that might make for some good posts.

For right now I just wanted to comment on my experience with C. In my internship this summer I started to really like Perl, despite all my programming language snobbery with Haskell. I would definitely use it for text processing and as a replacement for shell scripts. Similarly, I've been discovering the good things about C. The attitude that I have right now, which I heard somewhere and seems to be true, is that C is a very nice assembly language. What I mean is that its abstraction (its abstract machine) is a fairly thin layer above the actual machine.

While the Haskellian in me doesn't like unions types and arbitrary casting, I'm starting to see how they are useful in systems programming. Bitfields, while possibly dangerous (compiler specific behavior has already come up in the last two weeks), and helpful with network programming and other densely packet structures.

So- the future. I hope to post about embedded systems, C, space, and my (second attempt at a) project to write a standalone Forth operating system.