Its Computer Science Time: Eden- Distributed Computing with Haskell

I'm doing a project for a Distributed Operation Systems class on the programming language Eden. I wanted to post a little bit about my understanding of the system, so it will be a lot of rambling thoughts many of which may be false.
Eden is an extension of Haskell built of off parallel, concurrent, and distributed Haskell extensions and using part of the GdH (distributed Haskell) runtime system GUM. It adds a small number of primitives to Haskell yet allows arbitrary topologies for a distributed system.
The language has a published operational semantics based on the STGM (spineless tag-less graph reduction machine) and denotational semantics (based on a simply typed lambda calculus).
The core concepts are- processes, threads, channels, processing elements, and a non-deterministic merge. Processes are first order abstractions and are instantiated partially eagerly. They also create their output eagerly in the sense that they do not wait for the creator process to ask for output before sending it. This is necessary to avoid what is called sequential parallelism where a potentially parallel process ends up executing sequentially over many processors.
The dynamic channel construct allows communication channels to be represented as Algebraic Data Types and sent over existing channels. A dynamic channel can only be used once and creates a channel from the process that creates the channel to the process that created the channel object (using object in the more general sense of a manipulatable item).
The merge primitive is interesting as it allows Eden programs to be reactive. It is amazing how much this primitive seems to add to the language. The dynamic channel construct technically adds no more expressiveness to the language in the sense that each dynamic channel could be removed and turned into a series of transfers over static channels. The merge on the other hand allows a series of streams (events, data from a sensor, user actions, etc) to be merged into a single stream with no assurance on ordering. If for application specific reasons one needed a deterministic merge it would have to be built on top of the merge primitive.
The merging of streams means that Eden can be used to create any of the normal sort of distributed system. The underlying run-time system has concepts of remote references (they call it remote data) and location specific resources (sticky objects) that could be files or processes MVars, or anything else.
It is hard to get a good picture of the Eden system as a whole, as many of the papers are at a high level of abstraction and leave details to the underlying RTS. Unfortunately there isn't much about the details of the RTS, which although apparently similar to GUM does not make use of all of its concepts such as the global heap. I have a lot of questions that I can't seem to find answers for about the little details of Eden.
On thing that we have been talking about in my class is transactions, and Haskell certainly supports STM. I'm not sure exactly how this interacts with Eden.
The virtual machine that runs the compiled Eden language PEARL (Parallel Eden Abstraction Reduction Language) is called DREAM (DistRibuted Eden Abstract Machine). Instances of this machine must be on every device in the system. A great deal of the machinery here is taken care of in a way that is abstraction so I don't know how instances add themselves to the system or negotiate or send data or marshal and un-marshal or anything like that.

There is quite a lot more to this than I've mentioned. There is failure detection and correction, node failure and network partitioning, inter-process and inter-thread communication, support for algorithmic skeleton's, and more.

Its Computer Science Time

Monday, March 14, 2011

Eden- Distributed Computing with Haskell

2 comments: