Its Computer Science Time: Learning Category Theory

Continuing my series of introductory posts on awesome thing that seem very difficult at first but turn out to be very fun and understandable, I'm going to start posting about category theory. This is a huge subject, and despite about a year of teaching myself about it I don't even understand all of what sometimes called elementary category theory (I may say CT at times for short). The assumed knowledge will be higher for this series then the others, however, so be warned.

I think we will start off defining a category without going into their use or importance. I found it confusing to think about the sheer size and amount of content of some categories, and the abstraction still gets to me at times. For this reason, we will talk about a category as an object like any other object- we have a series of axioms and we are interested in the objects that satisfy those axioms. It will be a little different from an algebraic structure as the binary operation is not closed. Its a little like a graph, but often we don't allow graphs to have multiple edges between the same source and target, where in a category we do.

I'm going to ignore a whole series of issues in the definition for simplicity, so just take this post with a grain (or bucket) of salt.

A category, C, is given by a set of objects Obj(C) and a set of morphisms Mor(C). We can specify this as a 4-tuple (O, M, S, T) where O is a set of objects, M a set of morphisms (also called arrows, but they are just some kind of object) and two functions S, T : M -> O taking a morphism to its source (S) or its target (T). One way to think of this is as a set of objects with arrows between them. The source of an arrow is the object (or point if we draw them as a diagram) that the arrow starts at and the target the object it points to. A little ascii art:


*--->*
|
V
*

where the "*" are objects and the little ascii art arrows are the arrows/morphisms.

A nicer way to specify a category, C, is as a set of objects which we write either Obj(C) or, in an abuse of notation, as just C, and, for every pair of objects, a set of morphisms Mor(A, B) where A, B are objects in C. We can also write C(A, B), which I prefer. You may see it written hom(A, B) which is referring specifically to the set of morphisms when its important that it is a set.

So, we have our objects and our arrows. Now for the axioms- for A, B, D in C if we have an arrow from A to B, written A -> B, and an arrow B -> D, then there must be an arrow A -> D that is the same as combining A -> B and B -> D. This combination is called composition, and will be function composition in some cases. If f:A->B, g:B->D, h:D->E for some A, B, D, E in C, then gof:A->D (g composed with f, with "o" as the symbol for the composition operator) must exists, and ho(gof) = (hog)of, so the composition operator is associative.

Also, for every object A in C, there is a morphism id in C(A, A) (the set of arrows that start and end at the same object A) such that for any morphisms f:A->B and g:D->A, f o id = f and id o g = g, so id is the identity in the sense that it has no effect on composition.

Notice the difference between composition and other algebraic operators- normally we can just say that there is some underlying set, S, and the operator is a function (using additive notation) +:SxS->S. If we have an identity element we say e is in S and for all b in S b+e = e+b = b. Composition can not operate on arbitrary elements like the operator + can, it can only act on composable elements. Being composable means that the target of the first is the source of the second.

Each of the elements in this definition can be interpreted in many ways- thats part of the fun of category theory. On way to think of morphisms is as transformations that morph one thing into another. Composition then means to do one transformation then another. In the category of sets the objects are sets, and the morphisms are functions. In this case we can think of a function turning one set into another by acting on each of its elements. In general, however, the objects can be anything and they may not have elements inside them (if they are numbers for example).

I'm hoping that this series will record what I've learned about CT and will provide many interpretations of different parts of the theory. If I can far enough I would like to talk about categorical functional type theory and topos theory, which I'm currently struggling desperately to learn.

Its Computer Science Time

Tuesday, April 26, 2011

Learning Category Theory

No comments:

Post a Comment