Category Theory and the category of Haskell programs : Part 1

"Category theory" is an expression that is generally frightening people. But, if you have attempted to read some research papers in Computer Science, Mathematics, Physics or even Philosophy, you've surely remarked that Category theory is used a lot and you probably asked yourself : what is Category theory ? Why is it so useful ? Is it so difficult ? What the link(s) with computer science ? Answering to those questions in an easy way that can be understood by most people is really challenging but I am going to try and I'll use the category of Haskell programs to illustrate some points. I hope the experts will forgive me some slight inaccuracies and/or an unconventional presentation.

1. What is a category ?

Category theory is used a lot because it is a theory of processes in the most general meaning of the word. Indeed, what do all processes have in common ? Answer : they can be composed and the composition is associative. So, a category is just a bunch of objects A,B,C ... and some processes f,g,h ... I'll note a process f applied to A to get B with an Haskell syntax:

f :: A -> B

A is the domain of f and B the codomain. It is wrong to say that the image of f is included in B. f is not a function.

The processes and objects are a category if they can be composed and the composition is associative which means:

f :: A -> B
g :: B -> C
h :: C -> D


(h . g) . f = h . (g . f)

And we need a last axiom : each object must be provided with a "do nothing process" called the identity :

 id :: A -> A
 id x = x

There is in fact one different identity process for each object A. Category theory is not polymorphic.

The processes in category theory are often named : arrows, or morphisms, or functions. But, they are much more than functions. Arrow is thus a better name. But, I don't think it is intuitive enough so for this presentation I chose process.

Note that all definitions are so far symmetric ones : if you change the direction of all arrows you get the same axioms. So, generally for each definition in category theory you have a dual one that you can get if you change the direction of all arrows : the dual definition is often named with a prefix co.

So, why is it useful ?

2. Why is category theory useful ?

It will be difficult to explain in a short paragraph why it is useful. You really need to see lots of examples. But let's try.

In a category you have some objects but you don't know what they are made of. An object is opaque and you don't have access to its implementation. You just have access to the processes, the arrows. In short : the interface, the API ! What is important is not the identity of the objects but their behavior. In fact, in category theory definitions are generally given up to isomorphism. Two objects are considered the same ones if they behave the same and if you can't distinguish them with the API you have.

The consequence is that most definitions are very abstract because you can only define things using an interface and you never have access to the implementation.

For instance, how could you define a subset if you don't have access to the elements of a set and if you're not even allowed to speak of the elements ? It is nevertheless possible ! The definition will be abstract but the advantage is that the definition can then be used in other contexts. Once you have defined the concept of subobject then you can not only talk about subsets but you can talk about subgroups and maybe other subthings according to the category in which you are.

So, thanks to these abstract definitions you begin seeing order where there was some chaos. But it is even better than that. Remember I said objects were defined up to isomorphism. So you never say that objects A and B are equal but that they are isomorphic. Which is in fact a way to say they are similar for a given definition of similarity. Category theory is a powerful tool for studying similarities and in fact relaxing and generalizing the definition of similarity. Seeing analogies between objects and analogies between analogies is again a way to introduce some order where there was an apparent chaos.

But, to be really useful we cannot stop at the definition of a category which is too general for being of any use. We need to study specific categories, specific processes, discover properties which can be used in many different categories etc...

So, for this first post, we will just try to define a subobject and first we need to define some specific kind of processes.

3. monic, epic and iso processes

Let's try to generalize the set definitions for one-to-one and onto functions to category theory. We know that for a one-to-one function f we have

f . g = f . h => g = h

Intuitively it means that f is not erasing nor adding information and so if f . g = f . h we can conclude that g = h It will be the definition of a monic process (monomorphism). If for each g and h we have

f . g = f . h => g = h

Then f is monic.

Now, let's try to do something similar for onto. If a function is onto then its image is big enough and can be used to distinguish two functions. So, we may try : if for each g and h we have

g . f = h . f => g = h

then f is epic (epimorphism). And in fact we will use that definition but WARNING. epic is not equivalent to onto. Indeed, we want the result of our process f to be big enough. In Set, big enough is the full set which containing the image of f. But, in other categories big enough is dependent on the processes we have. Don't forget that objects are considered as equivalent if they cannot be distinguished by the processes you have. We don't need more. I don't have any other example than topological ones. So, I won't give an example (but if you know topology just think about continuous functions for the processes. Dense subsets cannot be distinguished from the full set with just continuous functions).

The advantage of previous definitions is that they are not refering to elements so you can use them in any category. Another way to remember the definition is:

  • A monic process can be canceled out on the left
  • An epic process can be canceled out on the right

What is an iso process (isomorphism) ? It is NOT an epic and monic process. f is iso if there exists g such that

f :: A -> B
g :: B -> A
g . f = id -- identity of A
f . g = id -- identity of B

Now we are ready to define a subobject

4. Subobject

In set, you get a subset with a one-to-one function that will be used to embed a set in another one. So, a subobject could just be defined with a monic process:

f :: A -> B

If f is monic then f is a subobject of B (and we don't say the image of f. f is a process not a function). And we really say f is a subobject.

5. Initial and terminal objects

5.1. Terminal object

Some additional terminology. A terminal object T is an object such that every object has exactly one arrow to T. It is playing a similar role to the singleton set. Indeed, to define a function from a Set to a singelton set you have no choice : the image must be the one element of your singleton. So there is only one function possible.

In Haskell, a terminal object may be a phantom type:

data T

since T is containing only the element undefined. So, if you try to define a function

f:: a -> T

it can only be:

f _ = undefined

Such kind of object is called terminal because if you draw a picture you will see lots of arrows converging to it. And of course, such object is defined only up to isomorphism. Any other phantom type could be used.

data U

And we have an iso which is a typecast:

iso :: T -> U
iso _ = undefined

iso' :: U -> T
iso' _ = undefined

The type of iso . iso' is U -> U and there is only one possible function from U to U which is id.

Indeed, if you try:

instance Show U where
 show _ = "U"

 (iso . iso') undefined

you get U !

5.2. Initial object

By duality, an initial object I is one from which there is one arrow from I to any other object. That kind of object is playing the role of the empty set. So, there is no initial object in the category of Haskell programs since any type is inhabited by undefined.

A object which is initial and terminal is called the Zero object.

6. End for today

We still have a very long way before being able to explain types like [a], the famous result "theorem for free", monads etc...

I'll continue only if I have some readers since I am not sure it will interest lot of people :-) This first post is a test.

(This post was imported from my old blog. The date is the date of the import. The comments where not imported.)

blog comments powered by Disqus