Four weeks ago I started learning Clojure on my own. I took the bold move of building something awesome without even knowing the syntax, which I hope will eventually learn by doing an Amazon web spider and Project Euler problems.
So far it’s being both good and hard at the same time, I kind of want to push off and be done with some problems at my work at Lifebooker. This and all hackNY events, hackathons, and heck just trying to make the most of my time at NYC (at which I’m sort of failing) don’t really leave much room for experiments. Tonight I feel like I should rest and work in the morning rather than trying to do and accomplish nothing on my Amazon pet project. So instead I’ll be writing a blogpost about corecursion and learn from explaining it. Let’s get started by defining what recursion is.
My dearest friend Wikipedia says recursion is the process of repeating items in a self-similar way. Pretty accurate, but for our purposes and for anyone not familiar with the mathematical concept of self-similarity, let’s understand recursion as whenever a function, be it a mathematical function f(x) or a function in a programming language function(x), are expressed in terms of themselves.
In maths we understand a function defined in terms of itself as, for instance, f(x) = f(x-1) + 2. A simple function defined in terms of itself in java-like pseudocode would be something like,
Recursive functions usually have at least a base case, so they stop at a certain point and stop calling to themselves. The simplest example I can make up off the top of my head is this, given an integer argument,
This would just return a number after some calls, provided that you call it with an integer argument less than 10. Now that we have brushed up on our CS101, let’s bring up the real thing. Corecursion is essentially the same idea but applied to data structures instead of functions. If I had to put a wording for the problem, it would be something like:
Given a data structure, define a computation on that structure. Execute it, then repeat on result of this computation, and so on. It is similar to recursion, but it produces a sequence of data, with calculations taking place on demand when this data is dereferenced. It’s oriented on data, not the operations.
In order to understand corecursion, I found vital to understand codata. If you really wanna get into it, I would recommend reading this paper which teaches the basics of F-algebras, F-coalgebras, and duality (F meaning functor in this context). For the sake of simplicity, let us use two really basic data types, based on natural numbers, Nat and CoNat.
Nat (the natural numbers) is defined as:
At the end of the day they are created out of S and 0. Think of S as 1 and you could create all the numbers with this data type.
CoNat is almost identical to Nat but with an extra element.
An interesting corollary to this is that the latest element of the CoNat set doesn’t contain c0. Plus, it’s truly infinite, as opposed to the “latest” Nat which is just applying S and 0 over and over. If we were to define Nat, we could say something like
But when it comes to infinity, we cannot do infinity = Successor(infinity). Thus, infinity (ω) is NOT a member of this set. If it was, now consider the set N that has all the same elements as Nat except it does not have ω. Clearly Zero is in N, and if y is in N, Succ(y) is in N, but N is smaller than Nat which is a contradiction. Therefore ω is not in Nat. Data is always finite. Codata can be infinite.
Taken word by word from reddit:
All types in functional programming have two “tools” for using them
- an introduction principle, and
- an elimination principle
The introduction principle lets us turn objects of some other type into objects of this type. The elimination principle lets us turn objects of this type into objects of some other type.
This should be your personal mantra. Examples of this are:
- Function types – Lambda abstraction & Function application
- Sum types – inl and inr & case expressions
- Product types – the function
- Unit – the unit constructor
case () of () ->
- Bool –
False& if-then-else statements
- Empty – (no introduction rule) & reductio ad absurdum, the ability to promote
_|_to any other type.
A requirement for an introduction principle to be such, is that the introduction principle has to be able to turn some type into Nat. We do have a introduction principle for Nat. It is just our two old friends 0 and S (the constructors).
CoNat does NOT have an introduction principle as simple as its constructors. It’s introduction principle is
coNatIntro f includes
a -> CoNat . And that’s enough to ‘turn some type into CoNat’, so it’s an introduction principle.
Elimination principles work just the opposite way, they turn elements of a type into elements of some other type, so in our example they would have the form
Nat -> something, CoNat -> something else.
The elimination principle for our simple Nat type is:
Wait. Isn’t Nat -> a the opposite of a -> CoNat? Yup, so it is even clearer than they are elimination and introduction principles respectively. So it is proven that CoNat is codata, hence infinite but we can only see a finite amount of CoNat. This has some consequences, that can help understand corecursion and codata.
- You can’t define total subtraction. (infinity – infinity will always loop).
- You can’t write total equality. (it loops when comparing infinity == infinity).
- You can’t write a total
elemfunction that works on a CoList (it loops when the list is infinity long and the element isn’t in the list).
Corecursive data structures are defined in terms of itself, corecursive algorithms are defined in terms of its output rather than on its input.
This and lazy evaluation allows “infinitely” large objects without allocating this memory and obviously running out of it prematurely.
Corecursive binary trees
A complex yet easy to understand corecursive structure would be a binary tree. I won’t go in the details, but you can see how this would be defined in terms of a corecursive structure. Here is an example on how they take advantage of corecursion to traverse the tree.