Preventing irreversible updates from GET with types

If you're a functional programmer or an Haskell coder then you won't learn anything new from this post. But, if you're learning functional programming or if you're just wondering why static typing is useful then you can continue reading. People from the dynamically typed language world generally claim that you do not need static typing. It is too annoying and you can rely on unitary testing for similar results. I'd just like to show on a very simple example that static typing is not only very useful but also very convenient. Static typing and unitary testing have both their use.

I am just assuming that the reader knows how to read Haskell code and typecheck it in ghci.

The problem we want to solve is : we don't want to have to write unitary tests for every HTTP handler we have in our code just to check that no irreversible update can be run in reaction to an HTTP GET.

First, we need an environment where we can write database actions : a database monad. It is not a hack. It is not breaking the functional paradigm. It is just a controlled environment.

data DB s a

I am not writing the definition of the database monad. So there is no data constructor. No = after the type definition. It is a phantom type. In a real implementation it would have to be a concrete type.

The meaning of the type is easy : we have a program that will do some database accesses and return a value of type a when it is executed. But, while it is not executed we can manipulate this program like any other value : concatenate programs to build more complex ones etc... Just the advantage of having a pure lazy functional language.

The type s is used to track the kind of the database query : one which is irreversible or one which is idempotent (you can run it twice and you'll get the same result since there is no side effects and no change of the database). So, let's define that s is a type which has a property : the property of defining the kind of the database access

class QueryType a

And let's define two kind of queries :

data Irreversible
data Idempotent
instance QueryType Irreversible
instance QueryType Idempotent

Let's define two new additional types just to be able to write some database actions:

data Request
data Row

In a real code, you would have to define Row and Request but not Idempotent and Irreversible. Idempotent and Irreversible are phantom types. There are used to enforce some constraints at compile time only. You will never have to use a value of type Irreversible at runtime so no need to provide a data constructor.

Now, we can imagine how may be written a get function:

get :: QueryType s => Request -> DB s Row
get = undefined

This get function is not imposing anything to the s type variable except the fact that it must be used to describe the type of the query.

On the contrary, a post request is an irreversible access and it is tracked with the s type variable which is replaced by the specific value Irreversible.

post :: Request -> DB Irreversible Row
post = undefined

So, now, if you write a database program like:

bigquery = do
 a <- get myRequest
 let b = process a
 post b

Then, bigquery will have the type DB Irreversible Row.

So, if any post function is used anywhere in the query, and whatever the complexity of the query, then its type will contain Irreversible. If you use only get queries then the type will not be constrained.

Now, we impose that to run a database program (and really execute the database query) we need an idempotent query ! It looks stupid because we also need to change the content of the database from time to time. But wait a little for the answer.

runDB :: DB Idempotent a -> a
runDB = undefined

What we need is a way to say : I have checked that the query is executed in reaction to an HTTP POST so you can forget the real value of the type constraint (irreversible or idempotent) and just assume it is idempotent since it does not matter anymore.

Such a function could be:

checkPost :: (MonadCGI m, MonadError String m, QueryType s) 
  => DB s a -> m (DB Idempotent a)
checkPost = undefined

The type is a bit complex.

First, we see that the value of the type variable s does not matter anymore and is replaced by Idempotent. We also see that the result is in a monad m. There are two reasons for this. First, we need to access to the HTTP header to know if the request is a post or get. So, m is a MonadCGI. Second, if the access is not a POST then the function should fail and the database action should not be executed. So, m is also a monad error (where the error is just a String).

The checkPost above is not perfect : you can apply it to a get action and it will fail if it is called in the context of a GET since checkPost is not checking the value of s. We could use a class to change the behavior of checkPost according to the value of the type variable s.

So, with such a system and assumed that the implementation of DB s a is hidden in a module and manipulated only through the exported API, it is possible to prevent the use of any irreversible update from an HTTP GET whatever the complexity of the HTTP handler ! The code will just not compile ! And you're sure you will never forget any case ! The code will just not compile if you forget some.

It is not difficult to write such kind of code once you have understood classes and phantom types and it is much more convenient than having to write unit tests just to check a condition that is better left to the typechecker.

(Irreversible was probably not the right word to use for post queries ... also the code in this post will typecheck but not run since I have not provided any implementation for the functions which are all undefined. To typecheck you will have to import Network.CGI and Control.Monad.Error).

(This post was imported from my old blog. The date is the date of the import. The comments where not imported.)

blog comments powered by Disqus