« November LAN 2009Good things about FF12's battle system »

Imperative programming in Haskell

16/11/09

Permalink 10:00:44 am, 1175 words
Categories: Code

Imperative programming in Haskell

There’s really no such thing as a pure AND practical programming language. Even Haskell gives you ways to read and write from files, and it even allows you to update variables and swap array elements. It does mark the impure parts of your program with a special type, but once you sit down to do imperative programming, it doesn’t stop you. Of course, you’re going right back to the land of off-by-one errors, but sometimes there is no functional algorithm that provides the same performance characteristics as an imperative one. So you have to.

Disclaimer: One of my colleagues points out, “or you have to write it in C, then link to it using the FFI". Which is a very good point. I’ll give an example algorithm below and let you decide what you think of imperative programming in Haskell. It’s not particularly pretty.

The algorithm in I’ll use as an example is insertion sort, the first algorithm in Introduction to Algorithms. First I’m going to give an implementation in Python, which is almost exactly the same as the pseudocode in the book except that it’s executable, so I can test that it works on real inputs.

def insertion(l):
    for j in range(1, len(l)):
        key = l[j]
        i = j - 1
        while i > -1 and l[i] > key:
            l[i+1] = l[i]
            i -= 1
        l[i+1] = key
    return l

It’s notable that I’ve never written an initial version of this particular algorithm without some kind of off-by-one bug; Introduction to Algorithms uses 1-based array bounds (which are a mistake), so I can’t just copy the code directly. Like I said, off-by-one errors come with the territory, so it shouldn’t be surprising to run into them.

Anyway, the Python version doesn’t look very different even from C, except for the iterator-based for and distinct lack of braces/semi-colons. Let’s look at Haskell next.

Well, wait. First we have to write while and for. Hoogle doesn’t show a while or numeric for, so I have to write them myself. This is possible because of lazy evaluation of actions, but it certainly brings back shades of Lisp macros. In other words, I’m certainly making mistakes in my implementation, so feel free to point out how I could improve my for and while. (succ is equivalent to (1 +), except it works on any Enum type, eg Int, Char, ADTs…)

forM start stop _ | start >= stop = return ()
forM start stop action = do
  action start
  forM (succ start) stop action

whileM condition action = do
  condition >>= test
  where test False = return ()
        test True = do action
                       whileM condition action

Actually, I might write whileM like so if I were worried about offending the sensibilities of non-Haskell programmers:

whileM condition action = do
  res <- condition
  if res 
    then do 
      action
      whileM condition action
    else
      return ()

But that’s a lot of newlines for about the same look…Anyway, let me show you the actual insertion sort.

insertion :: [Word32] -> UArray Int Word32
insertion l = runSTUArray $ do
  a <- newArray (0,length l - 1) 0
  mapM_ (uncurry (writeArray a)) (zip [0..] l)

  forM 1 (length l) (\ j -> do
    key <- readArray a j
    i <- newSTRef (pred j)
    whileM (do _i <- readSTRef i
               if _i < 0
                 then return False
                 else do a_i <- readArray a _i
                         return $ a_i > key)
           (do _i <- readSTRef i
               writeArray a (succ _i) =<< readArray a _i
               i `modifySTRef` pred)
    _i <- readSTRef i
    writeArray a (succ _i) key)
  return a

The first line wraps the whole function in runSTUArray, which lets you use imperative, strict code on an array, after which it freezes it and returns it to the pure, lazy world of Haskell.

The next two lines convert a linked list to an array. The tuple (0,length l - 1) describes the bounds of the array, so if you want to make the 1-based mistake, you can. I didn’t, so I started at 0. The last argument to newArray fills the array with 0, so on the next line I have to fill in the array with the list’s contents.

Then inside the for, the first thing I do is read the array. Unlike the nice syntax you get with C et al, a[j], you have to call a function: readArray a j. Again, this reminds me a lot of Lisp. Next I create an STRef for i. This is a lot like a pointer in C, or maybe a smart pointer in C++. Except, again, no nice *i syntax; you have to say readSTRef i. (pred is the opposite of succ: (+ (-1)))

The while is the most annoying part of the code. Remember when I said that runSTUArray is strict? Well, that means that the equivalent to the C i > -1 && a[i] > key no longer works, because (&&) isn’t lazy here. Both sides of the (&&) get evaluated, which throws an array index error when i = -1 because a[i] is invalid. So I have to do this Pascal-esque workaround with if. Also, getting the value that i refers to has to go on its own line. The upshot is that one fairly compact expression bloats out to 5 lines. The strictness of (&&) alone is enough to deflate the Haskellite’s joking claim that it’s the best imperative language in addition to being the best functional one.

The while body is pretty straightforward. The line writeArray a (succ _i) =<< readArray a _i looks more like C than it deserves; it could easily have been readArray a _i >>= writeArray a (succ _i), which reads completely backwards for an imperative program.

Finally, you may have noticed that type annotation at the top. I never figured out how to make this function work for any type a. It could be partly that I return a UArray: the U stands for unboxed, which may not play well with the type inferencer since you probably can’t slot an arbitrary a in place of Word32. I’m not sure if there’s a typeclass for unboxed types or what. Regardless, I ended up with an insertion sort over unboxed 32-bit integers. That’s got to be faster than the Python code, but it’s not as useful either.

For completeness, here are my imports. I’m not sure I used all of them.

import Control.Monad (mapM_, liftM2)
import Data.Array.MArray (getBounds, newArray, readArray, writeArray)
import Data.STRef (newSTRef, readSTRef, writeSTRef, modifySTRef)
import Data.Array.ST (runSTUArray)
import Data.Word (Word32)
import Data.Array.Base (UArray,MArray)

In case you’re curious, here is a functional insertion sort. It allocates more and the stack might overflow for longer lists. Well, you shouldn’t be sorting big lists with insertion sort, so stack overflow is less of a concern. However, in addition to being O(n**2) in time, it is O(n**2) in space. But it is much prettier, and it worked the first time, unlike the imperative version.

insertion' [] = []
insertion' (x:xs) = insert x (insertion' xs)
insert x [] = [x]
insert x (y:ys) | x < y = x : y : ys
                | otherwise = y : insert x ys

2 comments

Comment from: jtra [Visitor] Email · http://jtra.cz
My solution is here:
http://jtra.cz/download/haskell/imperative.hs.html (remove .html to download non-html source)
17/11/09 @ 16:40
Comment from: sandersn [Member]
Thanks very much. This is great, and has a lot of useful utilities that I didn't factor out of my solution.

I see you even figured out my type problem.
17/11/09 @ 16:52

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
powered by b2evolution free blog software