| « November LAN 2009 | Good things about FF12's battle system » |
There’s really no such thing as a pure AND practical programming language. Even Haskell gives you ways to read and write from files, and it even allows you to update variables and swap array elements. It does mark the impure parts of your program with a special type, but once you sit down to do imperative programming, it doesn’t stop you. Of course, you’re going right back to the land of off-by-one errors, but sometimes there is no functional algorithm that provides the same performance characteristics as an imperative one. So you have to.
Disclaimer: One of my colleagues points out, “or you have to write it in C, then link to it using the FFI". Which is a very good point. I’ll give an example algorithm below and let you decide what you think of imperative programming in Haskell. It’s not particularly pretty.
The algorithm in I’ll use as an example is insertion sort, the first algorithm in Introduction to Algorithms. First I’m going to give an implementation in Python, which is almost exactly the same as the pseudocode in the book except that it’s executable, so I can test that it works on real inputs.
def insertion(l):
for j in range(1, len(l)):
key = l[j]
i = j - 1
while i > -1 and l[i] > key:
l[i+1] = l[i]
i -= 1
l[i+1] = key
return l
It’s notable that I’ve never written an initial version of this particular algorithm without some kind of off-by-one bug; Introduction to Algorithms uses 1-based array bounds (which are a mistake), so I can’t just copy the code directly. Like I said, off-by-one errors come with the territory, so it shouldn’t be surprising to run into them.
Anyway, the Python version doesn’t look very different even from C, except for the iterator-based for and distinct lack of braces/semi-colons. Let’s look at Haskell next.
Well, wait. First we have to write while and for. Hoogle doesn’t show a while or numeric for, so I have to write them myself. This is possible because of lazy evaluation of actions, but it certainly brings back shades of Lisp macros. In other words, I’m certainly making mistakes in my implementation, so feel free to point out how I could improve my for and while. (succ is equivalent to (1 +), except it works on any Enum type, eg Int, Char, ADTs…)
forM start stop _ | start >= stop = return ()
forM start stop action = do
action start
forM (succ start) stop action
whileM condition action = do
condition >>= test
where test False = return ()
test True = do action
whileM condition action
Actually, I might write whileM like so if I were worried about offending the sensibilities of non-Haskell programmers:
whileM condition action = do
res <- condition
if res
then do
action
whileM condition action
else
return ()
But that’s a lot of newlines for about the same look…Anyway, let me show you the actual insertion sort.
insertion :: [Word32] -> UArray Int Word32
insertion l = runSTUArray $ do
a <- newArray (0,length l - 1) 0
mapM_ (uncurry (writeArray a)) (zip [0..] l)
forM 1 (length l) (\ j -> do
key <- readArray a j
i <- newSTRef (pred j)
whileM (do _i <- readSTRef i
if _i < 0
then return False
else do a_i <- readArray a _i
return $ a_i > key)
(do _i <- readSTRef i
writeArray a (succ _i) =<< readArray a _i
i `modifySTRef` pred)
_i <- readSTRef i
writeArray a (succ _i) key)
return a
The first line wraps the whole function in runSTUArray, which lets you use imperative, strict code on an array, after which it freezes it and returns it to the pure, lazy world of Haskell.
The next two lines convert a linked list to an array. The tuple (0,length l - 1) describes the bounds of the array, so if you want to make the 1-based mistake, you can. I didn’t, so I started at 0. The last argument to newArray fills the array with 0, so on the next line I have to fill in the array with the list’s contents.
Then inside the for, the first thing I do is read the array. Unlike the nice syntax you get with C et al, a[j], you have to call a function: readArray a j. Again, this reminds me a lot of Lisp. Next I create an STRef for i. This is a lot like a pointer in C, or maybe a smart pointer in C++. Except, again, no nice *i syntax; you have to say readSTRef i. (pred is the opposite of succ: (+ (-1)))
The while is the most annoying part of the code. Remember when I said that runSTUArray is strict? Well, that means that the equivalent to the C i > -1 && a[i] > key no longer works, because (&&) isn’t lazy here. Both sides of the (&&) get evaluated, which throws an array index error when i = -1 because a[i] is invalid. So I have to do this Pascal-esque workaround with if. Also, getting the value that i refers to has to go on its own line. The upshot is that one fairly compact expression bloats out to 5 lines. The strictness of (&&) alone is enough to deflate the Haskellite’s joking claim that it’s the best imperative language in addition to being the best functional one.
The while body is pretty straightforward. The line writeArray a (succ _i) =<< readArray a _i looks more like C than it deserves; it could easily have been readArray a _i >>= writeArray a (succ _i), which reads completely backwards for an imperative program.
Finally, you may have noticed that type annotation at the top. I never figured out how to make this function work for any type a. It could be partly that I return a UArray: the U stands for unboxed, which may not play well with the type inferencer since you probably can’t slot an arbitrary a in place of Word32. I’m not sure if there’s a typeclass for unboxed types or what. Regardless, I ended up with an insertion sort over unboxed 32-bit integers. That’s got to be faster than the Python code, but it’s not as useful either.
For completeness, here are my imports. I’m not sure I used all of them.
import Control.Monad (mapM_, liftM2) import Data.Array.MArray (getBounds, newArray, readArray, writeArray) import Data.STRef (newSTRef, readSTRef, writeSTRef, modifySTRef) import Data.Array.ST (runSTUArray) import Data.Word (Word32) import Data.Array.Base (UArray,MArray)
In case you’re curious, here is a functional insertion sort. It allocates more and the stack might overflow for longer lists. Well, you shouldn’t be sorting big lists with insertion sort, so stack overflow is less of a concern. However, in addition to being O(n**2) in time, it is O(n**2) in space. But it is much prettier, and it worked the first time, unlike the imperative version.
insertion' [] = []
insertion' (x:xs) = insert x (insertion' xs)
insert x [] = [x]
insert x (y:ys) | x < y = x : y : ys
| otherwise = y : insert x ys