« Type-directed programmingTwo weeks of Haskell »

Solving Cryptograms, Part 2

29/09/08

Permalink 09:24:45 am, 180 words
Categories: Code

Solving Cryptograms, Part 2

Well, I promised that this would be a weekly series, so here goes. This week I don’t have much code to show, but at least it’s a start.

My next step beyond the brute-force solution is to implement a depth-first search sorted by goodness. To do this I break the permutation logic into two pieces. The first function generates all single-character changes to a substitution. The second function sorts these substitutions by score, then investigates those substitutions in order until (at least) one wins. This solution is not quite right, and although I hope that it interacts well with Haskell’s laziness, I rather doubt that it does. I need to step through an execution to see.

Just as a reminder, here are the types I’m using:

type Substitution = [Char]
type Cryptogram = String

The first function has to provide one step of the original permute and enough information for it to be resumed later. In other words, it has to capture the call stack. It’s a bit like call/cc done manually I guess.

permute [] = [[]]
permute l = [x:ys | (x,xs) <- extractions l, ys <- permute xs]

becomes

stepPermute (prefix,l) = [(x:prefix, xs) | (x,xs) <- extractions l]

The prefix argument splits the “string already shuffled” from the “string left to shuffle". Notice that you have to provide the part of string that has already been shuffled beforehand.

Unfortunately, the scaffolding required to use stepPermute is too extensive for my taste. I’m probably missing some nice way to express it. Here is search.

search :: Set String -> Cryptogram -> (String,String) -> [Substitution]
search dict crypt pair = step . sortBy (comparing (lift eval)) $ subs
    where permutations = stepPermute pair
          subs = zip (map (uncurry (++)) permutations) permutations
          lift f = f dict crypt . fst
          step ss | any (lift done) ss = map fst . filter (lift done) $ ss
          step ss = concatMap (search dict crypt) (map snd ss)

I’m not sure of the best order of presentation either, so I threw everything into a where clause. First I get the next set of permutations, then zip the resulting substitutions with the prefix/suffix pairs. This results in such an unwieldy data structure that I define lift to make it easier to use functions I’ve already defined, like eval and done.

The rest of the code should be fairly simple once you get past the bad formatting. I sort the list of 1-permuted substitutions by score (which may be a bad idea, I’m not sure at this point), then inside step I check to see if any are done. If so, then I return the finished substitutions. Otherwise, I recursively search the factorials, paired with their prefixes.

As a reminder, here are eval, done and score again.

eval dict crypt sub = score dict (subst crypt sub)
subst crypt sub = map (\ c -> Map.findWithDefault c c dsub) crypt
    where dsub = Map.fromAscList (zip ['a'..'z'] sub)
score dict text = sum (map (lookup dict) (words text))
    where lookup dict w = if member w dict then length w else 0
done dict crypt sub = eval dict crypt sub == bestScore crypt

Finally, here is a test function using the same data as last time. Notice that search now takes care of all the generation and ordering of states. All the messiness is now in search.

test2 = search dict crypt initial
    where dict = fromAscList ["a", "ad", "bad", "cab", "cad"]
          crypt = "d bda cdb"
          initial = ("", "abcd")

Here is the complete code. It’s getting kind of disorganised, I’ll admit. Cleanup will come later; I’m busy this week. Also, I should test with a bigger alphabet to see if I’ve actually improved on the brute-force solver. I suspect that I haven’t. Plus I get the correct substitution repeated three times, so I think I am exploring parts of the permutation tree more than once, which is not supposed to happen. As always, if you see bug fixes or improvements, let me know.

No feedback yet

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
powered by free blog software