| « First, quick Leopard (OS X 10.5) notes | First Wii Notes » |
Here’s some runtime type checking for Python. It lets you define the point at which Python throws a type error. Instead of happening deep inside a library function, it will be at the point at which the annotated function is called:
@typecheck(str, str, int) # takes two strings and returns an int def f(s1,s2): return int(s1) + (int(s2) if s2 else 0)
And here’s the actual code
from util import fnc
def _check(x, t, errmsg):
xtype = type(x)
ttype = type(t) # used for list/dict/tuple syntax
if t is True: return
if t is None and x is None:
return
elif ttype is list and xtype is list:
_check(x[0], t[0], errmsg)
elif ttype is dict and xtype is dict:
xk, xv = x.iteritems().next()
tk, tv = t.iteritems().next()
_check(xk, tk, errmsg)
_check(xv, tv, errmsg)
elif ttype is tuple and xtype is tuple:
for y,tt in zip(x,t):
_check(y,tt, errmsg)
else:
try:
assert isinstance(x, t)
except (AssertionError, TypeError), e:
raise TypeError(errmsg)
def typecheck(*types, **kwtypes):
"""[TypeRef] -> (* -> *) -> TypeChecked (* -> *)"""
def deco(f):
name = fnc.fn_name(f)
def checker(*args, **kwargs):
for i,(a,t) in enumerate(zip(args, types)):
_check(a,t,"Argument %s, %r, is not %s" % (i,a,typerepr(t)))
try:
for kw,v in kwargs.items():
_check(v, kwtypes[kw],
"Keyword argument %r, %r, is not %s" %
(kw,v,kwtypes[kw]))
except KeyError, e:
raise TypeError("%s got an unexpected keyword argument %r" %
(name,e.args[0]))
res = f(*args, **kwargs)
_check(res, types[-1],
"Return value %r is not %s" % (res, typerepr(types[-1])))
return res
return fnc.named(name, checker)
return deco
Typechecking is so un-Pythonic that I need to explain myself. But first, an analysis of the code. The first function, _check, does most of the work. It consists of three sections: pre-work, recursion on type syntax, and the actual isinstance check.
# pre-work
xtype = type(x)
ttype = type(t) # used for list/dict/tuple syntax
if t is True: return
if t is None and x is None:
return
You can see that passing True makes the checker pass without checking. A type of None means that the value *must* be None. That’s meant to be used for functions that return nothing. The type of x and t are also stored for future comparisons. Even though t is meant to be a type, because of type syntax like [{str:int}], there will be lists and dicts mixed in with the real types.
elif ttype is list and xtype is list:
_check(x[0], t[0], errmsg)
elif ttype is dict and xtype is dict:
xk, xv = x.iteritems().next()
tk, tv = t.iteritems().next()
_check(xk, tk, errmsg)
_check(xv, tv, errmsg)
elif ttype is tuple and xtype is tuple:
for y,tt in zip(x,t):
_check(y,tt, errmsg)
Like I said, t is sometimes a value rather than a type, and these three if clauses check for list values, dict values and tuple values, recurring on _check to make sure the types they contain are correct. Notice that lists and dicts are assumed to be homogeneous; only the first item is checked. Also notice that because of Python’s semantics and this setup, None is not the subclass of everything and cannot be passed for, say, [int]. Use of None for a failure value is kind of a bad idea anyway, and if you’re up for type checking surely you won’t balk at *this* requirement.
else:
try:
assert isinstance(x, t)
except (AssertionError, TypeError), e:
raise TypeError(errmsg)
The actual type check is surrounded by a a try/except because it can go wrong in two ways. First, the type check can fail normally, throwing an AssertionError. For example isinstance(1.0, int) will fail. Second, there can be a mismatch in the number of recursions necessary to arrive at a simple type, throwing a TypeError. Notice that the recursion conditions in the second section are very strict–both x and t must be the same complex type. So if x is 12 but t is [int], then isinstance(x, [int]) will fail because [int] is not a type. Either way, it’s the same error: type checking fails.
typecheck itself is annoying and nested, but that’s just the way you have to write Python decorators with arguments. If you just look at the code inside the thrice (!) nested function checker, there are three sections: check the numbered arguments, check the keyword arguments, then run the function and check its results.
So, as for an explanation and such. Well…this is what I’m going to use for the second stage of prototyping complex ad-hoc code, where I’ve written something that works and I need to come back several weeks later and remember what my data structure is. Let me show you a relatively simple example, a cleaner for program output that returns a graph of strings :: {str:[str]}.
@typecheck(str, int, [(str,str,str)])
def clean(outname, iterations):
return [(src,dst,sig)
for (src,dst,d_avg,dots,sig) in group(list(open(outname)), 5)]
@typecheck([(str,str,str)], int, {str:[str]})
def graph(edges, iterations):
return dct.collapse_pairs((src,dst) for (src,dst,sig) in edges
if float(sig) / iterations >= 0.95)
@typecheck(str, int, {str:[str]})
def process(outname, iterations):
return graph(clean(outname, iterations), iterations)
I’ve been using comments for this, but they have all the problems of comments: they aren’t checked, so they fall out of date or are wrong in the first place because they’re not checked by a machine. Other programmers will probably ignore them (though they may ignore the annotations too). However, the above example is really too simple. The only real place I miss type annotations is when I come up with a complex representation for something and then forget exactly what it was later. My favourite example is a tree that is a pair of (node, [children]), where children is a pair of (node, [children]), where children is a pair of (node, [children]). Oops…that type is infinite. But it’s the default way to represent lists in Lisp, so I have a soft spot for it. Here is a tree parser that uses it. Notice that I stop the infinite type at the first level with [object], so it’s not that descriptive.
@typecheck([str], {str:[[(str, [object])]]})
def read_ice(lines):
@typecheck(int, [str], [(str, [object])])
def loop(n, lines):
@typecheck([str], (str, [object]))
def clean_heading(lines):
line,childs = carcdr(lines)
return (clean(line),loop(n+1,childs)) if childs else clean(line)
return map(clean_heading,
splitby(lambda line: n==indent(line), lines, True))
return dct.collapse(splitby(lambda line: '<sent>' in line,
lines, first=True),
compose(speaker_code, car, cdr),
lambda sent:loop(0, filter(useful, cdr(cdr(sent)))))
So here are some disclaimers. Don’t use this if you need a real type system with (parametric) polymorphism/type classes/type variables. Don’t use this for simple code. def useful(line): return ‘ignore’ not in line and line[0] != “[” is obviously str ->bool, so don’t annotate it. Don’t use this for code that gets called a million times–the annotations make your code slower, not faster. Don’t use this to annoy your fellow programmers. The only real purpose for this code is to get a few of the benefits of type annotations, commenting mainly.
If you are serious about using this, you probably want the real typecheck library, which provides typeclasses and polymorphism and a possibly more Pythonic interface (two annotations, @accepts and @returns). I did some research and there are basically three approaches to Python type checking: little toys like mine, bizarre OO monstrosities (also toys but very configurable), and the typecheck library.
And now for something completely different.
If you’re planning to rotate your monitor to portrait (and I recommend that you do), be prepared for lower performance. I recently switched my secondary monitor back to portrait and iTunes’ album art view is pretty stuttery on it. As soon as I moved iTunes back to my main monitor, everything was silky smooth again. (I couldn’t verify the effect before this year because my old laptop is so slow that both monitors stuttered.) It’s worth it, though, for the usability benefits. Really tall documents.