« The Anti-Mac Interfaceautojump »

Call by value in Python (and C and Scheme)

19/02/09

Permalink 08:21:59 pm, 1624 words
Categories: Code, Linguistics

Call by value in Python (and C and Scheme)

This is an edit of an e-mail conversation I had with my colleague Josh Herring about the term ‘call-by-value’ as applied to the Scheme-like languages Python, Ruby, Java, Scheme, Javascript, etc. Thanks to Josh for his discussion and help in bringing the argument to some conclusions. Also, note that I am not a programming language theorist, just an enthusiast. This is a complex and fiddly point of semantics, so please provide any needed corrections in the comments.

Conclusions

If you aren’t interested in the complete discussion, here are the conclusions:

  1. Programming language (PL) theorists choose some obtuse names for concepts: ‘call-by-value’ is not very descriptive (without some work to understand semantics that may differ from common usage).
  2. The Python community, mirroring its BDFL, is short on PL theory but very good at usefully describing real-world behaviour.
  3. Wikipedia articles are full of information on any subject you want to know about. Usually the problem is too much information, not too little.

The Python community uses the term ‘call-by-sharing’ to describe Python’s function call semantics, using ‘call-by-value’ for C-style semantics and ‘call-by-reference’ for Visual Basic-style semantics. ‘call-by-sharing’ is a useful term but not strictly accurate—it combines the two concepts of call-by-value and references to mutable objects.

Wikipedia

Let’s start with the Wikipedia article on Evaluation strategies, starting in the Strict section. The three relevant subsections are call-by-value, call-by-reference and call-by-sharing. Call-by-value is defined as “the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the function (frequently by copying the value into a new memory region).”

As the third paragraph says, the definition of ‘value’ is problematic; it is not necessarily the same between C/Pascal and Java/Visual Basic, even though Java claims to be call-by-value. This is precisely right, and the point that is most confusing. Let’s look at C first, then Scheme and finally Java/VB/Python.

C

C provides only call-by-value. However, it provides pointers to allow you to mimic call-by-reference yourself if you want. (Pointers allow you to mimic a large number of other things which are beyond the scope of almost any article, especially this one.) The mimicry, however, is not transparent. Shared objects must be marked at the call site, the receiving site and the change site:

void f(int* i) {
  int j = 12;
  *i = j;
}
int main() {
  int k = 10;
  f(&k);
  return 0;
}

If you print k before and after running f, you’ll see 10, then 12. In contrast, this C++ program, with call-by-reference, does the same thing marking only the receiving site:

void f(int& i) {
  int j = 12;
  i = j;
}
int main() {
  int k = 10;
  f(k);
  return 0;
}

Crucially, in C, you cannot do two things. You cannot, as in C++, change the shared object transparently: you must use the machinery of pointers to mimic call-by-reference. And you cannot changed the passed pointer because it is itself a value that is copied when the function is called. This is still call-by-value; pointers point to shared objects, but the pointers themselves are passed by value. So if you try to alter the pointer passed to a function, nothing happens:

void f(int* i) {
  int j = 12;
  i = &j;
}
int main() {
  int k = 10;
  f(&k);
  return 0;
}

Here, I want to use f to alter k’s address instead of the value that k points to. But this is impossible because k is copied to i inside f and changing i itself has no effect on k. (Although, hmm, this isn’t possible in C++ either because its call-by-reference mechanism manages dereferencing for you and does not let you do it yourself.)

Scheme

On the surface, Scheme doesn’t behave much like C, at least C without pointers. It acts more like Python. Just like Python, actually, and all those other languages that were created 1988-1996 or so:

> (define x (make-string 3 #\o))
> (define (f y)
    (string-set! y 0 #\b))
> x
"ooo"
> (f x)
> x
"boo"

I’ve switched from numbers to mutable strings because of the way the R5RS defines complex objects. In fact, let’s look at the R5RS, the language definition for Scheme. Preview summary: all Scheme values are referred to by variables, which are C pointers except that they support fewer operations.

Storage Model

Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations. …
Every location is marked to show whether it is in use. No variable or object ever refers to a location that is not in use. Whenever this report speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them.

Procedures

When the procedure is later called with some actual arguments, the environment in which the lambda expression was evaluated will be extended by binding the variables in the formal argument list to fresh locations, the corresponding actual argument values will be stored in those locations, and the expressions in the body of the lambda expression will be evaluated sequentially in the extended environment. (emphasis mine)

So it looks like the Scheme’s model is actually the same as C’s: storage is a set of locations that have objects stored in them. Then pointers/variables, stored in different locations, refer to the objects’ locations. Calling a function creates fresh copies of the pointers, which still point to the original location. In other words, Scheme always passes pointers (calling them variables); these are the things that are copied to fresh locations, not the objects themselves.

There is one other important difference with C: variables only have two operations: reference (C’s *i) and (set! i j) (C’s i = &j;). There is no equivalent to C’s *i = 12;, which is how C++’s i = 12 works (where i is passed by reference).

However, there a number of built-in procedures that modify the object that your variable points to, like set-car! and string-set!. (string-set! s 0 #\c) does the equivalent of C’s *i = ‘c’; but is a black box that cannot be written in Scheme itself. That’s why I switched to strings—there are no procedures that alter numbers. This is precisely Python’s model. In fact, trying to fake string-set! with set! on immutable strings does not work:

> (define x "ooo")
> (define (f y)
    (set! y "boo"))
> x
"ooo"
> (f x)
> x
"ooo"

Python

And All The Rest

Before starting, to re-assure you that the semantics of Python and Scheme are the same, here’s nearly the same code in Python (I use lists since strings are never mutable in Python):

x = list('ooo')
def f(y):
  y[0] = 'b'
print x
f(x)
print x
=>
['o', 'o', 'o']
['b', 'o', 'o']

But

x = list('ooo')
def f(y):
  y = list('boo')
print x
f(x)
print x
=>
['o', 'o', 'o']
['o', 'o', 'o']

So it’s clear that the languages behave the same way. Is there any difference at all? Let’s look at the Python language definition (version 3.0):

Naming and Binding

Names refer to objects. Names are introduced by name binding operations. Each occurrence of a name in the program text refers to the binding of that name established in the innermost function block containing the use.

The actual text is a lot longer than the Scheme definition because the Python definition has a lot more work to do, being a non-minimal language with classes, etc. It is also is a lot less specific. A more abstract definition is fine, and Python isn’t renowned for its precise definition in the first place. It leaves name-binding open to interpretation by the implementer, but one valid interpretation is that names are implemented with pointers and copied with binding.

Conclusions

For Real This Time

Let’s recap.

  1. C is call-by-value, but you can mimic call-by-reference with pointers. Not transparently, but who cares? This is C.
  2. Scheme is call-by-value. It behaves very differently from C, in fact, identically to Python, Java, Ruby, Javascript, etc. But it does this by defining variables as values too, ones that point to actual objects. In other words, all Scheme variables are pointers. You don’t get to directly access objects.
  3. Python behaves the same way as Scheme. Its language definition can be interpreted as the same as Scheme’s: names are pointers and binding copies pointers to a new namespace.
  4. Therefore, Python is call-by-value.

So why use the term ‘call-by-sharing’? Well, because it’s a useful concept now. Today. Most modern languages combine call-by-value with pointers to mutable objects in precisely the way Python does. This results in a practical difference from old languages like C, which mimic it manually when needed.

However, in 100 years, the programming language world will surely be different, and call-by-value versus call-by-sharing may be a distinction that is no longer relevant. By that time most languages will be non-strict (lazy) and call-by-value will be something that only old bare-metal eager languages do, or current languages provide as an escape hatch for efficiency. In that space, Python and related languages will be squeezed from below by C and from above by some lazy language like Lazy#.NET.2.0; most will probably be long gone. But programming languages will still have theorists (and, unfortunately, Microsoft and C). The theorists will still need the call-by-value/call-by-reference distinction for eager languages, just not as often.

Of course this is fine. Theory is more concerned with being correct forever than useful in context. And human languages should be free to adapt to changing environments. The successors of the Python community will probably provide a lot of useful vocabulary just like they do today.

3 comments

Comment from: Aravind [Visitor]
Excellently written article! One of the better ones on the entire call-by-value/refernce/sharing toss-up.
24/12/09 @ 17:57
Comment from: Pilipinas Win na Win [Visitor] · http://www.facebook.com/group.php?gid=106501072737625
Might you be interested on a website link swap? Generously mail myself a message.
28/07/10 @ 12:20
Comment from: Pilipinas Win na Win [Visitor] · http://www.facebook.com/group.php?gid=106501072737625
I just stopped at your website from Bebo. Do you provide certain type of donation box where I could possibly render monetary gift in PayPal? I would desire to reward you for your website content. :)
28/07/10 @ 23:17

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
powered by free blog software