Monday, February 26, 2007

locals() and free variables

I've been struggling with some odd corner cases of free variables and eval/exec in Python.

There is a long-standing bug report that you can't access free variables when you created a nested code block with eval or exec--for example, putting a lambda in an expression passed to eval. It works if the variable happens to be free in the text of the function containing the eval, but only then. The bug report claimed that Scheme worked differently, but I don't think Scheme provides a way to capture an arbitrary environment to pass to eval; it only provides access to top-level environments.

The name locals() is misleading. It returns free variables as well as local variables. The name suggests that this behavior is wrong, but it is necessary to make exec and eval work with free variables. The function returns the names visible in the current scope.

The discussion here is actually about much more than locals(). The same basic issues arise when you use exec or import * in a block or when you use the debugger or some other trace function installed via sys.settrace(). In all these cases, we extract variables in a dictionary to support introspection.

There is still a problem with locals(), which results from the use of locals() in debugging. The implementation makes sure that changes made to the locals dict are reflected in the running program in many cases. In old versions of Python, local variables were stored in a dictionary and locals() returned the actual dictionary. This feature is useful in the debugger. When the implementation changed to use a simple C array instead of a dict, the interpreter arranged to copy variables back and forth between the dictionary and the array when the debugger was used. (This behavior is also needed to make features like exec and import * work.)

In many cases, it is fine to copy free variables into the dictionary returned by locals. If changes are made to those variables, they can be written back to the appropriate part of the closure rather than creating local variables that shadow the free variables.

Class namespaces pose a serious problem in the current CPython implementation. The class stores local variables in a dictionary. When the body of the class finishes execution, this dictionary is passed to the class constructor. Keys in the dictionary become attributes of the class. If you call locals(), free variables could be copied into the dictionary for access in the debugger or introspection. But if they are copied, they will become attributes of the class, which was not intended. It's a messy problem, because it is possible, though inscrutable, for
the same name to be used for a free variable in a method and a class attribute; they have the same name, but they refer to different bindings.

For now, we are fixing this in Python 2.x by omitting free variables from the dictionary returned by locals() when locals() is called in a class block. This makes introspection more difficult but prevents locals() from polluting the class namespace.

What are some other solutions for the class problem? It might be possible to return a copy of the class dictionary with the free variables added. Changes to this dictionary are written back to the real dictionary or free variables when you run in the debugger. It wouldn't be possible to reflect all changes to the dictionary; it would only work in contexts like debugging or using exec. I'm not sure if that would be too confusing.

We could also make two different functions, where locals() and vars() seem like reasonable names. The locals() could return the actual dictionary object for classes, without adding the free variables. The vars() could return a dictionary will all variables, but it would be a copy. Then client code could get whatever they wanted.

1 comment:

Anonymous said...

You write very well.