Jeremy Hylton: Inconceivable: An example code transformation

I wrote a simple transformer that works for a trivial case. It demonstrates the basic structure of a transformer and highlights some hard issues still to resolve. It is much too hard to generate the AST nodes (and corresponding tokens) for the newly generated code.

Let's start with the easy part. The HasKeyTransformer looks has all calls of a has_key attribute. (There should be better type analysis to try to cope with cases where there is a non-dict with a has_key() method.) It starts with an in-order walk of the tree looking for has_key() calls. The search() method performs the search, defering to the is_has_key_call() predicate to test each node. The predicate looks for a Call node, where the function being called is an attribute of another object; i.e. the call is of the form x.y(...). If the y attribute is has_key, then the predicate returns true.

import concrete

class HasKeyTransformer:

  def search(self, tree):
      for node in concrete.TreeIter(tree):
          if self.is_has_key_call(node):
              yield node

  def is_has_key_call(self, node):
      if not isinstance(node, _ast.Call):
          return False
      callee = node.func
      if not isinstance(callee, _ast.Attribute):
          return False
      return callee.attr == "has_key"

The next step is to replace the Call node with a Compare node, using the object with the has_key method as the right-hand side of an "in" comparison. The new Compare node can re-use some of the old tokens and AST nodes but needs to generate its own "in" token. In some cases, it may need to add parentheses, too. The code turns ugly here.

    def replace(self, node):
       # Replace the Call node with a Compare node.  The
       # base of the callee becomes the RHS and one argument
       # in the call becomes the LHS.

       the_dict = node.func.value
       the_key = node.args[0]
       strip_positions(the_key)

       new = _ast.Compare()
       new.left = the_key
       the_in = _ast.In()

       # We need to synthesize a full token for "in".  Yuck!
       the_in.concrete = [(3, "in")]
       new.ops = [the_in]
       new.comparators = [the_dict]
       new.concrete = [new.left] + new.ops + new.comparators
       return new

The strip_positions() function removes the old position information from all the tokens representing the key. We are moving the key to the left of "in," so none of those positions are correct. We'll have to count on untokenize to do something reasonable. We ought to be able to preserve relative spacing, but that's a detail. It's hard to hook up the new node, because we need to fill in the concrete list and all the other node-specific attributes. It should be possible to generate helper functions that take the concrete tokens and nodes and hook up all the attributes correctly. The code will look better then. We could also have token helper functions so that you can avoid writing tuples.

    def replace(self, node):
        # Replace the Call node with a Compare node.  The
        # base of the callee becomes the RHS and one argument
        # in the call becomes the LHS.

        the_dict = node.func.value
        the_key = node.args[0]
        strip_positions(the_key)

        return concrete.Compare(the_key, [token.STRING("in")], [the_dict])

Jeremy Hylton: Inconceivable

Sunday, August 27, 2006

An example code transformation

No comments:

Search This Blog

Followers

Blog Archive

About Me