Sunday, February 26, 2006

Rationale for Conditional Expression

Guido explained part of his rationale for adding a conditional expression to Python 2.5. Many users have been attracted to the subtly flawed "a and b or c" form. (It returns c if a is True and b evaluates to False.) Adding a direct form will prevent bugs in their code.

I still don't like it. The proposed form "val if test else val" is too confusing. The test does not belong in the middle. The right thing is "if test then value else val," which Guido is not doing because of the cost of introducing a "then" keyword. (It is also excruciating to get the parser to recognize the difference between a bare "if ... then ..." expression and an "if ...:" statement.)

Exposing AST to Python Programs

Yesterday we decided to expose the AST as a set of Python objects, one class for each constructor in the AST with abstract base classes for each type. Martin von Löwis checked it in today! The code recursively copies the internal C representation to Python objects.

Python Compiler Sessions

We had a birds of a feather (BOF) session about the new Python bytecode compiler. We had a large crowd -- as many as 50 people. The discussion was open-ended and we didn't make many specific decisions.

One result was to come up with a list of tasks that should be completed to make the AST useful for Python programmers:
  • We need generic API to child nodes in AST. This ai would have a function to get all the child nodes. The current API exposes only named attributes for specific children.
  • It would be great if the AST could be used in conjunction with tools that do source-to-source transformations. This use requires an association between the AST and raw tokens (and comments, too). John Ehresman suggested adding a token range as an annotation on the AST.
  • Users who tried the old compiler package found that tree walking pure Python was too slow. We should write a generic walker in C that can be extended with Python functions to call for specific nodes, presumably with pre-order, post-order, and in-order traversals. Performance is important in practice.
We collected several concrete use cases for the AST. People who attended the BOF wanted to:
  • Restrict the scope of Python programs, along the lines of the compiler for Zope's "Python Scripts"
  • Parsing source to generate documentation.
  • Generate SQL, Verilog statements from Python code. Generally, I think, things along the lines of LINQ.
  • Optimizers
  • Branch coverage analysis
The compiler package has always been hard to maintain. In principle, I like having a pure Python compiler. In practice, it is tiresome to maintain two different compilers when one is used infrequently. Several key problems for the 2.5 release are:
  • The AST is uses is different than the builtin AST.
  • There are bugs in the 2.4 implementation, particularly in namespace resolution. (Why didn't someone assign the bug to me?)
  • The compiler package doesn't have a good test suite.
There was some question about whether to label the compiler package as experimental or deprecated, but we did not decide.

Earlier in the morning, I gave a talk on the new bytecode compiler -- a whirlwind tour intended to help new developers get up to speed. I hadn't prepared adequately, so I went through some of the material too quickly. I got a number of good questoins. Brett and Steve suggest it went pretty well.

Saturday, February 25, 2006

Exquisite Timing

I got up early this morning to write my slides for the bytecode talk. I didn't get down to the conference this morning until Guido's keynote was well underway. When I walked into the ballroom, the slide about the AST code was on the projector and Guido was saying something about the speed of the new compiler. The very next moment Guido asked me whether it was faster -- as if it were scripted.

The compiler actually is faster, but not by much. We made no effort to make the compiler faster. I just wanted the code to be simple and easy to maintain. There are some obvious optimizations available. Changing the ast to use a real arena implementation should speed up memory allocation. Doing peephole optimizations before the assembler runs should speed it up. Maybe using the arena for the parser would help, too. Changing pgen to generate the AST directly and avoid the hand-coded transformations in ast.c should be a win, but won't get done for Python 2.5.

Friday, February 24, 2006

Python Ain't Got No Wrong Notes

Django is named after Django Reinhardt. It aspires to be sublime. There is a CMS for newspapers built on top of Django called Ellington. I think the next project should be called Thelonius. (I'd say Monk, but it doesn't have the same mystique). The perfect tag line would be "Python ain't got no wrong notes."

Source: I heard a great radio interview with Phil Schaap where he recounted a story about Monk. I found it repeated on the WKCR web site:
March of '76 was Thelonious Monk. There was a guy on the air doing that standard gibberish about Monk: "and Monk, playing the wrong notes on the piano, is able to create this kind of music....". Anyway, Monk called the Columbia switchboard, and the Columbia switchboard got in touch with me and said that Thelonious Monk had called to say that we should tell the guy on the air, "The piano ain't got no wrong notes."

Thursday, February 23, 2006

Tracking PyCon in the Blogosphere

I expect lots of bloggers will write about PyCon. Googlers will be posting to the PyCon blog again, and Planet Python should have plenty of material. I also subscribed to a pycon query in Google BlogSearch with my feed reader (modest plug).

Dallas Travel Tips

NY Times advice on Going to Dallas. Most of the advice is about locations around the downtown area, a good 10-12 miles from Addison and PyCon. No car for me, so it's unlikely I'll see much.

Wednesday, February 22, 2006

Generators in JavaScript

My recent experience with JavaScript has been mostly positive. The tools I used for development were primitive, but the language itself has some nice features. Brendan Eich just posted about his recent work to incorporate some features from Python into JavaScript. JS is getting iterators and generators via Python!

It was an interesting opportunity to look back at the history of the generators idea. The version implemented for Python came by way of PEP 255 and an implementation by Tim Peters and Neil Schemenauer. The original idea is much older. Iterators were a feature of CLU and generators were fundamental to Icon. The CLU version was inspired by Alphard, a research language from the 1970s. There's more background on Icon's generators in Griswold's 1982 paper (which I haven't read).

Amusing historical notes:
  • When we were debating whether to use suspend or yield for the generator keyword, Tim said he'd continue to use suspend when talking to people familiar with Icon. We didn't expect that Python would become the popular point of reference for this construct.
  • Steve Majewski made the first post on generators in Python that I can find. Guido responded by saying that anything you can do with generators you can do with classes. You see the same argument todaying about allowing functions to rebind names in enclosing scopes. I find this argument a little tedious. I find the Church-Turing thesis as appealing as the next person, but I wouldn't want to write code for a Turing machine.
  • Later in that first generators thread, Tim says there are two things he doesn't want to see:

    1) Changing Python at a deep level to support coroutines/closures.

    2) Any scheme that shares with Icon that the _resumption_ of such beasts
    requires special syntax.
    He did fine with #2, but I'm not so sure about #1. Maybe you could argue that Python didn't change at a deep level, but that would be a fine argument.

Wednesday, February 15, 2006

Impressed with the Democrats?

Not so much. I got a fundraising call from the Democratic Congressional Campaign Committee today. (We're on the do not call list, but I presume they called because we're registered Democrats.) The caller said they target specific races where they have a good chance of tipping the race in favor of the Democrat. It's a reasonable strategy, and I hope they succeed in getting a majority in the house.

I said I might be willing to make a contribution if they were involved in some local races in Pennsylvania. I'd rather make a local contributor where I hope to understand the issues better. The caller just wanted the money, so he evaded the question for a little. When I persisted, he asked what district I was in and put me on hold. He came back and said there was a good chance they would get involved in that race. I said I'd think about it.

I live in the 15th district in Pennsylvania, and I don't have any idea who is challenging Charlie Dent. The answer: Justin Behrens and Bob Dodge are running. The news from the Express Time today:
U.S. Rep. Charlie Dent, a freshman Republican from the Lehigh Valley, might worry about a re-election fight from Democrats now that Sgt. Justin Valera Behrens, an Iraq war veteran, has entered the race in the 15th congressional district.

But the Democratic Congressional Campaign Committee is not prepared to make Dent's ouster a priority in 2006. The House Democratic campaign does not list Dent, among the vulnerable Pennsylvania incumbents in Congress.

I'm not so impressed with the caller. I think we both knew he was lying. Funny, though, that I get confirmation in an article published just today.

I wish Behrens well at any rate, but he needs to get a better web presence. His site has a picture of his family, a phone number, and a couple of email addresses. No content at all.

Things are looking better for Bob Casey, the one candidate I have supported this year.

Monday, February 13, 2006

On-the-fly cycle collection

I did some Javascript programming recently and was surprised to learn that its garbage collector did not collect cycles. Brendan Eich's post today discussed the same problem for XPCom. Python had this problem but we (well, Neil mostly) fixed it years ago. It's basically impossible to require programmers to avoid cycles. Collecting cycles is the right thing.

The Python garbage collection algorithm adds a separate cycle collection phase in addition to reference counting. Periodically the garbage collector does a pass over the objects that subtracts references from tracked objects; the objects left with non-zero refcounts have external roots and a regularly mark-and-sweep phase proceeds from there. I believe the basic algorithm was introduced by Lins, but I can't recall the reference.

Eich's post reminded me of an alternate scheme by David Bacon and others. See the April 2005 paper by Paz et al. The key idea is that garbage cycles are only created when an object has its reference count decremented to a value greater than zero. If you have a cycle, it only becomes collectible when the last reference outside the cycle goes away; thus, a decref is necessary to collect the cycle. The key idea is to limit the garbage collection to a set of candidate objects that have recently been decrefed. It seems like a fairly straightfoward addition to Python.

Friday, February 10, 2006

Guido speaking at Google NY event

Google NY is starting a series of technical talks open to the general public. Guido van Rossum-- a Noogler or new Googler-- will be giving the first talk on Feb. 22. I think he plans to give a version of the keynote talk he'll give at PyCon a few days later.

Let me know if you'd like to come to the talk. It's a good chance to meet other Python users in the New York area and to find out more about Google NY.