Sunday, October 10, 2004

Python Standard Library

Andrew Kuchling argues that we should stop hacking on the Python interpreter and starting improving the standard library. It's exactly the opposite of the argument that Mitch Kapor made in his PyCon keynote: security, performance, developer tools.

I'm sympathetic to Andrew's line of reasoning. In the spirit of worse is better, I've often said that Python isn't faster because it's users are better served by other things, like better libraries or fewer interpreter crashes or better language features. The resources available for developing Python are quite limited and security and performance take a lot of effort.

There is little question that parts of the standard library need improvement; cgi comes to mind first as a module that should be rewritten from scratch. I had hoped that the Web SIG would improve web programming in Python , but it is making headway in only a narrowly focused area.

What else needs work? We need better XML tools that what we find in the standard library. There are good ideas and interfaces in ElementTree and xmltramp -- and even more interesting work in languages like Comega.

I don't agree that we need to retarget Python development, though. There has always been a healthy mix of work on libraries and work on the core language. Python 2.4 has library improvements: new decimal arithmetic, improved doctest, new email package. Python 2.3 had even more: sets, itertools, logging, new pickle protocol, csv files, and BerkeleyDB support.

Morever, Python developers choose what they want to do. There are certainly bugs to fix and patches to review, but new libraries and substantial revisions are usually the result of one or a few people who want to do it.

Andrew posted a list of six issues that ought to be resolved.

  1. Organization: I don't find this issue very interesting. I'm not sure who benefits from renaming or hierarchy. Is it a question of discovery -- putting like things together in a hierarchy to make them easier to find -- or of name clashes or something else?
  2. Process for adding new modules or packages: They need to be useful to many users, and they need to be maintainable. They two conditions imply that the code is well-designed, correct, tested, and documented. My strawman is that we accept everything in this category. What would go wrong?
  3. PyPI: I think I'd like PyPI to do things it doesn't do, but I don't feel like I've suffered for its lack of features. I think a release that contains core plus libraries is still the best model.
  4. Nomination process: Maybe a popularity contest is sufficient. Put up a Wiki page and let people add their names.
  5. Modules that need improvement: If you pick almost any module, you could find ways to improve it. A Wiki page with users' library gripes might be a good start.
  6. Look to applications for ideas about what libraries are needed. Ok. I'll bet you'll find applications where language features or better implementation would make a difference, too.

No comments: