Wednesday, April 21, 2010

Easton School District Budget

The Easton school district is in a financial mess.  The school board is considering terrible options like a tax increase of more than 10 percent or layoffs totaling more than 10 percent of the teaching staff.  This post collects a few sources of information about the budget.  (Let me know if you have more links.)
The Morning Call wrote a good summary of the last school board meeting: Ax falls on Easton schools.

The school district public finance page includes the proposed budget for 2010-2011 and the budgets from the last two years.  The auditor's report from 2009 is also posted there.  (All documents there are pdfs.)  I'm not sure how useful the auditor's report is.  There is this gem from the financial highlights on page 4:
The District's overall financial position as of June 30, 2009 continued to be very strong.
As of June 30, 2010, not so much.

The salaries of all Pennsylvania public school employees in 2009 are available.  I took a quick look at the numbers.  It seems like Easton and Nazareth have similar salaries.  Parkland clearly pays better.

School Matters (Easton) has some basic data about the schools, like enrollment and students per teacher.  Overall, the numbers seem to add up to 9,160 students.    Students per teacher ranges from 12.6 at March Elementary to 20.6 at Tracy Elementary.  Unfortunately, the data here is from 2006.  It's district financial summary says the district instructional expenses are a bit under the state average, but capital expenditures  are more than five times the state average.  (Where can we find more recent numbers?)

Tuesday, March 31, 2009

Python strings and bytes

At the PyCon sprints, we looked into a lot of bugs in the standard library caused by interactions between strings and bytes.  (A string holds a sequence of characters.  A bytes object holds a sequence of bytes, e.g. 0-255.)  I help maintain httplib and urllib, which read raw bytes from a socket and often convert them into strings.  The details of those conversions are sometimes tricky.  The rules for strings and bytes changed drastically in Python 3.0.  Most of the standard library was converted from old to new automatically (by 2to3), and many of the times those conversions were incorrect.

A harmless example comes from httplib where an if / elif statement had tests from strings and for unicode strings.  They were both converted to test for strings by the conversion tool.  The code looked like this:

    if isinstance(buf, str):  # regular strings

        # do something
    elif isinstance(buf, str):  # unicode strings
        # do something else

In this case, the second branch could be deleted.  In other cases, the effects were harmful.  If you passed a bytes object as the body argument in an HTTP request--passing form params for a POST reply is a common case--the bytes object would be converted via str() to a string.

    >>> body = b"key=value"
    >>> str(body)
    "b'key=value'"

That is, str() uses repr() to convert bytes to a string.  That's simplfy incorrect.

It will take a long time to sort out all of these problems.  We don't have a lot of experience from application developers who are using Python 3.0, so we have to invent solutions as we go along.  We're likely to make mistakes or at least make sub-optimal API decisions.

I can of think of two things that would help us make progress. 

First, we ought to organize a systematic effort to review the standard library.  How many of the libraries have plausible tests that exercise strings and bytes?  For example, the json library was carefully tested with strings and unicode in Python 2.x.  Those have all been converted to strings, so now we have a thorough set of tests for strings and none at all for bytes.

Second, we need to collect a set of best practices for writing libraries that support bytes and unicode.  A typical pattern is that bytes get sent on the wire.  (Wires, almost by definition, send bytes.)  The applications that use the wire usually want to deal with strings, which means they need to have some way to specify an encoding to use when send to or read from the wire.  We could start by collecting all the patches and bug fixes that have gone into Python 3.1 to fix string and bytes problems with 3.0.

Monday, March 30, 2009

Coroutines in Python

David Beazley gave a tutorial at coroutines at PyCon 2009.  The slides and code are available for download.  I took them home with me on the plane.  I had a fun time reading the slides and studying the code.  It's a remarkably clear explanation of how generators can be used as coroutines, starting with Python 2.5.  He runs through a good collection of examples, winding up with a simple OS-style task scheduler for cooperative multi-tasking coroutines.

One of his concluding points was really helpful for me.  I found it hard to pay a lot of attention to the evolution of PEP 342, which added coroutine support for generators.  I found it confusing that yield / generators were being extended to handle different use cases.  David clarifies it in a helpful way:
There are three main uses of yield
  • Iteration (a producer of data)
  • Receiving messages (a consumer)
  • A trap (cooperative multitasking)
Do NOT write generator functions that try to do more than one of these at once

Tuesday, March 03, 2009

Pressing the Police

David Simon, of Homicide, The Wire, and long ago the Baltimore Sun, wrote about the increasing secrecy of the Baltimore police in Sunday's Washington Post: In Baltimore, No One Left to Press the Police. He was making a large point about the role of newspapers.
"In an American city, a police officer with the authority to take human life can now do so in the shadows, while his higher-ups can claim that this is necessary not to avoid public accountability, but to mitigate against a nonexistent wave of threats. And the last remaining daily newspaper in town no longer has the manpower, the expertise or the institutional memory to challenge any of it."
Simon argues that there aren't any bloggers fighting to keep the city government honest.  The laws provide access to many police records, but an individual is left with little practical recourse if the police don't obey them.  (They hassle photographers taking pictures on bridges, too.)  I'm sympathetic to Simon's argument, but the primary problem is public accountability not the lack of a newspaper to provide it.

It reminded me of my own experience getting access to campus police records in college.  (Neither the crimes, nor the institutions were as threatening as they are in Baltimore.)  The M.I.T. campus police refused to show reporters its police log, a simple record of incidents and arrests.  It took us a year or more of effort to get access to them.

What was involved in getting access to those records?  The Student Press Law Center provided resources for student journalists.  I knew the basic outlines of the law, that other papers had similar problems, and that many of them prevailed in the end.  The issues were clear cut at public universities, but Massachusetts law seemed fairly clear for police at private colleges.  We asked to see the records several times--just walked into the police station and asked to see it.  We also did this a few times with the local Cambridge police, who never gave us a hard time.

I also had the help of a lawyer, a former editor-in-chief of The Tech, who helped us negotiate with the Institute.  I recall a letter he wrote to Thomas Henneberry, M.I.T.'s lawyer, requesting access to the police log: "We believe that this is an appropriate matter for injunctive relief and hope seeking such relief does not become necessary."  I delivered the letter in person to various administrators--the president, the chairman of the corporation, etc.

I certainly had instiutional support from my fellow students at the newspaper.  I think it's easier to feel confident in pressing the case when you have an organization behind you, but it's hard to quantify the effect.

Henneberry wrote a letter back to us, explaining that everything we argued was nonsense.  I don't recall exactly how we responded, but several months later the police log was opened.  We did not get any official response, but the log was available to reporters.  The police log is still published reguarly.

What's the lesson for bloggers in this?  You probably need an umbrella organization to help coordinate a campaign for open records and a lawyer willing to help you in specific cases.  You need to make a sustained effort to get access to records.  The first few times you show up, you'll simply be turned away.  It's not inconceivable that bloggers could achieve as much as the local paper in this regard.

Wednesday, December 03, 2008

Python 3000

Python 3000 is ready! The official release may not come until tomorrow, but Barry has tagged the source and is preparing the release. We've been waiting for this release for almost nine years. The earliest reference I can find is a message from Guido to python-dev in Jan. 2000.

In the previous century, we had been thinking about Python 2 as the Python version that would break backwards compatibility to make real language improvements. Python 2 ended up being a big deal, but without too many compatibility issues. All the major changes were deferred to "Python 3000," a mythical creature. I'm eager to see what programmers make of this creature.

Sunday, November 23, 2008

John Dingell

I remember John Dingell from my days writing for The Tech, the campus newspaper at M.I.T. He had a reputation as a bully, who liked to hold hearings and conduct investigations on fraud and waste in science to grandstand. The two I remember best ended up being busts as far as actually uncovering waste or fraud--audits of funding by research universities and a fraud case involving a researcher in David Baltimore's lab where the all charges were eventually dismissed.

The Tech, Feb. 7, 1992
The government -- particularly Rep. John D. Dingell (D-Mich.), chairman of the Oversight andInvestigations Subcommittee -- tried to publicly embarrass MIT and other universities for alleged misuses of funds, even before formal evidence was presented before the committee.

The Tech, June 6, 1997
What had originally been a matter of Imanishi-Kari's questionable research data quickly swelled into a thorny and divisive debate over the validity of scientific research.

Baltimore derided the controversy as a witch hunt and believed that some people, like U.S. Representative John Dingell (D-Mich.), were using it unreasonably to call into question government money spent on funding research.
75
Created by OnePlusYou - Free Dating Site