Wednesday, June 14, 2006

Using indentation to represent program structure

I've heard Guido mention that Donald Knuth made an early suggestion that future programming languages would use indentation as a structuring mechanism. I decided to dig into that history a little bit today. Knuth quotes D. V. Schorre who writes
Since the summer of 1960, I have been writing programs in outline form, using conventions of indentation to indicate the flow of control.
Knuth quoted this in the context of his long paper "Structured Program with go to Statements" (p. 295) where he wrote:
It seems clear that languages somewhat different from those in existence today would enhance the preparation of structured programs. We will perhaps eventually be writing only small modules which are identified by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language.
Knuth also mentions Peter Landin's ISWIM family of languages. In "The Next 700 Programming Languages," Landin discusses four levels of abstraction in programming languages--physical, logical, abstract, and applicative expressions. The physical and logical languages involve syntactic issues like grammatical rules for grouping textual elements. Landin also mentions that ALGOL "sought to avoid any commitment to any particular sets of characters or type faces."

Landin's preferred representation for ISWIM used indentation. The paper has a section that lists differences between ISWIM and LISP, including the "textual appearance of ISWIM."
(c) Indentation, used to indicate program structure. A physical ISWIM can be defined in terms of an unspecified parameter: a subset of phrase categories, instances of which are restricted in layout by the following rule called "the offside rule." The southeast quadrant that just contains the phrase's first symbol must contain the entire expression, except possibly for bracketed subsegments.... It is based on vertical alignment, not character width, and hence is equally appropriate in handwritten, typset or typed texts.
In the next sentence, he observes that indentation is not mandatory. It can be freely mixed with more conventional punctuation.

The discussion, presumably from the conference, starts with a discussion of indentation. Peter Naur raises an immediate objection, oen that is raised today about Python programs.
Regarding indentation, in many ways I am in sympathy with this, but I believe that if it came about that this notation were used for very wide communcation and also publication, you would regret it because of this kind of rearrangement of manuscripts done in printing.
Robert Floyd also objected:
It works on the micro-scale--that is, one page is all right--when dealing with an extensive program, turning from one page to the next there is no obvious way of indicating how far the indentation stretches because there is no printing at all to indicate how far you have indented.
References

Donald E. Knuth
Structured Programming with go to Statements
ACM Computing Surveysm, Vol. 6, No. 6, December 1974
pp. 261-301

D.V. Schorre
Improved organization for procedural languages
Technical memo TM 3086/002/00, Systems Development Corp., Santa Monica, Ca.
Sept. 8, 1966
(I haven't seen the original. This citation is just a copy of Knuth's.)

P. J. Landin
The Next 700 Programming Languages
Communications of the ACM, Vol. 9, No. 3, March 1966
pages 157-166

The paper was presented at the ACM Programming Languages and Pragmatics Conference, Aug. 8-12, 1965 in San Dimas, Ca. The printed version of paper includes some discussion, presumably from that conference.

8 comments:

Florian said...

That must've been the silliest objection against indentation for structure I've ever heard.

"We should not make a nice computer language because the printing press is not suited for it."

hmright...whatever dude, uh Robert Floyd

Wherever people write source they widely use indentation to signify structure, regardless the fact that their environment is oblivious to the indentation, for christ sake, can we finally arrive at common sense?

Jeremy Hylton said...

I don't think you're being charitable enough to Floyd and Naur. First, their comments were made 40 years ago, when readability concerns were different. I imagine it was much more common to print source code for reading and revision.

I think the comment still captures a tricky issue. If you're trying to re-indent some code at the end of a long indented block, it can be hard to tell exactly how far back to move a block and where to line it up.

Matt said...

Following indentation level in long code blocks, both on the screen and in a printout, can easily be fixed with an editor that has the ability to signal depth level (with background color, dots, or numbers, etc.).

It's true that in a simple text editor the reader may still be lost, but that argument is similar to folks complaining about editing some fictitious application's XML config file with a text editor instead of using the application's GUI configuration tool.

Anonymous said...

Again, this was 40 years ago. A simple text editor had to do; there weren't nice gui's to edit XML settings with. Low resolution monochrome displays were the best they had.

MWM

Anonymous said...

What was that about monochrome displays!? Back in the early sixties you wrote your code on paper, handed it to the keypunch operator to be punched on cards, then waited. When the deck came back with a printout of the contents, you checked it for typographical errors, and if you were lucky, punched up a few corrected cards yourself. Then it was down to the card reader, or more likely, to the computer operator who put your cards in something called the queue. Once again you waited. Finally, the results of your test run were returned along with a code printout which probably spanned many, many pages. If you wrote your code in Algol you would likely get a nice code listing with vertical bars tying together the indented lines. There was a distinction between a code listing produced by the compiler and a code printout which was a simple dump of your punchcards.

By the way, I was not a programmer until 1972 in highschool. My second full-time programming job in 1979 is where I experience the punchcard environment that I described.

Jeremy Hylton said...

I don't find it unusual that a modern text would use indentation. The book xamdam mentions is

Probability and Statistics for Computer Science
by James L. Johnson
, which was written in 2006.

Anonymous said...

Cards in 1979? I started doing FORTRAN with cards in 1965 as a student, and Assembler professionally in 1970. My employer was so cheap we didn't have interpreting punches. We had to read the cards by holding the holes up to the light.

I still remember 'A' was a 12-1 punch, 'J' 11-1, and 'S' was 0-2.

16K memory, maximum 4-character variables. Ahh... those were the days.

Anonymous said...

Indentation seems pretty prevalent in papers and books, at least for pseudocode...

Re: a mixture of indentation and symbols: Haskell implements this (c.f. layout specificati