“premature optimization”

Here is an except from Donald Knuth’s 1974 paper “Structured Programming with go to Statements”.  There is a well known quote at the end of the second paragraph :

The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant.  The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should no pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. After working with such tools for seven years, I’ve become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

The quote to which I am referring – that “premature optimization is the root of all evil” – is, from time to time, used fait accompli to assert that any time spent optimising is time wasted – something that is clearly not the case when considering the quote in context.

There are a lot of other good ideas in Knuth’s article, many still quite relevant to software engineering. The full article (which can be obtained via citeseerx) is, primarily, a contribution to a  discussion of the time on the use of go to statements – another topic often reduced to a simple, extreme position, attributed to Edsger Dijkstra.  Knuth’s paper makes it clear that Dijkstra did not hold so tightly to that position himself.

This is the sort of paper that I find to be a fascinating read, with both historical and technical value.

The final page of the article contains the following quote :

A disturbingly large percentage of the students ran into situations that require go to‘s, and sure enough, it was often because while didn’t work well to their plan, but almost invariably because their plan was poorly thought out.

Which brought to mind another quote that I had seen recently, this one from George Orwell :

A man may take to drink because he feels himself to be a failure, and then fail all the more completely because he drinks. It is rather the same thing that is happening to the English language. It becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts.

It seems quite valid to say that there is a connection between our language and our expression (in terms of ideas, spoken word, programs, and more).

When trying to find a source for Orwell’s words (which you can find here), I was heartened to find these sentences follow those quoted above :

The point is that the process is reversible.   Modern English, especially written English, is full of bad habits which spread by imitation and which can be avoided if one is willing to take the necessary trouble.

Which I believe is also applicable to the expression and optimisation of computer programs.

Computer memory architecture foretold

Ideally one would desire an infinitely large memory capacity such that any particular … word would be immediately available. … We are … forced to recognize the possibility of constructing a hierarchy of memories, each of which has greater capacity that the preceding but which is less quickly accessible.

A. W. Burks, H. H. Goldstine and J. von Neumann
Preliminary Discussion of the Logical Design of an Electronic Computing Instrument (1946)

(Quote found in Computer Architecture : A Quantitative Approach 4th Ed. by Hennessy & Patterson)

The Soul of a New Machine

I’m reading The Soul of a New Machine, which I discovered by accident when looking for some other books.  It’s a fantastic book!  It details the development of the MV/8000 series of superminicomputers by Data General in the 70s/80s – their response to DEC’s VAX.

This book ties in with my own recent work and reading in several ways, which is probably why I’m enjoying it so much.

For my PhD research, I’ve been reading various academic papers going back to the 60s on the topics of automatic memory caching and virtual memory, and the variety of ways that problems have been addressed (hur). This book goes into surprising amounts of detail about the design and implementation of computer architecture, and to read Steve Wallach’s memory segmentation “Golden moment” explained in this narrative history was a delight for me.  The motivation and concepts are quite different, but it brought to mind the explanations that can be found in Neal Stephenson’s Cryptonomicon (and even Cory Doctorow’s Little Brother) – an attempt to teach some computing concepts within the scope of a story.

My reading for research has brought to my attention the Burroughs large systems, computers produced by Burroughs Corporation in the 1960s – particularly the B5500.  I have been fascinated to read how the company’s technology has persisted over time through mergers and renamings and many generations, to the extent that the MCP (Master Control Program) developed by Burroughs back in 1961 has some (vague) support still available in hideously expensive mainframes made by Unisys (including some kind of object-level compatibility, although I’m not sure how far back that goes).

Another reason that this book has been of interest is that it is a story of intra-corporate competition (and conflict) that seems to resonate with the stories found in The Inventor’s Dilemma – companies that grew quickly based on a good product, but that didn’t change as the market required.  Data General spent a lot of time and money fighting to win in (what would prove to be) a superceded product domain, and while they still sold a lot of machines (from what I read, billions of dollars worth of the MV/8000 series), they perhaps changed too slowly and circumstances were against them.  See more on the wikipedia page.

(I should say that I didn’t finish The Inventor’s Dilemma, and I haven’t yet finished The Soul of a New Machine – I’m aware that the first references the idea attributed to Tom West that the internal design of the VAX reflected the internal corporate structure of DEC, but I do not recall reading that part of Inventor’s Dilemma… I guess I should request it again)

At times I wish I knew more about computer history, particularly in terms of how hardware design has changed over time – there is a lot to learn from the past, I think, when it comes to implementing computer solutions now.  For example, algorithms that made sense for managing or controlling dataflow between memory and disk are now often applicable between the processor and the memory, such has the relative speed of each changed.   This is something that I have found to be useful in my research, having implemented a caching system that has more in common with RAM-to-disk virtual memory than traditional cache-to-RAM.

In fact I’d like to see taught (by which I mean “I’d like to teach”) a full-year program-your-way-through-computer-history unit at university :)

With regard to my reading about this book, I found this page where it was fascinating to me to see a post describing the backwards compatibility of the MV/8000 by Michael Meissner on usenet from 1987 – an engineer working for Data General at that time, and a name I recognise as currently working for IBM, making contributions to GCC in the areas of __ea support for the Cell BE SPU, as well as Power7 support.  I also note that DJ Delorie, another name I recognise from the GCC mailing lists, worked at Data General (as mentioned on the DG wikipedia page).  I am continually excited to follow threads connecting computing companies and people to different places and times. (Going back to Burroughs, this story of Don Knuth writing an ALGOL-58 compiler over a summer is quite an entertaining read.  Also, I’d not noticed this interview before).

For further reading, I’m hoping to get a copy of The Race for a New Game Machine – about the development of the PS3 & Xbox360 processors, as well as Racing the Beam, on programming for the Atari VCS.