hbr

programming

“Layers are Leaking”

by jonathan on Aug.12, 2010, under general, programming

Andrew Richards (@codeandrew), the CEO of Codeplay posted an article “Layers are Leaking”, which inspired some thoughts – tangential and perpendicular – beyond what I could reasonably fit into a stream of tweets (not that that stopped me trying).

From the article:

If you look at any computer system architecture diagram (and there are many) you will almost always see a series of “layers” on top of each other. Each layer is based on a layer underneath. It is a very common pattern in computers, and I believe that it is fundamental to the success of the development of computer technology. What layers allow you to do, is let individual people concentrate on only one part of a problem.

This brought to mind the first chapter of Bell & Newell’s Computer Structures book from 1971, which illustrates this kind of layering idea.  Additionally, I was reminded of Jeannette Wing’s excellent Computational Thinking article which (amongst other things) makes the case that this type of thinking, associated with computer science, is something that is worth teaching and applying more widely.

But, recently, the layers have started leaking.

I don’t really agree that this is a recent thing (layers leak, abstractions change – some leaks are more obvious than others), but the article refers to the significant changes that have come about in the eternal quest for faster computation, and the difficulties involved in replacing established abstractions. Renegotiating the interface between two layers is difficult when the people responsible for each layer don’t speak the same technical language. What’s worse is trying to change the interface to one layer without adequately considering the consequences to other layers in the system. I think time-consuming discussion and debate is inevitable, and it is not inherently a reason to avoid changing an otherwise broken abstraction.

A number of the problems (or – dare I cliché it – opportunities) that are appearing with regard to increasingly parallel computation are not new. Consider Peter Denning’s recent article. I find it interesting that a number of the suggested solutions involve changes to deeply ingrained abstractions (like programming languages) for which there would be (is) significant resistance to change.

I read an interview with Fred Brooks recently, and the following quote stood out:

Software is not the exception; hardware is the exception. No technology in history has had the kind of rapid cost/performance gains that computer hardware has enjoyed. Progress in software is more like progress in automobiles or airplanes: We see steady gains, but they’re incremental.

Which echoes something that Andrew said in another article (“Why Intel Larrabee Really Stumbled” – google cache is here)

Infinite Truth: Software takes longer to develop than Hardware

A lot of time has been spent on steady, incremental gains for software that attempts to work around the limitations of many common layer abstractions – examples that spring to mind include compiling general purpose programs for small register sets, converting scalar programs to utilise vector instructions, and trying to make parallelism out of inherently single threaded code. We can’t necessarily redesign our systems to make these problems go away, but I think we can strive to address these problems in the best way possible – which will likely involve reshuffling some layers in the process. It’s likely to be messy, time consuming, and require a lot of hard work.

Sounds like fun :)

My thanks to Andrew for his thought-provoking article.

Additional:
Stephen Hill (@self_shadow) tweeted links to wikipedia on Leaky abstractions, and a related article by Joel Spolsky.

But, recently, the layers have started leaking.
2 Comments more...

spu_µopt[0] // shifting and arrays

by jonathan on May.07, 2010, under general, programming, spu

There are a couple of mostly-obvious SPU micro-optimisation tricks that I’ve learned. Here’s one.

qword a[128];
qword f(int i) {
    return a[i>>4];
}

Simple enough code. spu-gcc -O3 produces

f:                                                                                                                                                                                                                                                                                                                          
    ila     $2,a                                                                                                                                                                                                                                                                                                        
    andi    $3,$3,-16                                                                                                                                                                                                                                                                                                   
    lqx     $3,$2,$3                                                                                                                                                                                                                                                                                                    
    bi      $lr

And it’s worse if the shift is not 4 – here’s the result of using 5:

f:
    rotmai  $4,$3,-5
    ila     $2,a
    shli    $3,$4,4
    lqx     $3,$2,$3
    bi      $lr

Which is pretty ugly.

The compiler is doing what it has been asked to do, but doesn’t appear to know that SPU load and store instructions force the four rightmost bits of the calculated address to zero.  Which means the andi masking can be removed from the first case, and in the second the shift right and shift left instructions could be replaced with a single shift right – without changing the result.

How to avoid it?  I’ve used manual pointer arithmetic to get around this compiler feature, e.g.

return *(qword*)((uint)a+i)

Which is pretty ugly.

I don’t know what that sort of thing does for the compiler’s alias analysis. Probably nothing good (judicious use of restrict might help…).  si_lqx() would also work, but my  experience – without in-depth examination – is that measured performance gets worse when using load and store intrinsics. I’m not sure if it’s something to do with scheduling, aliasing or programmer competence.

When would this ever matter? I’ve run into this in some very hot lookup table access. Two or four less cycles can make a big difference. Sometimes. (If it really matters, rewriting the whole thing in assembly can work, too)

It’s a micro-optimisation – measure often, use with care.

Leave a Comment : more...

“premature optimization”

by jonathan on Nov.25, 2009, under general, history, programming

Here is an except from Donald Knuth’s 1974 paper “Structured Programming with go to Statements”.  There is a well known quote at the end of the second paragraph :

The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant.  The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering. Of course I wouldn’t bother making such optimizations on a one-shot job, but when it’s a question of preparing quality programs, I don’t want to restrict myself to tools that deny me such efficiencies.

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should no pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. After working with such tools for seven years, I’ve become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

The quote to which I am referring – that “premature optimization is the root of all evil” – is, from time to time, used fait accompli to assert that any time spent optimising is time wasted – something that is clearly not the case when considering the quote in context.

There are a lot of other good ideas in Knuth’s article, many still quite relevant to software engineering. The full article (which can be obtained via citeseerx) is, primarily, a contribution to a  discussion of the time on the use of go to statements – another topic often reduced to a simple, extreme position, attributed to Edsger Dijkstra.  Knuth’s paper makes it clear that Dijkstra did not hold so tightly to that position himself.

This is the sort of paper that I find to be a fascinating read, with both historical and technical value.

The final page of the article contains the following quote :

A disturbingly large percentage of the students ran into situations that require go to’s, and sure enough, it was often because while didn’t work well to their plan, but almost invariably because their plan was poorly thought out.

Which brought to mind another quote that I had seen recently, this one from George Orwell :

A man may take to drink because he feels himself to be a failure, and then fail all the more completely because he drinks. It is rather the same thing that is happening to the English language. It becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts.

It seems quite valid to say that there is a connection between our language and our expression (in terms of ideas, spoken word, programs, and more).

When trying to find a source for Orwell’s words (which you can find here), I was heartened to find these sentences follow those quoted above :

The point is that the process is reversible.   Modern English, especially written English, is full of bad habits which spread by imitation and which can be avoided if one is willing to take the necessary trouble.

Which I believe is also applicable to the expression and optimisation of computer programs.

2 Comments :, , , more...

Looking for something?