For my next TUCS tech talk I’ll be continuing on from the Mandelbrot rendering in the last one (which can be seen here) to something a little more complex.
The Buddhabrot is conceptually not any more complex than the Mandelbrot in terms of its generation – rather than colouring points based on the number of iterations before they ‘escape’, we apply colour to each point reached while iterating escaping starting points. This has consequences for the drawing of the Buddhabrot – rather than generating one point at a time independently of all other points in the output, iterating a single input point may effect thousands of different output points. This makes it all trickier when implementing this on the Cell BE – parallel writes by SPEs to shared locations will need some form of synchronisation. That could be messy, and the process of load/modify/store when expressed in terms of SPU DMA can be quite clumsy.
Rather than try to implement a complex locking/synchronisation system, I have tried to apply some ideas from a set of post-it notes by Mike Acton (you can see them here). This isn’t identical to Mike’s solution, because it’s not the same problem.
To explain – each SPE thread iterates various points on the screen, and generates a list of points to be written. This list of points is sent via DMA to a buffer for the SPE’s use the PPE, which proceeds through the list plotting the points to the framebuffer. The advantage of this approach is that there is only one writer to the framebuffer (the PPE), and that each SPE has it’s own buffers to write its data into. The only synchronisation that is necessary is between each SPE and the PPE to ensure that all data in a buffer is consumed before writing more into it. This is achieved through the use of interrupt mailboxes (SPE tells PPE that there is data), a fenced DMA to act as sentinel (the PPE spins on the arrival of the sentinel data to ensure that DMA of a buffer has completed – this doesn’t feel like the right way to solve this particular problem, though), and the SPE signal register in OR mode to inform the SPE that a particular buffer has been finished with. Interrupt mailbox events are aggregated through libspe2’s spe_event_*() functions.
It’s not an especially complex piece of code – the motivation in its writing is for my own interest and to use for the tech talk. I think it will do nicely for explaining some of the complexities and curiosities of the Cell BE architecture, and the programming of it with the IBM SDK.
There are a few extra features that I’d like to add – particularly better colouring (including saturation which is unfortunately apparent in its absence), and a number of optimisations to the render_fractal() function that I need to lift from my earlier Mandelbrot efforts.
The program includes code by Jeremy Kerr (See hackfest items at http://ozlabs.org/~jk/diary/tech/cell/) and Mike Acton (framebuffer utilities, from http://cellperformance.beyond3d.com/articles/2007/03/handy-ps3-linux-framebuffer-utilities.html). My thanks to Jeremy and Mike, and to all those that have offered comments & feedback via twitter.
[edit: Oh, and it includes cheriff’s fine VNC code ;)]
Read the code: fractal.c and spe-fractal.c, or grab a tarball. Comments & suggestions most welcome.
Addition: I added pixel value saturation and experimented with some alternative approaches to colouring…