ioctl() and the spu — a tale of many magics

I’d achieved a working implementation of the small spu project I’ve been messing around with and wanted to move on to start pushing particular pixels (rather than debugging in text mode). My program, up to this point, has been a simple spu-only piece of code (launched via the elfspe loader under Linux), and I was reluctant to tack on a some kind of ppu manager just to set up the framebuffer — so, I though, why not do it from the spu itself? (Why not? Because it’s a stupid idea and so on.)

What’s needed to “set up the framebuffer”?  Magic.  Fortunately, the magic has been well documented by Mike Acton and you can get the source here. The trick becomes merely to get those files compiled for the spu, and the biggest barrier to doing that is the ioctl() system call.

There was no existing way to call ioctl() directly from the spu and several possible approaches were apparent.

Callback handler in libspe

libspe provides a number of callback functions for spu programs, allowing them some degree of access to and control of the wider system.  These are built on the __send_to_ppe() function found in newlib. __send_to_ppe() packs up a function’s arguments and stops the spu, signaling to libspe which of several handlers should be utilised.

My first partially-working attempt to access ioctl() used this approach, and once I’d worked out how to debug my way into libspe and back to the spu the solution was quite straightforward.  Particularly, libspe mmaps the spu’s local store, making it possible to pass a pointer to the spu’s local store directly to ioctl().

The third argument to ioctl() can be a pointer or a value and so some decoding of the ‘request’ argument would be required to handle particular cases correctly. While that’s probably not too difficult (probably — I’m not clear on the details), it’s beyond what I was interested in pursuing. For syscalls, though, there is a ‘better’ way…

Stopcode 0x2104

With __send_to_ppe(), newlib provides a similar __linux_syscall() function.  I couldn’t make much sense of it when I first looked at it as the 0x2104 stopcode it uses is not intercepted by libspe. Instead, it is magic and is intercepted directly by the kernel for the express purpose of handling syscalls. Hooray?

Almost. If the third argument to ioctl() is a pointer, it needs to contain a valid ea. Unfortunately, there appears to be no sane way for a spu to mmap it’s own local store, meaning an lsa cannot be converted to a valid ea without some external assistance, and so we must consider other options…

__ea

Named address spaces are an extension to the C language that provide a way to access eas from a spu in a somewhat transparent fashion.  On the spu, these work by qualifying pointers with __ea to indicate that they contain an ea, and the complier handles the intervening magic (gcc-4.5 and the various versions of gcc and xlc that have shipped with the IBM SDK have support for this extension).

While this method might work, I didn’t want to give up a chunk of my local store for the software cache that the implementation in gcc uses — I have a hunch that I’ll will almost run out of space when I add some extra DMA buffers to this program. (And I’m not sure what’s needed to get  __ea storage working correctly without a separate ppu context).

A more convoluted method

In the absence of a neater approach, here’s how I got it working:

  1. mmap some scratch space in main memory (using mmap_eaddr() from newlib)
  2. call ioctl() with the address of that scratch space
  3. DMA the data from the scratch space to local store

Something like:

void* scratch = (void*)(uint32_t)mmap_eaddr(0ULL, size, PROT_READ|PROT_WRITE,
                                            MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
ioctl(fb_fd, FBIOGET_VBLANK, scratch);
struct fb_vblank vblank;
spu_mfcdma32(&vblank, (uint32_t)scratch, ROUNDUP16(sizeof(vblank)),
             0, MFC_GET_CMD);
mfc_write_tag_mask(1);
spu_mfcstat(MFC_TAG_UPDATE_ALL);

And so, ugly as it is, I can now call ioctl() from spu code.  Lather, rinse & repeat as desired.

Of course, there were a few other problems to solve: because I was compiling cp_{fb,vt}.c for the spu but using the (ppu) system headers to get the fb defns, the compiler sees the prototypes for the system mmap, munmap, printf and snprintf functions. Extra wrappers are needed for each of those (and so I now know about the v*printf family).

Once all that is done, it works. I can push pixels to the screen from the spu.

Flipping and vsync is an interesting, and slightly awkward, case as they are passed the address of a value, not the value itself (for no readily apparent reason), so the values are allocated early on and referenced as needed.

The implementation can be found here. It’s functional. :)  There are many opportunities to tidy it up and simplify it. Right now it’s a fairly direct and naive port from the original.

My thanks to Mike Acton for the original setup code, and to Ken Werner for helping me get my head around the problem and its possible solutions.

Building gdb for Cell/B.E.

Trying to debug a bus error on my PS3, I realised that the version of GDB I have installed doesn’t support debugging of SPU programs. There doesn’t seem to be a Debian packaged version available that does, so I built my own.

Because I found no obvious google result, I share this with the zero other people that I expect may one day be interested : the key option for configure appears to be

--enable-targets=spu

This information was brought to you via the gdb.spec file, and a post to the gcc-testresults mailing list.

Some Linux on PS3 updates

I’ve updated the Linux distros page with links to articles about installing Ubuntu 9.10 [Joep Cremers Weblog] and Fedora 12 [Geoff Levand, Ken Werner] on the Playstation 3.

In mosty unrelated news,

  • IBM will not be releasing a previously planned successor to the PowerXCell 8i [heise online, Playstation University]
    (neither article conveys particularly clear information to me – not least because I have only a machine translation of the first one)
  • Playstation 3 firmware version 3.10 seems to have added ABC iView and BBC iPlayer support to the XMB (the presence of which *may* be related to your PSN region, geographical restrictions to accessing video content still seem to apply). If you can’t see it in the TV column of your XMB, try rebooting…

PS3, Fedora 11 and OpenCL

Ken Werner has written a nice guide to installing Fedora 11 on the PS3.  (And I’ve added the link to the Linux distro page).

Inside Ken’s page is some details on installing and running demos from the recent “OpenCL Development Kit for Linux on Power”, which is something I’m now even more keen to try :)

(It seems a little odd that the only place that OpenCL SPE acceleration is mentioned is in one of the FAQs…)

A summary of Linux distros for the PS3

I’ve been asked about selecting a Linux distro for use on the Playstation 3 a few times recently, so I’ve put together a page summarising some of the options which has mentions of pdaXrom, Debian, Ubuntu, Yellow Dog, Fedora, RHEL and Gentoo.

It’s all based on my own knowledge and experience – if there’s something you think is worth adding (or correcting, or improving) let me know.

[Update 20091021: Expanded the entry for Gentoo based on suggestions from unsolo & cheriff, and added a link to Windows-Hosted Cell SDK which I saw mentioned by @domipheus]