These are my notes for where I can see SPU varying from ia32, as presented in the video Part 4 — Hello World.
I’ve written about syscalls on SPU before, here. System calls can be performed using appropriately packed data and stopcode 0x2104, which is intercepted by the kernel.
The JustExit.s example in the video uses the exit syscall, which is explicitly excluded from being called by the SPU. For the sake of the example, we can use the time syscall instead, and so a simple syscall looks something like this:
.data syscall_time: .quad 13 # syscall number and return value .quad 0 # parameters .quad 0 .quad 0 .quad 0 .quad 0 .quad 0 .text .globl _start _start: stop 0x2104 .int syscall_time
syscall_time is the structure used by the syscall (see struct spu_syscall_block and __linux_syscall()) with 13 in the first unsigned long long. (“Obviously”, .quad is 8 bytes :\). If there were arguments to pass to the syscall, they would be placed in the six following .quads.
The address of the syscall block must follow directly after the stop instruction. (I did wonder if there would be some trick to mixing the address with the program code — as you can see, no trick needed)
The syscall’s return value is placed in the first 8 bytes of the syscall block.
While it’s possible to use the write syscall, it’s rather painful as it requires a valid char* ea to be passed to be written, which is not readily accessible from the SPU. The alternative is to use the __send_to_ppe() function — write() is one of the POSIX1 functions handled by newlib+libspe. Of course, it has a slightly different calling mechanism to to __linux_syscall(), uses the JSRE_POSIX1 stopcode of 0x2101. This works:
.data HelloWorldString: .ascii "Hello world\n" send_to_ppe_write: .int 1 # stdout .int 0 # pad .int 0 # pad .int 0 # pad .int HelloWorldString # char* in local store .int 0 # pad .int 0 # pad .int 0 # pad .int 12 # length .int 0 # pad .int 0 # pad .int 0 # pad .text .globl _start _start: stop 0x2101 .int send_to_ppe_write+0x1b000000
Although I’m sure it can be expressed more elegantly.
The magic number 0x1b000000 added to the address of send_to_ppe_write is derived from the “combined” variable from __send_to_ppe() with the values from jsre.h.
Alternatively…
I just realised that this works: if /proc/sys/kernel/randomize_va_space is zero, the address of the mapped SPU LS (from /proc/$PID/maps, as seen here) of 0xf7f70000 can be used as an offset to anything in local store, so the syscall will work with offset pointers:
.data HelloWorldString: .ascii "Hello World\n" syscall_write: .quad 4 # write .quad 1 # stdout .int 0 # gas doesn't seem to like doing the arithmetic in a .quad .int 0xf7f70000 + HelloWorldString # mapped address of string .quad 12 # length .quad 0 .quad 0 .quad 0 .text .globl _start _start: stop 0x2104 .int syscall_write
Hideous :)
Previous assembly primer notes…
Part 1 — System Organization — PPC — SPU
Part 2 — Memory Organisation — SPU
Part 3 — GDB Usage Primer — PPC & SPU