On to Assembly Primer — Part 4. This is where we start writing a small assembly program for the platform. In this case, I don’t know the language and I don’t know the ABI. Learning these from scratch ranges from interesting to tedious :)
Regarding the language (available instructions, mnemonics and assembly syntax): I’m using the ARM Architecture Reference Manual as my reference for the architecture (odd, I know). It’s very long and the documentation for each instruction is extensive — which is good because there are a lot of instructions, and many of them do a lot of things at once.
Regarding the ABI (particularly things like argument passing, return values and system calls): there’s the Procedure Call Standard for the ARM Architecture, and there are a few other references I’ve found, such as the Debian ARM EABI Port wiki page.
“EABI is the new “Embedded” ABI by ARM ltd. EABI is actually a family of ABI’s and one of the “subABIs” is GNU EABI, for Linux.”
– from Debian ARM EABI Port
To perform a system call using the GNU EABI:
- put the system call number in r7
- put the arguments in r0-r6 (64bit arguments must be aligned to an even numbered register i.e. in r0+r1, r2+r3, or r4+r5)
- issue the Supervisor Call instruction with a zero operand — svc #0
(Supervisor Call was previously named Software Interrupt — swi)
Based on the above, it’s not difficult to reimplement JustExit.s (original) for ARM.
.text .globl _start _start: mov r7, #1 mov r0, #0 svc #0
mov here is Move (Immediate) which puts the #-prefixed literal into the named register.
Likewise, the conversion of HelloWorldProgram.s (original) is not difficult:
.data HelloWorldString: .ascii "Hello World\n" .text .globl _start _start: # Load all the arguments for write () mov r7, #4 mov r0, #1 ldr r1,=HelloWorldString mov r2, #12 svc #0 # Need to exit the program mov r7, #1 mov r0, #0 svc #0
This includes the load register pseudo-instruction, ldr — the compiler stores the address of HelloWorldString into the literal pool, a portion of memory located in the program text, and the 32bit address is loaded from the literal pool (more details).
When compiling a similar C program with -mcpu=cortex-a8, I notice that the compiler generates Move (immediate) and Move Top — movw and movt — instructions to load the address directly from the instruction stream, which is presumably more efficient on that architecture.