#asmtut  1: true

In an effort to educate the dunces +Knut Arild Erstad and +Jon Packer who apparently never programmed in assembly, I am going to post a step by step instruction on making a snake game in x86-64/amd64 assembly on a modern operating system. Here. As if this were some kind of a blog. I am learning x86-64 assembly as I go, which makes this even more of a blog-like experience.

The modern operating system I chose is OS X 10.6. The same assembly might work directly in BSD and should work with only minor changes in Linux. The build rules will require additional changes :)

The goal is to make an interactive realtime snake game in the terminal. Graphics was a lot easier in the old days when you would just write directly to the video memory, but that is unavailable now and I would rather make this more contemporary than making it more graphical but limit it to Dosbox.

For programming assembly we need an assembler. GCC has got one, but it uses AT&T syntax for no good reason, which is harder to read and write than Intel syntax. Therefore, we choose NASM (http://www.nasm.us/pub/nasm/releasebuilds/2.10.05/macosx/nasm-2.10.05-macosx.zip), which implements Intel syntax. Install the nasm binary somewhere in your PATH. You might already have a /usr/bin/nasm from Xcode which is too old to support x86-64, so be sure not to confuse that one with the one you just installed.

Step 1: Assemble, link and execute

In this installment, we are implementing the "true" command line utility. It executes and returns 0, indicating success. An equivalent C program is "int main() { return 0; }". Simple, but a good place to start to check that all the tools are working and that everything is in place.

In reality, unlike the overly abstract world of C, the entry to a process does not act exactly as a function call, and the exit is not like a function return. C programs usually get compiled with trampoline functions to make all of this more convenient. We will go straight for calling the "exit" function with 0 as the argument.

Although, before we get to that, we need to supply an entry point to our program and export this symbol to the linker. We do the exporting first, by declaring "global main" at the top of our new file "true.asm". Go ahead, it is safe. Now, immediately below it we write "main:" on a line for itself. "main:" is a label, and referring to this label anywhere will give us its address. This is what the linker needs. (true.asm should now look like this: http://pastebin.com/6ngg0ej5)

This is actually all we need to assemble and link.

Assembling (.asm -> .o): nasm -f macho64 -o true.o true.asm

"-f macho64" tells NASM to produce an object file of the Mach-O 64bit format. This should be "elf64" for Linux, for example. "nasm -hf" gives you a list of the formats NASM supports.

Linking (.o -> executable): ld -macosx_version_min 10.6 -o true -e main true.o

"-e main" tells the linker that "main" is the label of our entry point, and the linker can find it because we have "global main" in our .asm file.

It should now be possible to execute "./true", and it will probably cause the nondescript error message "Bus error: 10" to appear.

Step 2: exit(0)

We are not going to call the "exit" function in the C runtime library, but rather the "exit" system call via the OS's syscall functionality. There is some information on this in /usr/include/sys/syscall.h, and in it we can see that "SYS_exit" has identification number 1. Nice.

Thanks to http://thexploit.com/secdev/mac-os-x-64-bit-assembly-system-calls/ I also found out that since "exit" is classified as a Unix call, it gets to have an identifier of [whatever's in syscall.h] + 0x02000000. That is, 0x02000001. Great. We now know how to identify the system call "exit".

We also know that exit takes an argument, the value to return.

According to the ABI (http://www.x86-64.org/documentation/abi.pdf) we should put the syscall number in the register "rax" and the first argument in the register "rdi". Think of registers as (global-ish) variables that you don't get to name. So, in pseudocode, we want something like:

    rax := 0x02000001; // Put the ID for SYS_exit into rax
    rdi := 0; // Put the desired exit status value into rdi
    performTheSyscall;

This is quite easy to express in assembly:

    mov rax, 0x02000001
    mov rdi, 0
    syscall

Now, test.asm should look something like this: http://pastebin.com/F0SqLDzq

"syscall" is actually a dedicated assembly instruction that was introduced in the x86-64 instruction set to make calls to the operating system more snappy.

Now you should be able to assemble, link and run this proper implementation and it should act exactly like the "true" built-in in bash.

Exercise for the reader: Modify this to implement "false" ;)

OBTW TIL: Google+ doesn't offer formatting for code.

Next lesson: https://plus.google.com/u/0/111794994501300143213/posts/TardBAAtq4e
Shared publiclyView activity