In the last post we wrote a simple assembly program, all it did was exit with status code zero. Our code was
.section .data
.section .text
.globl _start
_start:
movq $60, %rax
movq $0, %rdi
syscall
How does this work? The first line can be ignored for now, it just denotes where we would store any user defined data, if we had any. The second line
.section .text
denotes where our actual code begins. The third and fourth lines
_globl _start
_start:
defines a special label called start. A label is a convenient human readable name for a particular memory address. Labels allow us, the programmer to reference memory addresses without having to use actual numeric addresses. Labels are obviously less error prone, but they also mean that when we change the memory layout of our code we don’t have to recalculate a lot of addresses. The CPU does not understand labels, when the assembler runs it replaces all usages of labels with the actual addresses they refer to.
As we said, _start is a special label that defines the entry point of our program, like the main method in a C program. The line
_globl _start
just makes this label available outside of the program itself. If we had left this label out, our program will still assemble, link and run successfully, as the assembler will just create a default entry point. In general we don’t want to do this, as in more complex programs the default entry point might not be the entry point we want.
To understand this program there are two things we need to understand, the first of which is the system call. The kernel is the core part of the operating system. It handles all I/O, looks after memory at a low level, writes and reads files and plenty of other things. When we want to perform any of these task we transfer control to the kernel, this is called a system call. We perform a system call with the command
sycall
This will immediately transfer control over to the kernel. However, we also need to tell the kernel what we would like it to do for us. To do this we move certain special values into specific registers. (Remember the registers are small very fast memory inside the CPU). When the Kernel takes over, it reads these registers to find out what we are asking of it.
In the above program we use two registers, rax
and rdi
. These are 64 bit general purpose registers. With the command
movq $60, %rax
we move the 64 bit value 1 into the register rax
. With the command
movq $0, %rdi
we move the 64 bit value 0 in the register rdi
. When we perform a system call the value in the rax
register tells the kernel what operation we would like it to perform. In this case, the value 1 tells it that we would like to exit. When exiting the value in the rdi
register will be the exit code, in this case, we are exiting with code 0.