layout: post title: "System calls" subtitle: "Bend the stack to your will" tags: [osdev] System calls is the way user processes communicate to the kernel. Look at the following program, for example. #include int main(int argc, char **argv) { printf("Hello, world!"); return 0; } {: .lang-c} When you call the program, even before it is started, the shell makes a couple of system calls such as `fork()` and `exec()`. The program itself then makes several more system calls before the `write()` and `exit()` system calls represented by the two lines in the code. System calls can be performed in several ways, but one of the most common is through a special software interrupt with the `int` instruction. For example, linux and most unix-like hobby kernels I've studied use `int 0x80`. That's also what I chose to use in my kernel. Next is the problem of passing data. The simplest way is using registers, and that's what most projects seem to use. For this, I chose a combination of a single register and the processes own stack. ###Sample system call Let's look at how `read()` would be implemented. I've not actually implemented it in my kernel yet, but here's how it would work. ####User side First the definition in the c library: int read(int file, char *ptr, int len) { return _syscall_read(file, ptr, len); } Simply a wrapper for an assembly function: [global _syscall_read] _syscall_read: mov eax, SYSCALL_READ int 0x80 mov [syscall_error], edx ret {: .lang-nasm} This function puts an identifier for the system call in the `eax` register and then execute the system call interrupt. _Note:_ Here I return the error code through register `edx`. In the actual code at this point, I used the register `ebx`. I should have looked up [Calling Conventions](http://wiki.osdev.org/Calling_Conventions) more carefully. Of course, this can be simplified with a macro to [global _syscall_read] DEF_SYSCALL(read, SYSCALL_READ) {: .lang-nasm} ####Kernel side In the kernel, the system call is caught by the following function: registers_t *syscall_handler(registers_t *r) { if(syscall_handlers[r->eax]) r = syscall_handlers[r->eax](r); else r->edx = ERR_NOSYSCALL; return r; } If the system call is registered correctly in the kernel (through the macro `KREG_SYSCALL(read, SYSCALL_READ)`), this will pass everything onto the following function: KDEF_SYSCALL(read, r) { process_stack stack = init_pstack(); r->eax = read((int)stack[0], (char *)stack[1], (int)stack[2]); r->edx = errno; return r; } The `init_pstack()` macro expands to `(unitptr_t *)(r->useresp + 0x4)` and this lets us read the arguments passed to the system call from where they are pushed on call. Then the `read()` function has the same definition as the library version. int read(int file, char *ptr, int len) { ... } _Spoiler alert:_ Keeping a version of `read()` (and in fact every syscall function) inside the kernel will turn out to have some really cool advantages... This works for c compiled with the `cdecl` calling convention. For other languages or calling conventions, the asm functions will have to be adjusted. ###Git The methods described in this post has been implemented in git commit [8a26e26163](https://github.com/thomasloven/os5/tree/8a26e26163c15c9d9854554dce9d4fc5ad8baee5).