From 0b3648de7cf10823e26eef9bb5348ad0cf81f3be Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Thomas=20Lov=C3=A9n?= Date: Thu, 21 Dec 2017 16:21:58 +0100 Subject: [PATCH] Chapter 8: Exceptions - COMPLETE --- doc/8_Exceptions.md | 294 ++++++++++++++++++++++++++++++++++++++++++++ doc/README.md | 1 + 2 files changed, 295 insertions(+) create mode 100644 doc/8_Exceptions.md diff --git a/doc/8_Exceptions.md b/doc/8_Exceptions.md new file mode 100644 index 0000000..f10b8b3 --- /dev/null +++ b/doc/8_Exceptions.md @@ -0,0 +1,294 @@ +# Exceptions and Interrupts + +Sometimes, things go wrong. When they do, we want to fail gracefully - or even +recover. That's the point of exceptions. + +## Interrupt Service Routines + +The x86 interrupt handling method is, for historical reasons I assume, messy. +The x86\_64 architecture saw a slight improvement in that the stack pointer and +segment are always pushed, even if the cpu was running in ring 0 when the +interrupt happened. Still, though, some exceptions push an error code, and +others do not. And no data is provided to determine which interrupt occurred, +besides which interrupt service routine was called. + +If all interrupts pushed a dummy error code and an identifying number, a single +ISR would be enough, and the rest could be done in software. + +Anyway. Let's play with the cards we're dealt. + +The most common way of solving this discrepancy is by having a number of short +ISRs in the form + +```asm +isr1: + push 0 //; Dummy error code + push 1 //; Interrupt number + jmp isr_common //; The rest is the same for all interrupts +``` + +You may want up to 256 ISRs, so let's do some finger warmup exercises! + +Or rather yet, let's generate the ISRs automatically. With python! + +`src/kernel/cpu/isr.S.py` +```python +#!/usr/bin/env python2 +# -*- coding: utf-8 -*- + +from __future__ import print_function + +num_isr = 256 +pushes_error = [8, 10, 11, 12, 13, 14, 17] + +print(''' +.intel_syntax noprefix +.extern isr_common +''') + + +print('// Interrupt Service Routines') +for i in range(num_isr): + print('''isr{0}: + cli + {1} + push {0} + jmp isr_common '''.format(i, + 'push 0' if i not in pushes_error else 'nop')) + +print('') +print(''' +// Vector table + +.section .data +.global isr_table +isr_table:''') + +for i in range(num_isr): + print(' .quad isr{}'.format(i)) +``` + +This outputs an assembly file with 256 ISRs like the one above, except numbers +8, 10, 11, 12, 13, 14 and 17, which has an `nop` instruction instead of pushing +a bogus error code. + +It's written for python 2 because that's what's included in the alpine version +the build docker image is based on - despite it being 2018. The encoding is +utf-8, and I import the print function from \_\_future\_\_, because it's 2018. + +It also makes a table with pointers to each ISR, which makes it easy to set up +the Interrupt Descriptor Table later: + +`src/kernel/cpu/interrupts.c` +```c +... +struct idt +{ + uint16_t base_l; + uint16_t cs; + uint8_t ist; + uint8_t flags; + uint16_t base_m; + uint32t base_h; + uint32_t _; +}__attribute__((packed)) idt[NUM_INTERRUPTS]; + +extern uintptr_t isr_table[] + +void interrupt_init() +{ + memset(idt, 0, sizeof(idt)); + for(int i=0; i < NUM_INTERRUPTS; i++) + { + idt[i].base_l = isr_table[i] & 0xFFFF; + idt[i].base_m = (isr_table[i] >> 16) & 0xFFFF; + idt[i].base_h = (isr_table[i] >> 32) & 0xFFFFFFFF; + idt[i].cs = 0x8; + idt[i].ist = 0; + idt[i].flags = IDT_PRESENT | IDT_DPL0 | IDT_INTERRUPT; + } +... +``` + +`isr_common` pushes all registers to the stack (one by one, there's no `pusha` +instruction in x86\_64) and passes controll to a c interrupt handler. Note that +for x86\_64 the arguments to a function is not primarily passed on the stack, + but in registers. So the last thing it does before calling the c function is + move the stack pointer into `rdi`. In case the handler returns, `isr_common` + restores the stack pointer from `rax` - which is the function return value, + pops all values again, and performs an `iretq` instruction, which is pretty + much a backwards interrupt. + +`src/kernel/cpu/isr_common.S` +```asm +... +isr_common: + push r15 + push r14 +... + push rbx + push rax + mov rdi, rsp + call int_handler + + mov rdi, rax +isr_return: + mov rsp, rdi + pop rax + pop rbx +... + pop r14 + pop r15 + add rsp, 0x10 + iretq +``` + +But what's the deal with passing `rax` to `rsp` via `rdi`? Doing it this way +will allow us to call `isr_return` as a function, with a faked interrupt stack. +We'll use this later to get into user mode. + +## Building isr.S.py + +But back to the ISRs. In order to build this, we need some changes in the +kernel makefile. +First of all, the lines + +```make +SRC := $(wildcard **/*.[cS]) +OBJ := $(patsubst %, %.o, $(basename $(SRC))) +``` + +need to be updated to allow more file extensions: + +```make +SRC := $(wildcard **/*.[cS]*) +OBJ := $(patsubst %, %.o, $(basename $(basename $(SRC)))) +``` + +We also need a special rule to generate .o files from .S.py: + +`src/kernel/Makefile` +```asm +%.o: %.S.py + python $^ | $(COMPILE.S) $(DEPFLAGS) -x assembler-with-cpp - -o $@ +``` + +In theory, it should be enough with a rule of the form + +```make +%.S: %.S.py + python $^ > $@ +``` + +However, this generates the dependency tree .o <- .s <- .S <- .py rather than +.o <- .S <- .py, which uses `as` to compile, and causes some other trouble as +well with intermediate files that are removed once, but not if you run make +again, and stuff... + +Some of this can be solved with an `.INTERMEDIATE:` rule, but that's not very +elegant. The big problem's probably with me rather than make. + + +## The Interrupt Handler + +The c interrupt handler routine is a simple thing. Its default modus operandi +is to print an error message and hang. + +However, before doing this, it checks a table of other interrupt handlers, and +if one exists for the current interrupt, it passes execution over to that. + +`src/kernel/cpu/interrupts.c` +```c +registers *int_handler(registers *r) +{ + if(int_handlers[r->int_no]) + return int_handlers[r->int_no](r); + + debug("Unhandled interrupt occurred\n"); + debug("Interrupt number: %d Error code: %d\n", r->int_no, r->err_code); + debug_print_registers(r); + + PANIC("Unhandled interrupt occurred"); + for(;;); +} +``` + +## Final Note + +For tidyness sake, I wrapped the call to `interrupt_init` inside a function +called `cpu_init`, which in turn is called from `kmain`. For now, that's all it +is, but it will soon grow more important. + +## Bonus: Debugging Interrupts + +There's a small problem with the way interrupts are handled by the processor; +they don't follow the calling convention. + +This means that when an interrupt occurs, and the debugger breaks in the +`PANIC` macro, it has lost all context, and we can't see what happened. + +But wait. The entire context is saved. It was pushed to the stack and passed to +the interrupt handler. And by using gdbs ability to set the value of registers +in qemu, we can bring it back into scope. + +I put the following function in `toolchain/gdbinit` + +```gdb +define restore_env +set $name = $arg0 +python + +registers = {r: gdb.parse_and_eval('$name->' + r) for r in +['rax', 'rbx', 'rcx', 'rdx', 'rsi', 'rdi', 'rbp', 'rsp', 'r8', 'r9', 'r10', +'r11', 'r12', 'r13', 'r14', 'r15', 'rip']} + +for r in registers.items(): + gdb.parse_and_eval('$%s=%s' % r) +gdb.execute('frame 0') +end +end +``` + +And it's used like this: + +``` +(gdb) c +Continuing. + +Thread 1 hit Breakpoint 2, int_handler (r=0xffffff8000019f10) at cpu/interrupts.c:74 +74 PANIC("Unhandled interrupt occurred"); +(gdb) restore_env r +#0 0xffffff8000010caa in divide_two_numbers (divisor=0, dividend=0) at boot/kmain.c:18 +18 return dividend/divisor; +(gdb) bt +#0 0xffffff8000010caa in divide_two_numbers (divisor=0, dividend=0) at boot/kmain.c:18 +#1 0xffffff8000010dbd in kmain (multiboot_magic=920085129, multiboot_data=0x105fa0) at boot/kmain.c:33 +#2 0xffffff8000010efd in .reload_cs () at boot/boot.S:96 +#3 0x0000000000000007 in ?? () +#4 0x0000000000000730 in ?? () +#5 0x0000000000000000 in ?? () +(gdb) list +13 for(;;); +14 } +15 +16 int divide_two_numbers(int divisor, int dividend) +17 { +18 return dividend/divisor; +19 } +20 +21 void kmain(uint64_t multiboot_magic, void *multiboot_data) +22 { +(gdb) p divisor +$1 = 0 +(gdb) p divident +$2 = 5 +(gdb) frame 1 +#1 0xffffff8000010dbd in kmain (multiboot_magic=920085129, multiboot_data=0x105fa0) at boot/kmain.c:33 +33 divide_two_numbers(0,5); // Calculate 0/5 and discard the results +(gdb) +``` + +By restoring the processor to the state stored in `r`, we can debug from where +the interrupt occurred as normal. By backtracing and inspecting variables we +find that whoever wrote line 33 in `kmain.c` got the divisor and divident mixed +up, which resulted in a divide by zero exception. diff --git a/doc/README.md b/doc/README.md index 8c85053..a41c206 100644 --- a/doc/README.md +++ b/doc/README.md @@ -10,4 +10,5 @@ [Chapter 5: Unit Testing Framework](5_Unit_Testing.md)
[Chapter 6: Debug output](6_Debug_Output.md)
[Chapter 7: Multiboot Data](7_Multiboot_Data.md)
+[Chapter 8: Exceptions and Interrupts](8_Exceptions.md)