Chapter 4: Higher Half Kernel - COMPLETE
This commit is contained in:
parent
823560d3ae
commit
a908284dc6
334
doc/4_Higher_Half_Kernel.md
Normal file
334
doc/4_Higher_Half_Kernel.md
Normal file
@ -0,0 +1,334 @@
|
||||
# Chapter 4 - "Higher Half" Kernel
|
||||
|
||||
In this chapter we'll make our kernel run in the top of memory - well out of
|
||||
the way of user programs and memory mapped devices.
|
||||
|
||||
## What is a higher half kernel
|
||||
|
||||
Some arguments for a higher half kernel can be found at [the osdev
|
||||
wiki](http://wiki.osdev.org/Higher_Half_Kernel). There are arguments against
|
||||
as well, such as it being pointless with modern memory management routines. My
|
||||
main argument for using it is that it makes things simpler.
|
||||
|
||||
I chose to put the split at `0xFFFFFF8000000000`, which corresponds to the last
|
||||
entry in P4.
|
||||
|
||||
> A note about the address. If you go through the calculations, you'll find
|
||||
> that the addresses mapped from the last entry in P4 actually starts at
|
||||
> `0xFF8000000000` or `0x0000FF8000000000`. However, due to limitations in the
|
||||
> hardware, the X86 architecture requires the most significant 13 bits of an
|
||||
> address to be equal; thus `0xFFFFFF8000000000`. This address format is called
|
||||
> 'canonical'.
|
||||
|
||||
## Higher half linking
|
||||
|
||||
I want to have the kernel running at address `0xFFFFFF8000000000` and above.
|
||||
However, when we're booted from GRUB, paging is not enabled, so
|
||||
we're limited to physical RAM.
|
||||
|
||||
The solution to this problem is to tell GRUB to load the kernel into a low
|
||||
memory position, but make the kernel think it's loaded at a high position. This
|
||||
can be done with a linker script trick.
|
||||
|
||||
We change the linker script from Chapter 2 to something like this:
|
||||
|
||||
`src/kernel/Link.ld`
|
||||
```
|
||||
ENTRY(_start)
|
||||
|
||||
KERNEL_OFFSET = 0xFFFFFF8000000000;
|
||||
KERNEL_START = 0x10000;
|
||||
|
||||
SECTIONS
|
||||
{
|
||||
. = KERNEL_START + KERNEL_OFFSET;
|
||||
.text : AT(ADDR(.text.) - KERNEL_OFFSET)
|
||||
{
|
||||
*(.multiboot)
|
||||
*(.text)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
What this does is tell the linker to assume the code starts at
|
||||
`0xFFFFFF8000010000` when calculating addresses for things like function calls
|
||||
and jumps and such, but to generate an ELF file with headers saying the `.text`
|
||||
section should be *loaded* at an address `0xFFFFFF8000000000` below that, i.e.
|
||||
`0x10000` which is within the physical RAM limits.
|
||||
|
||||
This also means that GRUB will jump to address `0x10000` after loading the
|
||||
kernel, and from there we can set up paging and jump to above
|
||||
`0xFFFFFF8000000000`. We just need to take care at all memory references, since
|
||||
we can't trust the linker to sort them out before paging is setup.
|
||||
|
||||
Oh, and you will also want to do the same with the other default elf sections
|
||||
in the linker file, such as `.rodata`, `.data` and `.bss`.
|
||||
|
||||
## Fixing memory references
|
||||
|
||||
In order to make sure all memory references are correct, we'll define some
|
||||
helpful macros.
|
||||
|
||||
`src/kernel/include/memory.h`
|
||||
```c
|
||||
#define KERNEL_OFFSET 0xFFFFFF8000000000
|
||||
|
||||
#ifdef __ASSEMBLER__
|
||||
#define V2P(a) ((a) - KERNEL_OFFSET)
|
||||
#define P2V(a) ((a) + KERNEL_OFFSET)
|
||||
#else
|
||||
#include <stdint.h>
|
||||
#define V2P(a) ((uintptr_t)(a) & ~KERNEL_OFFSET)
|
||||
#define P2V(a) ((uintptr_t)(a) | KERNEL_OFFSET))
|
||||
#endif
|
||||
...
|
||||
```
|
||||
|
||||
I define two versions of the macros, one for use in assembly and one for c.
|
||||
`__ASSEMBLER__` is set by gcc when compiling a `.S` file. The c version uses
|
||||
bit operations which means you can run `V2P(VP2(address))` without any problems
|
||||
due to the format of `KERNEL_OFFSET`. The proof of this is left as an exercise
|
||||
to the reader.
|
||||
|
||||
Note that although we don't have access to a standard c library at this point,
|
||||
`stdint.h` (which defines `uintptr_t`) can still be used since it's included in
|
||||
`libgcc`, which we built and installed in the docker image together with the
|
||||
compiler.
|
||||
|
||||
Then we need to go through our code and make sure all memory references are
|
||||
corrected.
|
||||
|
||||
`src/kernel/boot/boot.S`
|
||||
```asm
|
||||
#include <memory.h>
|
||||
.intel_syntax noprefix
|
||||
|
||||
...
|
||||
_start:
|
||||
cli
|
||||
mov esp, offset V2P(BootStack)
|
||||
|
||||
...
|
||||
|
||||
mov eax, offset V2P(BootP4)
|
||||
mov cr3, eax
|
||||
|
||||
...
|
||||
|
||||
lgdt [V2P(BootGDTp)]
|
||||
|
||||
...
|
||||
|
||||
jmp 0x8:V2P(long_mode_start)
|
||||
...
|
||||
```
|
||||
|
||||
Note that `call` instructions don't have to be modified, since `call` uses
|
||||
relative addressing.
|
||||
|
||||
And don't forget about the memory references in the page tables:
|
||||
|
||||
`src/kernel/boot/boot_PT.S`
|
||||
```asm
|
||||
...
|
||||
BootP4:
|
||||
.quad offset V2P(BootP3) + (PAGE_PRESENT | PAGE_WRITE)
|
||||
...
|
||||
```
|
||||
|
||||
Note also that the GDT pointer does not require to be redirected to `V2P(BootGDT)` as one would assume. But why is that? Because of pure luck and coincidence.
|
||||
|
||||
Before starting long mode the `lgdt` instruction will expect a 32 bit gdt pointer. If there's any more data, the top bits will just be truncated (due to the small-endian nature of the processor). As luck would have it, the only difference between `BootGDT` and `V2P(BootGDT)` lies in the top 32 bits. This also means that when it's time to load a 64 bit GDT, we can use the same pointer. Neat!
|
||||
|
||||
At this point, it would be a good idea to check that the kernel still boots.
|
||||
However, gdb won't be able to tell you anything about the code since we're
|
||||
running outside of the linked addresses. You can still use it to inspect
|
||||
registers and such, though. You can also set breakpoints by modifying the
|
||||
address manually: `(gdb) break *(long_mode_start - 0xFFFFFF8000000000)`.
|
||||
|
||||
## Jumping to higher half
|
||||
|
||||
The final piece of setup we need to do before we can start running in the
|
||||
higher half is update the page table.
|
||||
|
||||
We'll do this by adding a pointer to the same P3 we set up earlier at the end
|
||||
of the BootP4.
|
||||
|
||||
`src/kernel/boot/boot_PT.S`
|
||||
```asm
|
||||
...
|
||||
BootP4:
|
||||
.quad offset V2P(BootP3) + (PAGE_PRESENT | PAGE_WRITE)
|
||||
.rept ENTRIES_PER_PT - 2
|
||||
.quad 0
|
||||
.endr
|
||||
.quad offset V2P(BootP3) + (PAGE_PRESENT | PAGE_WRITE)
|
||||
...
|
||||
```
|
||||
|
||||
If you start up the emulator now you can check that the higher half is mapped
|
||||
|
||||
```
|
||||
(gdb) mmap
|
||||
0000000000000000-0000000040000000 0000000040000000 -rw
|
||||
0000ff8000000000-0000ff8040000000 0000000040000000 -rw
|
||||
```
|
||||
|
||||
Note that qemu doesn't report the addresses in canonical mode.
|
||||
|
||||
Anyway. It should now be safe to jump to higher half code:
|
||||
|
||||
`src/kernel/boot/boot.S`
|
||||
```asm
|
||||
...
|
||||
.code64
|
||||
long_mode_start:
|
||||
mov eax, 0x0
|
||||
mov ss, eax
|
||||
mov ds, eax
|
||||
mov es, eax
|
||||
mov fs, eax
|
||||
mov gs, eax
|
||||
|
||||
movabs rax, offset upper_memory
|
||||
jmp rax
|
||||
|
||||
upper_memory:
|
||||
|
||||
jmp $
|
||||
```
|
||||
|
||||
By loading the address of the `upper_memory` into a register and jumping to it
|
||||
we force the assembler to make a non-relative jump.
|
||||
|
||||
If you run this, you'll find that gdb will be able to track where in the code
|
||||
you are again (after passing `upper_memory:`, or you could check the `RIP`
|
||||
register.
|
||||
|
||||
```
|
||||
upper_memory () at boot/boot.S:116
|
||||
116 jmp $
|
||||
(gdb) reg RIP
|
||||
RIP=ffffff800001019f
|
||||
```
|
||||
|
||||
Great! Now we can do some cleanup
|
||||
|
||||
Move the stack pointer to higher half memory:
|
||||
```asm
|
||||
...
|
||||
upper_memory:
|
||||
mov rax, KERNEL_OFFSET
|
||||
add rsp, rax
|
||||
...
|
||||
```
|
||||
|
||||
and unmap the identity mapping of the first gigabyte and reload the page table:
|
||||
|
||||
```asm
|
||||
...
|
||||
mov rax, 0
|
||||
movabs [BootP4], rax
|
||||
|
||||
mov rax, cr3
|
||||
mov cr3, rax
|
||||
...
|
||||
```
|
||||
|
||||
Run it all again, and check that the low memory is unmapped:
|
||||
|
||||
```
|
||||
(gdb) mmap
|
||||
0000ff8000000000-0000ff8040000000 0000000040000000 -rw
|
||||
```
|
||||
|
||||
Finally, we also need to reload the GDT. In long mode, the GDT
|
||||
register points to the physical address of the GDT, and we just
|
||||
unmapped that...
|
||||
|
||||
So we need to
|
||||
|
||||
- reload the GDT and update the data selectors:
|
||||
```asm
|
||||
...
|
||||
lgdt[rax]
|
||||
mov rax, 0x0
|
||||
mov ss, rax
|
||||
mod ds, rax
|
||||
mov es, rax
|
||||
...
|
||||
```
|
||||
- and reload the code selector. There are no long jumps in long mode,
|
||||
so instead we'll use the `retfq` instruction which pops a return
|
||||
address and code segment selector off the stack:
|
||||
```asm
|
||||
...
|
||||
movabs rax, offset .reload_cs
|
||||
pushq 0x8
|
||||
push rax
|
||||
retfq
|
||||
.reload_cs:
|
||||
```
|
||||
|
||||
## Running c code
|
||||
|
||||
Now that the instruction pointer is safely within our linked memory, we can
|
||||
trust c code to run.
|
||||
|
||||
Calling a c function is simple enough:
|
||||
|
||||
`src/kernel/boot/boot.S`
|
||||
```asm
|
||||
...
|
||||
.reload_cs:
|
||||
|
||||
.extern kmain
|
||||
movabs rax, offset kmain
|
||||
call rax
|
||||
|
||||
hlt
|
||||
jmp $
|
||||
```
|
||||
|
||||
And the c source file:
|
||||
`src/kernel/boot/kmain.c`
|
||||
```c
|
||||
#include <memory.h>
|
||||
|
||||
void clear_screen()
|
||||
{
|
||||
unsigned char *vidmem = P2V(0xB8000);
|
||||
for(int i=0; i < 80*24*2; i++)
|
||||
*vidmem++ = 0;
|
||||
}
|
||||
|
||||
void print_string(char *str)
|
||||
{
|
||||
unsigned char *vidmem = P2V(0xB8000);
|
||||
while(*str)
|
||||
{
|
||||
*vidmem++ = *str++;
|
||||
*vidmem++ = 0x7;
|
||||
}
|
||||
}
|
||||
|
||||
void kmain()
|
||||
{
|
||||
clear_screen();
|
||||
print_string("Hello from c, world!");
|
||||
for(;;);
|
||||
}
|
||||
```
|
||||
|
||||
... which will clear the screen and print "Hello from c, world!".
|
||||
Things are so much simpler in c...
|
||||
|
||||
But what now? This doesn't compile!
|
||||
|
||||
You'll probably get an error about "relocation truncated to fit: R_X86_64_32
|
||||
against \`.rodata\`"
|
||||
|
||||
This is because gcc assumes your code will be running at a lower memory
|
||||
address, and optimizes it as such. The solution is to tell gcc to make no
|
||||
assumption about addresses by adding the switch `-mcmodel=large` to `CFLAGS` in
|
||||
your makefile.
|
@ -6,4 +6,5 @@
|
||||
[Chapter 1: Toolchain](1_Toolchain.md)<br>
|
||||
[Chapter 2: Booting a Kernel](2_A_Bootable_Kernel.md)<br>
|
||||
[Chapter 3: Activate Long Mode](3_Activate_Long_Mode.md)<br>
|
||||
[Chapter 4: "Higher Half" Kernel](4_Higher_Half_Kernel.md)<br>
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user