8.4 KiB
Chapter 4 - "Higher Half" Kernel
In this chapter we'll make our kernel run in the top of memory - well out of the way of user programs and memory mapped devices.
What is a higher half kernel
Some arguments for a higher half kernel can be found at the osdev wiki. There are arguments against as well, such as it being pointless with modern memory management routines. My main argument for using it is that it makes things simpler.
I chose to put the split at 0xFFFFFF8000000000
, which corresponds to the last
entry in P4.
A note about the address. If you go through the calculations, you'll find that the addresses mapped from the last entry in P4 actually starts at
0xFF8000000000
or0x0000FF8000000000
. However, due to limitations in the hardware, the X86 architecture requires the most significant 13 bits of an address to be equal; thus0xFFFFFF8000000000
. This address format is called 'canonical'.
Higher half linking
I want to have the kernel running at address 0xFFFFFF8000000000
and above.
However, when we're booted from GRUB, paging is not enabled, so
we're limited to physical RAM.
The solution to this problem is to tell GRUB to load the kernel into a low memory position, but make the kernel think it's loaded at a high position. This can be done with a linker script trick.
We change the linker script from Chapter 2 to something like this:
src/kernel/Link.ld
ENTRY(_start)
KERNEL_OFFSET = 0xFFFFFF8000000000;
KERNEL_START = 0x10000;
SECTIONS
{
. = KERNEL_START + KERNEL_OFFSET;
.text : AT(ADDR(.text.) - KERNEL_OFFSET)
{
*(.multiboot)
*(.text)
}
}
What this does is tell the linker to assume the code starts at
0xFFFFFF8000010000
when calculating addresses for things like function calls
and jumps and such, but to generate an ELF file with headers saying the .text
section should be loaded at an address 0xFFFFFF8000000000
below that, i.e.
0x10000
which is within the physical RAM limits.
This also means that GRUB will jump to address 0x10000
after loading the
kernel, and from there we can set up paging and jump to above
0xFFFFFF8000000000
. We just need to take care at all memory references, since
we can't trust the linker to sort them out before paging is setup.
Oh, and you will also want to do the same with the other default elf sections
in the linker file, such as .rodata
, .data
and .bss
.
Fixing memory references
In order to make sure all memory references are correct, we'll define some helpful macros.
src/kernel/include/memory.h
#define KERNEL_OFFSET 0xFFFFFF8000000000
#ifdef __ASSEMBLER__
#define V2P(a) ((a) - KERNEL_OFFSET)
#define P2V(a) ((a) + KERNEL_OFFSET)
#else
#include <stdint.h>
#define V2P(a) ((uintptr_t)(a) & ~KERNEL_OFFSET)
#define P2V(a) ((uintptr_t)(a) | KERNEL_OFFSET))
#endif
...
I define two versions of the macros, one for use in assembly and one for c.
__ASSEMBLER__
is set by gcc when compiling a .S
file. The c version uses
bit operations which means you can run V2P(VP2(address))
without any problems
due to the format of KERNEL_OFFSET
. The proof of this is left as an exercise
to the reader.
Note that although we don't have access to a standard c library at this point,
stdint.h
(which defines uintptr_t
) can still be used since it's included in
libgcc
, which we built and installed in the docker image together with the
compiler.
Then we need to go through our code and make sure all memory references are corrected.
src/kernel/boot/boot.S
#include <memory.h>
.intel_syntax noprefix
...
_start:
cli
mov esp, offset V2P(BootStack)
...
mov eax, offset V2P(BootP4)
mov cr3, eax
...
lgdt [V2P(BootGDTp)]
...
jmp 0x8:V2P(long_mode_start)
...
Note that call
instructions don't have to be modified, since call
uses
relative addressing.
And don't forget about the memory references in the page tables:
src/kernel/boot/boot_PT.S
...
BootP4:
.quad offset V2P(BootP3) + (PAGE_PRESENT | PAGE_WRITE)
...
Note also that the GDT pointer does not require to be redirected to V2P(BootGDT)
as one would assume. But why is that? Because of pure luck and coincidence.
Before starting long mode the lgdt
instruction will expect a 32 bit gdt pointer. If there's any more data, the top bits will just be truncated (due to the small-endian nature of the processor). As luck would have it, the only difference between BootGDT
and V2P(BootGDT)
lies in the top 32 bits. This also means that when it's time to load a 64 bit GDT, we can use the same pointer. Neat!
At this point, it would be a good idea to check that the kernel still boots.
However, gdb won't be able to tell you anything about the code since we're
running outside of the linked addresses. You can still use it to inspect
registers and such, though. You can also set breakpoints by modifying the
address manually: (gdb) break *(long_mode_start - 0xFFFFFF8000000000)
.
Jumping to higher half
The final piece of setup we need to do before we can start running in the higher half is update the page table.
We'll do this by adding a pointer to the same P3 we set up earlier at the end of the BootP4.
src/kernel/boot/boot_PT.S
...
BootP4:
.quad offset V2P(BootP3) + (PAGE_PRESENT | PAGE_WRITE)
.rept ENTRIES_PER_PT - 2
.quad 0
.endr
.quad offset V2P(BootP3) + (PAGE_PRESENT | PAGE_WRITE)
...
If you start up the emulator now you can check that the higher half is mapped
(gdb) mmap
0000000000000000-0000000040000000 0000000040000000 -rw
0000ff8000000000-0000ff8040000000 0000000040000000 -rw
Note that qemu doesn't report the addresses in canonical mode.
Anyway. It should now be safe to jump to higher half code:
src/kernel/boot/boot.S
...
.code64
long_mode_start:
mov eax, 0x0
mov ss, eax
mov ds, eax
mov es, eax
mov fs, eax
mov gs, eax
movabs rax, offset upper_memory
jmp rax
upper_memory:
jmp $
By loading the address of the upper_memory
into a register and jumping to it
we force the assembler to make a non-relative jump.
If you run this, you'll find that gdb will be able to track where in the code
you are again (after passing upper_memory:
, or you could check the RIP
register.
upper_memory () at boot/boot.S:116
116 jmp $
(gdb) reg RIP
RIP=ffffff800001019f
Great! Now we can do some cleanup
Move the stack pointer to higher half memory:
...
upper_memory:
mov rax, KERNEL_OFFSET
add rsp, rax
...
and unmap the identity mapping of the first gigabyte and reload the page table:
...
mov rax, 0
movabs [BootP4], rax
mov rax, cr3
mov cr3, rax
...
Run it all again, and check that the low memory is unmapped:
(gdb) mmap
0000ff8000000000-0000ff8040000000 0000000040000000 -rw
Finally, we also need to reload the GDT. In long mode, the GDT register points to the physical address of the GDT, and we just unmapped that...
So we need to
- reload the GDT and update the data selectors:
...
lgdt[rax]
mov rax, 0x0
mov ss, rax
mod ds, rax
mov es, rax
...
- and reload the code selector. There are no long jumps in long mode,
so instead we'll use the
retfq
instruction which pops a return address and code segment selector off the stack:
...
movabs rax, offset .reload_cs
pushq 0x8
push rax
retfq
.reload_cs:
Running c code
Now that the instruction pointer is safely within our linked memory, we can trust c code to run.
Calling a c function is simple enough:
src/kernel/boot/boot.S
...
.reload_cs:
.extern kmain
movabs rax, offset kmain
call rax
hlt
jmp $
And the c source file:
src/kernel/boot/kmain.c
#include <memory.h>
void clear_screen()
{
unsigned char *vidmem = P2V(0xB8000);
for(int i=0; i < 80*24*2; i++)
*vidmem++ = 0;
}
void print_string(char *str)
{
unsigned char *vidmem = P2V(0xB8000);
while(*str)
{
*vidmem++ = *str++;
*vidmem++ = 0x7;
}
}
void kmain()
{
clear_screen();
print_string("Hello from c, world!");
for(;;);
}
... which will clear the screen and print "Hello from c, world!". Things are so much simpler in c...
But what now? This doesn't compile!
You'll probably get an error about "relocation truncated to fit: R_X86_64_32 against `.rodata`"
This is because gcc assumes your code will be running at a lower memory
address, and optimizes it as such. The solution is to tell gcc to make no
assumption about addresses by adding the switch -mcmodel=large
to CFLAGS
in
your makefile.