Chapter 9: Memory Management - COMPLETE
This commit is contained in:
parent
fc6c7f3bc7
commit
63ae2a0b9d
238
doc/9_Memory_Management.md
Normal file
238
doc/9_Memory_Management.md
Normal file
@ -0,0 +1,238 @@
|
||||
# Memory Management
|
||||
|
||||
Normally, I think of the memory management of the kernel as four parts.
|
||||
|
||||
- Physical Memory Manager
|
||||
- Virtual Memory Manager
|
||||
- Kernel Heap
|
||||
- Process Memory Manager
|
||||
|
||||
We'll go through them in order, but first a radical design decision.
|
||||
|
||||
## Mapping the entire physical memory into kernel space
|
||||
|
||||
A 32-bit processor can address about 4 Gb of memory. Near the end of the 32-bit
|
||||
era, even home desktop computers were actually running into this limit.
|
||||
|
||||
In chapter 4, I mentioned that te top 16 address bits must be the same due to
|
||||
hardware limitations, in what's called canonical addressing. This leaves 48
|
||||
bits, which lets us address 256 Tb.
|
||||
|
||||
I also mentioned that I reserve a 512th of this for the kernel (virtual
|
||||
addresses above 0xFFFFFF8000000000), which is about 512 Gb.
|
||||
|
||||
512 Gb - to paraphrase something Bill Gates probably never actually said -
|
||||
ought to be enough for anybody.
|
||||
|
||||
This means we could - for any forseeable future, and definitely longer than the
|
||||
lifespan of my hobby kernel designed for running in an emulated environment
|
||||
with a couple of Mb of RAM - map the entirety of RAM right into the kernel
|
||||
address space.
|
||||
|
||||
The main advantage of this is that we will never have to map anything into
|
||||
kernel space temporarily, say to modifiy the page tables of a process. It also
|
||||
means that we can use unused pages for temporary storage (more on this in a
|
||||
minute).
|
||||
|
||||
The main disadvantage is that the entirety of memory is mapped into the kernels
|
||||
virtual memory at all times. At the time of writing this, the Spectre and
|
||||
Meltdown hardware bugs affecting almost all modern Intel CPUs have recently
|
||||
proved that anything mapped into memory is insecure.
|
||||
|
||||
There are ways to remedy this, but they come at a performance cost, and -
|
||||
honestly - I don't feel like implementing them right now. Perhaps I will later,
|
||||
or perhaps I'll never have to because Intel decides that maybe they should fix
|
||||
the issue.
|
||||
|
||||
Anyway. The entirety of physical RAM will be mapped into kernel space. That
|
||||
means a physical page with address `addr` can be accessed at `P2V(addr)`, where
|
||||
`P2V` is one of the two macros defined in Chapter 4. This explains their name
|
||||
P2V - Physical to Virtual, and V2P - Virtual to Physical. We'll use that a lot.
|
||||
|
||||
## The physical memory manager - PMM
|
||||
|
||||
The PMM keeps track of available physical memory and hands out free pages when
|
||||
requested.
|
||||
|
||||
Thanks to virtual address translation, we almost never need to care where a
|
||||
physical page is located, and therefor the PMM can be made very simple.
|
||||
|
||||
In reality, we sometimes do need a number of continuous physical pages - for
|
||||
example when reading from disk using Direct Memory Access, but we'll save that
|
||||
for another day.
|
||||
|
||||
The entire PMM has only one single state variable. A pointer to the last freed
|
||||
page. This page, in turn, holds a pointer to the one that was freed before, and
|
||||
so on, so when we free a page, we write the pointer to it, and update the free
|
||||
pointer:
|
||||
|
||||
`src/kernel/memory/pmm.c`
|
||||
```c
|
||||
...
|
||||
uint64_t next = 0;
|
||||
|
||||
void pmm_free(uint64_t page)
|
||||
{
|
||||
*(uint64_t *)P2V(page) = next;
|
||||
next = (uint64_t)P2V(page);
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
That's all. Allocating a page is equally simple:
|
||||
|
||||
```c
|
||||
...
|
||||
uint64_t pmm_alloc()
|
||||
{
|
||||
if(!next) return 0;
|
||||
uint64_t page = next;
|
||||
next = *(uint64_t *)page;
|
||||
return V2P(page);
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
Feeling comfortable about dereferencing a pointer cast will help you when
|
||||
debugging your VMM in gdb, by the way.
|
||||
|
||||
I also define a `pmm_calloc()` function, which allocates and zeros a page.
|
||||
|
||||
## The virtual memory manager - VMM
|
||||
|
||||
The VMM handles setting up page tables, and separating user and kernel space.
|
||||
|
||||
As mentioned earlier, x86\_64 uses four levels of page tables, each containing
|
||||
512 64 bit entries. The 52 most significant bits of each entry is a pointer to
|
||||
the next level of page table, or the page itself in the case of the last page
|
||||
table (P1 in my inofficial nomenclature). This pointer is obviously always page
|
||||
alligned.
|
||||
|
||||
The 12 least significant bits of each entry are for various flags. In order to
|
||||
find the next level, they need to be masked out. So let's start with some
|
||||
macros:
|
||||
|
||||
`src/kernel/memory/vmm.c`
|
||||
```c
|
||||
...
|
||||
#define FLAGS_MASK (PAGE_SIZE - 1)
|
||||
#define MASK_FLAGS(addr) ((uint64_t)addr & ~FLAGS_MASK)
|
||||
...
|
||||
```
|
||||
|
||||
Now it's easy to go from a page table entry to something we can parse:
|
||||
|
||||
```c
|
||||
...
|
||||
#define PT(ptr) ((uint64_t *)P2V(MASK_FLAGS(ptr)))
|
||||
...
|
||||
```
|
||||
|
||||
Page table entries are physical addresses, so we need to go through `P2V` to
|
||||
access them. To access a certain page table entry, you just do e.g.
|
||||
`PT(P4)[num]`.I chose to define some macros to access the entries for a certain
|
||||
address in each page table (P4 trough P1). They look like this:
|
||||
|
||||
```c
|
||||
...
|
||||
#define P4E (PT(P4)[P4_OFFSET(addr)])
|
||||
#define P3E PT(P4E)[P3_OFFSET(addr)])
|
||||
#define P2E PT(P3E)[P2_OFFSET(addr)])
|
||||
#define P1E PT(P2E)[P1_OFFSET(addr)])
|
||||
...
|
||||
```
|
||||
|
||||
This assumes that there's a variable named `addr` which
|
||||
contains the virtual address whose page table entries you
|
||||
want, and a variable named `P4` which points to the top
|
||||
level page directory. The `OFFSET` macros finds the
|
||||
correct entry for an address for each page table level
|
||||
(`#define P4_OFFSET(a) (((a)>>39 & 0x1FF)` and so on).
|
||||
|
||||
Now getting the physical page for a virtual address is very easy:
|
||||
|
||||
```c
|
||||
...
|
||||
uint64_t vmm_get_page(uint64_t P4, uint64_t addr)
|
||||
{
|
||||
if(P4 && PRESENT(P4E) && PRESENT(P3E) && PRESENT(P2E))
|
||||
return P1E;
|
||||
return -1;
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
Where the `PRESENT` macro just checks for the `PAGE_PRESENT` bit being set.
|
||||
|
||||
Setting the physical page for a virtal address is also very easy:
|
||||
|
||||
```c
|
||||
...
|
||||
int vmm_set_page(uint64_t P4, uint64_t addr, uint64_t page, uint16_t flags)
|
||||
{
|
||||
...
|
||||
|
||||
if(!PRESENT(P4E) && !(P4E = pmm_calloc()))
|
||||
return -1;
|
||||
P4E |= flags | PAGE_PRESENT;
|
||||
|
||||
// Do the same thing for P3E and P2E
|
||||
...
|
||||
|
||||
P1E = page | flags;
|
||||
return 0;
|
||||
}
|
||||
...
|
||||
```
|
||||
|
||||
The first three lines checks if the P4 entry is set, and if not, allocates a P3
|
||||
and sets the entry to point to it. If the allocation fails, `vmm_set_page`
|
||||
fails as well. The same is then done with P2 and P1, and finally the correct P1
|
||||
entry is set with the page address and flags.
|
||||
|
||||
I also wrote a function `void vmm_clear_page(uint64_t P4, uint64_t addr, int
|
||||
free)` which zeros the P1 entry, and - if `free` is true - frees P1, P2 and P3
|
||||
if they are empty. This is left as an exercise to the reader.
|
||||
|
||||
## The kernel heap
|
||||
|
||||
The kernel heap keeps track of and hands out small chunks of memory for
|
||||
temporary storage. This corresponds to the `malloc()`-family of functions.
|
||||
|
||||
In this case, I actually chose to forego the heap, and try to make do with
|
||||
hard-coded global variables and structures using entire pages.
|
||||
|
||||
For example, when setting up a new process, I might normally do something like
|
||||
|
||||
```c
|
||||
struct process *new_process = kmalloc(sizeof(struct process));
|
||||
setup_process(new_process);
|
||||
```
|
||||
|
||||
Now, I'll instead do:
|
||||
|
||||
```c
|
||||
struct process *new_process = (void *)pmm_alloc();
|
||||
setup_process(new_process);
|
||||
```
|
||||
|
||||
This will reqire me to think a bit more carefully of how I define my various
|
||||
data structures in order to keep wasted space to a minimum. It'll be an
|
||||
interesting experiment, but we'll see - perhaps I'll end up implementing a heap
|
||||
later...
|
||||
|
||||
## The process memory manager - PROCMM
|
||||
|
||||
The final memory related part of the kernel - the procmm - handles user space
|
||||
memory. Setting up and cloning process memory spaces, replacing them with new
|
||||
executables, and handling the user stack and `brk()` calls are some of its
|
||||
tasks.
|
||||
|
||||
Since there are no processes yet, having a process memory manager doesn't really make sense, so I'll save this for later...
|
||||
|
||||
## A remark about the git history
|
||||
|
||||
For this chapter, I went a bit crazy with the TDD and made one git commit every
|
||||
time I wrote and passed a test. Perhaps that would make sense if I had a
|
||||
finished API to conform agains. Now, it just got a bit messy... If you explore
|
||||
the git history - I'm sorry.
|
@ -11,4 +11,5 @@
|
||||
[Chapter 6: Debug output](6_Debug_Output.md)<br>
|
||||
[Chapter 7: Multiboot Data](7_Multiboot_Data.md)<br>
|
||||
[Chapter 8: Exceptions and Interrupts](8_Exceptions.md)<br>
|
||||
[Chapter 9: Memory Management](9_Memory_Management.md)<br>
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user