mittos64/doc/9_Memory_Management.md

7.6 KiB

Memory Management

Normally, I think of the memory management of the kernel as four parts.

  • Physical Memory Manager
  • Virtual Memory Manager
  • Kernel Heap
  • Process Memory Manager

We'll go through them in order, but first a radical design decision.

Mapping the entire physical memory into kernel space

A 32-bit processor can address about 4 Gb of memory. Near the end of the 32-bit era, even home desktop computers were actually running into this limit.

In chapter 4, I mentioned that te top 16 address bits must be the same due to hardware limitations, in what's called canonical addressing. This leaves 48 bits, which lets us address 256 Tb.

I also mentioned that I reserve a 512th of this for the kernel (virtual addresses above 0xFFFFFF8000000000), which is about 512 Gb.

512 Gb - to paraphrase something Bill Gates probably never actually said - ought to be enough for anybody.

This means we could - for any forseeable future, and definitely longer than the lifespan of my hobby kernel designed for running in an emulated environment with a couple of Mb of RAM - map the entirety of RAM right into the kernel address space.

The main advantage of this is that we will never have to map anything into kernel space temporarily, say to modifiy the page tables of a process. It also means that we can use unused pages for temporary storage (more on this in a minute).

The main disadvantage is that the entirety of memory is mapped into the kernels virtual memory at all times. At the time of writing this, the Spectre and Meltdown hardware bugs affecting almost all modern Intel CPUs have recently proved that anything mapped into memory is insecure.

There are ways to remedy this, but they come at a performance cost, and - honestly - I don't feel like implementing them right now. Perhaps I will later, or perhaps I'll never have to because Intel decides that maybe they should fix the issue.

Anyway. The entirety of physical RAM will be mapped into kernel space. That means a physical page with address addr can be accessed at P2V(addr), where P2V is one of the two macros defined in Chapter 4. This explains their name P2V - Physical to Virtual, and V2P - Virtual to Physical. We'll use that a lot.

The physical memory manager - PMM

The PMM keeps track of available physical memory and hands out free pages when requested.

Thanks to virtual address translation, we almost never need to care where a physical page is located, and therefor the PMM can be made very simple.

In reality, we sometimes do need a number of continuous physical pages - for example when reading from disk using Direct Memory Access, but we'll save that for another day.

The entire PMM has only one single state variable. A pointer to the last freed page. This page, in turn, holds a pointer to the one that was freed before, and so on, so when we free a page, we write the pointer to it, and update the free pointer:

src/kernel/memory/pmm.c

...
uint64_t next = 0;

void pmm_free(uint64_t page)
{
  *(uint64_t *)P2V(page) = next;
  next = (uint64_t)P2V(page);
}
...

That's all. Allocating a page is equally simple:

...
uint64_t pmm_alloc()
{
  if(!next) return 0;
  uint64_t page = next;
  next = *(uint64_t *)page;
  return V2P(page);
}
...

Feeling comfortable about dereferencing a pointer cast will help you when debugging your VMM in gdb, by the way.

I also define a pmm_calloc() function, which allocates and zeros a page.

The virtual memory manager - VMM

The VMM handles setting up page tables, and separating user and kernel space.

As mentioned earlier, x86_64 uses four levels of page tables, each containing 512 64 bit entries. The 52 most significant bits of each entry is a pointer to the next level of page table, or the page itself in the case of the last page table (P1 in my inofficial nomenclature). This pointer is obviously always page alligned.

The 12 least significant bits of each entry are for various flags. In order to find the next level, they need to be masked out. So let's start with some macros:

src/kernel/memory/vmm.c

...
#define FLAGS_MASK (PAGE_SIZE - 1)
#define MASK_FLAGS(addr) ((uint64_t)addr & ~FLAGS_MASK)
...

Now it's easy to go from a page table entry to something we can parse:

...
#define PT(ptr) ((uint64_t *)P2V(MASK_FLAGS(ptr)))
...

Page table entries are physical addresses, so we need to go through P2V to access them. To access a certain page table entry, you just do e.g. PT(P4)[num].I chose to define some macros to access the entries for a certain address in each page table (P4 trough P1). They look like this:

...
#define P4E (PT(P4)[P4_OFFSET(addr)])
#define P3E PT(P4E)[P3_OFFSET(addr)])
#define P2E PT(P3E)[P2_OFFSET(addr)])
#define P1E PT(P2E)[P1_OFFSET(addr)])
...

This assumes that there's a variable named addr which contains the virtual address whose page table entries you want, and a variable named P4 which points to the top level page directory. The OFFSET macros finds the correct entry for an address for each page table level (#define P4_OFFSET(a) (((a)>>39 & 0x1FF) and so on).

Now getting the physical page for a virtual address is very easy:

...
uint64_t vmm_get_page(uint64_t P4, uint64_t addr)
{
  if(P4 && PRESENT(P4E) && PRESENT(P3E) && PRESENT(P2E))
    return P1E;
  return -1;
}
...

Where the PRESENT macro just checks for the PAGE_PRESENT bit being set.

Setting the physical page for a virtal address is also very easy:

...
int vmm_set_page(uint64_t P4, uint64_t addr, uint64_t page, uint16_t flags)
{
  ...

  if(!PRESENT(P4E) && !(P4E = pmm_calloc()))
    return -1;
  P4E |= flags | PAGE_PRESENT;

  // Do the same thing for P3E and P2E
  ...

  P1E = page | flags;
  return 0;
}
...

The first three lines checks if the P4 entry is set, and if not, allocates a P3 and sets the entry to point to it. If the allocation fails, vmm_set_page fails as well. The same is then done with P2 and P1, and finally the correct P1 entry is set with the page address and flags.

I also wrote a function void vmm_clear_page(uint64_t P4, uint64_t addr, int free) which zeros the P1 entry, and - if free is true - frees P1, P2 and P3 if they are empty. This is left as an exercise to the reader.

The kernel heap

The kernel heap keeps track of and hands out small chunks of memory for temporary storage. This corresponds to the malloc()-family of functions.

In this case, I actually chose to forego the heap, and try to make do with hard-coded global variables and structures using entire pages.

For example, when setting up a new process, I might normally do something like

struct process *new_process = kmalloc(sizeof(struct process));
setup_process(new_process);

Now, I'll instead do:

struct process *new_process = (void *)pmm_alloc();
setup_process(new_process);

This will reqire me to think a bit more carefully of how I define my various data structures in order to keep wasted space to a minimum. It'll be an interesting experiment, but we'll see - perhaps I'll end up implementing a heap later...

The process memory manager - PROCMM

The final memory related part of the kernel - the procmm - handles user space memory. Setting up and cloning process memory spaces, replacing them with new executables, and handling the user stack and brk() calls are some of its tasks.

Since there are no processes yet, having a process memory manager doesn't really make sense, so I'll save this for later...

A remark about the git history

For this chapter, I went a bit crazy with the TDD and made one git commit every time I wrote and passed a test. Perhaps that would make sense if I had a finished API to conform agains. Now, it just got a bit messy... If you explore the git history - I'm sorry.