thomasloven.com/pages/2012-07-30-Privilege-Levels.md

164 lines
5.9 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

layout: post
title: "Privilege Levels"
subtitle: "Lots of abbreviations ending in PL"
tags: [osdev]
###Processor privilege level in Segmentation
The Intel x86 processor architecture has a number of features implemented to
protect the system from malicious code. One of those features is the
__Privilege Levels__.
The privilege levels are a remnant of the times when memory segmentation was
popular. With segmentation, the physical memory is divided into segments that
work as a kind of translation table. In Protected mode, if you call an address
like
:::nasm
jmp CS:AX
the processor looks into the currently loaded __Local__ or __Global Descriptor
Table__ ( __LDT__/ __GDT__) for the entry pointed to by _CS_. This enty (or
__Segment Descriptor__) describes the beginning of a segment which is combined
with the offset in _AX_ to get the physical address;
:::c
physical_address = segment_descriptor_from_index(CS).base + AX;
The segment descriptor also has a limit, which in our example is the maximum
value _AX_ is allowed to take. If it's higher, you get a __Segmentation Fault__
(or segfault for short). Now you can start to see how this system makes for
a working memory protection scheme. By switching out the LDT, you can change
what part of physical memory is addressed by any Selector:Offset-pair and thus
give each task or process their own address space.
The segmentation scheme is now deprecated in favor of paging which offers more
fine-grained control and a greater level or transparency.
So, what about the privilege levels?
Well, the user program can switch its own segment selector values. However,
each segment has a protection level, given by the __Descriptor Privilege
Level__ ( __DLP__) in the segment descriptor. The processor has a __Current
Privilege Level__ ( __CPL__) which is given by the lowest two bits of _CS_. If
the program tries to switch a selector to a descriptor with a DPL that is lower
than the CPL, the processor throws a __General Protection Fault__.
###Processor privilege levels today
I mentioned that segmentation is deprecated in favor of paging, so why would
I care about it for a modern state-of-the-art operating system such as mine?
Firstly, the x86 architecture requires segmentation to access the entire
address space - most hobby OSes I've studied just keeps two segments (one for
code and one for data - processor requirement) for this, with base 0x0 and
a limit of 4 gb (in other words, they each cover the entire virtual address
space).
Secondly, there are other ways than segmentation where the CPL comes into play.
For example, in paging, if the supervisor bit of a page table entry is set, the
address can only be accessed if the processor is in CPL 0 (sometimes called
__ring 0__).
The privilege levels are also used to determine whether certain instructions
may be run, like _sti_, _lgdt_, _hlt_ and such.
Finally, the privilege levels determine which interrupts may be called with the
_int_ instruction (each interrupt descriptor in the IDT has an assigned DPL).
So there's still a point to keep privilege levels around for your hobby OS,
despite the problems they cause with segmentation and TSS and stuff.
###Changing the privilege level
Changing the CPL is actually two different problems.
- Increasing CPL
- Decreasing CPL
Increasing the CPL is relatively easy. It can be done either through a far jump
:::nasm
JMP 0x1B:label
label:
; The CS selector is now 0x18 | 0x3
; i.e. it points to segment no 3 (3*0x8) and CPL is set to 0x3
or through the `IRET` instruction
###The IRET instruction
Let's change the topic for a minute and think about interrupts.
Say the processor is running in __Kernel Mode__ (Ring 0, CPL=0) and an
interrupt happens. What the processor does then is:
- Push SS and ESP to stack
- Push EFLAGS to stack
- Push CS and EIP to stack
- Load CS and EIP from the IDT
and from there the interrupt handling routine takes over.
The interrupt handling routine does its thing and then runs the `IRET`
instruction. `IRET` makes the processor do the same thing as when an interrupt
happens, but _backwards_. I.e:
- Pop CS and EIP from stack
- Pop EFLAGS from stack
- Pop SS and ESP from stack
- Do stack stuff
- Far jump to CS:EIP
Notice that extra thing there? The "Do stack stuff"?
At that point, the processor checks the value of CS that is just popped. It
compares the __Requested Privilege Level__ ( __RPL__, last one - promise - I'm
not making these up, you know) in the bottom two bits of this to the CPL and if
it is higher it changes SS and ESP to the recently popped values. This is really
useful for software task switching.
So, you could easily get into a higher privilege level by intercepting
a handled interrupt and changing the value of CS on the stack. If you set the
bottom two bits to 0x3, you will soon be in User Mode.
An other (better in my opinion) option is to create a fake interrupt-pushed
stack and push that onto the stack before running `IRET` .
:::c
// C code
struct
{
uint32_t esp;
uint32_t ss;
uint32_t eflags;
uint32_t eip;
uint32_t cs;
} fake_stack;
fake_stack.esp = usermode_stack_top;
fake_stack.ss = user_data_segment | 0x3;
fake_stack.eflags = 0;
fake_stack.eip = &usermode_function;
fake_stack.cs = user_code_segment | 0x3;
set_all_segments(user_data_segment | 0x3);
run_iret(&fake_stack);
 
:::nasm
; Assembler code
run_iret:
add esp, 0x8
iret
###Going back to ring0
I was going to continue this blog post with talking about how to switch from
a higher CPL to a lower, but it is growing way longer than I thought it would.
Therefore I will cut it off here, and continue in a new post.
###Application
The methods described in this post is used in Git commit
[52a0c84739](https://github.com/thomasloven/os5/tree/52a0c84739e04f3d9dd7410cdf0b378118a946b4).