# Using SSE Stream SIMD Extensions (SSD) is a set of processor instructions that help with floating point operations. They have been around for over 20 years, but are not enabled by default. The reason for this is that the SSE instructions make use of some special CPU registers, and the OS needs to be aware of this so they can be saved and restored at context switching. TODO: Three versions of SSE instructions and registers? ## Preserving registers Saving and restoring the SSE registers is actually very easy, and there's even special instructions that does it: ```asm .global sse_save sse_save: fxsave [rdi] ret .global sse_restore sse_restore: fxrstor [rdi] ret ``` Those functions will save or restore the SSE registers to or from the 512 byte buffer used as the first argument (passed in `rdi`). A good time to do this is to restore the registers right before switching to a new thread, and saving them right after: ```c process *next = scheduler_next(); ... sse_restore(next->sse); switch_stack(scheduler->stack_ptr, next->stack_ptr); sse_save(next->sse); ... ``` TODO: Can SSE usage be checked through exceptions? Now, before moving on to how to actually enable SSE, there's one pitfall to look out for. The memory location for storing the SSE registers MUST BE 16 BYTE ALLIGNED. This is easiest done by allocation 16 bytes extra, and adding an offset to the pointer: TODO: Make sure this is freed correctly! ```c struct process *new_process() { ... void *sse = malloc(512+16); new->sse = (void *) (((uintptr_t)sse + 0xF) & ~0xF); ``` ## Enabling SSE Ok, so finally - enabling SSE. This is simply done by setting or unsetting four controll register bits. ```asm sse_init: mov rax, cr4 or rax, 1<<9 //; Enable Operating System FXSAVE/FXRSTOR Support bit (OSFXSR) or rax, 1<<10 //; Enable Operating System Unmasked Exception Support bit (OSXMMEXCPT) mov cr4, rax mov rax, cr0 or rax, 1<<1 //; Enable Monitor Coprocessor bit (MP) and rax, ~(1<<2) //; Disable Emulate Coprocessor bit (EM) mov cr0, rax ``` And that's it. We can now use floating point math! TODO: Find out and explain what the flags actually do and why they are all needed. OSFXSR: Enable use of legacy SSE instructions and indicate that OS saves 64 and 128 bit registers on task switching OSXMMEXCPT: Indicate that floating point excetion (#XF) is handled by the OS MP: CHECK IF THIS IS REQUIRED HERE OR ALREADY SET EM: Should be set to 1 if processor supports x86 - check if this is set already CR0= 0x80000011 EM already unset, MP not set CR4= 0x00000020 CR0= 0x80000013 CR4= 0x00000620