167 lines
		
	
	
		
			5.0 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			167 lines
		
	
	
		
			5.0 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
layout: post
 | 
						|
title: "Loading elf"
 | 
						|
subtitle: "there's DWARF in my ELF."
 | 
						|
tags: [osdev]
 | 
						|
 | 
						|
### Elf header format
 | 
						|
 | 
						|
Elf files all start with a header which identifies the file and explains
 | 
						|
where to find everything. It has the following structure. The
 | 
						|
[ELF specification](http://www.skyfree.org/linux/references/ELF_Format.pdf)
 | 
						|
gives an excellent description on the meaning and use of each field.
 | 
						|
 | 
						|
    :::c
 | 
						|
    typedef struct
 | 
						|
    {
 | 
						|
        uint8_t identity[16];
 | 
						|
        uint16_t type;
 | 
						|
        uint16_t machine;
 | 
						|
        uint32_t version;
 | 
						|
        uint32_t entry;
 | 
						|
        uint32_t ph_offset;
 | 
						|
        uint32_t sh_offset;
 | 
						|
        uint32_t flags;
 | 
						|
        uint16_t header_size;
 | 
						|
        uint16_t ph_size;
 | 
						|
        uint16_t ph_num;
 | 
						|
        uint16_t sh_size;
 | 
						|
        uint16_t sh_num;
 | 
						|
        uint16_t strtab_index;
 | 
						|
    }__attributes__((packed)) elf_header;
 | 
						|
 | 
						|
The first thing we should do is check whether we actually got an
 | 
						|
executable ELF file. (In the following code, I'll assume the entire elf
 | 
						|
file is located somewhere in memory and that this location is passed to
 | 
						|
the `load_elf()` function.)
 | 
						|
 | 
						|
To check if the file is an ELF executable we can look at the
 | 
						|
identity field. The first four bytes of this filed should always be
 | 
						|
`0x7F`,`'E'`,`'L'`,`'F'`. If that's correct, we can look at the `type`
 | 
						|
field. For an executable standalone program, this should be `2`.
 | 
						|
 | 
						|
    :::c
 | 
						|
    int load_elf(uint8_t *data)
 | 
						|
    {
 | 
						|
        elf_header *elf = (elf_header *)data;
 | 
						|
        if(is_elf(elf) != ELF_TYPE_EXECUTABLE)
 | 
						|
            return -1;
 | 
						|
    ...
 | 
						|
 | 
						|
`is_elf` looks as follows. Note the use of `strncmp` which I can do
 | 
						|
because I link [newlib into my kernel](/blog/2013/08/Catching-Up/).
 | 
						|
 | 
						|
    :::c
 | 
						|
    int is_elf(elf_header *elf)
 | 
						|
    {
 | 
						|
        int iself = -1;
 | 
						|
 | 
						|
        if((elf->identity[0] == 0x7f) && \
 | 
						|
            !strncmp((char *)&elf->identity[1], "ELF", 3))
 | 
						|
        {
 | 
						|
            iself = 0;
 | 
						|
        }
 | 
						|
 | 
						|
        if(iself != -1)
 | 
						|
            iself = elf->type;
 | 
						|
 | 
						|
        return iself;
 | 
						|
    }
 | 
						|
 | 
						|
Should be pretty straight forward. Let's continue.
 | 
						|
 | 
						|
For just loading a simple ELF program, we only need to look at the
 | 
						|
program headers which are located in a table at offset `ph_offset` in
 | 
						|
the file.
 | 
						|
 | 
						|
    :::c
 | 
						|
    typedef struct
 | 
						|
    {
 | 
						|
        uint32_t type;
 | 
						|
        uint32_t offset;
 | 
						|
        uint32_t virtual_address;
 | 
						|
        uint32_t physical_address;
 | 
						|
        uint32_t file_size;
 | 
						|
        uint32_t mem_size;
 | 
						|
        uint32_t flags;
 | 
						|
        uint32_t align;
 | 
						|
    }__attributes__((packed)) elf_phead;
 | 
						|
 | 
						|
The program headers each tell us about one section of the file, and we
 | 
						|
use them to find out what parts of the elf image should be loaded where
 | 
						|
in memory. So, the next step would be to go through all program headers
 | 
						|
looking for loadable sections and load them into memory.
 | 
						|
 | 
						|
    :::c
 | 
						|
        ...
 | 
						|
        elf_phead *phead = (elf_phead)&data[elf->ph_offset];
 | 
						|
        uint32_t i;
 | 
						|
        for(i = 0; i < elf->ph_num; i++)
 | 
						|
        {
 | 
						|
            if(phead[i].type == ELF_PT_LOAD)
 | 
						|
            {
 | 
						|
                load_elf_segment(data, &phead[i]);
 | 
						|
            }
 | 
						|
        }
 | 
						|
        return 0;
 | 
						|
    }
 | 
						|
 | 
						|
This would also be a good time to update the memory manager information
 | 
						|
about the executable. You might want to keep track of the start and end
 | 
						|
of code and data for example.
 | 
						|
 | 
						|
Anyway, `load_elf_segment()` looks like this
 | 
						|
 | 
						|
    :::c
 | 
						|
    void load_elf_segment(uint8_t *data, elf_phead *phead)
 | 
						|
    {
 | 
						|
 | 
						|
        uint32_t memsize = phead->mem_size; // Size in memory
 | 
						|
        uint32_t filesize = phead->file_size; // Size in file
 | 
						|
        uint32_t mempos = phead->virtual_address; // Offset in memory
 | 
						|
        uint32_t filepos = phead->offset; // Offset in file
 | 
						|
 | 
						|
        uint32_t flags = MM_FLAG_READ;
 | 
						|
        if(phead->flags & ELF_PT_W) flags |= MM_FLAG_WRITE;
 | 
						|
 | 
						|
        new_area(current->proc, mempos, mempos + memsize, \
 | 
						|
            flags, MM_TYPE_DATA);
 | 
						|
 | 
						|
        if(memsize == 0) return;
 | 
						|
 | 
						|
        memcpy(mempos, &data[filepos], filesize);
 | 
						|
        memset(mempos + filesize, 0, memsize - filesize);
 | 
						|
    }
 | 
						|
 | 
						|
Let's go through it.
 | 
						|
 | 
						|
First we define some helper variables.
 | 
						|
 | 
						|
Next we check if the section we're loading should be writable.
 | 
						|
 | 
						|
Then we request a new memory area from the [process memory
 | 
						|
manager](/blog/2013/06/Even-More-Memory/).
 | 
						|
 | 
						|
Finally, we copy as much data as is provided in the file and fill the
 | 
						|
rest of the new area with zeros.
 | 
						|
 | 
						|
And that's really all you need to do to load an ELF executable.
 | 
						|
The only thing left is to jump to `elf->entry` and you're going.
 | 
						|
 | 
						|
### Improvements
 | 
						|
Of course the entire executable image won't be loaded into memory in the
 | 
						|
normal case, but it might be true for e.g. an `init` program or similar
 | 
						|
that your bootloaded loads as a module to your kernel. Instead, you
 | 
						|
should read the parts you want through your filesystem as you go along.
 | 
						|
 | 
						|
Or maybe you shouldn't. It doesn't make sense to load a huge program
 | 
						|
into memory all at once. What if it encounters an error and exits with
 | 
						|
99% of the code unexecuted?
 | 
						|
 | 
						|
Perhaps the process memory manager could be told where to find certain
 | 
						|
parts of the program, and load them only when needed?
 | 
						|
 | 
						|
### Git
 | 
						|
The methods described in this post has been implemented in git commit
 | 
						|
[a4ca835d1d](https://github.com/thomasloven/os5/tree/a4ca835d1db61faf214b4b617d38a335ef05d142).
 | 
						|
 |