167 lines
5.0 KiB
Markdown
167 lines
5.0 KiB
Markdown
layout: post
|
|
title: "Loading elf"
|
|
subtitle: "there's DWARF in my ELF."
|
|
tags: [osdev]
|
|
|
|
### Elf header format
|
|
|
|
Elf files all start with a header which identifies the file and explains
|
|
where to find everything. It has the following structure. The
|
|
[ELF specification](http://www.skyfree.org/linux/references/ELF_Format.pdf)
|
|
gives an excellent description on the meaning and use of each field.
|
|
|
|
:::c
|
|
typedef struct
|
|
{
|
|
uint8_t identity[16];
|
|
uint16_t type;
|
|
uint16_t machine;
|
|
uint32_t version;
|
|
uint32_t entry;
|
|
uint32_t ph_offset;
|
|
uint32_t sh_offset;
|
|
uint32_t flags;
|
|
uint16_t header_size;
|
|
uint16_t ph_size;
|
|
uint16_t ph_num;
|
|
uint16_t sh_size;
|
|
uint16_t sh_num;
|
|
uint16_t strtab_index;
|
|
}__attributes__((packed)) elf_header;
|
|
|
|
The first thing we should do is check whether we actually got an
|
|
executable ELF file. (In the following code, I'll assume the entire elf
|
|
file is located somewhere in memory and that this location is passed to
|
|
the `load_elf()` function.)
|
|
|
|
To check if the file is an ELF executable we can look at the
|
|
identity field. The first four bytes of this filed should always be
|
|
`0x7F`,`'E'`,`'L'`,`'F'`. If that's correct, we can look at the `type`
|
|
field. For an executable standalone program, this should be `2`.
|
|
|
|
:::c
|
|
int load_elf(uint8_t *data)
|
|
{
|
|
elf_header *elf = (elf_header *)data;
|
|
if(is_elf(elf) != ELF_TYPE_EXECUTABLE)
|
|
return -1;
|
|
...
|
|
|
|
`is_elf` looks as follows. Note the use of `strncmp` which I can do
|
|
because I link [newlib into my kernel](/blog/2013/08/Catching-Up/).
|
|
|
|
:::c
|
|
int is_elf(elf_header *elf)
|
|
{
|
|
int iself = -1;
|
|
|
|
if((elf->identity[0] == 0x7f) && \
|
|
!strncmp((char *)&elf->identity[1], "ELF", 3))
|
|
{
|
|
iself = 0;
|
|
}
|
|
|
|
if(iself != -1)
|
|
iself = elf->type;
|
|
|
|
return iself;
|
|
}
|
|
|
|
Should be pretty straight forward. Let's continue.
|
|
|
|
For just loading a simple ELF program, we only need to look at the
|
|
program headers which are located in a table at offset `ph_offset` in
|
|
the file.
|
|
|
|
:::c
|
|
typedef struct
|
|
{
|
|
uint32_t type;
|
|
uint32_t offset;
|
|
uint32_t virtual_address;
|
|
uint32_t physical_address;
|
|
uint32_t file_size;
|
|
uint32_t mem_size;
|
|
uint32_t flags;
|
|
uint32_t align;
|
|
}__attributes__((packed)) elf_phead;
|
|
|
|
The program headers each tell us about one section of the file, and we
|
|
use them to find out what parts of the elf image should be loaded where
|
|
in memory. So, the next step would be to go through all program headers
|
|
looking for loadable sections and load them into memory.
|
|
|
|
:::c
|
|
...
|
|
elf_phead *phead = (elf_phead)&data[elf->ph_offset];
|
|
uint32_t i;
|
|
for(i = 0; i < elf->ph_num; i++)
|
|
{
|
|
if(phead[i].type == ELF_PT_LOAD)
|
|
{
|
|
load_elf_segment(data, &phead[i]);
|
|
}
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
This would also be a good time to update the memory manager information
|
|
about the executable. You might want to keep track of the start and end
|
|
of code and data for example.
|
|
|
|
Anyway, `load_elf_segment()` looks like this
|
|
|
|
:::c
|
|
void load_elf_segment(uint8_t *data, elf_phead *phead)
|
|
{
|
|
|
|
uint32_t memsize = phead->mem_size; // Size in memory
|
|
uint32_t filesize = phead->file_size; // Size in file
|
|
uint32_t mempos = phead->virtual_address; // Offset in memory
|
|
uint32_t filepos = phead->offset; // Offset in file
|
|
|
|
uint32_t flags = MM_FLAG_READ;
|
|
if(phead->flags & ELF_PT_W) flags |= MM_FLAG_WRITE;
|
|
|
|
new_area(current->proc, mempos, mempos + memsize, \
|
|
flags, MM_TYPE_DATA);
|
|
|
|
if(memsize == 0) return;
|
|
|
|
memcpy(mempos, &data[filepos], filesize);
|
|
memset(mempos + filesize, 0, memsize - filesize);
|
|
}
|
|
|
|
Let's go through it.
|
|
|
|
First we define some helper variables.
|
|
|
|
Next we check if the section we're loading should be writable.
|
|
|
|
Then we request a new memory area from the [process memory
|
|
manager](/blog/2013/06/Even-More-Memory/).
|
|
|
|
Finally, we copy as much data as is provided in the file and fill the
|
|
rest of the new area with zeros.
|
|
|
|
And that's really all you need to do to load an ELF executable.
|
|
The only thing left is to jump to `elf->entry` and you're going.
|
|
|
|
### Improvements
|
|
Of course the entire executable image won't be loaded into memory in the
|
|
normal case, but it might be true for e.g. an `init` program or similar
|
|
that your bootloaded loads as a module to your kernel. Instead, you
|
|
should read the parts you want through your filesystem as you go along.
|
|
|
|
Or maybe you shouldn't. It doesn't make sense to load a huge program
|
|
into memory all at once. What if it encounters an error and exits with
|
|
99% of the code unexecuted?
|
|
|
|
Perhaps the process memory manager could be told where to find certain
|
|
parts of the program, and load them only when needed?
|
|
|
|
### Git
|
|
The methods described in this post has been implemented in git commit
|
|
[a4ca835d1d](https://github.com/thomasloven/os5/tree/a4ca835d1db61faf214b4b617d38a335ef05d142).
|
|
|