thomasloven.com/pages/2014-04-15-Dito-Framework.md

layout: post
title: "DITo - Framework"
subtitle: "the Disk Image TOols"
tags: [osdev]

In my osdeving, I was starting to reach the point where a disk driver
seemed like the obvious next step. This was pretty much entirely unknown
territory for me. In fact, my only experience from disks and filesystems
were from when I got started in osdeving and found some tutorial in pdf
form which described how to write a bootloader in asm that read a kernel
from a FAT12 floppy disk.

Since then, whenever I needed a disk image for testing, I'd go through
a painful process of finding an image with GRUB preinstalled, mounting
it using a discontinued third party application, copy stuff to it, hope
I would be able to unmount it without the entire computer freezing up
and finally pray that it worked when I started the emulator. In short,
trying to manage a disk image from the command line in OSX sucks.

That's when I realized I could kill two birds with one stone. By writing
a tool for managing files in a disk image without mounting it, I could
gain understanding and experience of working with filesystems. If I
wrote it well, I would probably be able to reuse much of the code for
my kernel as well. At the time I had just finished my master thesis and
had all but signed the contract for my current employment, so I had some
free time on my hands while the paperwork fell through.

The result was [DITo - Disk Image
Tools](https://github.com/thomasloven/dito), a c library and set of
applications for creating and handling disk images from the command
line.

Recently, I actually did copy some of the code from DITo into my kernel.
Immagine my surprise when it actually worked like a charm after changing
only a few function calls.

I've since realized a couple of mistakes though, and decided to rewrite
some parts from scratch. Let's go!

###Drive operations

The basic operations of DITo are reading from or writing to image files
or disk drives. Each drive type has a driver

	typedef struct drive_driver
	{
		int (*open)(struct drive_t *d, int flags);
		int (*close)(struct drive_t *d, int flags);
		int (*read)(struct drive_t *d, void *buffer, size_t length, off_t offset);
		int (*write)(struct drive_t *d, void *buffer, size_t length, off_t offset);
	} drive_driver_t;

The drive type contains a pointer to the driver and a pointer to some
arbitrary data used by the driver.

	typedef struct drive_t
	{
		struct drive_driver *d;
		void *data;
	} drive_t;

Then there are some wrapper functions for performing the required
operations:

	int drive_open(struct drive_t *d, int flags)
	{
		if(d->d->open)
			return d->d->open(d, flags);
		else
			return 0;
	}

and simmilar for `drive_close`, `drive_read` and `drive_write`.

###Filesystem operations

The next important part of DITo is the filesystem handling. After
thinking about it, the important primitive functions for all file
operations I could think about are all in a filesystem driver struct:

	struct fs_driver
	{
		INODE (*open)(struct fs_t *fs, const char *path, int flags);
		int (*close)(struct fs_t *fs, INODE ino);
		int (*read)(struct fs_t *fs, INODE ino, void *buffer, size_t length, off_t offset);
		int (*write)(struct fs_t *fs, INODE ino, void *buffer, size_t length, off_t offset);
		int (*truncate)(struct fs_t *fs, INODE ino, off_t length);
		int (*stat)(struct fs_t *fs, INODE ino, struct stat *st);

		int (*touch)(struct fs_t *fs, const char *path, struct stat *st);
		int (*link)(struct fs_t *fs, const char *path1, const char *path2);
		int (*unlink)(struct fs_t *fs, const char *path);
		dirent_t *(*readdir)(struct fs_t *fs, INODE dir, unsigned int num);
	};

The `fs_t` type contains a pointer to the driver, a pointer to the drive
and a general data pointer.

	typedef struct fs_t
	{
		struct fs_driver *driver;
		drive_t *d;
		void *data;
	} fs_t;

The wrapper functions `fs_open`, `fs_close` and so on work the same way
as the `drive_*` functions.

The `INODE` type is a pointer to a struct containing a pointer to the
filesystem, a unique inode number and a pointer to arbitrary data.

	struct ino_st
	{
		fs_t *fs;
		unsigned int ino;
		void *data;
	};

	typedef struct ino_st * INODE;

And that's the basic framework. As you probably notice, the same `fs_t`
pointer is passed to most functions twice. Once as `fs` and once as
`ino->fs`. I decided to keep it this way to get the function interface
consistant, and also for the possible sanity check `fs == ino->fs`.

The idea behind the framework is that the same functions should be
usable for all kinds of filesystems on all kinds of drives. For example,
if I have one image of an FAT floppy disk with a file I want copied to
the ext2 formated second partition of a hard drive image, I could do
someting like this:

	drive_t *fat_disk = image_drive("floppy.img");
	drive_open(fat_disk, READ_FLAG);
	drive_t *ext2_disk = image_drive("harddrive.img");
	drive_open(ext2_disk, READ_WRITE_FLAG);
	drive_t *ext2_partition = mbr_drive(ext2_disk, 2);
	drive_open(ext2_partition, READ_WRITE_FLAG);

	fs_t *fat = fat_fs(fat_disk);
	fs_t *ext2 = ext2_fs(ext2_partition);

	INODE source = fs_open(fat, "/path/to/file", READ_FLAG);
	struct st *st = malloc(sizeof(struct st));
	fs_struct(fat, source, st);

	fs_touch(ext2, "/new/path", st);
	INODE destination = fs_open(ext2, "/new/path", WRITE_FLAG);

	void *buffer = malloc(BUFER_SIZE);
	off_t offset = 0;
	off_t add = 0;
	while(add = fs_read(fat, source, buffer, BUFFER_SIZE, offset))
	{
		fs_write(ext2, destination, buffer, BUFFER_SIZE, offset);
		offset += add;
	}

	fs_close(destination);
	fs_close(source);

	drive_close(ext2_partition);
	drive_close(fat_disk);

Which of couse will eventually become its own tool so that the actual
work the end user has to do becomes:

	$ dito-cp floppy.img:/path/to/file harddrive.img:2:/new/path