layout: post title: "DITo - Framework" subtitle: "the Disk Image TOols" tags: [osdev] In my osdeving, I was starting to reach the point where a disk driver seemed like the obvious next step. This was pretty much entirely unknown territory for me. In fact, my only experience from disks and filesystems were from when I got started in osdeving and found some tutorial in pdf form which described how to write a bootloader in asm that read a kernel from a FAT12 floppy disk. Since then, whenever I needed a disk image for testing, I'd go through a painful process of finding an image with GRUB preinstalled, mounting it using a discontinued third party application, copy stuff to it, hope I would be able to unmount it without the entire computer freezing up and finally pray that it worked when I started the emulator. In short, trying to manage a disk image from the command line in OSX sucks. That's when I realized I could kill two birds with one stone. By writing a tool for managing files in a disk image without mounting it, I could gain understanding and experience of working with filesystems. If I wrote it well, I would probably be able to reuse much of the code for my kernel as well. At the time I had just finished my master thesis and had all but signed the contract for my current employment, so I had some free time on my hands while the paperwork fell through. The result was [DITo - Disk Image Tools](https://github.com/thomasloven/dito), a c library and set of applications for creating and handling disk images from the command line. Recently, I actually did copy some of the code from DITo into my kernel. Immagine my surprise when it actually worked like a charm after changing only a few function calls. I've since realized a couple of mistakes though, and decided to rewrite some parts from scratch. Let's go! ###Drive operations The basic operations of DITo are reading from or writing to image files or disk drives. Each drive type has a driver typedef struct drive_driver { int (*open)(struct drive_t *d, int flags); int (*close)(struct drive_t *d, int flags); int (*read)(struct drive_t *d, void *buffer, size_t length, off_t offset); int (*write)(struct drive_t *d, void *buffer, size_t length, off_t offset); } drive_driver_t; The drive type contains a pointer to the driver and a pointer to some arbitrary data used by the driver. typedef struct drive_t { struct drive_driver *d; void *data; } drive_t; Then there are some wrapper functions for performing the required operations: int drive_open(struct drive_t *d, int flags) { if(d->d->open) return d->d->open(d, flags); else return 0; } and simmilar for `drive_close`, `drive_read` and `drive_write`. ###Filesystem operations The next important part of DITo is the filesystem handling. After thinking about it, the important primitive functions for all file operations I could think about are all in a filesystem driver struct: struct fs_driver { INODE (*open)(struct fs_t *fs, const char *path, int flags); int (*close)(struct fs_t *fs, INODE ino); int (*read)(struct fs_t *fs, INODE ino, void *buffer, size_t length, off_t offset); int (*write)(struct fs_t *fs, INODE ino, void *buffer, size_t length, off_t offset); int (*truncate)(struct fs_t *fs, INODE ino, off_t length); int (*stat)(struct fs_t *fs, INODE ino, struct stat *st); int (*touch)(struct fs_t *fs, const char *path, struct stat *st); int (*link)(struct fs_t *fs, const char *path1, const char *path2); int (*unlink)(struct fs_t *fs, const char *path); dirent_t *(*readdir)(struct fs_t *fs, INODE dir, unsigned int num); }; The `fs_t` type contains a pointer to the driver, a pointer to the drive and a general data pointer. typedef struct fs_t { struct fs_driver *driver; drive_t *d; void *data; } fs_t; The wrapper functions `fs_open`, `fs_close` and so on work the same way as the `drive_*` functions. The `INODE` type is a pointer to a struct containing a pointer to the filesystem, a unique inode number and a pointer to arbitrary data. struct ino_st { fs_t *fs; unsigned int ino; void *data; }; typedef struct ino_st * INODE; And that's the basic framework. As you probably notice, the same `fs_t` pointer is passed to most functions twice. Once as `fs` and once as `ino->fs`. I decided to keep it this way to get the function interface consistant, and also for the possible sanity check `fs == ino->fs`. The idea behind the framework is that the same functions should be usable for all kinds of filesystems on all kinds of drives. For example, if I have one image of an FAT floppy disk with a file I want copied to the ext2 formated second partition of a hard drive image, I could do someting like this: drive_t *fat_disk = image_drive("floppy.img"); drive_open(fat_disk, READ_FLAG); drive_t *ext2_disk = image_drive("harddrive.img"); drive_open(ext2_disk, READ_WRITE_FLAG); drive_t *ext2_partition = mbr_drive(ext2_disk, 2); drive_open(ext2_partition, READ_WRITE_FLAG); fs_t *fat = fat_fs(fat_disk); fs_t *ext2 = ext2_fs(ext2_partition); INODE source = fs_open(fat, "/path/to/file", READ_FLAG); struct st *st = malloc(sizeof(struct st)); fs_struct(fat, source, st); fs_touch(ext2, "/new/path", st); INODE destination = fs_open(ext2, "/new/path", WRITE_FLAG); void *buffer = malloc(BUFER_SIZE); off_t offset = 0; off_t add = 0; while(add = fs_read(fat, source, buffer, BUFFER_SIZE, offset)) { fs_write(ext2, destination, buffer, BUFFER_SIZE, offset); offset += add; } fs_close(destination); fs_close(source); drive_close(ext2_partition); drive_close(fat_disk); Which of couse will eventually become its own tool so that the actual work the end user has to do becomes: $ dito-cp floppy.img:/path/to/file harddrive.img:2:/new/path