KFS-4
This is the fourth release of KFS. Here's what we've been working on:
Interrutps
The main focus of this release has been around enabling interrupts in KFS, and write interrupts and exception handlers.
The first step to do that is to set up an Interrupt Descriptor Table, basically a table of pointers to irq/exception handler functions.
Exceptions
In the interrupts
module we define functions to handle all of the x86 cpu exceptions. We put pointers to these handlers in the IDT, so that they are called when an exception occurs.
Almost all of them cause a kernel panic for now, sometimes with a more helpful context (e.g. page faulting reads cr2
to get the address that caused the page-fault).
We use llvm's x86-interrupt
calling convention to backup and restore registers for us, and handle the optional errcode pushed by the cpu.
Double faulting
Stack overflowing causes a page fault exception, thanks to the page guard at the end of every KernelStack
. The page fault handler is invoked, but it also runs on the same invalid stack, and this causes a page fault again, which is translated to a Double Fault. We don't want to do that a third time, otherwise we would triple fault and reset the cpu.
Because of that, the double fault handler uses a Task Gate (instead of the usual Trap Gate for every other exception), so that the cpu can switch the stack before invoking the Double Fault exception handler. This is the only place in the kernel where we use TSSs.
The double fault handler just kernel panics, because clearly something went really wrong in the kernel.
PIC
Onto interrupts now.
First of all, we wrote a Programmable Interrupt Controller driver. We configure it in cascade, with the second line cascading to another 8254. When an interrupt occurs, we acknowledge it in the irq handler.
PIT
Now that we have interrupts, set up the Programmable Interrupt Timer in periodic mode, so it delivers an interrupt every millisecond. We acknowledge the interrupt, and use it to track time when processes are sleep
ing for an arbitrary amount of time.
Events
We're a micro-kernel, irq handlers live in userspace. Let's consider the case of a HDD driver: the driver makes a request to read a given amount of disk sectors via DMA, and then waits for an IRQ from the disk signaling completion. The kernel unschedules the driver for now, and runs some other process while the IRQ hasn't been received. Eventually, the disk will perform the request and signal an interrupt. At this point we're still running some other process. The cpu will enter Kernel mode, and call our irq handler. What the kernel irq handler does is simply incrementing a monotonic counter for the given irq line, notice that a process was waiting on it, wake it up by adding it to the schedule queue, and return to the process it was running. When the process eventually yields, the driver will finally be scheduled, and its irq handler code will be able to run.
From the userspace's point of view, this is all represented by the IRQEvent
handler type. The driver registers its interests for IRQs on a given IRQ line to the kernel by waiting on the handle, and the kernel will only wake it up after an IRQ has been signaled on the given line.
The counters are here to prevent dropping irqs: The kernel keeps for every line a monotonic counter of irqs that happened on this line, ever. An IRQEvent
handle holds the number of irqs it has acknowledged on its line. If you register for an irq, sleep, and 3 interrutps are triggered before you are actually scheduled, those 2 other interrupts are not lost. When you will wait on the handle again you will be immediately waked up, incrementing the ack
count on your handle. Waiting on a IRQEvent
effectively consumes one interrupt event. You are expected to consume them one by one.
This is all in theory, as we don't have a scheduler yet. But we implemented it anyway, so we don't have to do it in the future.
Syscall
Finally, we implemented a Proof of Concept syscall handler. The ABI is linuxy: you pass the syscall number in eax
, up to six other arguments in the other registers, and then do a
int 0x80
This will enter kernel mode and call our syscall handler function, which will unpack the arguments, dispatch to the appropriate syscall based on eax
, repack the return values into the registers, and iret
.
For clients, we provide the
pub unsafe fn syscall(syscall_nr: u32, arg1: u32, arg2: u32, arg3: u32, arg4: u32, arg5: u32, arg6: u32) -> u32 { ... }
function that performs the argument packing, int 0x80
, and result unpacking for them.
Once again, we don't have a userspace yet, but this is all already testable in the kernel for now.
SpinLockIRQ
Because we have interrupts now, we might be interrupted at undesired moments. To prevent that, we provide the SpinLockIRQ
type, that wraps and behaves like a spin::Mutex
, but also disables interrupts while it is held, ensuring one's critical section won't be interrupted.
PS/2 keyboard driver
We have interrupts, we can do I/O ! We implemented a PS/2 keyboard driver, that waits for interrupts on line 1, reads the scan code(s), maps them to ascii letters or control keys (Ctrl
, Shift
, CapsLock
, F12
, Pause
, www Search
(yes that's a real key), Email
, ...). It handles capitalization if the key shift is pressed/capslock is enabled.
It also exposes a get_next_line
function to read a full line, blocking while doing so.
Font rendering
Displaying gifs is fun, but sooner or later, we'll want to display text to the screen. As always, sooner it is.
We could just take a bitmap font like GNU Unifont and blit it to the screen, but where's the fun in that ? Plus, Unifont is really ugly, and I really like the monaco typeface.
So let's do font rendering in the kernel ! We use the font-rs crate to parse the monaco.ttf
file, and do the font rendering.
Rendering the 5
char gives us something like that:
Great, let's blit it to the screen !
Oh well ... Turns out font-rs doesn't handle baselines, and horizontal metrics. Unless you want to look like a serial killer blackmailing its victims, it's not acceptable. Let's fork it, export the hhea
table, do some dubious geometric arithmetics, and we have a vga text logger. Using the generic logger we wrote in kfs3, we can multiplex all logs to both the serial port and the screen.
Shell
We have an Input method, the keyboard, and output method, the vga text logger. Let's do a shell !
Well ... let's do a read-eval loop that recognize three hardcoded commands.
The commands are:
help
: display the list of known commandsstackdump
: display a stack dumpgif3
: display the KFS-3 gif memegif4
: display the KFS-4 meme
stack dump
The stackdump
command dumps the stack in a xxd
fashion, going down frame by frame like (gdb) backtrace
does, using the pushed esp
and eip
at the start of each frame to know where the next one starts.
$ stackdump
---------- Dumping stack ---------
# Stack start: 0xe01c9000, Stack end: 0xe01ccffc
> Frame #0 - eip: 0xc0004575 - esp: 0xe01cce98 - ebp: 0xe01ccf20
0xe01cce98: 0100 0000 0100 0000 f4ce 1ce0 d0dc 01c0 ................
0xe01ccea8: f0ce 1ce0 10df 01c0 e0ce 1ce0 10df 01c0 ................
0xe01cceb8: c4ce 1ce0 10df 01c0 0900 0000 20cf 1ce0 ................
0xe01ccec8: 4c4b 02c0 0500 0000 744b 02c0 0400 0000 LK......tK......
0xe01cced8: a0ce 1ce0 0400 0000 98ce 1ce0 fccf 1ce0 ................
0xe01ccee8: 98ce 1ce0 0090 1ce0 7545 00c0 0000 0000 ........uE......
0xe01ccef8: 20cf 1ce0 7545 00c0 e8ce 1ce0 2011 02c0 ....uE..........
0xe01ccf08: 0300 0000 40cf 1ce0 28cf 1ce0 40cf 1ce0 ....@...(...@...
0xe01ccf18: 28cf 1ce0 3cd7 1bc0 (...<...
Saved ebp: 0xe01ccf60 @ 0xe01ccf20 (ebp) - Saved eip: 0xc000c165 @ 0xe01ccf24 (ebp + 4)
> Frame #1 - eip: 0xc000c165 - esp: 0xe01ccf20 - ebp: 0xe01ccf60
0xe01ccf20: 60cf 1ce0 65c1 00c0 0100 0000 1c74 02c0 `...e........t..
0xe01ccf30: 0a00 0000 2674 02c0 1200 0000 0100 0000 ....&t..........
0xe01ccf40: 3cd7 1bc0 1000 0000 0900 0000 0500 0000 <...............
0xe01ccf50: 0100 0000 0400 0000 0500 0000 0100 0000 ................
Saved ebp: 0x00000006 @ 0xe01ccf60 (ebp) - Saved eip: 0xc000c013 @ 0xe01ccf64 (ebp + 4)
> Frame #2 - eip: 0xc000c013 - esp: 0xe01ccf60 - ebp: 0x00000006
Invalid ebp
-------- End of stack dump --------
The output is really clear, it only lacks symbols resolution to display the name of the function of each frame. This will be fixed in the next release.
Bootstrap
We want the Kernel to be mapped in the high addresses, like real biggies. However, when Grub passes control to us, the paging is not enabled, and we are loaded in low addresses. From there, we can accomplish what we want in two ways:
- Have the kernel be a PIE loaded in lower addresses, which sets-up the MMU when it starts, both identity maps itself in lower, and also in higher addresses, enables the paging, relocates all its symbols in higher addresses (
.dynamic
?), jumps to code in higher addresses, and finally unmaps itself from the lower addresses. - Split the kernel into two stages. A
bootstrap
stage loaded by Grub, which is in charge of enabling the paging, loading stage 2's ELF in higher addresses, and finally passing control to it.
We choose to go down the second road, as it is easier for now, as we don't want to handle ELF relocations just yet. Plus, it lets us split the kernel code into a really arch-specific bootstrap
stage, and a more architecture agnostic kernel
stage.
ELF
The kernel is a simple static ELF whose linker script puts the .text
, .data
, and .bss
in high addresses.
The bootstrap implements a really simple ELF loader to load the kernel's ELF. We use the xmas-elf crate to parse the ELF file format. We only need to get the program headers, find PT_LOAD
instructions and execute them. This is done in only 2 functions, and this is really beautiful.
Lands
This means that the user/kernel split on i386 is now the following:
0x00000000..0xbfffffff, 3 GiB: User memory
0xc0000000..0xffffffff, 1 GiB: Kernel memory
How to build
This release is pretty outdated, so we provide a roadmap on how to build it.
reveal build instructions
First, make sure to to checkout the kfs4
tag.
git switch --detach kfs4
cargo-xbuild
You need to downgrade your version of cargo-xbuild to v0.4.9
. You must do this outside of your repo as the rust-toolchain file would try to compile it with a non-working rustc.
cd ~
cargo install cargo-xbuild --force --version 0.4.9
cd -
fix the linker scripts
The OUTPUT_FORMAT
somewhat changed, use elf32-i386
instead.
sed -i 's/OUTPUT_FORMAT(elf-i386)/OUTPUT_FORMAT(elf32-i386)/' linker-scripts/kernel.ld linker-scripts/bootstrap.ld
grub-mkrescue
Because sunrise developers were lazy-ass monkeys, busy being high on smoking rolled-up intel-specs pretending it's some kind of psychotrope, they used to let all the hard work of creating an iso image be done by grub-mkrescue
, while taking all the credit.
This means that depending on your distro, you might need to install:
- grub
- mtools
This went on until Thog joined them and created mkisofs-rs
, and effectively rescued them, before they fall victim to 8042 withdrawal, and OD trying to snort an A20 line, while attempting to relieve the crave.
Building
Now you should finally be able to build and create the boot iso:
cargo make iso-release