Dev'ing an OS

by Shikhin Sethi

As times have progressed, people have shifted from assembly languages to higher level languages; from magazine code listings to the Internet; and from systems to application programming.  With this progression of time, the difference between the two - "systems" and "application" development - has broadened, making the journey to learning systems programming even more difficult.

Nowadays, systems programming isn't even taught in colleges and courses.  Children are made to learn Java as well as other languages with garbage collection and other "features."  This article aims at bringing a programmer versed in C on the path to becoming an experienced systems developer.

Prerequisites

*  Knowledge of C.  Perhaps the language most common in systems development, and the one everyone learns (or used to learn) as a beginner is C.  Knowledge of this is absolutely necessary since this article delves into things like pointers without even a single thought that the reader doesn't know what they are.

*  Knowledge of UNIX.  You should know how to use the command line in UNIX, compile simple files using GCC (at the command line), and other necessary stuff.

*  A little knowledge of assembly.  While not absolutely necessary, you should have some knowledge of assembly.  If you don't, though, don't worry, since I will also be teaching basics of assembly along the way.

*  It's almost surprising that some people who know the above don't have any basic knowledge of hexadecimal numbers.  Thus, be sure that you go through hexadecimal before reading on.

*  Most importantly, you must have good Googling skills, i.e., you must always query Google whenever in doubt.

*  And of course, you must have an Internet connection and a computer!

*  Oh, and the computer must (preferably) have Linux installed on it.  If you're using Windows and don't want to install Linux on the machine, you can always use a virtual machine.

Scope of This Article

This article attempts to give the reader a basic understanding of systems development.  The basic structure that it follows is:

A basic review of the boot process.  This should tell you how the computer actually starts, and what all is happening under the hood.  Following this is a basic explanation of Real Mode - the 16-bit initial mode that the BIOS leaves the computer in.

A review of x86 assembly follows for those who are unfamiliar with it.

We start by explaining how to install your choice of assembler.  Then, a bit about registers is explained.  That is followed by how to address, declare, and access memory.  A bit on the x86 stack follows.  The review then gives a reference where you can go through all the basic instructions.  In the end, the useful link to the manual of the assembler is given.

Since interrupts are about the only way to communicate with the BIOS, an explanation of them is given.  After all the theory, we start writing our very basic bootloader.  This section mostly contains assembly source code, with explanations in the form of comments and build instruction.  Since the article is rather short, instructions on how to proceed from here are given.

Review of the Boot Process

As soon as you click the power button on your computer (or laptop), surprisingly, it whirs to life.  The first thing to happen is that the motherboard starts up and initializes the memory controller, the chipset among other such things.  It then initializes the processor(s).

(Tidbit:  You might be wondering what happens when there are several processors in the system.  In such a case, a processor is dynamically chosen to run the BIOS as well as continue the initialization.  This processor is known as the BootStrap Processor, or the BSP.  The other processors are known as the Application Processors, and are halted until the Operating System wants to initialize them.)

The processor then starts executing the Basic Input/Output System, a.k.a. the BIOS.  The BIOS - the firmware - starts by doing the Power-On Self-Test (POST - funny acronyms, - eh?), which looks for and initializes peripherals in the system.

As soon as all of the peripherals have been identified and initialized, the BIOS starts looking for the first stage of an Operating System - the bootloader.  The BIOS loads the bootloader to the memory address 0x7C00, where the bootloader performs its functions.  For now, just know that the bootloader's job is to load the Operating System from the disk and "jump" to it.  We'll be going on to the bootloader in more detail in just a few seconds!

We could perhaps go into more details related to the boot process, but, for the moment, it's better to just leave it at that.

Real Mode

The BIOS leaves the processor in a 16-bit initial mode, known as the Real Mode.  This mode has no hardware-based memory protection, and, thus, any program can execute anything.  The default operand length is 16-bit, and only about 1 MiB of memory can be accessed.

While this mode has been superseded by (32-bit) Protected Mode, to maintain compatibility with legacy operating systems it is still present.  Moreover, it is the only practical mode via which you can access the BIOS functions - useful for gathering a memory map, reading the disk, among other functions required during boot.

Review of x86 Assembly

Every microprocessor has its own set of commands that it understands - with these commands in a series of highs and lows - 1s and 0s (binary).  These series of commands are what the machine can understand, and are known as machine instructions.

Since it's very difficult to remember these complex binary numbers, people implement programs known as assemblers which try to abstract away the machine instructions by taking in more understandable statements (in English) and translating them to machine instructions.

Since the syntax of the assembly languages is easy enough, and there is no standardized way to represent the instructions, people make their own dialects.  As of now, there are two major dialects for x86 assembly - Intel and AT&T.  While we will be using the Intel style of x86 assembly throughout this article, the difference is minimal, and you can switch to AT&T if you want to.

Installing the Assembler

For those who have chosen the Intel dialect, one of the best assemblers I have found is NASM.  For the AT&T pickers, GNU Assembler (GAS) is a good assembler.

Installing NASM by your package manager is easy.  For Debian users:
$ sudo apt-get install nasm
or
$ sudo apt-get install gas
should install the respective assemblers.

For Fedora users:
$ sudo yum install nasm
or 
$ sudo yum install gas
should install the assemblers.  In case your package manager does not contain the above packages, the source of the assembler can be downloaded from their sites (www.nasm.us, www.gnu.org/software/binutils, and compiled by hand.

Registers!

Just as you use temporary variables in higher-level languages, the x86 provides you with a set of eight 32-bit general purpose registers: EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP - with the names for mainly historical purposes.  The main difference with these registers and memory variables is the fact that the registers are located on the CPU, and can be accessed faster than the memory (and the cache).

The EAX register (or eax - NASM is only case sensitive about symbols) was mainly used as the accumulator register (for arithmetic purposes), ECX as the count register (for counters in loop), ESI to point to the source address in string instructions), EDI to point to the destination address (in string instructions), and ESP and EBP for managing the stack (more on the stack later).  However, except for ESP and EBP, it isn't necessary to use the rest of the registers for their destined purpose.

EAX, EBX, ECX, and EDX registers are split up into smaller 16-bit registers, and eventually 8-bit registers.

*  EAX is split up into AX as the lower 16-bit.  AX is also split up into AH (upper 8-bits) and AL (lower 8-bits).

*  EBX is split up into BX as the lower 16-bit.  BX is also split up into BH (upper 8-bits) and BL (lower 8-bits).

*  ECX is split up into CX as the lower 16-bit.  CX is also split up into CH (upper 8-bits) and CL (lower 8-bits).

*  EDX is split up into DX as the lower 16-bit.  DX is also split up into DH (upper 8-bits) and DL (lower 8-bits).

Memory!  Memory!  Memory!

Addressing

Memory in Real Mode can be accessed via segmentation, in which any physical memory address can be accessed in the form Segment:Offset.

The Segment and Offset are both 16-bit, and the pair represents the physical memory: (Segment * 16) + Offset

The mathematician might have noticed that a physical addresses can thus be accessed via several different Segment:Offset pairs.  For example:
0x0FF0,
0000:0FF0
00F0:00F0
00FF:0000
While the general purpose registers can be used to store the offset, storing the segment requires special registers.  For this purpose, the following segment registers are present:

CS or Code Segment  This is the segment register for all the code.

DS or Data Segment  This is the segment register for all the data.

ES or Extra Segment  This is the extra segment register, for other uses.

FS  This is another extra segment register ("F" comes after "E").

GS  Another extra segment register ("G" comes after "F").

SS or Stack Segment  This is the segment register.

Declaring

In NASM, symbols can be defined via SymbolName.  Analogous to the variables in higher-level languages, "variables" in NASM can be defined by having a symbol followed by "declaring a data region."

The way to declare these data regions is by using:

DB or Declare Byte  This declares a byte (8-bits).  Example usage: DB 0x12

DW or Declare Word  This declares two bytes (16-bits).  Example usage: DW 0x1234

DD or Declare Double  This declares four bytes (32-bits).  Example usage: DD 0x12345678

DQ or Declare Quadruple  This declares eight bytes (64-bits).  Example usage: DQ 0x1234567812345678

Unlike higher level languages, adjacent memory declarations are followed by each other, and no optimization takes place.

Accessing

For accessing memory, keeping the following in mind can help:

The address of the symbols are accessed by their names, with SymbolName translating to the address of that symbol.

The contents of the symbols are accessed by their names in [  ], with [SymbolName] translating the content at that symbol.

Since the assembler never knows how many bytes you want to access, you have to use size directives to make it clear to the assembler.

BYTE (1), WORD (2), DWORD (4), and QWORD (8) are used as size directives.

For example, word [SymbolName] indicates that you want to access the contents of the word at SymbolName.

The contents of the address pointed to by a register are accessed by [RegisterName].  AX, CX, and DX can't be used to address memory in Real Mode.

The same directives as above can be used to access memory contents via registers.

If you want to override the segment used to access the address (symbol or register), the following syntax can be used: [es:RegisterName] or [es:SymbolName]

Direct memory addresses can also be used.  For example, to access the contents at 0x0FF0, [00F0:00F0] can be used.

Stack

(The concept of the stack should be clear to any programmer reading this article, and the writer assumes so.)

The x86 has the concept of a stack, which is used to store parameters, local data, and return addresses.  However, the x86 stack grows downwards, which is rather unusual.

The SP register points at the top of the stack, and when something is pushed onto the stack, SP is decremented and the value pushed is stored on to the new top.  SS is used for the segment for the stack.

To store the above data without needing to "clean up at the end," the stack is divided into stack frames.  The address of the stack frames is stored into the BP register.

To better understand how stack frames are used, look at the following example of the C calling convention (known as CDECL calling convention):

Caller

*  Caller pushes the arguments in reverse on the stack.  Caller calls the callee.

*  Caller pops the pushed arguments to clear the stack.

*  Caller takes the value in EAX as the return value.

Callee

*  Callee saves the caller's EBP by pushing it onto the stack.

*  Callee places the current ESP in EBP, thus creating a new stack frame.

*  Callee makes some space on the stack for local data.

*  Callee executes code.

*  Callee replaces the ESP with EBP, effectively popping the local data.

*  Callee pops the caller's EBP.

*  Callee places the return code into EAX, and returns.

*  In assembly, the CDECL calling convention isn't usually used (unless you're intermixing with C code), and the EBP is a spare register.

Basic Instructions

The x86 Instruction Set Architecture is one of the most complex ISAs, and has many instructions.  Instead of trying to give a review of all of the basic ones, the following for reading is recommended: www.cs.virginia.edu/~evans/cs216/guides/x86.html

At this point, you should probably delve straight into the manual of your assembler.  For NASM, www.nasm.us/doc goes through all of the options and the syntax, and would help a lot.

Interrupting the Interrupt

Just before we delve into our bootloader, the concept of interrupts need to be explained.

Imagine yourself sleeping in the morning.  However, your arch-enemy, the alarm clock, wakes you up.  The question is "how?"  It interrupts you by ringing a bell.

Similarly, in Real Mode, to indicate that you want to get the BIOS' attention, you interrupt it.  In x86, the int instruction is used to do a software interrupt.

The way interrupts work is by having a vector table - 256 vectors - where each vector corresponds to an interrupt.  The BIOS then fills this table with the address of the functions that you need to call.

Thus, when you do int 0x1, the CPU jumps to whatever address is at the second (int 0x00 corresponds to the first) entry in the vector table.

Some devices also use interrupts to inform the CPU that they are ready to perform some special function.  For example, a disk device might interrupt the CPU to inform that it has read something, and is now ready to read another sector.

These interrupts can be masked by cli so that the CPU isn't interrupted, and can be unmasked by sti.  For now, you should probably enable these maskable interrupts so that the BIOS can use them.

The Bootloader

Now that all the theory is complete, we'd want to begin with the basic bootloader - not to bore all of my article readers!  Please note that this section contains no theory at all - it just throws the source with enough comments to help you understand what is going on.

The build instructions follow each source file.

Barebone Bootloader
; Main.asm
; This is a barebone bootloader to boot from the CD.

; BITS 16 tells NASM to output for 16 bit mode.
BITS 16

; ORG 0x7C00 ensures that all the data references are w.r.t. 0x7C00.
ORG 0x7C00

; This is our entry point, where the BIOS leaves us.
; The BIOS ensures that:
; a) DL contains the boot drive number. 
; This is a number to identify what device we booted from, so that we can 
; read/write from/to it later.
; b) CS:IP points to 0x7C00. Note that this doesn't mean IP (instruction 
; pointer) is 0x7C00.
Main:
; This is known as a long jump, and is the only way to reset the CS segment 
; register. The rest of the registers can be changed via a simple mov 
; instruction.
jmp 0x0000:Startup

; We save the Boot Drive number here.
BootDrive:
db 0

; Now, we are assured that the Instruction Pointer is 0x7C00.
Startup:
; We stop all maskable interrupts until we don't set up a stack, since the 
; interrupts require a stack.

cli

; We require all segment registers to be 0x0000. All except CS can be set 
; via a mov instruction.
xor ax, ax
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax

; Set the stack to 0x7C00. Since the stack grows downwards on x86, this 
; means that unless we do extra pops, we never cross into the bootloader's area.
mov ss, ax
mov sp, 0x7C00

; Now that we have set up the stack, we enable maskable interrupts.
sti

; Though disk reading is out of the scope of this tutorial, you should 
; save the drive number if you want to use it later on.
mov [BootDrive], dl

; All the code that we introduce later on should be put here.

; Here, $ is a special NASM symbol, which points to the address of the 
; current instruction. Thus, it keeps on jumping to the current instruction, 
; thus effectively halting the CPU.
jmp $
Build Instructions

Make a directory known as Article. Save the above file to main.asm in the Article directory. Assemble the above file via NASM.

The way you can do it is by the following command from the command line in the tutorial directory:
$ nasm main.asm -fbin -o Article
This tells NASM to assemble main.asm file and output a flat binary (i.e., without any file format). The -o flag tells it that the output file should be named Article.

Make an ISO using mkisofs (install if not installed). Execute the following command from the command line in the tutorial directory:
$ mkisofs -b Article -quiet -input-charset ascii -boot-load-size 4 -no-emul-boot -o Article.iso ./.
The -b flag tells mkisofs that the bootloader file is known as Article. The -boot-load-size and -no-emul-boot can be ignored. If you're curious enough, a full explanation can be found in the respective manual of mkisofs.

How to Continue?

At this point, my article is almost finished. You must be wondering on how to proceed. So here, I am giving you my list of references:

osdev.org is an excellent site, with wiki.osdev.org/Getting_Started and wiki.osdev.org/Tutorials the recommended pages.

www.brokenthorn.com/Resources/OSDevIndex.html is also an excellent tutorial.

At this point, I leave you to explore the magically wonderful world of OS development.

Thanks!

Return to $2600 Index