11. Heap Memory Allocator - josehu07/hux-kernel GitHub Wiki

Hux is now running safe & sound with virtual memory support. However, we do not know how to manage the large chunk of kernel heap memory yet. Recall that in the previous chapter, we just maintain a dumb kheap_curr pointer and that solution does not allow free'ing any allocated memory. It is just a temporary solution for allocating things needed to set up paging.

The next thing to do is thus to develop a heap memory allocation algorithm that (efficiently) provides a kalloc() + kfree() interface to other parts of the kernel. This interface will allow us to easily allocate and recycle bytes on the kernel heap whenever we introduce a new kernel data structure. Unlike paging, we cannot assume fixed-sized slots here and must be able to react to dynamically-sized requests.

Main References of This Chapter

Scan through them before going forth:

  • Free-Space Management chapter of the OSTEP book: splitting & coalescing, free-list embedding, next-fit policy ✭

Heap Allocation Algorithm

Let's first go through the heap memory allocation problem conceptually. Heap allocation is an instance of the free-space management problem. The heap allocator, at initialization, knows the range & size of free memory it can use to allocate objects, in our case [kheap_curr, KMEM_MAX). What it does is maintaining a data structure of free memory regions, named a free-list.

A free-list can be any suitable data structure: linked list, doubly-linked list, binary tree, hash map, etc. We will use the simplest form of a linked list as shown in the OSTEP book examples.

Mechanism: Splitting & Coalescing

We use a node to denote a contiguous chunk of free memory. Each node holds the following information:

  • addr: start address of the free chunk
  • size: size of the free chunk
  • next: pointer to the next node in list
  • (optional) magic: for detecting memory corruptions/buffer overruns

Our free-list is simply a linked list of such nodes, sorted in ascending order of addresses. It starts with one big free chunk covering the entire heap memory range. When an allocation request comes, it chooses a chunk that is large enough for the allocation size. Two general mechanisms are used in virtually any memory allocator to improve space efficiency ✭:

  • Splitting: when a chosen node is larger than the allocation size, split it into two, return the fit chunk, and keep the rest as a free node (figures from the OSTEP book by Remzi & Andrea)
  • Coalescing: you can imagine that only doing splitting leads to the external fragmentation problem - the list will soon be populated with a lot of small chunks and none could be used for a relatively large request. We should also do coalescing (or merging) - when a chunk is freed and returns to the free-list, if it and its neighbor(s) form a contiguous chunk, merge them into one node

Policy: Next-Fit Allocation

With the splitting & coalescing mechanism, the next thing to do is to decide the policy of "which free node to choose".

It is better to separate mechanisms with policies, as suggested in OSTEP.

We will adopt the Next-Fit allocation policy. See the relative sections in the linked OSTEP chapter for how it works and how it compares to First-Fit. Basically, we maintain a last_search_node pointer which points to where the last request was served. As a new request comes, we start traversing the free-list from that node, until we see a node that is large enough for the request.

Embedding the Free-List

The free-list itself is a data structure and it must stay somewhere in memory. Deciding where to put the free-list can be tricky. A common technique used by many memory allocators is to embed the free-list along with the objects ✭. For every free chunk and every object we returned to callers, prefix it with a header structure.

  • The headers of free chunks are the nodes of the linked list. (In other words, we are creating a linked list in-place.)
  • It's easy to translate back & forth between the header structure and the address of the memory chunk it represents, just by adding/subtracting the size of header struct. Hence, we don't need to store the addr field in the headers.

Define the header structure @ src/memory/kheap.h:

/** Random magic number to protect against memory overruns. */
#define KHEAP_MAGIC 0xFBCA0739


/**
 * The free list is embedded inside the heap. Every allocated object (i.e.,
 * memory chunk returned to caller of `kalloc()`) is prefixed with a free-list
 * header structure.
 *
 * See https://pages.cs.wisc.edu/~remzi/OSTEP/vm-freespace.pdf, figure 17.3
 * ~ figure 17.7. The only difference is that our `magic` field is separate
 * from the `next` field and the magic number has weaker protection against
 * buffer overruns.
 */
struct free_list_node_header {
    size_t size;
    bool free;
    struct free_list_node_header *next;
    uint32_t magic;
};
typedef struct free_list_node_header fl_header_t;


/** Pointer arithmetics helper macros. */
#define HEADER_TO_OBJECT(header) ((header) + sizeof(fl_header_t))
#define OBJECT_TO_HEADER(object) ((object) - sizeof(fl_header_t))


void kheap_init();

uint32_t kalloc(size_t size);
void kfree(void *addr);

A demonstration of such embedding can be found in figures 17.3 - 17.7 of OSTEP. Example:

Implementing the Algorithm

Allocator Initialization

We first initialize the allocator with these things @ src/memory/kheap.c:

static uint32_t kheap_btm;
static uint32_t kheap_top = KMEM_MAX;


/** Global state of the free-list. */
static fl_header_t *bottom_most_header;     /** Address-wise smallest node. */
static fl_header_t *last_search_header;     /** Where the last search ends. */
static size_t free_list_length = 1;


/** Initialize the kernel heap allocator. */
void
kheap_init(void)
{
    /**
     * Initially, everything from `kheap_curr` (page rounded-up) upto 8MiB
     * are free heap memory available to use. Initialize it as one big
     * free chunk.
     */
    kheap_btm = kheap_curr;
    kheap_top = KMEM_MAX;

    fl_header_t *header = (fl_header_t *) kheap_btm;
    uint32_t size = (kheap_top - kheap_btm) - sizeof(fl_header_t);
    memset((char *) (HEADER_TO_OBJECT(kheap_btm)), 0, size);

    header->size = size;
    header->free = true;
    header->next = header;      /** Point to self initially. */
    header->magic = KHEAP_MAGIC;

    bottom_most_header = header;
    last_search_header = header;
    free_list_length = 1;
}

Allocation with kalloc()

Code @ src/memory/kheap.c:

/**
 * Allocate a piece of heap memory (an object) of given size. Adopts
 * the "next-fit" allocation policy.
 */
uint32_t
kalloc(size_t size)
{
    if (free_list_length == 0) {
        warn("kalloc: kernel flexible heap all used up");
        return 0;
    }

    fl_header_t *header_last = last_search_header;
    fl_header_t *header_curr = (fl_header_t *) header_last->next;
    fl_header_t *header_begin = header_curr;

    do {
        /** If this node is not large enough, skip. */
        if (header_curr->size < size) {
            header_last = header_curr;
            header_curr = (fl_header_t *) header_curr->next;
            continue;
        }

        /**
         * Else, split this node if it is larger than `size` + meta bytes
         * (after split, the rest of it can be made a smaller free chunk).
         */
        if (header_curr->size > size + sizeof(fl_header_t)) {
            /** Split out a new node. */
            fl_header_t *header_new = (fl_header_t *)
                ((uint32_t) header_curr + size + sizeof(fl_header_t));
            header_new->size = (header_curr->size - size) - sizeof(fl_header_t);
            header_new->magic = KHEAP_MAGIC;
            header_new->free = true;

            /** Don't forget to update the current node's size. */
            header_curr->size = size;

            /**
             * Now, update the links between nodes. The special case of list
             * only having one node needs to be taken care of.
             * 
             * If only one node in list, then `header_last` == `header_curr`,
             * so `last_search_header` should be the new node, not the current
             * node (that is about to be returned as an object).
             */
            if (free_list_length == 1) {
                header_new->next = header_new;
                last_search_header = header_new;
            } else {
                header_new->next = header_curr->next;
                header_last->next = header_new;
                last_search_header = header_last;
            }

            /** Update smallest-address node. */
            if (header_curr == bottom_most_header)
                bottom_most_header = header_new;
        
        } else {
            /** Not splitting, then just take this node off the list. */
            header_last->next = header_curr->next;
            free_list_length--;

            /** Update smallest-address node. */
            if (header_curr == bottom_most_header)
                bottom_most_header = header_curr->next;
        }

        /** `header_curr` is now the chunk to return. */
        header_curr->next = NULL;
        header_curr->free = false;
        uint32_t object = HEADER_TO_OBJECT((uint32_t) header_curr);

        _print_free_list_state();
        return object;

    } while (header_curr != header_begin);

    /** No free chunk is large enough, time to panic. */
    warn("kalloc: no free chunk large enough for size %d\n", size);
    return 0;
}

Real operating systems often maintain segregated lists (slab allocators) for faster allocation of fixed-size kernel objects like page tables, inodes, locks, etc. These allocations often also require alignments, e.g., page tables need to be page-aligned. Our allocator for now does not support alignment options, because free-list embedding works terribly with alignments. We will make dedicated allocators for those structures in later chapters when we meet them.

Deallocation with kfree()

Code @ src/memory/kheap.c:

/**
 * Free a previously allocated object address, and merges it into the
 * free-list if any neighboring chunk is free at the time.
 */
void
kfree(void *addr)
{
    fl_header_t *header = (fl_header_t *) OBJECT_TO_HEADER((uint32_t) addr);

    if ((uint32_t) addr < kheap_btm || (uint32_t) addr >= kheap_top) {
        error("kfree: object %p is out of heap range", addr);
        return;
    }

    if (header->magic != KHEAP_MAGIC) {
        error("kfree: object %p corrupted its header magic", addr);
        return;
    }

    /** Fill with zero bytes to catch dangling pointers use. */
    header->free = true;
    memset((char *) addr, 0, header->size);

    /**
     * Special case of empty free-list (all bytes exactly allocated before
     * this `kfree()` call). If so, just add this free'd obejct as a node.
     */
    if (free_list_length == 0) {
        header->next = header;
        bottom_most_header = header;
        last_search_header = header;
        free_list_length++;

    /**
     * Else, traverse the free-list starting form the bottom-most node to
     * find the proper place to insert this free'd node, and then check if
     * any of its up- & down-neighbor is free. If so, coalesce with them
     * into a bigger free chunk.
     *
     * This linked-list traversal is not quite efficient and there are tons
     * of ways to optimize the free-list structure, e.g., using a balanced
     * binary search tree. Left as future work.
     */
    } else {
        /**
         * Locate down-neighbor. Will get the top-most node if the free'd
         * object is located below the current bottom-most node.
         */
        fl_header_t *dn_header = bottom_most_header;
        while (dn_header->next != bottom_most_header) {
            if (dn_header < header && dn_header->next > header)
                break;
            dn_header = dn_header->next;
        }
        bool dn_coalesable = dn_header > header ? false
            : HEADER_TO_OBJECT((uint32_t) dn_header) + dn_header->size == (uint32_t) header;

        /**
         * Locate up-neighbor. Will get the bottom-most node if the free'd
         * object is located above all the free nodes.
         */
        fl_header_t *up_header = dn_header > header ? bottom_most_header
            : dn_header->next;
        bool up_coalesable = up_header < header ? false
            : HEADER_TO_OBJECT((uint32_t) header) + header->size == (uint32_t) up_header;

        /** Case #1: Both coalesable. */
        if (dn_coalesable && up_coalesable) {
            /** Remove up-neighbor from list. */
            dn_header->next = up_header->next;
            free_list_length--;
            /** Extend down-neighbor to cover the whole region. */
            dn_header->size +=
                header->size + up_header->size + 2 * sizeof(fl_header_t);
            /** Update last search node. */
            if (last_search_header == up_header)
                last_search_header = dn_header;

        /** Case #2: Only with down-neighbor. */
        } else if (dn_coalesable) {
            /** Extend down-neighbor to cover me. */
            dn_header->size += header->size + sizeof(fl_header_t);

        /** Case #3: Only with up neighbor. */
        } else if (up_coalesable) {
            /** Update dn-neighbor to point to me. */
            dn_header->next = header;
            /** Extend myself to cover up-neighbor. */
            header->size += up_header->size + sizeof(fl_header_t);
            header->next = up_header->next;
            /** Update bottom-most node. */
            if (bottom_most_header > header)
                bottom_most_header = header;
            /** Update last search node. */
            if (last_search_header == up_header)
                last_search_header = header;

        /** Case #4: With neither. */
        } else {
            /** Link myself in the list. */
            dn_header->next = header;
            header->next = up_header;
            free_list_length++;
            /** Update bottom-most node. */
            if (bottom_most_header > header)
                bottom_most_header = header;
        }
    }

    _print_free_list_state();
}

State Printing Helper

It is also nice to have a debug printing helper function that shows us the internal state of the free-list. Code @ src/memory/kheap.c:

/** For debug printing the state of the free-list. */
__attribute__((unused))
static void
_print_free_list_state(void)
{
    fl_header_t *header_curr = bottom_most_header;

    info("Kheap free-list length = %d, last_search = %p, nodes:",\
         free_list_length, last_search_header);
    do {
        printf("  node header %p { size: %d, next: %p }\n",
               header_curr, header_curr->size, header_curr->next);

        header_curr = (fl_header_t *) header_curr->next;
    } while (header_curr != bottom_most_header);
}

Add two temporary printing hooks right before kalloc() and kfree() returns to let it print the free-list state whenever it changes. Use the unused attribute to suppress unused warning only for this function when not used.

Progress So Far

Let's write a few kalloc()'s and kfree()'s in our kernel main function and see how our kernel heap allocators reacts to these requests! Add these to the main function @ src/kernel.c:

#include "memory/kheap.h"


/** The main function that `boot.s` jumps to. */
void
kernel_main(unsigned long magic, unsigned long addr)
{
    ... // All the previous initializations...

    /** Initialize the kernel heap allocator. */
    _init_message("initializing kernel heap memory allocator");
    kheap_init();
    _init_message_ok();
    info("kernel free heap starts at %p", kheap_curr);

    /** Executes `sti`, CPU starts taking in interrupts. */
    asm volatile ( "sti" );

    printf("\nKallocing arr1 - 128 bytes...\n");
    char *arr1 = (char *) kalloc(128 * sizeof(char));
    strncpy(arr1, "hello\n", 127);

    printf("\nKallocing arr2 - 23 bytes...\n");
    char *arr2 = (char *) kalloc(23 * sizeof(char));
    strncpy(arr2, "hello\n", 22);

    printf("\nKallocing arr3 - 437 bytes...\n");
    char *arr3 = (char *) kalloc(437 * sizeof(char));
    strncpy(arr3, "hello\n", 436);

    printf("\nKfreeing arr3, should coalesce with the big chunk...\n");
    kfree(arr3);

    printf("\nKfreeing arr1, should have no coalescing...\n");
    kfree(arr1);

    printf("\nKallocing arr4 - 54 bytes, should reuse the first chunk...\n");
    char *arr4 = (char *) kalloc(54 * sizeof(char));
    strncpy(arr4, "hello\n", 53);

    printf("\nKfreeing arr2, should coalesce with both neighbors...\n");
    kfree(arr2);

    printf("\nKallocing arr5 - 3971 bytes...\n");
    char *arr5 = (char *) kalloc(3971 * sizeof(char));
    strncpy(arr5, "hello\n", 3970);

    while (1)   // CPU idles with a `hlt` loop.
        asm volatile ( "hlt" );
}

This should produce a terminal window as the following after booting up:

This allocator implementation could later be reused in building a user-level memory management library for user processes to do malloc()'s. Our kernel, however, will probably use fixed-size slab allocators more frequently for fixed-size objects like page tables, process kernel stacks, inodes, etc. We will implement slab allocators in the next chapter.

Current repo structure:

hux-kernel
├── Makefile
├── scripts
│   ├── gdb_init
│   ├── grub.cfg
│   └── kernel.ld
├── src
│   ├── boot
│   │   ├── boot.s
│   │   ├── elf.h
│   │   └── multiboot.h
│   ├── common
│   │   ├── debug.c
│   │   ├── debug.h
│   │   ├── port.c
│   │   ├── port.h
│   │   ├── printf.c
│   │   ├── printf.h
│   │   ├── string.c
│   │   ├── string.h
│   │   ├── types.c
│   │   └── types.h
│   ├── device
│   │   ├── keyboard.c
│   │   ├── keyboard.h
│   │   ├── timer.c
│   │   └── timer.h
│   ├── display
│   │   ├── terminal.c
│   │   ├── terminal.h
│   │   └── vga.h
│   ├── interrupt
│   │   ├── idt-load.s
│   │   ├── idt.c
│   │   ├── idt.h
│   │   ├── isr-stub.s
│   │   ├── isr.c
│   │   └── isr.h
│   ├── memory
│   │   ├── gdt-load.s
│   │   ├── gdt.c
│   │   ├── gdt.h
│   │   ├── kheap.c
│   │   ├── kheap.h
│   │   ├── paging.c
│   │   └── paging.h
│   └── kernel.c