| 17 min

Memory Layout & the Linker Script: Teaching the Linker What 'Correct' Looks Like

Master ELF sections (.text, .rodata, .data, .bss) and fix alignment bugs in your AArch64 linker script. Learn to verify memory maps using readelf for bare-metal ARM development.

Diagram showing a kernel's address space divided into .text, .rodata, .data, .bss, stack, and heap regions

In the last post, we gave our terminal the ability to speak and made it print “Hello, Kernel!” by writing bytes directly to a memory-mapped UART register. It was genuinely satisfying. But the kernel was also lying to itself a little. The address 0x40000000 isn’t some magic number we decided on; it’s from our linker script. The UART address simply came from QEMU’s documentation. The stack worked because we got lucky with alignment. We haven’t actually designed the address space; we’ve just been squatting in it.

This post explains the kernel’s memory layout in every region. We will fix the problems in our current linker script and set up the address space properly for everything that follows. By the time we implement the MMU and enable virtual memory, you’ll need to know exactly what you’re mapping and why.


What We’re Building Today

The end result of this post is a properly structured address space. We’ll add a .rodata section, fix a real alignment bug that can corrupt your BSS zeroing, expand the stack from 4 KB to 16 KB, and reserve a heap region for the heap allocator we’ll build later in this series. By the end of this post, readelf will confirm every region is where we said it would be.

Here’s an interactive view of what the kernel’s physical address space will look like after this post:

bios@confessions ~/memory-map · Physical address space after this post — full kernel layout
PL011 UART (MMIO) 0x0009000000 → 0x0009001000 (4 KB)
Kernel .text 0x0040000000 → 0x0040010000 (64 KB)
Kernel .rodata 0x0040010000 → 0x0040018000 (32 KB)
Kernel .data 0x0040018000 → 0x0040020000 (32 KB)
Kernel .bss 0x0040020000 → 0x0040028000 (32 KB)
Stack (16 KB) 0x0040028000 → 0x004002C000 (16 KB)
Heap (1 MB) 0x004002C000 → 0x004012C000 (1 MB)
PL011 UART (MMIO)
Kernel .text
Kernel .rodata
Kernel .data
Kernel .bss
Stack (16 KB)
Heap (1 MB)
unmapped

The regions displayed aren’t exactly these sizes. They mainly depend on how much code and data we add over time. But the ordering is fixed by the linker script, and the UART is permanently at 0x09000000. Everything else just lives above 0x40000000.


What’s Actually in the Binary Right Now?

Before we change anything, let’s see what we currently have. Run readelf -S kernel.elf on the binary from the previous post. The -S flag dumps the section header table, which contains the linker’s record of what’s inside the binary and where it lives.

Looking at the result, a few things immediately jump out. There’s no .rodata section at all, so string literals like "Hello, Kernel!\n" are being folded into .text. .data and .bss are both zero bytes in size, which is correct for now, but means the linker can place them anywhere. There is also no explicit stack or heap region because those only exist when we increment the location counter in the linker script, but they don’t appear as named sections.

Now run nm --defined-only kernel.elf | grep -E 'bss|stack|heap' to see the exported symbols:

__bss_start and __bss_end are at the same address. That means BSS has zero size. boot.S’s zeroing loop runs for zero iterations, which is correct right now, but will silently stop working the moment we add a global variable. This will happen because there is no heap region.

More worrying: __bss_start is at 0x40000124, which is not a multiple of 8. Our zeroing loop uses str xzr, [x1], #8, which is an 8-byte store. If the address isn’t 8-byte aligned, that instruction will raise a data alignment fault on real hardware. We already mentioned this bug in post 3 and promised we would fix it. Well, this is the post where that fix is delivered.


ELF Sections: The Four Rooms of Every Program

When your compiler turns C into machine code, it doesn’t produce a flat blob of bytes. It produces an ELF file with distinct sections, each with a defined purpose. Understanding what goes where is the foundation of everything we’re about to do.

Think of ELF sections as four rooms in a house, each with a different function and access policy. Let’s see what it’s like to walk through each room:

  • .text - The workshop. This room contains your executable instructions. Think of functions, loops, conditionals or basically everything the CPU fetches and executes. It’s read-only at runtime (you don’t want code that modifies itself), and the MMU will later mark it executable. Our boot stub .text.boot lives here, which is why we need it first: QEMU jumps to 0x40000000, and _start must be there.

  • .rodata - The reference library. This room contains read-only data. For example, string literals like "Hello, Kernel!\n" are not stored in .text even though they appear in C code. The compiler emits them into .rodata. Constant lookup tables, const char[] arrays, and anything the compiler can prove will never be written go here. The MMU marks .rodata as readable but not executable or writable, enforcing the “read-only” part.

  • .data - The whiteboard. This room initialises mutable globals. A declaration like int counter = 42; at file scope ends up in .data. The actual value 42 is stored in the ELF binary and loaded into RAM at startup. On ROM-based embedded systems, a startup routine copies .data from flash to RAM before main runs; on our QEMU setup, the binary is loaded directly into RAM, so .data is already there. .data takes up real space in the ELF file because every initialised byte needs to be stored somewhere.

  • .bss - The blank notebook. The last room consists of zero-initialised globals. int counter; or static char buf[4096]; ends up here. The C standard guarantees they start at zero, but storing zeros in the ELF file would be wasteful. So .bss records only its size. No actual bytes are stored in the binary. At startup, boot.S zeros the entire BSS region, and that’s the only reason the C guarantee holds on bare metal. If you removed the BSS zeroing loop, globals would contain power-on RAM noise.

Here’s a quick cheat sheet showing where different C declarations land:

C DeclarationSectionWhy
void fn(void) { }.textExecutable code
const char *msg = "hi";.rodata"hi" is a string literal; the pointer itself is in .data
int counter = 42;.dataInitialized, mutable, non-zero
int x = 0;.bss or .dataCompiler decides; often .bss when the value is zero
int uninit;.bssNo initializer → guaranteed zero at startup
static char buf[4096];.bssLarge zero-init buffer

This separation of regions lets the OS loader (or our boot stub) apply the right memory protection to each region, and it saves binary size by not storing zeros for .bss.


The Linker Script, Line by Line

The linker script is the only thing that decides where all of this actually lands in memory. On a regular system, the OS handles this for every process. However, now that we are the OS, we need to handle it ourselves. Below is the linker script we currently have. Click on any token to understand what it does and why it’s there.

bios@confessions ~/linker · link.ld — annotated
1ENTRY(_start)
2
3SECTIONS {
4. = 0x40000000;
5
6/* .text.boot first → _start lands at 0x40000000 */
7.text : {
8KEEP(*(.text.boot))
9*(.text .text.*)
10}
11
12. = ALIGN(8);
13.rodata : {
14*(.rodata .rodata.*)
15}
16
17. = ALIGN(8);
18.data : {
19*(.data .data.*)
20}
21
22. = ALIGN(8);
23.bss : {
24__bss_start = .;
25*(.bss .bss.*)
26*(COMMON)
27. = ALIGN(8);
28__bss_end = .;
29}
30
31/* Stack: 16 KB, growing downward */
32. = ALIGN(16);
33__stack_bottom = .;
34. += 0x4000; /* 16 KB */
35_stack_top = .;
36
37/* Heap: 1 MB placeholder for Post 8 */
38. = ALIGN(8);
39__heap_start = .;
40. += 0x100000; /* 1 MB */
41__heap_end = .;
42}
↑ click any tokento see what it does
directives
sections
symbols

The key insight to understanding this linker script is that it’s not just configuration. It’s a description of the physical address space. Every symbol = . line exports an address that C and assembly code can reference. Every ALIGN(n) line adds padding to ensure the hardware’s alignment requirements are met. The script is the contract between the compiler, the linker, and the CPU.


Problems With Our Current link.ld

The linker script we’ve been using since Post 2 has served us well, but it has four real issues that will cause problems as the kernel grows:

  1. No .rodata section. String literals get absorbed into .text. When we enable the MMU and mark .text as execute-only, any attempt to read a string literal will fault. More immediately, string data mixed with executable code makes section-level permissions impossible.

  2. BSS is not 8-byte aligned. The current script has .bss starting wherever .data ends, with no alignment guarantee. Our boot.S zeroing loop uses 8-byte stores. If BSS starts at an odd address, every zeroing write results in an alignment fault on real AArch64 hardware. We added ALIGN(8) as a note in Post 3, but we never actually added it to the file.

  3. The stack is too small. 4 KB sounded fine when the kernel did nothing. Exception handlers push many registers. Deeply nested kernel functions eat stack frames. Future features, such as context switching, save an entire register file per task. 16 KB is more appropriate.

  4. No heap symbols. When we write the bump allocator in a later post, it will need to know where free memory begins. Right now, there’s nothing. We can plant the symbols now so we can use them in later posts.


The Improved link.ld

Here’s the full rewrite of our link.ld. This is the file that needs to replace the current link.ld in the root path of our OS.

ENTRY(_start)

SECTIONS {
    . = 0x40000000;

    /*
     * .text.boot must come first — _start must sit at exactly 0x40000000.
     * QEMU loads the ELF and jumps to ENTRY(), which is _start.
     * KEEP prevents the linker from discarding .text.boot even if it
     * looks unreferenced (it's referenced only by the ENTRY directive,
     * not by any C symbol).
     */
    .text : {
        KEEP(*(.text.boot))
        *(.text .text.*)
    }

    /*
     * .rodata: read-only data. String literals and const globals.
     * Separate from .text so the MMU can mark it readable but not
     * executable when we enable virtual memory in a later Post.
     */
    . = ALIGN(8);
    .rodata : {
        *(.rodata .rodata.*)
    }

    /*
     * .data: initialized read-write globals.
     * int counter = 42 → here. The values are stored in the ELF binary
     * and are already in RAM when QEMU loads us.
     */
    . = ALIGN(8);
    .data : {
        *(.data .data.*)
    }

    /*
     * .bss: zero-initialized globals.
     * C standard guarantees these are zero at program start. On bare
     * metal, boot.S must zero this region explicitly before calling
     * kernel_main. ALIGN(8) here is not optional — our zeroing loop
     * uses 8-byte stores, which will alignment-fault on real hardware
     * if the start address isn't 8-byte aligned.
     *
     * *(COMMON) catches tentative definitions (e.g. int x; without
     * 'extern', in pre-C99 style). Include it or those may be silently
     * dropped.
     */
    . = ALIGN(8);
    .bss : {
        __bss_start = .;
        *(.bss .bss.*)
        *(COMMON)
        . = ALIGN(8);   /* guarantee 8-byte aligned end too */
        __bss_end = .;
    }

    /*
     * Stack: 16 KB, growing downward.
     * AArch64 uses a full-descending stack where SP points to the last
     * written word and decrements before each push. boot.S sets:
     *   ldr x0, =_stack_top
     *   mov sp, x0
     * so the stack starts at _stack_top (the high address) and grows
     * toward __stack_bottom. We don't put this in a named section,
     * it's just reserved address space with exported symbols.
     */
    . = ALIGN(16);          /* 16-byte stack alignment required by AAPCS64 */
    __stack_bottom = .;
    . += 0x4000;            /* 16 KB */
    _stack_top = .;

    /*
     * Heap: 1 MB placeholder for Post 8 (Heap Allocator).
     * The bump allocator will use __heap_start and __heap_end to
     * initialize its memory pool. By defining them here instead of
     * in C code, we keep the allocator decoupled from the memory layout.
     */
    . = ALIGN(8);
    __heap_start = .;
    . += 0x100000;          /* 1 MB */
    __heap_end = .;
}

The structural change that matters most here is the ALIGN(8) directive before every section and in the .bss section. That one line eliminates an entire class of alignment faults. The addition of the .rodata section means string literals now have their own home. The heap symbols are just two lines, but they eliminate an entire design problem we would face later.


Verifying With readelf

Once you replace link.ld in the project, rebuild it using:

make clean && make

Now let’s also read the sections again:

The AX flags on .text mean Allocated and Executable. The A on .rodata means Allocated (but not executable, not writable). .data and .bss are WA (Writable and Allocated). These flags are exactly what we’d ask for. The MMU in the next Post will use them as the blueprint for page table permission bits.

Now also check the symbols again:

__bss_start and __bss_end are both at 0x40000170, which is 8-byte aligned (it ends in 0 or 8). The stack occupies 0x40004170 to 0x40005170. This is exactly 16 KB (0x4000 bytes). The heap starts immediately after the stack top and extends 1 MB to 0x40105170. The output shows everything lines up.

One thing to notice: __bss_start == __bss_end is still true because we have no global variables yet. As soon as you add one, the gap will open. We can quickly verify this by updating our main.c:

/* Add to kernel/main.c temporarily — remove it after you've checked */
int test_global = 0;

void kernel_main(void) {
    uart_init();
    kprint("Hello, Kernel!\n");
    while (1);
}

Wait, something went wrong because both addresses are still 0x40000170. We need to change our grep and look more carefully:

$ aarch64-elf-nm --defined-only kernel.elf | grep -E 'bss|test'
0000000040000170 B __bss_start
0000000040000174 B __bss_end
0000000040000170 B test_global

Changing the grep shows that test_global sits at __bss_start, and __bss_end moved 4 bytes forward. Remove the test_global before you continue, since we don’t need it. It was only there to prove the plumbing works.


What boot.S Reads From the Linker Script

The boot stub hasn’t changed, but it’s worth grounding out exactly which symbols it reads and what they mean after our linker script update:

bios@confessions ~/registers · Linker-exported symbols read by boot.S
__bss_start 0x0000000040000170
__bss_end 0x0000000040000170
_stack_top 0x0000000040005170
__heap_start 0x0000000040005170
__heap_end 0x0000000040105170

These are not variables — they are linker symbols. Their addresses are resolved at link time and embedded into the binary. boot.S uses =__bss_start (the address, not the contents) to set up register x1 for the BSS zeroing loop.

The boot stub has always relied on _stack_top, __bss_start, and __bss_end. Nothing changes in boot.S itself. The improvement is that these symbols are now properly aligned and reliably defined. The heap symbols don’t affect boot at all.

Here’s the BSS zeroing loop from boot.S, annotated with what the new symbols mean:

bios@confessions ~/boot.S (excerpt) — BSS zeroing with new symbol values
    ldr  x1, =__bss_start   // x1 = 0x40000170 (8-byte aligned, ALIGN(8) guarantees it)
  ldr  x2, =__bss_end     // x2 = 0x40000170 (same — no globals yet)
.zero_bss:
  cmp  x1, x2             // if x1 >= x2, we're done
  bge  .call_kernel
  str  xzr, [x1], #8      // *x1 = 0, x1 += 8  (safe because x1 is 8-byte aligned)
  b    .zero_bss

Why ALIGN Matters More Than You Think

The alignment fault that the old link.ld enabled deserves its own explanation. On AArch64, most load and store instructions have alignment requirements: a 64-bit store (str x0, [addr]) requires the address to be 8-byte aligned. A 128-bit SIMD store requires 16-byte alignment. Violating this generates a Data Abort exception with an EC (Exception Class) value of 0x25 in the ESR_EL1 register.

What makes alignment faults especially treacherous on bare metal is that we don’t yet have a fault handler. We haven’t set up the exception vector table yet; we will do so later. When an alignment fault fires without a handler, the CPU jumps to an unmapped address, and either hard-faults into a secondary fault or silently enters an undefined state. The boot loop appears to hang. No error message, no diagnostic, just silence.

The AArch64 Architecture Reference Manual states that alignment faults for non-SIMD loads and stores depend on the SCTLR_EL1. A bit. When the A bit is 1 (strict alignment checking), misaligned accesses always fault. When it’s 0 (the default reset value in QEMU), misaligned accesses may be handled transparently in hardware, depending on the exact instruction and the memory region type. QEMU is forgiving. Real Cortex-A53 silicon is not. Since we’re building for real hardware too, we write alignment-safe code.

Luckily, one ALIGN(8) in the linker script eliminates the entire problem.


What Broke (And Why)

The real thing I want to tell you about here isn’t a build failure. It’s about the BSS not being 8-byte aligned. This has been present since Post 2, but can cause issues when running our kernel on a physical Raspberry Pi 4 instead of QEMU.

On QEMU, the alignment-checking bit SCTLR_EL1. A defaults to 0. Misaligned 8-byte stores just work. QEMU’s memory model is forgiving in ways that real silicon isn’t. Every GDB session, every readelf, every make run looks perfect.

On the Pi, the first time the BSS zeroing loop hit a misaligned address, the CPU generated an alignment fault. With no exception vector table installed, the CPU fetched a handler from address 0x0, which was still unmapped. It will fault again on the next fetch and then again and again. The Pi’s status LED will blink in a pattern that, after much documentation hunting, will turn out to mean “synchronous abort from EL1 with no recovery path.” The kernel appeared to “just not boot.”

The fix is one line. By adding ALIGN(8) to the script, this problem will be prevented on real hardware.


What’s Next

The address space is clean. Sections are named, aligned, and sized correctly. Symbols are exported and verified. This is the last time we’ll treat the linker script as its own topic. From here, it’s just a file that works in the background.

The next post in the series is probably also the hardest: virtual memory and the MMU. AArch64’s four-level page table structure enables the MMU, and the moment when physical addresses stop being the only addresses. After the next post, our kernel will finally have address isolation.

But the linker script we wrote today is the map the MMU will use. Without it, we’d be mapping a random pile of bytes. With it, we know exactly what’s at every address.


Sources