Kernel Crafting: Building, Running, and Debugging Your Custom Linux Kernel with Busybox and QEMU

In this step-by-step tutorial, we’ll walk through the entire process of building a custom Linux kernel, creating a minimal filesystem using Busybox, running it on QEMU, and debugging the kernel. Finally, we’ll wrap up by learning how to compile and add custom Linux kernel modules to enhance our kernel. I’m using a Linux system for this demonstration, specifically Ubuntu 16.04.7 LTS with kernel version 4.15.0-142-generic. However, the steps should be similar for other Linux distributions. Let’s dive in!

Refer code and other resources at my GitHub repository:


Before we begin, ensure you have all the necessary tools and libraries installed on your system. This includes development tools, compilers, and libraries essential for building the kernel.

$ sudo apt update
$ sudo apt install build-essential libncurses5-dev bison flex libssl-dev libelf-dev qemu qemu-kvm 

Downloading the Custom Kernel

First, let’s download the Linux kernel source code. We’ll choose version 3.16.49 for this example:

$ wget  https://cdn.kernel.org/pub/linux/kernel/v3.x/linux-3.16.49.tar.xz
$ tar xvf linux-3.16.49.tar.xz 
$ cd linux-3.16.49/

Configuring and Compiling the Kernel

Next, we’ll configure the kernel. For simplicity, we’ll use the default configuration:

$ make defconfig

When configuring the Linux kernel, you might want to use the configuration file specific to your current Linux distribution. This can help ensure that the kernel configuration matches the settings and modules already in use on your system. To do this, you can copy one of the existing configuration files from /boot/config-xxx in the Linux kernel source root directory and name it .config.

The following command provides a text-based menu interface that allows us to configure various kernel options, including enabling or disabling features, selecting specific device drivers, and more.

$ make menuconfig

Before we compile the kernel, we need to enable some options for debug symbols, KASLR, and other useful features. Open the .config file in a text editor and ensure these options are set:

CONFIG_KCOV=y
CONFIG_DEBUG_INFO=y
CONFIG_KASAN=y
CONFIG_KASAN_INLINE=y
CONFIG_CONFIGFS_FS=y
CONFIG_SECURITYFS=y
# CONFIG_RANDOMIZE_BASE is not set

Now, let’s compile the kernel. This process may take some time:

$ make -j$(nproc)

The -j flag is used to specify the number of jobs (or threads) to run simultaneously during compilation. In this case, $(nproc) is a command substitution that dynamically inserts the number of processing units available on your system.

nproc: This command prints the number of processing units (or CPU cores) available on your system. It’s a handy way to utilize all available cores for faster compilation.

After running the above command, the kernel image (bzImage) will be created. This file represents the compressed Linux kernel image that will be used to boot our system. This bzImage file is located in the arch/x86/boot/ directory within the Linux kernel source tree.

Creating a Minimal Filesystem with Busybox

Now, let’s work on creating a minimal filesystem with Busybox. To do this, we’ll first need to obtain the Busybox source code and extract it.

Download and Configure Busybox

$ wget https://busybox.net/downloads/busybox-1.31.1.tar.bz2
$ tar xvf busybox-1.31.1.tar.bz2
$ cd busybox-1.31.1/
$ make defconfig  # Use the default configuration

We’ll need to use busybox’s menuconfig interface to enable static linking:

$ make menuconfig
-> Busybox Settings
  -> Build Options
[*] Build static binary (no shared libs)  # Press y to select

After selecting the "Build static binary (no shared libs)" option in the make menuconfig interface, exit the menu by selecting “Exit” or pressing ‘Esc’ repeatedly until prompted to save changes. Then, proceed to build the Busybox filesystem:

Build Busybox

Build Busybox and install it to a temporary directory:

$ make -j$(nproc)  # Ignore "Trying libraries: crypt m resolv" error
$ file busybox # check if the compiled file is fine
busybox: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.32, BuildID[sha1]=0fb9e344357bedd1287e715dc20b71511ebff5ce, stripped

The above output shows that the binary busybox is compiled as a statically linked executable, meaning all necessary libraries are included within the executable itself. This ensures that Busybox will run independently without relying on external libraries.

Finally, we’ll create the filesystem that includes Busybox. Running make install will generate a directory named _install, which mirrors a basic Linux filesystem structure.

$ make install
...
  ./_install//usr/sbin/udhcpd -> ../../bin/busybox
$ tree -d _install/
├── bin
├── sbin
└── usr
    ├── bin
    └── sbin

5 directories

Create the Minimal Filesystem Structure

Create a basic filesystem structure:

$ cd _install/
$ mkdir dev proc sys

Create Initialization Script

Now create a file called init and open it with a text editor. Copy and paste the following data in it:

#!/bin/sh
mount -t devtmpfs none /dev
mount -t proc none /proc
mount -t sysfs none /sys

# clear the screen
clear

# Banner
echo " __________"
echo "< kernw0lf >"
echo " ----------"
echo "        \   ^__^"
echo "         \  (oo)\_______"
echo "            (__)\       )\/\\"
echo "                ||----w |"
echo "                ||     ||"
echo ""

# Display boot time
echo -e "\nBoot took $(cut -d' ' -f1 /proc/uptime) seconds\n"

# Welcome message
echo "H4ppy K3rnel H4cking!"

# Start the shell
exec /bin/sh

NOTE: I have added banner also in the script. It is not required.

Make it executable:

$ chmod +x init

We’ve completed the setup for our custom Linux system using Busybox and a custom initialization script (init). Let’s summarize the steps we’ve taken:

  • Busybox Compilation: We compiled Busybox, which provides a single executable capable of providing various Linux utilities such as sh, echo, vi, and more.

  • Filesystem Creation: After compiling Busybox, we used make install to create a filesystem hierarchy (_install directory) containing these utilities as links to the Busybox executable. This filesystem structure resembles a basic Linux filesystem.

  • Custom Initialization Script: We created a shell script named init.sh. This script will be executed after the kernel loads during the boot process.

  • Mounting Essential Directories: In the init script, we mounted essential special directories such as /dev, /proc, and /sys. These directories provide access to kernel information and system devices.

Create the Initramfs

To create the filesystem (initramfs) containing our custom Linux system, we’ll run the following commands inside the _install directory:

$ find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../initramfs.cpio.gz
$ cd ..
$ file initramfs.cpio.gz
initramfs.cpio.gz: gzip compressed data

initramfs

The initramfs (initial RAM filesystem) contains the files needed for the Linux kernel to mount the root filesystem and start the system. It’s used during the early stages of the boot process.

After running the command, the initramfs.cpio.gz file will be created in the parent directory. This file contains the entire filesystem structure that we created using Busybox and the init script.

We’re now ready to boot our custom Linux system using QEMU or another virtualization platform.

Booting the Custom Kernel with QEMU

Before we can boot our custom kernel with QEMU, we need to make sure QEMU is installed. If you haven’t installed it yet, you can do so on Ubuntu or Debian-based systems with:

$ sudo apt-get update
$ sudo apt-get install qemu-system-x86

Now that we have QEMU installed, let’s proceed to boot our custom kernel with the minimal filesystem using QEMU:

$ qemu-system-x86_64 -kernel ../linux-3.16.49/arch/x86/boot/bzImage -initrd initramfs.cpio.gz -append "root=/dev/ram rw console=ttyS0" -nographic
-kernel: Path to your custom kernel image (bzImage).
-initrd: Path to the initramfs.cpio.gz file.
-append: Specifies kernel command-line parameters. Here, we specify:
    root=/dev/ram: Tells the kernel to use the RAM disk as the root filesystem.
    rw: Mount the root filesystem as read-write.
    console=ttyS0: Redirect kernel console output to the first serial port (ttyS0).
The -nographic option ensures that the output is displayed in the terminal.

If you don’t use the -nographic option, QEMU will open a graphical window to display the boot process of the kernel. We will use terminal to display as I have faced problems while debugging on QEMU Graphical Window.

Debugging the Kernel with GDB

To enable debugging, we need to run QEMU with the -s option to enable debug mode. We’ll also add the -S option to freeze the CPU at startup:

$ qemu-system-x86_64 -kernel ../linux-3.16.49/arch/x86/boot/bzImage -initrd initramfs.cpio.gz -append "root=/dev/ram rw console=ttyS0" -nographic -s -S

GDB

In another terminal, start GDB:

$ gdb vmlinux
(gdb) target remote localhost:1234
(gdb) c #Continue execution

Now you can set breakpoints, inspect memory, and step through code in GDB.

To stop the execution press Ctrl+C in the gdb window.

NOTE:

If you are getting reply on gdb like the following on Pressing Ctrl+C, refer Solution:

(gdb) c
Continuing.
^CRemote 'g' packet reply is too long: 000000000000000030c4ec81ffffffff0000000000000001000000000000000000000000000000000000000000000000c03ee081ffffffffb03ee081ffffffff000000000000000000000000000000003b2b3d000000000058065b8c03000000000000000000000000000000000000000000e081ffffffffedffffff000000009cc30081ffffffff4602000010000000180000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffff000000ffffffffff0000000000000000ff000000ff000000000000000000000000ff0000000000ff000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000
Remote 'g' packet reply is too long: 000000000000000030c4ec81ffffffff0000000000000001000000000000000000000000000000000000000000000000c03ee081ffffffffb03ee081ffffffff000000000000000000000000000000003b2b3d000000000058065b8c03000000000000000000000000000000000000000000e081ffffffffedffffff000000009cc30081ffffffff4602000010000000180000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffff000000ffffffffff0000000000000000ff000000ff000000000000000000000000ff0000000000ff000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000
Remote 'g' packet reply is too long: 000000000000000030c4ec81ffffffff0000000000000001000000000000000000000000000000000000000000000000c03ee081ffffffffb03ee081ffffffff000000000000000000000000000000003b2b3d000000000058065b8c03000000000000000000000000000000000000000000e081ffffffffedffffff000000009cc30081ffffffff4602000010000000180000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000007f030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffff000000ffffffffff0000000000000000ff000000ff000000000000000000000000ff0000000000ff000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000801f0000

Solution

I transferred vmlinux, bzImage, and initramfs.cpio.gz to my Kali machine (2024.1).

Use the following commands

On Terminal 1:

$ qemu-system-x86_64 -kernel bzImage -initrd initramfs.cpio.gz -append "root=/dev/ram rw console=ttyS0" -nographic -s -S

On Terminal 2:

$ gdb ./vmlinux
(gdb) target remote :1234
(gdb) continue
Continuing.
# Hit Ctrl+C to stop execution
...
# To verify that we are able to read correct kernel symbols, we will print two important functions: prepare_kernel_cred and commit_creds
(gdb) print prepare_kernel_cred 
$1 = {<text variable, no debug info>} 0xffffffff81071e40 <prepare_kernel_cred>
(gdb) print commit_creds
$2 = {<text variable, no debug info>} 0xffffffff81071a30 <commit_creds>

(gdb) c # Now view the kallsyms on Terminal 1

On Terminal 1:

# cat /proc/kallsyms | grep prepare_kernel_cred
ffffffff81071e40 T prepare_kernel_cred
# cat /proc/kallsyms | grep commit_creds
ffffffff81071a30 T commit_creds

Till now we’ve successfully printed the addresses of prepare_kernel_cred and commit_creds and verified them in the /proc/kallsyms file.

Now, we will explore how to establish breakpoints at kernel functions and trigger them by initiating system calls. Here are several prevalent kernel functions where breakpoints can be set for effective debugging during development:

  • start_kernel: This function is the entry point of the Linux kernel.
  • do_sys_open: This function is responsible for handling the open() system call.
  • sys_read: The sys_read system call is used by user-space programs to read data from a file descriptor (fd) into a buffer (buffer) for a specified number of bytes (count).
  • sys_write: Writes data to a file descriptor.
  • sys_close: Closes a file descriptor.
  • sys_execve: Creates a new directory.
  • sys_rmdir: Removes a directory.
  • sys_unlink: Removes a file.
  • sys_chmod: Changes file permissions.
  • sys_mmap: Maps files or devices into memory.
  • sys_exit: Handles process termination.

We will now setup a breakpoint on sys_mkdir

(gdb) break sys_mkdir
Breakpoint 1 at 0xffffffff8116e364
(gdb) c

Now, we will create a directory named AAAA in Terminal 1.

# mkdir AAAA

On Terminal 2, we can see that we hit the function sys_mkdir().

Breakpoint 1, 0xffffffff8116e364 in sys_mkdir ()

Now, let’s view the registers.

(gdb) info registers  # I am using pwndbg :)
RAX  0x53
RBX  0xffffffff
RCX  0x464a40 ◂— pxor xmm0, xmm0
RDX  0x0
RDI  0x7ffcbda4cfc8 ◂— 0x4c48530041414141 /* 'AAAA' */
RSI  0x1ff
R8   0x0
R9   0x0
R10  0x464a40 ◂— pxor xmm0, xmm0
R11  0x246
R12  0x7ffcbda4cfc8 ◂— 0x4c48530041414141 /* 'AAAA' */
R13  0xffffffff
R14  0x0
R15  0x0
RBP  0xffff88000655ff78 —▸ 0x7ffcbda4cfc8 ◂— 0x4c48530041414141 /* 'AAAA' */
RSP  0xffff88000655ff78 —▸ 0x7ffcbda4cfc8 ◂— 0x4c48530041414141 /* 'AAAA' */
RIP  0xffffffff8116e364 (sys_mkdir+4) ◂— push r15

We can see register RDI and R12 contains the pointer to the string “AAAA”. We can modify it… :)

VERIFY:

(gdb) x/c 0x7ffcbda4cfc8
0x7ffcbda4cfc8:	65 'A'
(gdb) x/c 0x7ffcbda4cfc9
0x7ffcbda4cfc9:	65 'A'
(gdb) x/c 0x7ffcbda4cfca
0x7ffcbda4cfca:	65 'A'
(gdb) x/c 0x7ffcbda4cfcb
0x7ffcbda4cfcb:	65 'A'
(gdb) x/c 0x7ffcbda4cfcc
0x7ffcbda4cfcc:	0 '\000'
(gdb) set {char}0x7ffcbda4cfc8 = 'K'
(gdb) set {char}0x7ffcbda4cfc9 = 'E'
(gdb) set {char}0x7ffcbda4cfca = 'R'
(gdb) set {char}0x7ffcbda4cfcb = 'N'
(gdb) c
Continuing.

Now, let’s take a look at Terminal 1.

# ls
KERN     dev      linuxrc  root     sys
bin      init     proc     sbin     usr

There you have it! By hitting a breakpoint at the sys_mkdir() kernel function, we were able to manipulate the memory content and change the directory name. Initially, we used mkdir AAAA, but through modifying the memory content, we ended up creating a directory named KERN.

Congratulations! You’ve successfully built a custom Linux kernel, created a minimal filesystem with Busybox, ran it on QEMU, and even debugged the kernel using GDB. This tutorial has given you a hands-on experience in kernel development and embedded system basics.

Customizing kernels and building minimal filesystems are fundamental skills in the world of Linux and embedded systems. Feel free to explore more kernel configurations, Busybox features, and QEMU options to deepen your understanding.

Now you’re equipped with the knowledge to create and debug custom Linux kernels. Happy kernel hacking!

References