pwnlib.elf.elf — ELF Files¶
Exposes functionality for manipulating ELF files
Stop hard-coding things! Look them up at runtime with
>>> e = ELF('/bin/cat') >>> print(hex(e.address)) 0x400000 >>> print(hex(e.symbols['write'])) 0x401680 >>> print(hex(e.got['write'])) 0x60b070 >>> print(hex(e.plt['write'])) 0x401680
You can even patch and save the files.
>>> e = ELF('/bin/cat') >>> e.read(e.address+1, 3) b'ELF' >>> e.asm(e.address, 'ret') >>> e.save('/tmp/quiet-cat') >>> disasm(open('/tmp/quiet-cat','rb').read(1)) ' 0: c3 ret'
Encapsulates information about an ELF file.
>>> bash = ELF(which('bash')) >>> hex(bash.symbols['read']) 0x41dac0 >>> hex(bash.plt['read']) 0x41dac0 >>> u32(bash.read(bash.got['read'], 4)) 0x41dac6 >>> print(bash.disasm(bash.plt.read, 16)) 0: ff 25 1a 18 2d 00 jmp QWORD PTR [rip+0x2d181a] # 0x2d1820 6: 68 59 00 00 00 push 0x59 b: e9 50 fa ff ff jmp 0xfffffffffffffa60
Abstract classes can override this to customize issubclass().
This is invoked early on by abc.ABCMeta.__subclasscheck__(). It should return True, False or NotImplemented. If it returns NotImplemented, the normal algorithm is used. Otherwise, it overrides the normal algorithm (and the outcome is cached).
Returns the uncompressed contents of the provided DWARF section.
Get the string table section corresponding to the section header table.
Given a section header, find this section’s name in the file’s string table
Create a SymbolTableIndexSection object
Parses the ELF file header and assigns the result to attributes of this object.
patch_elf_and_read_maps(self) -> dict
/proc/self/mapsas if the ELF were executing.
This is done by replacing the code at the entry point with shellcode which dumps
/proc/self/mapsand exits, and actually executing the binary.
dictmapping file paths to the lowest address they appear at. Does not do any translation for e.g. QEMU emulation, the raw results are returned.
If there is not enough space to inject the shellcode in the segment which contains the entry point, returns
These tests are just to ensure that our shellcode is correct.
>>> for arch in CAT_PROC_MAPS_EXIT: ... context.clear() ... with context.local(arch=arch): ... sc = shellcraft.cat2("/proc/self/maps") ... sc += shellcraft.exit() ... sc = asm(sc) ... sc = enhex(sc) ... assert sc == CAT_PROC_MAPS_EXIT[arch], (arch, sc)
Builds a dict of ‘functions’ (i.e. symbols of type ‘STT_FUNC’) by function name that map to a tuple consisting of the func address and size in bytes.
>>> from os.path import exists >>> bash = ELF(which('bash')) >>> all(map(exists, bash.libs.keys())) True >>> any(map(lambda x: 'libc' in x, bash.libs.keys())) True
Loads the PLT symbols
>>> path = pwnlib.data.elf.path >>> for test in glob(os.path.join(path, 'test-*')): ... test = ELF(test) ... assert '__stack_chk_fail' in test.got, test ... if test.arch != 'ppc': ... assert '__stack_chk_fail' in test.plt, test
>>> bash = ELF(which('bash')) >>> bash.symbols['_start'] == bash.entry True
Adds symbols from the GOT and PLT to the symbols dictionary.
Does not overwrite any existing symbols, and prefers PLT symbols.
Synthetic plt.xxx and got.xxx symbols are added for each PLT and GOT entry, respectively.
>>> bash = ELF(which('bash')) >>> bash.symbols.wcscmp == bash.plt.wcscmp True >>> bash.symbols.wcscmp == bash.symbols.plt.wcscmp True >>> bash.symbols.stdin == bash.got.stdin True >>> bash.symbols.stdin == bash.symbols.got.stdin True
Read the contents of a DWARF section from the stream and return a DebugSectionDescriptor. Apply relocations if asked to.
Assembles the specified instructions and inserts them into the ELF at the specified address.
This modifies the ELF in-place. The resulting binary can be saved with
Prints out information in the binary, similar to
debug(argv=, *a, **kw) → tube[source]¶
Debug the ELF with
- argv (list) – List of arguments to the binary
- *args – Extra arguments to
- **kwargs – Extra arguments to
Disables NX for the ELF.
Zeroes out the
disasm(address, n_bytes) → str[source]¶
Returns a string of disassembled instructions at the specified virtual memory address
dynamic_by_tag(tag) → tag[source]¶
Parameters: tag (str) – Named
dynamic_string(offset) → bytes[source]¶
Fetches an enumerated string from the
Parameters: offset (int) – String index Returns:
str– String from the table as raw bytes.
dynamic_value_by_tag(tag) → int[source]¶
Retrieve the value from a dynamic tag a la
If the tag is missing, returns
fit(address, *a, **kw)[source]¶
Writes fitted data into the specified address.
flat(address, *a, **kw)[source]¶
Writes a full array of values to the specified address.
from_assembly(assembly) → ELF[source]¶
Given an assembly listing, return a fully loaded ELF object which contains that assembly at its entry point.
>>> e = ELF.from_assembly('nop; foo: int 0x80', vma = 0x400000) >>> e.symbols['foo'] = 0x400001 >>> e.disasm(e.entry, 1) ' 400000: 90 nop' >>> e.disasm(e.symbols['foo'], 2) ' 400001: cd 80 int 0x80'
from_bytes(bytes) → ELF[source]¶
Given a sequence of bytes, return a fully loaded ELF object which contains those bytes at its entry point.
>>> e = ELF.from_bytes(b'\x90\xcd\x80', vma=0xc000) >>> print(e.disasm(e.entry, 3)) c000: 90 nop c001: cd 80 int 0x80
Generally, shared library and executable contain 1 .ARM.exidx section. Object file contains many .ARM.exidx sections. So we must traverse every section and filter sections whose type is SHT_ARM_EXIDX.
Get a section from the file, by name. Return None if no such section exists.
Gets the index of the section by name. Return None if no such section name exists.
get_segment_for_address(address, size=1) → Segment[source]¶
Given a virtual address described by a
PT_LOADsegment, return the first segment which describes the virtual address. An optional
sizemay be provided to ensure the entire range falls into the same segment.
Either returns a
Check whether this file appears to have arm exception handler index table.
offset_to_vaddr(offset) → int[source]¶
Translates the specified offset to a virtual address.
Parameters: offset (int) – Offset to translate Returns: int – Virtual address which corresponds to the file offset, or
This example shows that regardless of changes to the virtual address layout by modifying
ELF.address, the offset for any given address doesn’t change.
>>> bash = ELF('/bin/bash') >>> bash.address == bash.offset_to_vaddr(0) True >>> bash.address += 0x123456 >>> bash.address == bash.offset_to_vaddr(0) True
process(argv=, *a, **kw) → process[source]¶
Execute the binary with
process. Note that
argvis a list of arguments, and should not include
read(address, count) → bytes[source]¶
Read data from the specified virtual address
The simplest example is just to read the ELF header.
>>> bash = ELF(which('bash')) >>> bash.read(bash.address, 4) b'\x7fELF'
ELF segments do not have to contain all of the data on-disk that gets loaded into memory.
First, let’s create an ELF file has some code in two sections.
>>> assembly = ''' ... .section .A,"awx" ... .global A ... A: nop ... .section .B,"awx" ... .global B ... B: int3 ... ''' >>> e = ELF.from_assembly(assembly, vma=False)
By default, these come right after eachother in memory.
>>> e.read(e.symbols.A, 2) b'\x90\xcc' >>> e.symbols.B - e.symbols.A 1
Let’s move the sections so that B is a little bit further away.
>>> objcopy = pwnlib.asm._objcopy() >>> objcopy += [ ... '--change-section-vma', '.B+5', ... '--change-section-lma', '.B+5', ... e.path ... ] >>> subprocess.check_call(objcopy) 0
Now let’s re-load the ELF, and check again
>>> e = ELF(e.path) >>> e.symbols.B - e.symbols.A 6 >>> e.read(e.symbols.A, 2) b'\x90\x00' >>> e.read(e.symbols.A, 7) b'\x90\x00\x00\x00\x00\x00\xcc' >>> e.read(e.symbols.A, 10) b'\x90\x00\x00\x00\x00\x00\xcc\x00\x00\x00'
Everything is relative to the user-selected base address, so moving things around keeps everything working.
>>> e.address += 0x1000 >>> e.read(e.symbols.A, 10) b'\x90\x00\x00\x00\x00\x00\xcc\x00\x00\x00'
Save the ELF to a file
>>> bash = ELF(which('bash')) >>> bash.save('/tmp/bash_copy') >>> copy = open('/tmp/bash_copy', 'rb') >>> bash = open(which('bash'), 'rb') >>> bash.read() == copy.read() True
search(needle, writable = False, executable = False) → generator[source]¶
Search the ELF’s virtual address space for the specified string.
Does not search empty space between segments, or uninitialized data. This will only return data that actually exists in the ELF file. Searching for a long string of NULL bytes probably won’t work.
An iterator for each virtual address that matches.
An ELF header starts with the bytes
\x7fELF, so we sould be able to find it easily.
>>> bash = ELF('/bin/bash') >>> bash.address + 1 == next(bash.search(b'ELF')) True
We can also search for string the binary.
>>> len(list(bash.search(b'GNU bash'))) > 0 True
It is also possible to search for instructions in executable sections.
>>> binary = ELF.from_assembly('nop; mov eax, 0; jmp esp; ret') >>> jmp_addr = next(binary.search(asm('jmp esp'), executable = True)) >>> binary.read(jmp_addr, 2) == asm('jmp esp') True
section(name) → bytes[source]¶
Gets data for the named section
Parameters: name (str) – Name of the section Returns:
str– String containing the bytes for that section
string(address) → str[source]¶
Reads a null-terminated string from the specified
strwith the string contents (NUL terminator is omitted), or an empty string if no NUL terminator could be found.
vaddr_to_offset(address) → int[source]¶
Translates the specified virtual address to a file offset
Parameters: address (int) – Virtual address to translate Returns: int – Offset within the ELF file which corresponds to the address, or
>>> bash = ELF(which('bash')) >>> bash.vaddr_to_offset(bash.address) 0 >>> bash.address += 0x123456 >>> bash.vaddr_to_offset(bash.address) 0 >>> bash.vaddr_to_offset(0) is None True
Writes data to the specified virtual address
This routine does not check the bounds on the write to ensure that it stays in the same segment.
>>> bash = ELF(which('bash')) >>> bash.read(bash.address+1, 3) b'ELF' >>> bash.write(bash.address, b"HELO") >>> bash.read(bash.address, 4) b'HELO'
Address of the lowest segment loaded in the ELF.
When updated, the addresses of the following fields are also updated:
However, the following fields are NOT updated:
>>> bash = ELF('/bin/bash') >>> read = bash.symbols['read'] >>> text = bash.get_section_by_name('.text').header.sh_addr >>> bash.address += 0x1000 >>> read + 0x1000 == bash.symbols['read'] True >>> text == bash.get_section_by_name('.text').header.sh_addr True
Architecture of the file (e.g.
Whether the current binary uses an executable stack.
This is based on the presence of a program header PT_GNU_STACK being present, and its setting.
The p_flags member specifies the permissions on the segment containing the stack and is used to indicate whether the stack should be executable. The absense of this header indicates that the stack will be executable.
In particular, if the header is missing the stack is executable. If the header is present, it may explicitly mark that the stack is executable.
This is only somewhat accurate. When using the GNU Linker, it usees DEFAULT_STACK_PERMS to decide whether a lack of
PT_GNU_STACKshould mark the stack as executable:
/* On most platforms presume that PT_GNU_STACK is absent and the stack is * executable. Other platforms default to a nonexecutable stack and don't * need PT_GNU_STACK to do so. */ uint_fast16_t stack_flags = DEFAULT_STACK_PERMS;
By searching the source for
DEFAULT_STACK_PERMS, we can see which architectures have which settings.
$ git grep '#define DEFAULT_STACK_PERMS' | grep -v PF_X sysdeps/aarch64/stackinfo.h:31:#define DEFAULT_STACK_PERMS (PF_R|PF_W) sysdeps/nios2/stackinfo.h:31:#define DEFAULT_STACK_PERMS (PF_R|PF_W) sysdeps/tile/stackinfo.h:31:#define DEFAULT_STACK_PERMS (PF_R|PF_W)
List of all segments which are executable.
ELFimports any libraries which contain
'libc[.-], and we can determine the appropriate path to it on the local system, returns a new
ELFobject pertaining to that library.
If not found, the value will be
Try to find the return address from main into __libc_start_main. The heuristic to find the call to the function pointer of main is to list all calls inside __libc_start_main, find the call to exit after the call to main and select the previous call.
List of all segments which are NOT writeable.
Whether the current binary uses NX protections.
Specifically, we are checking for
READ_IMPLIES_EXECbeing set by the kernel, as a result of honoring
PT_GNU_STACKin the kernel.
The Linux kernel directly honors
PT_GNU_STACKto mark the stack as executable.
case PT_GNU_STACK: if (elf_ppnt->p_flags & PF_X) executable_stack = EXSTACK_ENABLE_X; else executable_stack = EXSTACK_DISABLE_X; break;
Additionally, it then sets
read_implies_exec, so that all readable pages are executable.
if (elf_read_implies_exec(loc->elf_ex, executable_stack)) current->personality |= READ_IMPLIES_EXEC;
Whether the current binary uses RELRO protections.
This requires both presence of the dynamic tag
DT_BIND_NOW, and a
The ELF Specification describes how the linker should resolve symbols immediately, as soon as a binary is loaded. This can be emulated with the
If present in a shared object or executable, this entry instructs the dynamic linker to process all relocations for the object containing this entry before transferring control to the program. The presence of this entry takes precedence over a directive to use lazy binding for this object when specified through the environment or via
Separately, an extension to the GNU linker allows a binary to specify a PT_GNU_RELRO program header, which describes the region of memory which is to be made read-only after relocations are complete.
Finally, a new-ish extension which doesn’t seem to have a canonical source of documentation is DF_BIND_NOW, which has supposedly superceded
If set in a shared object or executable, this flag instructs the dynamic linker to process all relocations for the object containing this entry before transferring control to the program. The presence of this entry takes precedence over a directive to use lazy binding for this object when specified through the environment or via
>>> path = pwnlib.data.elf.relro.path >>> for test in glob(os.path.join(path, 'test-*')): ... e = ELF(test) ... expected = os.path.basename(test).split('-') ... actual = str(e.relro).lower() ... assert actual == expected
List of all segments which are writeable and executable.
A list of
elftools.elf.sections.Sectionobjects for the segments in the ELF.
A list of
elftools.elf.segments.Segmentobjects for the segments in the ELF.
Whether the current binary was built with Undefined Behavior Sanitizer (
List of all segments which are writeable.
Function(name, address, size, elf=None)[source]¶
Encapsulates information about a function in an
__init__(name, address, size, elf=None)[source]¶
x.__init__(…) initializes x; see help(type(x)) for signature
Wrapper to allow dotted access to dictionary elements.
Is a real
dictobject, but also serves up keys as attributes when reading attributes.
Supports recursive instantiation for keys which contain dots.
>>> x = pwnlib.elf.elf.dotdict() >>> isinstance(x, dict) True >>> x['foo'] = 3 >>> x.foo 3 >>> x['bar.baz'] = 4 >>> x.bar.baz 4