pwnlib.dynelf — Resolving remote functions using leaks

Resolve symbols in loaded, dynamically-linked ELF binaries. Given a function which can leak data at an arbitrary address, any symbol in any loaded library can be resolved.

Example

# Assume a process or remote connection
p = process('./pwnme')

# Declare a function that takes a single address, and
# leaks at least one byte at that address.
def leak(address):
    data = p.read(address, 4)
    log.debug("%#x => %s" % (address, enhex(data or '')))
    return data

# For the sake of this example, let's say that we
# have any of these pointers.  One is a pointer into
# the target binary, the other two are pointers into libc
main   = 0xfeedf4ce
libc   = 0xdeadb000
system = 0xdeadbeef

# With our leaker, and a pointer into our target binary,
# we can resolve the address of anything.
#
# We do not actually need to have a copy of the target
# binary for this to work.
d = DynELF(leak, main)
assert d.lookup(None,     'libc') == libc
assert d.lookup('system', 'libc') == system

# However, if we *do* have a copy of the target binary,
# we can speed up some of the steps.
d = DynELF(leak, main, elf=ELF('./pwnme'))
assert d.lookup(None,     'libc') == libc
assert d.lookup('system', 'libc') == system

# Alternately, we can resolve symbols inside another library,
# given a pointer into it.
d = DynELF(leak, libc + 0x1234)
assert d.lookup('system')      == system

DynELF

class pwnlib.dynelf.DynELF(leak, pointer=None, elf=None, libcdb=True)[source]

DynELF knows how to resolve symbols in remote processes via an infoleak or memleak vulnerability encapsulated by pwnlib.memleak.MemLeak.

Implementation Details:

Resolving Functions:

In all ELFs which export symbols for importing by other libraries, (e.g. libc.so) there are a series of tables which give exported symbol names, exported symbol addresses, and the hash of those exported symbols. By applying a hash function to the name of the desired symbol (e.g., 'printf'), it can be located in the hash table. Its location in the hash table provides an index into the string name table (strtab), and the symbol address (symtab).

Assuming we have the base address of libc.so, the way to resolve the address of printf is to locate the symtab, strtab, and hash table. The string "printf" is hashed according to the style of the hash table (SYSV or GNU), and the hash table is walked until a matching entry is located. We can verify an exact match by checking the string table, and then get the offset into libc.so from the symtab.

Resolving Library Addresses:

If we have a pointer into a dynamically-linked executable, we can leverage an internal linker structure called the link map. This is a linked list structure which contains information about each loaded library, including its full path and base address.

A pointer to the link map can be found in two ways. Both are referenced from entries in the DYNAMIC array.

  • In non-RELRO binaries, a pointer is placed in the .got.plt area in the binary. This is marked by finding the DT_PLTGOT area in the binary.
  • In all binaries, a pointer can be found in the area described by the DT_DEBUG area. This exists even in stripped binaries.

For maximum flexibility, both mechanisms are used exhaustively.

Instantiates an object which can resolve symbols in a running binary given a pwnlib.memleak.MemLeak leaker and a pointer inside the binary.

Parameters:
  • leak (MemLeak) – Instance of pwnlib.memleak.MemLeak for leaking memory
  • pointer (int) – A pointer into a loaded ELF file
  • elf (str,ELF) – Path to the ELF file on disk, or a loaded pwnlib.elf.ELF.
  • libcdb (bool) – Attempt to use libcdb to speed up libc lookups
__init__(leak, pointer=None, elf=None, libcdb=True)[source]

Instantiates an object which can resolve symbols in a running binary given a pwnlib.memleak.MemLeak leaker and a pointer inside the binary.

Parameters:
  • leak (MemLeak) – Instance of pwnlib.memleak.MemLeak for leaking memory
  • pointer (int) – A pointer into a loaded ELF file
  • elf (str,ELF) – Path to the ELF file on disk, or a loaded pwnlib.elf.ELF.
  • libcdb (bool) – Attempt to use libcdb to speed up libc lookups
_dynamic_load_dynelf(libname) → DynELF[source]

Looks up information about a loaded library via the link map.

Parameters:libname (str) – Name of the library to resolve, or a substring (e.g. ‘libc.so’)
Returns:A DynELF instance for the loaded library, or None.
_find_dt(tag)[source]

Find an entry in the DYNAMIC array.

Parameters:tag (int) – Single tag to find
Returns:Pointer to the data described by the specified entry.
_find_dynamic_phdr()[source]

Returns the address of the first Program Header with the type PT_DYNAMIC.

_find_linkmap(pltgot=None, debug=None)[source]

The linkmap is a chained structure created by the loader at runtime which contains information on the names and load addresses of all libraries.

For non-RELRO binaries, a pointer to this is stored in the .got.plt area.

For RELRO binaries, a pointer is additionally stored in the DT_DEBUG area.

_find_linkmap_assisted(path)[source]

Uses an ELF file to assist in finding the link_map.

_find_mapped_pages(readonly=False, page_size=4096)[source]

A generator of all mapped pages, as found using the Program Headers.

Yields tuples of the form: (virtual address, memory size)

_lookup(symb)[source]

Performs the actual symbol lookup within one ELF file.

_make_absolute_ptr(ptr_or_offset)[source]

For shared libraries (or PIE executables), many ELF fields may contain offsets rather than actual pointers. If the ELF type is ‘DYN’, the argument may be an offset. It will not necessarily be an offset, because the run-time linker may have fixed it up to be a real pointer already. In this case an educated guess is made, and the ELF base address is added to the value if it is determined to be an offset.

_resolve_symbol_gnu(libbase, symb, hshtab, strtab, symtab)[source]
Internal Documentation:

The GNU hash structure is a bit more complex than the normal hash structure.

Again, Oracle has good documentation. https://blogs.oracle.com/ali/entry/gnu_hash_elf_sections

You can force an ELF to use this type of symbol table by compiling with ‘gcc -Wl,–hash-style=gnu’

_resolve_symbol_sysv(libbase, symb, hshtab, strtab, symtab)[source]
Internal Documentation:

See the ELF manual for more information. Search for the phrase “A hash table of Elf32_Word objects supports symbol table access”, or see: https://docs.oracle.com/cd/E19504-01/802-6319/6ia12qkfo/index.html#chapter6-48031

struct Elf_Hash {
    uint32_t nbucket;
    uint32_t nchain;
    uint32_t bucket[nbucket];
    uint32_t chain[nchain];
}

You can force an ELF to use this type of symbol table by compiling with ‘gcc -Wl,–hash-style=sysv’

bases()[source]

Resolve base addresses of all loaded libraries.

Return a dictionary mapping library path to its base address.

dump(libs = False, readonly = False)[source]

Dumps the ELF’s memory pages to allow further analysis.

Parameters:
  • libs (bool, optional) – True if should dump the libraries too (False by default)
  • readonly (bool, optional) – True if should dump read-only pages (False by default)
Returns:

a dictionary of the form – { address : bytes }

static find_base(leak, ptr)[source]

Given a pwnlib.memleak.MemLeak object and a pointer into a library, find its base address.

heap()[source]

Finds the beginning of the heap via __curbrk, which is an exported symbol in the linker, which points to the current brk.

lookup(symb = None, lib = None) → int[source]

Find the address of symbol, which is found in lib.

Parameters:
  • symb (str) – Named routine to look up If omitted, the base address of the library will be returned.
  • lib (str) – Substring to match for the library name. If omitted, the current library is searched. If set to 'libc', 'libc.so' is assumed.
Returns:

Address of the named symbol, or None.

stack()[source]

Finds a pointer to the stack via __environ, which is an exported symbol in libc, which points to the environment block.

__weakref__[source]

list of weak references to the object (if defined)

dynamic[source]

Returns: Pointer to the .DYNAMIC area.

elfclass[source]

32 or 64

elftype[source]

e_type from the elf header. In practice the value will almost always be ‘EXEC’ or ‘DYN’. If the value is architecture-specific (between ET_LOPROC and ET_HIPROC) or invalid, KeyError is raised.

libc[source]

Leak the Build ID of the remote libc.so, download the file, and load an ELF object with the correct base address.

Returns:An ELF object, or None.

Pointer to the runtime link_map object

pwnlib.dynelf.gnu_hash(str) → int[source]

Function used to generated GNU-style hashes for strings.

pwnlib.dynelf.sysv_hash(str) → int[source]

Function used to generate SYSV-style hashes for strings.