pwnlib.util.sh_string — Shell Expansion is Hard

Routines here are for getting any NULL-terminated sequence of bytes evaluated intact by any shell. This includes all variants of quotes, whitespace, and non-printable characters.

Supported Shells

The following shells have been evaluated:

  • Ubuntu (dash/sh)

  • MacOS (GNU Bash)

  • Zsh

  • FreeBSD (sh)

  • OpenBSD (sh)

  • NetBSD (sh)

Debian Almquist shell (Dash)

Ubuntu 14.04 and 16.04 use the Dash shell, and /bin/sh is actually just a symlink to /bin/dash. The feature set supported when invoked as “sh” instead of “dash” is different, and we focus exclusively on the “/bin/sh” implementation.

From the Ubuntu Man Pages, every character except for single-quote can be wrapped in single-quotes, and a backslash can be used to escape unquoted single-quotes.

Quoting
  Quoting is used to remove the special meaning of certain characters or
  words to the shell, such as operators, whitespace, or keywords.  There
  are three types of quoting: matched single quotes, matched double quotes,
  and backslash.

Backslash
  A backslash preserves the literal meaning of the following character,
  with the exception of ⟨newline⟩.  A backslash preceding a ⟨newline⟩ is
  treated as a line continuation.

Single Quotes
  Enclosing characters in single quotes preserves the literal meaning of
  all the characters (except single quotes, making it impossible to put
  single-quotes in a single-quoted string).

Double Quotes
  Enclosing characters within double quotes preserves the literal meaning
  of all characters except dollarsign ($), backquote (`), and backslash
  (\).  The backslash inside double quotes is historically weird, and
  serves to quote only the following characters:
        $ ` " \ <newline>.
  Otherwise it remains literal.

GNU Bash

The Bash shell is default on many systems, though it is not generally the default system-wide shell (i.e., the system syscall does not generally invoke it).

That said, its prevalence suggests that it also be addressed.

From the GNU Bash Manual, every character except for single-quote can be wrapped in single-quotes, and a backslash can be used to escape unquoted single-quotes.

3.1.2.1 Escape Character

A non-quoted backslash ‘\’ is the Bash escape character. It preserves the
literal value of the next character that follows, with the exception of
newline. If a ``\\newline`` pair appears, and the backslash itself is not
quoted, the ``\\newline`` is treated as a line continuation (that is, it
is removed from the input stream and effectively ignored).

3.1.2.2 Single Quotes

Enclosing characters in single quotes (‘'’) preserves the literal value of
each character within the quotes. A single quote may not occur between single
uotes, even when preceded by a backslash.

3.1.2.3 Double Quotes

Enclosing characters in double quotes (‘"’) preserves the literal value of a
ll characters within the quotes, with the exception of ‘$’, ‘`’, ‘\’, and,
when history expansion is enabled, ‘!’. The characters ‘$’ and ‘`’ retain their
pecial meaning within double quotes (see Shell Expansions). The backslash retains
its special meaning only when followed by one of the following characters:
‘$’, ‘`’, ‘"’, ‘\’, or newline. Within double quotes, backslashes that are
followed by one of these characters are removed. Backslashes preceding
characters without a special meaning are left unmodified. A double quote may
be quoted within double quotes by preceding it with a backslash. If enabled,
history expansion will be performed unless an ‘!’ appearing in double quotes
is escaped using a backslash. The backslash preceding the ‘!’ is not removed.

The special parameters ‘*’ and ‘@’ have special meaning when in double quotes
see Shell Parameter Expansion).

Z Shell

The Z shell is also a relatively common user shell, even though it’s not generally the default system-wide shell.

From the Z Shell Manual, every character except for single-quote can be wrapped in single-quotes, and a backslash can be used to escape unquoted single-quotes.

A character may be quoted (that is, made to stand for itself) by preceding
it with a ‘\’. ‘\’ followed by a newline is ignored.

A string enclosed between ‘$'’ and ‘'’ is processed the same way as the
string arguments of the print builtin, and the resulting string is considered
o be entirely quoted. A literal ‘'’ character can be included in the string
by using the ‘\\'’ escape.

All characters enclosed between a pair of single quotes ('') that is not
preceded by a ‘$’ are quoted. A single quote cannot appear within single
quotes unless the option RC_QUOTES is set, in which case a pair of single
quotes are turned into a single quote. For example,

print ''''
outputs nothing apart from a newline if RC_QUOTES is not set, but one single
quote if it is set.

Inside double quotes (""), parameter and command substitution occur, and
‘\’ quotes the characters ‘\’, ‘`’, ‘"’, and ‘$’.

FreeBSD Shell

Compatibility with the FreeBSD shell is included for completeness.

From the FreeBSD man pages, every character except for single-quote can be wrapped in single-quotes, and a backslash can be used to escape unquoted single-quotes.

Quoting is used to remove the special meaning of certain characters or
words to the shell, such as operators, whitespace, keywords, or alias
names.

There are four types of quoting: matched single quotes, dollar-single
quotes, matched double quotes, and backslash.

Single Quotes
    Enclosing characters in single quotes preserves the literal mean-
    ing of all the characters (except single quotes, making it impos-
    sible to put single-quotes in a single-quoted string).

Dollar-Single Quotes
    Enclosing characters between $' and ' preserves the literal mean-
    ing of all characters except backslashes and single quotes.  A
    backslash introduces a C-style escape sequence:

    ...

Double Quotes
    Enclosing characters within double quotes preserves the literal
    meaning of all characters except dollar sign (`$'), backquote
    (``'), and backslash (`\\').  The backslash inside double quotes
    is historically weird.  It remains literal unless it precedes the
    following characters, which it serves to quote:

      $     `     "     \     \\n

Backslash
    A backslash preserves the literal meaning of the following char-
    acter, with the exception of the newline character (`\\n').  A
    backslash preceding a newline is treated as a line continuation.

OpenBSD Shell

From the OpenBSD Man Pages, every character except for single-quote can be wrapped in single-quotes, and a backslash can be used to escape unquoted single-quotes.

A backslash (\) can be used to quote any character except a newline.
If a newline follows a backslash the shell removes them both, effectively
making the following line part of the current one.

A group of characters can be enclosed within single quotes (') to quote
every character within the quotes.

A group of characters can be enclosed within double quotes (") to quote
every character within the quotes except a backquote (`) or a dollar
sign ($), both of which retain their special meaning. A backslash (\)
within double quotes retains its special meaning, but only when followed
by a backquote, dollar sign, double quote, or another backslash.
An at sign (@) within double quotes has a special meaning
(see SPECIAL PARAMETERS, below).

NetBSD Shell

The NetBSD shell’s documentation is identical to the Dash documentation.

Android Shells

Android has gone through some number of shells.

  • Mksh, a Korn shell, was used with Toolbox releases (5.0 and prior)

  • Toybox, also derived from the Almquist Shell (6.0 and newer)

Notably, the Toolbox implementation is not POSIX compliant as it lacks a “printf” builtin (e.g. Android 5.0 emulator images).

Toybox Shell

Android 6.0 (and possibly other versions) use a shell based on toybox.

While it does not include a printf builtin, toybox itself includes a POSIX-compliant printf binary.

The Ash shells should be feature-compatible with dash.

BusyBox Shell

BusyBox’s Wikipedia page claims to use an ash-compliant shell, and should therefore be compatible with dash.

pwnlib.util.sh_string.sh_command_with(f, arg0, ..., argN) command[source]

Returns a command create by evaluating f(new_arg0, …, new_argN) whenever f is a function and f % (new_arg0, …, new_argN) otherwise.

If the arguments are purely alphanumeric, then they are simply passed to function. If they are simple to escape, they will be escaped and passed to the function.

If the arguments contain trailing newlines, then it is hard to use them directly because of a limitation in the posix shell. In this case the output from f is prepended with a bit of code to create the variables.

Examples

>>> sh_command_with(lambda: "echo hello")
'echo hello'
>>> sh_command_with(lambda x: "echo " + x, "hello")
'echo hello'
>>> sh_command_with(lambda x: "/bin/echo " + x, "\\x01")
"/bin/echo '\\x01'"
>>> sh_command_with(lambda x: "/bin/echo " + x, "\\x01\\n")
"/bin/echo '\\x01\\n'"
>>> sh_command_with("/bin/echo %s", "\\x01\\n")
"/bin/echo '\\x01\\n'"
pwnlib.util.sh_string.sh_prepare(variables, export=False)[source]

Outputs a posix compliant shell command that will put the data specified by the dictionary into the environment.

It is assumed that the keys in the dictionary are valid variable names that does not need any escaping.

Parameters
  • variables (dict) – The variables to set.

  • export (bool) – Should the variables be exported or only stored in the shell environment?

  • output (str) – A valid posix shell command that will set the given variables.

It is assumed that var is a valid name for a variable in the shell.

Examples

>>> sh_prepare({'X': 'foobar'})
b'X=foobar'
>>> r = sh_prepare({'X': 'foobar', 'Y': 'cookies'})
>>> r == b'X=foobar;Y=cookies' or r == b'Y=cookies;X=foobar' or r
True
>>> sh_prepare({'X': 'foo bar'})
b"X='foo bar'"
>>> sh_prepare({'X': "foo'bar"})
b"X='foo'\\''bar'"
>>> sh_prepare({'X': "foo\\\\bar"})
b"X='foo\\\\bar'"
>>> sh_prepare({'X': "foo\\\\'bar"})
b"X='foo\\\\'\\''bar'"
>>> sh_prepare({'X': "foo\\x01'bar"})
b"X='foo\\x01'\\''bar'"
>>> sh_prepare({'X': "foo\\x01'bar"}, export = True)
b"export X='foo\\x01'\\''bar'"
>>> sh_prepare({'X': "foo\\x01'bar\\n"})
b"X='foo\\x01'\\''bar\\n'"
>>> sh_prepare({'X': "foo\\x01'bar\\n"})
b"X='foo\\x01'\\''bar\\n'"
>>> sh_prepare({'X': "foo\\x01'bar\\n"}, export = True)
b"export X='foo\\x01'\\''bar\\n'"
pwnlib.util.sh_string.sh_string(s)[source]

Outputs a string in a format that will be understood by /bin/sh.

If the string does not contain any bad characters, it will simply be returned, possibly with quotes. If it contains bad characters, it will be escaped in a way which is compatible with most known systems.

Warning

This does not play along well with the shell’s built-in “echo”. It works exactly as expected to set environment variables and arguments, unless it’s the shell-builtin echo.

Argument:

s(str): String to escape.

Examples

>>> sh_string('foobar')
'foobar'
>>> sh_string('foo bar')
"'foo bar'"
>>> sh_string("foo'bar")
"'foo'\\''bar'"
>>> sh_string("foo\\\\bar")
"'foo\\\\bar'"
>>> sh_string("foo\\\\'bar")
"'foo\\\\'\\''bar'"
>>> sh_string("foo\\x01'bar")
"'foo\\x01'\\''bar'"
pwnlib.util.sh_string.test(original)[source]

Tests the output provided by a shell interpreting a string

>>> test(b'foobar')
>>> test(b'foo bar')
>>> test(b'foo bar\n')
>>> test(b"foo'bar")
>>> test(b"foo\\\\bar")
>>> test(b"foo\\\\'bar")
>>> test(b"foo\\x01'bar")
>>> test(b'\n')
>>> test(b'\xff')
>>> test(os.urandom(16 * 1024).replace(b'\x00', b''))