Mayhem Technology¶

Mayhem’s secret sauce is a portfolio of dynamic analyses that work together to test your target program. Dynamic analyses monitor the target program as it executes on real inputs. Mayhem's dynamic analyses have zero false positives, since the test case used to produce the result is a witness of any reported behavior.

Mayhem performs portfolio analysis, meaning it fuses together a set of different, complementary dynamic analysis techniques. Mayhem analyses are the result of over 20 years of research in leading academic and industry institutions, with many specific techniques patented and published in highly-regarded top peer-reviewed venues.

The main Mayhem analyses include:

Coverage-guided intelligent fuzzing
Symbolic execution
Dynamic run-time monitoring of each test case
Static analysis to check security quality indicators

In this section we give an overview of the different analyses, and give links for you to explore the topic in more detail if you wish.

What is Fuzzing¶

Fuzzing is the process of uncovering defects in an application by forcing the application to process different inputs and monitoring for any abnormal behaviors. Fuzzing has become a popular method across multiple disciplines and organizations for discovering problems in software.

Fuzzing is a negative testing technique, meaning it tests for incorrect or undesired (negative) behavior. Traditional testing focuses on positive testing, where each test case checks that a program exhibits a particular desired behavior.

Fuzzing differs sharply from traditional testing frameworks, where developers manually implement tests to test code. Manually-generated tests offer the developer great fidelity in how to test their code, with a trade-off that only the tests imagined and implemented by the developer are run. Fuzzing generates hundreds of thousands, to millions of inputs, and runs these over the program, testing functionality and edge-cases developers don't imagine, or simply don't have time to implement themselves.

Mayhem, as well as the typical fuzzer, consists of three algorithms:

Input Generation: The fuzzer generates a new input for the program. Mayhem uses a suite of heuristics to generate the next input from a previously known input.

Tip

You can provide Mayhem with a known test suite, which will help speed up Mayhem.
Invocation: The user needs to tell Mayhem how to run the program. This is done in the Mayhemfile. A fuzzer will invoke the program on each input generated.

Note

Each invocation of the target should be independent for best results.
Oracle: Mayhem checks whether the program crashed or not to determine whether a particular input is problematic.

Info

Mayhem considers a single input at a time for file-based inputs, and uses advanced proprietary heuristics for UDP and TCP network inputs. If Mayhem's heuristics fail to determine when a single input is processed, a user may need to perform harnessing.

Coverage-guided Fuzzing¶

Mayhem uses maximizing code coverage as the goal function for fuzzing and generating new inputs. Mayhem uses the edge coverage metric, which measures how many control flow graph edges have been exercised.

Formally, a coverage-guided fuzzer keeps a working set S of inputs for the target program p. The initial working set, called the seed set, is either a set of inputs provided by the user or a Mayhem-generated default set. For each input x in S, the Mayhem fuzzer:

Runs p on x and collects information on what program branches and statements are executed.
Uses the coverage information in step 1 to generate a new input y. Mayhem is called a coverage-guided fuzzer because it uses both the input x and the coverage information to generate the new input y.
Runs p on y. If y results in new coverage, Mayhem will add y to the working set S.
Reduce S to be the minimal set of inputs needed to maximize coverage so far. This step keeps the size of S as small as possible.
Pick a new input from S and go to step 1.

Step (4) is an important optimization to make sure your generated test suites are as small and as quick to execute as possible. Since we compute the minimal set, we often call S the minset.

On a modern CPU, it's common to be able to execute 100's to 1000's of tests per second on a target program. Compared to symbolic execution (discussed below), that is extremely fast! The main downside is that fuzzing can degenerate to randomly or semi-randomly trying new inputs, which can be inefficient in the long term. The general rule-of-thumb is fuzzing is much more valuable than symbolic execution for the first 10-30 minutes, at which time symbolic execution begins to add significant benefit.

Coverage Fuzzing Example: Let's look at an example program:

a = user_input0()
b = user_input1()
if (a < 10) {
  c = 30
}
else {
  if (b = 0xdeadbeef) {
    c = 100
  }
  else {
    c = 0
  }
}

We often think of a program in terms of its Control Flow Graph (CFG). A CFG is a directed graph where each node is a single program statement, and an edge exists between nodes if there is a possible transfer of control.

The CFG for the example is:

The CFG has 6 blocks, which represent 6 different locations: block-0, block-1, block-2, block-3, block-4, and block-5.

The next step is to determine which blocks are encountered during the execution of the program under test. If a block is executed, we will consider that block, "Covered." We can do this by inserting a special function at the beginning of each block. Each time a block is covered, this function will execute, and the calling block is added to a set of covered blocks.

Our instrumented program:

Our instrumentation will now tell us what blocks we cover during an execution of the program. For example, if we supply the value 5 to this program as user_input0(), we will cover blocks block-0, block-1, and block-5.

Coverage-guided fuzzers favor inputs which cover additional blocks. In our example programs, if we supply the inputs 1, 2, 3, 4, up to 9 for user_input0(), we always cover the blocks {block-0, block-1, block-5}. Our fuzzer may begin by supplying the values {user_input0 = 0, user_input1 = 0}, and then randomly change one of these numbers. Once we supply a value for user_input0() which is >= 10, our instrumentation will inform us we have covered new blocks.

The fuzzer saves inputs to its test suite, the set of inputs it continues to mutate, when inputs cover new code. Inputs that cover and exercise the same code the fuzzer has already tested are discarded. This allows the fuzzer more quickly discover new inputs which explore new parts of the program under test.

Symbolic Execution¶

Symbolic execution is a technique from formal methods that translates a program execution path into a logical formula. The formula is true for all inputs that take the selected path. Consider the following C program:

#include <stdio.h>
#include <assert.h>

int main() {
  int input;
  input = getchar();
  printf("%d\n", input);
  if(input > 64) {
     if(input < 91)
        if((input % 2) == 0)
             printf("bug!\n");
  } else {
    printf("safe!\n");
  }
  return 0;
}

This program prints "bug" if the user enters in an ASCII-character that, when translated to an integer, is:

Greater than 64
Less than 91
Is even (since any number that is zero mod 2 is even)

Symbolic execution will build up formulas that capture the constraints above and automatically solve them to generate inputs.

Symbolic execution represents each input as a symbolic variable instead of a concrete number as during normal execution. For example, if you provide the program the input 65, the variable input will have the concrete value 65, and each branch will evaluate to either 0 or 1. In contrast, a symbolic variable is simply a variable (paired with a type, e.g., "integer") with no specific value.

Symbolic execution works by:

Running the program on an input. For example, suppose you ran the above program on the input A, which is 65 (decimal) in ASCII.
Recording the execution trace where the input is represented as a symbolic variable x. For the input "A", we would create the formula:

x > 64 && x < 91 && !((x %2) == 0)

Note

Any assignment to x that makes the formula true (called satisfying the formula) will execute the same path as our original input "A".
Flipping one of the branch conditions in the formula, and checking if there are any solutions. For example, suppose symbolic execution flipped the last branch that checks if x is even (the same as the formula above with the ! negation removed):

x > 64 && x < 91 && ((x %2) == 0)

Note

This formula is satisfied by the assignment x = 42, which is the ASCII character B. Therefore, if we run the program on "B", we see we trigger the bug.

In a nutshell, symbolic execution utilizes formal logical and formula solvers to explore new branches, and come up with new inputs that take new program paths.

Contrast with Fuzzing: Fuzzing is not a formal technique and does not use a solver. For example, a random mutation fuzzer would have only a 10% probability of finding the bug, since only 13/127 of the possible ASCII inputs reach that line of code.

As another example, consider the example in the coverage-guided fuzzing section above. If all integers are 32-bits, a random input will only satisfy the if condition b == 0xdeadbeef with probability 2^{-32}.

Symbolic Execution Example: In this section, we describe how symbolic execution builds path formulas using the coverage-guided fuzzing example above.

We begin by replacing the result of user_input0() with A, and user_input1() with B, so that at the end of block-0 our symbolic execution may be tracking the following values:

States:
  State-0:
    Location: After block-0
    Variables: {a = A, b = B}
    PathFormula = {}

Here we have the block, and attached to it we have the symbolic state as it exists at the conclusion of symbolic execution of the block. In our symbolic execution, our symbolic execution engine has returned A for user_input0(), and B for user_input0(). In a real program, if we read in bytes from a file called, input_file.txt, we might make each individual byte of that file its own separate symbolic value, perhaps naming them input_file.txt_0x1, input_file.txt_0x2, and so on.

As we exit block-0, we encounter a conditional branch. One path is guarded by a < 10, and one path is guarded by a >= 10. We copy, or "Fork," our symbolic state, so that we have two identical copies, and add the conditions to our path formulas. We then execute the following blocks, block-1 and block-2.

States:
  State-0:
    Location: After block-1
    Variables: {a = A, b = B, C = 30}
    PathFormula = {A < 10}
  State-1:
    Location: After block-2
    Variables: {a = A, b = B}
    PathFormula = {A >= 10}

Now that we know how to fork state and add path formula, we can fork state after block-2, and symbolically execute block-3 and block-4.

States:
  State-0:
    Location: After block-1
    Variables: {a = A, b = B, C = 30}
    PathFormula = {A < 10}
  State-1:
    Location: After block-3
    Variables: {a = A, b = B, C = 0}
    PathFormula = {A >= 10 AND B != 0xdeadbeef}
  State-2:
    Location: After block-4
    Variables: {a = A, b = B, C = 100}
    PathFormula = {A >= 10 AND B == 0xdeadbeef}

And finally, we execute all symbolic states forward until the function ends.

States:
  State-0:
    Location: After block-5
    Variables: {a = A, b = B, C = 30}
    PathFormula = {A < 10}
  State-1:
    Location: After block-5
    Variables: {a = A, b = B, C = 0}
    PathFormula = {A >= 10 AND B != 0xdeadbeef}
  State-2:
    Location: After block-5
    Variables: {a = A, b = B, C = 100}
    PathFormula = {A >= 10 AND B == 0xdeadbeef}

We are left with three states, with three different sets of path formulas:

A < 10
A >= 10 AND B != 0xdeadbeef
A >= 10 AND B == 0xdeadbeef

These 3 path formulas accurately and completely describe which values our inputs must have in order to traverse all of the paths in this program.

The formulas can then be solved to find particular values for variables that make the formula true. Mayhem uses an SMT (Satisfiability Modulo Theory) solver, which can reason about machine-level representations of numbers. Given a formula, the SMT can return either:

SAT with an example value for each variable. By construction of the formula, each value is an input to the program.
UNSAT saying there is no satisfying answer. For example, the formula A and NOT A is unsatisfiable.
Not terminate. SMT's are at least NP-hard, and may not terminate in the timeout provided.

In an example run, here are three example SAT solutions:

A < 10 -> {A = 5}
A >= 10 AND B != 0xdeadbeef -> {A = 20, B = 0}
A >= 10 AND B == 0xdeadbeef -> {A = 15, B = 0xdeadbeef}

In this example, symbolic execution was able to discover that B = 0xdeadbeef in fewer steps than random fuzzing.

Combining Fuzzing and Symbolic Execution¶

Mayhem combines best-of-breed coverage-guided fuzzing with an advanced symbolic executor to quickly discover inputs which test more code, and find bugs faster. Mayhem accomplishes this by sharing inputs between its fuzzer and symbolic executor.

The symbolic executor takes inputs from the fuzzer, follows the same path as described by the input up to a certain point, and then uses symbolic execution to fork and discover new inputs which traverse edges our fuzzer has a hard time reasoning about. Likewise, the fuzzer can take inputs from the symbolic executor, and quickly modify them to test the program and explore new blocks which are easy to reach.