Skip to content

Analyzing Block, Function, and Line Coverage

As with other kinds of testing, it's important to not only understand what code is being covered, but also what code is not being covered so that you can make smarter decisions about your next steps with fuzzing.

A best practice would be to make sure that your fuzzing is covering parts of code that deal with input parsing or input processing from external or untrusted sources. In addition, functions with high cyclomatic complexity should be thoroughly tested, as high complexity functions have more distinct code paths, making them harder to reason about and more likely to contain unexpected corner cases.

Having an understanding of how the target application processes input, and therefore which functions are covered, not covered, or should be covered, allow developers to make more informed (and ultimately better) decisions when it comes to fuzzing.

Info

Check out the test coverage term in the glossary for more information on how coverage works.

Pre-requisites

Remember that in order to analyze coverage files, you must first download the coverage files for a completed Mayhem run. Therefore, we'll be utilizing the testme coverage results obtained in Executing and Download Coverage via Mayhem CLI:

Files: coverage.tgz

Tip

When downloading via the UI, a coverage.tgz archive will be downloaded which contains a directory named something like <target_name>_coverage. On the other hand, using the mayhem sync or mayhem download commands via the Mayhem CLI will put this output directory in the same parent directory as the Mayhemfile and tests directories.

Once downloaded, the coverage directory will contain up to three files: block_coverage.drcov, func_coverage.json, and line_coverage.lcov.

Each file describes the aggregate of all code coverage from each of the inputs or test cases in the test suite that were executed by the target application; however, each file describes the aggregate coverage from a different perspective—at the block, function, and source code level, respectively.

Note

The edge coverage metric that Mayhem uses internally to distinguish behaviors is different from block coverage, which is a related but different metric for coverage.

You'll also need access to the testme compiled binary. Therefore, use the 2.5 Tutorial Docker image.

docker pull forallsecure/tutorial:2.5
docker run -ti --privileged --rm forallsecure/tutorial:2.5

Let's see how to analyze coverage using the three types of files!

Block Coverage

A basic block is the smallest piece of code containing a single entry point and a single exit point. The block_coverage.drcov describes basic block coverage in a packed binary format, and as such is meant for use with binary analysis tools such as Binary Ninja (bncov), IDA Pro (lighthouse), or Ghidra (Dragon Dance) where the (additional) plugins will help visualize or manipulate the data.

Tip

If you have the source code for the target and can compile the target with debug symbols, the line coverage file will likely be more convenient to use.

Using Binary Ninja and bncov for Block Coverage

The bncov plugin for Binary Ninja will allow you to visualize block coverage when the block_coverage.drcov output file is overlayed on top of the original binary that produced it.

Warning

In order to visualize block coverage using Binary Ninja, you'll need to import the compiled target application as well as the resulting block_coverage.drcov output file from your Mayhem run into the Binary Ninja application. Therefore, if you are using Docker, copy (docker cp) the files over from the Docker container to a Host OS that can run Binary Ninja.

Now, let's open up Binary Ninja and import the compiled testme application.

binary-ninja

Once imported, you can now visualize the individual blocks of code that comprise the compiled testme application.

Note

You will have to purchase a Binary Ninja license to be able to view the disassembly graph.

binary-ninja-disassembly

Finally, installing the bncov plugin and importing the block_coverage.drcov file for the testme application indicates the individual blocks that were covered as well as the edges or pathways that were taken.

Note

The easiest way to install bncov is through the Binary Ninja plugin manager! Then, simply go to your Tools menu to begin using bncov.

binary-ninja-bncov

As a result, Binary Ninja can be used to analyze individual block coverage using the bncov plugin.

Using Ghidra and Dragon Dance for Block Coverage

Using Ghidra and the Dragon Dance plugin, you can visualize and manipulate the binary code coverage for an underlying application to understand more about its implementation and code coverage behaviors.

Warning

In order to visualize block coverage using Ghidra, you'll need to import the compiled target application as well as the resulting block_coverage.drcov output file from your Mayhem run into the Ghidra application. Therefore, if you are using Docker, copy (docker cp) the files over from the Docker container to a Host OS that can run Ghidra.

Now, we'll need to download and install Ghidra and the Dragon Dance plugin.

# Download Ghidra 
wget https://ghidra-sre.org/ghidra_9.1.2_PUBLIC_20200212.zip
unzip ghidra_9.1.2_PUBLIC_20200212.zip

# Download Dragon Dance 0.2.2
git clone https://github.com/0ffffffffh/dragondance.git
cd dragondance
git checkout v0.2.2

We'll also need to download and install the Java Development Kit (JDK) for your specific operating system (OS) as well as a compatible version of gradle to build the Dragon Dance plugin.

Note

For this exercise, we are utilizing Java 15.0.2, Gradle 6.8.2, Ghidra 9.1.2, and Dragon Dance 0.2.2.

Next, navigate to the dragondance-master folder and run the following command to build the Dragon Dance plugin in your Ghidra installation directory; the resulting package will be located under the dragondance-master/dist directory:

gradle -PGHIDRA_INSTALL_DIR=<path_to_ghidra_install>/ghidra_9.1.2_PUBLIC

Now navigate to the Ghidra installation directory and run the following to open the application:

./GhidraRun

Since we've already built the Dragon Dance plugin, now we just have to import it as an available plugin to Ghidra. Go to File > Install Extensions and add the Dragon Dance plugin by navigating to the dragondance-master/dist folder and selecting the .zip file.

import-dragon-dance

Next, create a new project testme_coverage and import the compiled testme application. The file should be located at testme-pkg/root/root/tutorial/testme/v1/testme.

Note

You may see some warnings when importing the testme binary into Ghidra, simply ignore and proceed as usual.

ghidra-testme

Double-click into the imported testme file and Ghidra should now begin to display the disassembled view of the testme binary. Go to the Symbol Tree pane on the left-hand side and select testme under the Functions folder. This shows us the underlying assembly code of the testme function in the testme binary.

ghidra-testme-diassembly

Now click on Window > Dragon Dance and import the block_coverage.drcov file from the testme coverage results.

Warning

You may need to initialize Dragon Dance for the first time before you can use it in Ghidra. Check out the official documentation for launching Dragon Dance.

In addition, for more complex binaries, Ghidra may not properly identify all the code regions and will prompt the user if it encounters block addresses outside of what it has identified as code. The proper action here is to view the coverage file as ground truth and tell Ghidra to fix up / mark the blocks as code.

dragon-dance-window

Right-click on the block coverage item in Dragon Dance and select Switch To. This will overlay the testme coverage results over the testme diassembly view.

ghidra-testme-diassembly-coverage

And that's it! We can now see parts of the testme code that were covered and not covered.

Function Coverage

The func_coverage.json coverage file is a plaintext JSON file describing the basic information about functions contained in the target binary as well as the block coverage information for each corresponding function.

Note

We've supplied this file in JSON format with the intent of allowing developers to parse the data as needed. For example, developers can use the function coverage data for automated report generation via their own custom scripts.

The key-value pair definitions for the func_coverage.json file are as follows:

  • address: The address of the function.
  • name: The name of the function.
  • complexity: The cyclomatic complexity.
  • callers: The other functions that are called from the current function.
  • callees: The other functions that call the current function.
  • all_blocks: All blocks corresponding to the function.
  • covered_blocks: All blocks corresponding to the function that are covered.
  • called: True or False. Indicates whether the function was covered or not.

Note

Notice all addresses are in decimal, and represent the starting address of the function (specifically the address of the first byte of the first instruction of the function, which is also the start of the first block).

Sample output for a given function is shown below.

[
   {
        "address": 1078889,
        "name": "http_parser_init",
        "complexity": 3,
        "callers": [
            1053267
        ],
        "callees": [
            1052768
        ],
        "all_blocks": [
            1078889,
            1078981,
            1078987,
            1078994,
            1079001,
            1079006
        ],
        "covered_blocks": [
            1078889,
            1079001
        ],
        "called": true
    },
    ...
]

Creating Custom Apps for Function Coverage Analysis

A quick example of how developers could utilize the func_coverage.json to suit their needs could be to import the data as a Pandas DataFrame using Python.

Pandas is a popular and robust data analysis library that utilizes table-like structures known as DataFrames to represent records via rows and columns. By importing the func_coverage.json into a Pandas Dataframe, developers can utilize the full suite of data analysis functions to further analyze their function coverage analysis.

func-analysis-app

Note

You may have to denormalize your data as there may be nested objects.

Line Coverage

Line coverage can be derived by mapping basic blocks to their origin as a line in the source code.

The line_coverage.lcov coverage file contains line coverage information in the LCOV format, which specifies which lines in a given file are covered. As a result, .lcov files must be processed alongside the original source directories and files. Otherwise, any alterations or modifications in file paths or source code versions will result in discrepancies/missing information to the lcov report.

Note

This file will only be created if the target contains debug symbols, otherwise line coverage information cannot be automatically generated.

The .lcov format (also named .info by some tools) file can be ingested by a number of tools such as LCOV to either generate browseable coverage reports or integrate with other IDEs, plugins, and third-party tools to display additional coverage information.

Generating Line Coverage Reports using genhtml

It is a best practice to process .lcov files in the original environment from which they were produced. Therefore, we will install lcov within our docker container and generate the line coverage report using the genhtml utility.

First, we'll need to install lcov.

apt-get update -y
apt-get install -y lcov

Next, executing the genhtml utility on the line_coverage.lcov file produces the resulting line coverage report for the testme application.

Note

The genhtml command follows the pattern genhtml <file> --output-directory <directory_name>.

lcov-genhtml

We can then move our generated html files to the Host OS from the docker container and run a local HTTP server to view the line coverage report.

Note

Alternatively, you can also configure the docker container network via port forwarding to spin up a HTTP server from within the container and connect via the Host OS.

docker-cp-genthml

The code coverage report will display aggregate line coverage results as well as allow you to drill down and visualize the individual lines that were covered--0 for not covered and 1 for covered.

lcov-report

As can be seen, visualizing line code coverage for a target application can be both extremely valuable and highly readable.

Summary

The three coverage files available for download and analysis from Mayhem are the block_coverage.drcov, func_coverage.json, and line_coverage.lcov files, each describing aggregate code coverage from different perspectives--at the block, function, and source code levels, respectively.

Lastly, the percentage of code coverage isn’t the end of the story. If a target is partly covered, you should consider whether the untested parts of the function indicate that an insufficient variety of input is reaching the function, or if there are simply parts of the code that will not execute under normal conditions (such as code for handling out-of-memory errors or network failures). Even if a function is 100% covered, there may be bug conditions that are possible, such as in the case of a divide-by-zero or NULL-pointer dereference. It is a matter of individual judgement when fuzzing covers a sufficient amount of code in testing, and is typically a weighing of the cost of additional fuzzing improvement vs the potential security or reliability impacts.

Knowing how to analyze these coverage files and taking into consideration the relevance of their results will allow you to make better fuzzing decisions when it comes to your target applications in Mayhem!