Detecting Uninitialized Memory Read Access Bugs using Pin (a la Valgrind)

Table of Contents:

Abstract

In today’s post we will discuss a tool which automatically detects read
access of uninitialized memory bugs, for both the stack and the heap.
This tool is similar to a (really simple) variant of the valgrind
[1] project, but made for windows
(linux support for our tool is really, really easy though.)

Besides that, it can also be used to learn more about the
Pin framework.

We will show the reader simple examples of accessing uninitialized memory,
how we can prevent them, how we can detect them, and finally, a Pintool (a
tool that uses Pin) which detects uninitialized memory read access bugs.

Introduction

First of all, inspiration for this blog entry was gained after reading
this
blogpost.

Uninitialized memory is, as it suggests, data which has not been
initialized yet, and therefore it must be assumed to be garbage data.
Stack variables are uninitialized by default, until they have been
assigned. The same goes for allocated memory on the heap; it is
uninitialized until it has been assigned.

Uninitialized variables may lead to crashes or other undefined behavior,
because the contents of the variable are filled with garbage data. For
example, when using an uninitialized pointer, chances are likely that the
pointer points to non-existant memory, and therefore, the application will
crash.

That being said, it’s time to show some examples of accessing
uninitialized memory, and how to prevent them.

Uninitialized Memory Access Examples

The first Proof of Concept application looks like the following. We
allocate a variable on the stack, and read from it before writing to it.

#include <stdio.h>

int main()
{
    int a;
    printf("a: %d\n", a);
    return 0;
}

The contents of the variable a must be considered undefined.
Therefore the printf() statement will print out some garbage number (do
note that, when running the application several times, the number might
remain constant.)

By assigning a value to the a variable we would, obviously, get rid
of the bug. Take for example the following code, which does not contain an
uninitialized memory access bug.

#include <stdio.h>

int main()
{
    int a = 42;
    printf("a: %d\n", a);
    return 0;
}

A somewhat more advanced Proof of Concept application can be found
here.

// http://reversingonwindows.blogspot.com/2012/07/detecting-read-access-to-uninitialized.html

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX 16

typedef struct _CONTEXT {
    int arr[MAX];
    int a;
    int b;
    int c;
} CONTEXT;

void init(CONTEXT *ctx)
{
    memset(ctx->arr, 0, sizeof(ctx->arr[0]) * (MAX-1));
    ctx->a = 1;
}

void process(CONTEXT *ctx)
{
    int trash;
    for (int i = 0; i < MAX; i++) {
        trash = ctx->arr[i];
    }
}

void process2(CONTEXT *ctx)
{
    ctx->b = ctx->c;
}

void process3(int num)
{
    int trash;
    if(num != 0) {
        trash = num;
    }
}

int main(int argc, char *argv[])
{
    CONTEXT ctx;

    // Erroneously initializes context. The last element of arr
    // member remains unitialized. b and c members remain
    // uninitialized, too.
    init(&ctx);

    // Accesses to each element of the array. Read-before-write
    // error should be reported in this function.
    process(&ctx);

    // Copies c to b but c is uninitialized. Read-before-write
    // error should be reported in this function.
    process2(&ctx);

    // This contains no read-before-write bug.
    process3(ctx.a);
}

This Proof of Concept contains two uninitialized memory access bugs (both
on the stack, again.) The first happens in the process function,
because the last element of the ctx->arr array has not been set by
the init function. The second bug occurs in the process2
function, because, as you may notice, the c member of the
ctx object has not been set yet.

A simple fix might look like the following (by altering only the
init function.)

void init(CONTEXT* ctx)
{
    memset(ctx, 0, sizeof(*ctx));
    ctx->a = 1;
}

This fix initializes the entire ctx object to zero’s, and sets the
a member to the value one afterwards, ensuring that the object does
not contain garbage data, but instead zero’s (which we consider
initialized data here.)

The third Proof of Concept application is based on heap memory. By
intercepting calls to the malloc function (actually, we intercept
calls to the RtlAllocateHeap
[2] function, although that one is
Windows-specific) we can determine the amount of bytes which have been
allocated to which address. For example, when an application allocates
32 bytes, we mark the address of each of these 32 bytes as
uninitialized. The following Proof of Concept application shows
uninitialized memory access from the heap.

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *a = (int *) malloc(sizeof(int) * 3);
    a[1] = 0;
    printf("a0: %d\n", *a);
    printf("a1: %d\n", a[1]);
    printf("a2: %d\n", a[2]);
    return 0;
}

In this application we allocated memory for three integers, only set one
(the second) and print all three of them. This results in two
uninitialized memory access bugs (reading the first integer and the third
integer from the array.)

A simple fix might be done by replacing the malloc call with a
calloc [3]
call, which initializes the memory to zero, for example.

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *a = (int *) calloc(3, sizeof(int));
    a[1] = 0;
    printf("a0: %d\n", *a);
    printf("a1: %d\n", a[1]);
    printf("a2: %d\n", a[2]);
    return 0;
}

Detecting Uninitialized Memory Access

So we handle two types of uninitialized memory read access bug detections;
on the stack and on the heap.

On the stack goes as following. The prolog of a function (usually) starts
with making a backup of the stack pointer, followed by subtracting an
immediate from the stack pointer. The amount that is being subtracted
denotes the amount of memory needed for stack variables. In our Pintool
we detect such subtract instructions, and when they happen, we set the
memory which has been allocated (by subtracting from the stack
pointer) as uninitialized.

For the heap we deploy similar functionality. When a chunk of memory
has been allocated by the application, we mark it as uninitialized (unless
the zero-memory flag has been set for the RtlAllocateHeap
[2] function on
windows, which initializes the memory to zero’s.)

From here on, all read and write instructions are traced and we simply
keep track which memory has been written to and which has not
(uninitialized data.) When the application reads from a memory address
which is uninitialized, we print the address and the address of the
instruction pointer so somebody can investigate the problem further and
attempt to fix the problem using one of the (simple) techniques listed
earlier.

Other than that we store our taint data (data which keeps track which
memory is initialized and which is not) by working on 128kb chunks. That
is, we have a list in which every entry points to taint data for 128kb
memory (we store this in 16kb memory by using one bit taint per byte.)
These entries are allocated on-demand in order to try to reduce the memory
foot print, but the overhead is still fairly big (as always, with taint
data.)

Proof of Concept

Binaries of the Pintool and the Proof of Concepts presented earlier can
be found here,
up-to-date source can be found
here.

An example run of the Pintool against the Proof of Concept binaries looks
like the following.

gcc -std=c99 -O0 -o poc1.exe poc1.c
../../../ia32/bin/pin -t obj-ia32/readb4write.dll -- poc1.exe
untainted address 0x0027ff1c is being read @ 0x004013c5..
a: 2130567168

gcc -std=c99 -O0 -o poc2.exe poc2.c
../../../ia32/bin/pin -t obj-ia32/readb4write.dll -- poc2.exe
untainted address 0x0027ff10 is being read @ 0x004013c6..
untainted address 0x0027ff1c is being read @ 0x004013dd..

gcc -std=c99 -O0 -o poc3.exe poc3.c
../../../ia32/bin/pin -t obj-ia32/readb4write.dll -- poc3.exe
untainted address 0x02792bd0 is being read @ 0x004013e6..
a0: 41490040
a1: 0
untainted address 0x02792bd8 is being read @ 0x00401418..
a2: 0

References

  1. Valgrind
  2. RtlAllocateHeap – MSDN
  3. calloc – C++

One thought on “Detecting Uninitialized Memory Read Access Bugs using Pin (a la Valgrind)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>