Automated Deobfuscation of Android Applications

Automated Deobfuscation of Android Applications

During AthCon 2013 last week I talked about Automated Deobfuscation
of Android Applications and Malware
. In particular, this presentation
focussed on using automated deobfuscation tools in order to speed up the
analysis of 3rd party applications which have been obfuscated.
Click here for the slides.
The cyanide and obad.a samples discussed in the presentation can be found
here (password infected.)

The Dexguard Scripts

At the moment the dexguard scripts are focussed at deobfuscating all
obfuscated strings and reconstructing a new dex file. As you will quickly see
when analyzing the _undexguard.dex files (explained further in the samples
section), this is far from complete deobfuscation. However, it gives a good
start at analyzing the sample more quickly. (And it’s work in progress.)

The tools?

As for now I have decided to, unfortunately, keep the code private. I’m
considering setting up a deobfuscation website later, where one can upload
a sample and download the deobfuscated sample a few seconds later.

Please do let me know if you (or your company) is interested in such
service!
You know where to contact me.

The Samples

A little explanation on the given samples.

Cyanide Sample

Cyanide.dex is a root exploit by Justin Case.

  • cyanide_original.dex is the original Cyanide binary
  • cyanide_dexguard.dex is a Dexguarded version of cyanide_original.dex
  • Running unchina.py on cyanide_dexguard.dex gives us cyanide_unchina.dex
  • Running our dexguard scripts on cyanide_unchina.dex gives us cyanide_undexguard.dex

Note: the dexguard version used for this sample is the one-to-last, however,
our framework has support for the latest version as well.

Obad.a Sample

Obad.a is a Most Sophisticated Android Trojan.

  • obad_original.dex is the original obad.a binary
  • We get obad_undexguard.dex after running our dexguard scripts on obad_original.dex

Note: obad_undexguard.dex will not run on an emulator or a real device due
to the way it’s built. (The cyanide_undexguard.dex, however, should work.)
Even though it doesn’t run, JEB loads the undexguarded file just fine,
so it’s mostly useful for analysis.

Pintool and Z3 Introduction

I’ve posted an introduction on Pintool and one on Z3 on the blog of our CTF Team, De Eindbazen.

And, yes, these actually provided us with results!

Pintool Introduction: http://eindbazen.net/2013/04/pctf-2013-hypercomputer-1-bin-100/

Z3 Introduction: http://eindbazen.net/2013/04/pctf-2013-cone-binary-250-2/

Cheers,
Jurriaan

Python Source Obfuscation using ASTs

Python Source Obfuscation using ASTs

Introduction

For one of the challenges of the Hack in the Box Capture the Flag
game last week, I decided to release an obfuscated and compiled Python class.
After doing some research on the internet about this particular topic, it
appeared there is no real up-to-date tool for this. I mostly found paid
software and/or software that has been outdated for several years, or at least
looks like it is.

Well, that’s great. This allows me to make something new-ish.

The Actual Challenge

The actual challenge was actually rather easy; given a teamname
and a flag on the commandline, the Python script would verify whether the flag
is correct or not. By correctly using a few prints here and there, the
challenge can be solved within minutes. Which is why we obfuscate it! As we
obviously want the teams playing our CTF to get some headaches :)

Abstract Syntax Trees

According to Wikipedia, an Abstract Syntax Tree is a tree representation of
the abstract syntactic structure of source code written in a programming
language. In other words, an AST represents the original source code as a
tree.

Fortunately for us, Python provides a built-in ast module which is able to
parse Python source into an AST (actually using the built-in compile()
method.) Besides that, the ast module gives us access to all available ast
nodes (e.g., Call, BinOp, etc.)

With this knowledge, we’re able to rewrite the original AST using the
ast.NodeTransformer class. For a brief example of what this
looks like, please refer to the official documentation.

Finally, after rewriting the AST, we can do two things. We can generate a
compiled python object directly from the AST. Unfortunately I did not find a
way to do this (or a library, for that matter), if you know of one, feel free
to point me to it :) The other option is to generate Python code from the AST
again and to compile it from there. For this step, one would use the
codegen.py module. (Note that I submitted a pull request,
as the current version gave me an error with regards to the omission of
parentheses for binary operations
.)

We’re now at the point where we can parse Python code into an AST, rewrite it,
and write a new Python source from the rewritten AST. The final step for my
challenge was to compile the created Python source into an object, which can
be done by executing the following command on the commandline. (I’m sure it’s
also possible using a Python function, but this works just fine for the
moment.)

$ python -mcompileall .
    

Decompilation

Before diving into the obfuscations I performed on the AST, I’d like to note
that the compiled Python object can be decompiled using Mysterie’s
python decompiler, uncompyle2.

Obfuscation through ASTs

So basically I only did a few simple obfuscations, which already proved to be
painful enough, but it’s a nice start for anyone that’s looking into doing
something similar.

The ast.NodeVisitor class, which was mentioned earlier in this blogpost,
allows one to visit each AST node in the tree, with the possibility to
modify them or to delete them. We can do this by implementing visit_
functions. For example, in order to analyzer/modify/delete certain Name
nodes, which are used for variable lookups etc, we implement a visit_Name
function in our obfuscation class (which, btw, extends ast.NodeVisitor.)

Modifying AST nodes using the NodeTransformer

The NodeTransformer can modify an existing AST in a fairly simple way. By
returning the original node, the AST remains untouched, as is showed in the
following example code.

from ast import NodeTransformer

class Example01(NodeTransformer):
    def visit_Str(self, node):
        return node
    

One can modify an AST node by return a new node. For example, to replace all
strings with an empty string, see the following snippet.

from ast import NodeTransformer

class Example02(NodeTransformer):
    def visit_Str(self, node):
        return Str(s='')
    

And, finally, to delete a node, simply return None in the visit_ function,
although this can give weird situations in which the new AST is not valid
anymore.

Example Obfuscation – Strings

In the AST, constant String nodes are represented with Str nodes. These
Str nodes have one interesting field, namely the s field, which contains
the actual string. For example, in the AST of the following Python source,
there will be exactly one Str node with the s field set to “Hello AST”.

print 'Hello AST'
    

For the challenge, I implemented a handful simple string obfuscations. Take
for example the following code (rewritten a bit, but similar to the code in
the obfuscator.)

from ast import NodeTransformer, BinOp, Str, Add

class StringObfuscator(NodeTransformer):
    def visit_Str(self, node):
        return BinOp(left=Str(s=node.s[:len(node.s)/2]),
                     op=Add(),
                     right=Str(s=node.s[len(node.s)/2:])),
    

Noteworthy in this example code is that BinOp is a node representing a
binary operation, in this case addition (because of the Add node.) A binary
operation takes a left operand and a right one. On the left we put the first
half of the actual string, and on the right we put the second half of the
string. When running this “obfuscator” on our example once, we get the
following code. (Note that you can run such obfuscator multiple times to
achieve extra painful code. This is what I did for the challenge :p)

print ('Hell' + 'o AST')
    

Other string obfuscations included reversing a string, i.e., “abc” ->
“cba”[::-1], and converting single-length strings (which you’ll get soon
enough when recursively running the obfuscator a few times) into a chr()
statement (i.e., “a” -> chr(0×61).)

The Obfuscated Challenge

After running the original challenge a few times through
the obfuscator, which, in addition to obfuscating strings, also
obfuscates integers, import statements, and global variable names, we get
our actual challenge.

And, yes, running the obfuscator several times does indeed look like the
following.

$ python hitbctfobf.py hitbctforig.py|python hitbctfobf.py -|...
    

Outro

Having pasted the original challenge in the blogpost, there’s not much left of
the challenge itself. However, I found the methods behind the obfuscation
fairly interesting, and perhaps so does somebody else.. :)

Cross-referencing stand-alone Dalvik Bytecode

Cross-Referencing stand-alone Dalvik Bytecode

A few days ago Patrick Schulz from BlueBox Security
posted an Android Challenge on BlueBox’ blog. In this blogpost we will
not go into the entire challenge, but rather focus on the patched bytecode.

Shameless self-promotion: tweet, reddit (I didn’t even have to make
the reddit post, hah.)

Introduction

After reading the blogpost, including the spoiler, it’s evident that the
native library will patch the bytecode of a particular function that was
originally implemented in classes.dex (the container which keeps all
dalvik bytecode with metadata.)

As part of research I’m doing for my presentation at AthCon I found
the patching process interesting in particular. This is actually a technique
I’ve thought about earlier, but then again, I’m sure many people have :)

Just-in-Time Bytecode

In order to speed up the process of executing the Dalvik bytecode, Android has
a Just in Time compiler, which may compile certain functions into native ARMv7
instructions. This allows the virtual machine to execute faster compared to
interpreting the bytecode naively.

I do not know the following for sure, as it depends on the internals of the
dalvik JIT, but it may require the bytecode to be patched before executing
it. If we were to patch the bytecode after it has been compiled by the JIT,
then who’s going to execute it? (This is just a side-note for anyone looking
to do the same in the future.)

Locating the Bytecode

Opening up the native library that can be found inside the apk, we find
ourselves with various functions dealing with the Dex file format.
After looking through the functions for a minute or two, we get to a
mprotect() call followed by a memcpy() call, this is where the function is
being patched, as described in the spoiler by the Patrick’s blogpost.

Extracting the Bytecode

I loaded the native library in IDA Pro. It appears that the symbols were not
stripped, so that makes it easier for us as well. Anyway, when the relevant
memcpy function is found, we see an obvious inject_ptr. Which is a pointer
to the target bytecode. We extract the few hundred bytes of bytecode directly
from the binary, as it’s not encrypted or anything, and put the hexdump in a
file. (Use the Hex View in IDA Pro.)

We then translate the hex dump into a binary file using the following command.

$ xxd -r -p hexfile binfile
    

Analyzing the Bytecode

Now we’ve got raw bytecode. It appears that dexdump doesn’t really know what
to do with it. We can do two things now. The first option would be to create
a new .dex file with the patched function (i.e., patching the original
.dex file with the new bytecode), but I don’t feel like doing that.
The second option would be to disassemble the raw bytecode and to fix all
cross-references to other methods, strings, etc. So that’s what I did.

For simplicity sake I wrote some Python bindings for my dalvik disassembler,
which I made a few months ago and is unfortunately still not public.
Disassembling the raw bytecode then results in the following output.

$ python dalvik.py binfile
0 const/4 v2, #+0
2 invoke-virtual {v12}, meth@3103
8 move-result v4
10 new-instance v5, type@475
14 invoke-direct {v5}, meth@3149
[...]
    

However, as you may notice, we’re missing some information here. So I also
wrote some Python bindings around my Dex file parser, which is still private,
just like the dalvik disassembler. The references in the bytecode, i.e.,
meth@3103 etc., are references to the original .dex file, so I dumped all
the relevant tables from the original .dex file into a simple database file
(actually just a pickle‘d dictionary, to make life easy.)

Having a database with all lookup tables we can now continue onto the
disassembling part. When disassembling a dalvik instruction, the disassembler
also returns whether there’s a lookup and in which table this lookup is.
Printing the correct information next to the instruction is therefore as easy
as reading from the correct table with the correct index. This looks like the
following in the disassembler code.

length, d = disasm(...)
if d.kind is None:
    print offset, d.string
else:
    print offset, d.string, ';', c[d.kind][d.index]
    

Disassembling again, with the database file as parameter, we get the following
output.

$ python dalvik.py -c bindb binfile
0 const/4 v2, #+0
2 invoke-virtual {v12}, meth@3103 ; ()I Ljava/lang/String; length
8 move-result v4
10 new-instance v5, type@475 ; Ljava/util/HashMap;
14 invoke-direct {v5}, meth@3149 ; ()V Ljava/util/HashMap; <init>
[...]
    

Now I thought that was pretty cool, so.. :)

Challenge Spoiler

For those of you that would like to do the challenge without having to write
several hundreds if not thousands of lines of code and/or without directly
patching the binary, the complete output of the bytecode can be found
here.

Small note: it appears my disassembler doesn’t really understand signed
shorts at the moment, but that’ll be fixed another time.

Toolz

I will release all the tools after my AthCon presentation. In the meantime,
I’ll be working on extending the code to do lots of other cool stuff with it
\o/

Darm – An armv7 disassembler

Darm – An armv7 disassembler

First of all, if you like this post and/or library, don’t hesitate to check
out the project on github, the official tweet (is this
even possible?) or the reddit thread on /r/programming.

Introduction

Darm is a lightweight, highly efficient, BSD 3-Clause licensed ARMv7
disassembler
written in C which gives you all the information you need, such
as flags and operands, in a compact structure.
Optionally you can generate a string representation from the given structure,
unlike every other ARMv7 disassembler I’ve come across, which only generate
strings.

Furthermore, darm ships with Python bindings.

Internals

Darm is, as advertised, efficient. Benchmarks will be presented in a follow-up
blogpost, but I’d estimate that for each supported instruction a maximum of a
few dozen if-statements and a handful table lookups are performed.

At the moment of writing this blogpost all regular instructions are supported,
this means all instructions except for the NEON and some funky
coprocessor instructions.

Support for NEON, Thumb2, and the coprocessor instructions is
planned for upcoming versions.

Usage

Darm features a simple C api, as well as Python bindings. Following is a C
snippet disassembling the “add r2, r3, r5, ror #5″ instruction.

#include <stdio.h>
#include "darm.h"

int main()
{
    darm_t d; darm_str_t str;

    // disassemble the instruction
    if(darm_armv7_disasm(&d, 0xe08322e5) == 0) {

        // print the register indices
        printf("Rd: %d, Rn: %d, Rm: %d\n",
            d.Rd, d.Rn, d.Rm);

        // print a string version of the
        // disassembled instruction
        if(darm_str2(&d, &str, 1) == 0) {
            printf("instr: %s\n", str.instr);
        }
    }
}
    
$ gcc sample.c libdarm.so -o sample && ./sample
Rd: 2, Rn: 3, Rm: 5
instr: add r2, r3, r5, ror #5
    

And, of course, the Python equivalent.

import darm

d = darm.disasm(0xe08322e5)
print d.Rd, d.Rn, d.Rm, d.shift
print d
    
$ python sample.py
r2 r3 r5 ROR #5
add r2, r3, r5, ror #5
    

Documentation

Documentation is currently being worked on and will be available in the
Git repository. For now, please refer to the darm.h header file as most of
the fields and functions are documented there.

License

As outlined in the introduction, darm is BSD 3-Clause licensed. This is
a flexible license which should allow you to use it as you wish.

Support

If you have questions, suggestions, or anything else, feel free to drop me an
email or join the official IRC chat, #darm on freenode.

Python Binary Extensions for Compilers

Python Binary Extension for Compilers

Or, pre-processing the pre-preprocessor

As many have stumbled upon before, the C standard does does not support Binary
Constants. In this post we’ll see how we can add support for Binary Constants
ourselves, in a hacky hacky way, of course!

Introduction

Inspired by the Python syntax, where 0b111 equals 7, while working a bit
on a potential ARMv7 disassembler (which will likely never be finished), I
decided that C should support binary constants just like Python.

Turns out there is actually a GCC Extension to do exactly this, but
then you wouldn’t be able to compile your favourite project using the
compiler provided by microsoft.

So, in order to keep cross-compiler support, I’ve decided to hack up some
experimental wrappers around gcc and cl.

Pybinext to the rescue

Pybinext is the utility which wraps around gcc and cl and basically
pre-processes the pre-processor. Using a simple regular expressions, we detect
the pattern which matches the binary constants, namely 0b[0-1]+. (The actual
regex query is a bit more complex, so it tries to avoid false positives as
much as possible.)

From there on, the script temporarily replaces the source file with an updated
source file which has been pre-processed to contain normal numbers rather than
the binary constants. The script then leaves the compiling upto the real
compiler, i.e., gcc or cl, and after compilation restores the original
source file.

That’s pretty much. There might still be some unhandled edge cases, but the
script correctly handles whenever a binary constant is in a string literal, on
a newline, etc.

Extending Pybinext

It’s fairly easy to extend pybinext in such a way to support more exotic
features, but I’ve yet to come up with one, so that’s for another blogpost
perhaps.

Proof of Concept

Assume we’ve got the following source file.

#include <stdio.h>

int main()
{
    printf("-> %d\n", 0b1110);
}
    

Running it under the microsoft compiler will normally give a series of errors,
as can be seen here:

$ cl main.c
main.c(7) : error C2059: syntax error : 'bad suffix on number'
main.c(7) : error C2146: syntax error : missing ')' before identifier 'b1110'
main.c(7) : error C2059: syntax error : ')'
    

However, running it under our pycl.py script, the result is a successful
compilation, as can be here:

$ ./pycl.py main.c
    

Same applies to the gcc script, namely pygcc.py. Now go and use Binary
Constants in your script!

Source

As always, the source of this simple utility can be found on github.

Apache Log Parsing

Apache Log Parsing

Just like Joe Landman I’ve been annoyed for years by the fact that
there hasn’t been a single Apache Log Parsing Library out there, on
the internet. Today I finally decided to do one last Google search query, only
to find out that there’s a few libraries, but these
are too bloated for my taste (and usage.)

Tweaking the Regex

The Regex query as given by Joe is almost perfect. I modified two parts in
order to be able to parse a few more queries and make it a bit more useful.

Split Request Method and Path

One of the return values of the original Regex query is the following:

# c[6] = incoming request (GET, PUT, HEAD, ... with relative URI part)

By splitting the request method and the path, it’s easier to handle these
values later. I.e., instead of “GET /abc” we will now get a “GET” and a “/abc”
value.

Fix Content-Length Bug

In some cases apache will redirect a request, i.e., send a Location: header.
These responses don’t have a Content-Length:, instead they have a dash as
value in the logs. The original Regex incorrectly tries to parse this dash
with \d. By changing this into a \S we fix this bug.

Simple Usage

Because webalizer limits the amount of Search Strings to the top 20
(Search Strings being the queries that people google for before they land on
my website) I decided to make a simple utility based on apachelog.

(Note that I found out about the existing apachelog library only afterwards I
created my own, hence the duplicate name.)

Following is a simple tool to search queries from Google referers.

import apachelog
import sys

if __name__ == '__main__':
    # enumerate each entry in the apache log file
    for entry in apachelog.enumerate(sys.argv[1]):
        # check whether the domain is something-google
        # and contains a query string
        if entry.referer and \
                'google' in entry.referer.netloc and \
                entry.referer.kwargs.get('q', None):
            print r.kwargs['q']
    

See, it’s really not that hard to parse some log files. Btw, turns out
somebody googled for pointer contain garbege untilit is uninitialized in
order to get to my website, now tell me that isn’t cute.

Code

As always, code can be found on github.

Pin Denial of Service

Pintool Denial of Service

As described in an earlier blogposts, Pintool is a
Dynamic Binary Instrumentation framework. In this blogpost we analyze a
simple Denial of Service approach which is tested for Pintool, but should also
work with other Dynamic Binary Instrumentation frameworks.

Internal Workings of Pin

When instrumenting a binary using Pin, Pin will translate every executed
basic block into their own version. That is, the translated block of
instructions will do the same as before, but the new basic
block will also execute additional code. For example, if the original basic
block calls another function, Pin will check if it should translate the new
basic block, or if it has already been translated (and if so, it will execute
the existing translation, rather than translating the basic block again and
executing it from there.)

Abusing Pin Internals

As just-in-time compilation is a fairly expensive operation (rather
than executing a set of instructions, such as a basic block, Pin has to
disassemble the instructions, check if any adjustments should be made,
translate them into the new block, and execute the translated block), we can
get Pin to perform expensive operations in order to trigger a Denial of
Service.

That being said, the expensive part of executing a new basic block under Pin,
is the translation. In other words, if we want to exhaust Pin, we will want to
have Pin translate many basic blocks. We do this by making sure that Pin
executes as many new basic blocks as possible.

Denial of Service Idea

In order to have Pin translate as many basic blocks as possible, without
continously altering memory of basic blocks in the process, we execute a
sequence of nop instructions in a backwards fashion. So imagine we have a
nopsled of length ten (that is, ten consecutive nop instructions), now if we
execute the last nop instruction, and then jump back to the second last nop
instruction, Pin will have to translate the new basic block. At this point
of time, there will be two basic blocks, the one starting at the last nop
instruction, and the one starting at the second last nop instruction.
Continuing like this we can have Pin translate N “unique” basic blocks from
a nopsled containing N nop instructions.

Proof of Concept

In the Proof of Concept we show two similar ways to mess around with
Pin. The first by creating a nopsled, as described above, with N = 0×10000
(in other words, there will be 65536 nop instructions, resulting in the
execution of 2147516416 nop instructions in total.) This takes Pin give or
take a 100mb of ram and a few minutes to finish.

The second way is by using N = 0×1000000 (or, 16777216 nop instruction),
and not by stepping one nop instruction back every iteration (because that
would result into a lot of nop instructions), but 0×10000. So if the first
iteration would execute the last nop instruction, then the second iteration
would execute the last 0×10001 nop instructions (0×10000 + 1, the last.) In
theory this almost equals in the amount of nop instructions that will be
executed compared to the first way, namely 2155872256. In practice, however,
Pin will have to translate 0×10000 * i nop instructions for every iteration.

I didn’t even wait for the second method to finish when running under Pin, but
after ten minutes or so, it’s using over one gigabyte of memory, and it will
probably run out of memory or so..

Note that this behavior is by design, and that it is probably rather hard if
not impossible (in my opinion), to solve. Besides that, this behavior is seen
when using a plain pintool. It might be possible, however, to fix
this in a specific pintool, but I’ll leave this as an exercise to the reader.

When running poc.exe under normal circumstances, i.e., not under Pin, we get
something like the following output:

    $ poc.exe
    nop-instructions: 65536
    executing nops:   2147516416
    that took 0.296000 milliseconds
    nop-instructions: 16777216
    executing nops:   2155872256
    that took 0.406000 milliseconds
    

Running the poc.exe under Pin, we get something like the following:

    $ ../pintool/ia32/bin/pin.exe -t pindos-x32.dll -- poc.exe
    nop-instructions: 65536
    executing nops:   2147516416
    that took 40.186000 milliseconds
    nop-instructions: 16777216
    executing nops:   2155872256
    ...
    

It is obviously taking a little extra time under Pin.. ;-) Other than that
this is not really that interesting.. So, enjoy your day doing something
useful instead.

Btw, the source of a bare Pintool and the source of the Proof of Concept can
be found in the following repository: pindos.

Malware.lu HackGyver Challenges

Malware.lu HackGyver Challenges

On the 13th of december, 2012, malware.lu
released two challenges.
One for windows,
and one for linux.
In this post we will discuss both challenges, first the windows binary,
then the linux binary.

HackGyver Windows Binary

After loading the binary into IDA Pro, we start off by analyzing the main
function, which is located at sub_401240. The application registers
a regular windows user interface so the user can enter the PIN Number in a
textbox and press on the validate button. When registering a user
interface, the application has to give a handler function. This function
contains all code related to initializing the user interface, handling
button clicks, etc. For this application, the handler is located at
sub_4013a0.

Following the handler function we first see some initializing stuff (some
function calls to create the button etc for the user interface.)
Then we get to the interesting part.

The PIN Number is retrieved from the GUI, using GetWindowTextW.
This unicode string is then converted to ascii, after which it is passed
onto sub_401000. This function checks whether the pin number is
correct or not, and returns 1 on success and 0 on failure.

Function sub_401000 contains routines to extract an encrypted image
from the resource section (i.e. an embedded image.) This image is then
decrypted using the given PIN Number, a number in the range 10000..99999.
Although the code doesn’t necessarily require the pin number to be a
number, the text “Enter your PIN code” is a dead giveaway that this in
fact the case. The length five is hardcoded though, hence the range starts
at the number 10000 and finished at 99999 (the pin number in ascii form is
the key to decrypt the encrypted image data.)

Further analysis show that the sub_401000 function calls two other
functions in order to decrypt the encrypted image. These functions are
standalone, i.e. don’t require any global values or anything else but the
parameters. After the image has been decrypt in-memory, the first four
bytes of some internal state are checked against the magic value
“\xaa\xbb\xcc\xdd”, in order to verify that the decryption key was
correct.

The sub_401000 function is, just like the two decrypt functions, a
standalone function, and can be called simply by passing a string
containing the pin code as first parameter.

In order to solve this challenge I decided to go for a bruteforce
approach, i.e. trying the pin numbers 10000 upto and including 99999. As
stated earlier, the sub_401000 function returns 1 on success and 0
on failure. So by checking the return value we can see if the pin number
was correct.

Finally, we find that the sub_401000 function has the fastcall
calling convention
and that the binary is ASLR enabled.

Bruteforcing the PIN Number through IDA Pro

I decided to bruteforce through IDA Pro by using the
AppCall function provided by IDC/IDAPython.

To use AppCall we first have to execute the binary and attach to it
using IDA Pro. This is as simple as pressing F9 after the binary has
loaded up in IDA Pro. Then we have to suspend the process, do this by
pressing the pause button in IDA Pro.

We then have to define our own prototype as outlined in
this
blogpost by hexblog. As the binary has ASLR enabled, the address of the
sub_401000 function changes every time. Let’s assume, for this example,
that the function is located at 0x12f1000. The prototype looks like
the following (note that you have to type the lines of code into the
IDAPython console within IDA Pro, or do some fancy stuff using
idascript..):

proto = 'int __usercall decrypt<eax>(const char *a<ecx>);'
fn = Appcall.proto(0x12f1000, proto)

It’s fairly straightforward; we define a function with parameters given
through registers using __usercall (note that there appears to be
no __fastcall), we call the function decrypt (although this
name has no special value here), the return value is in the eax
register, and finally, the first parameter is in the ecx register.
(The fastcall calling convention places the first parameter in the
ecx register, and the second parameter in the edx register,
although this function doesn’t take a second parameter.)

Now we have defined the prototype, it’s time to bruteforce. There’s not
much to bruteforcing, so here is the script to do it.

for x in xrange(10000, 100000):
    if fn(str(x)):
        print x
        break

Success!

After waiting for a few seconds the code has executed, and we see a number
being printed; 13044. Looking into our directory we find
that an image has been written. This is the following image:

Challenge Succeeded Image

That being said, we have solved the challenge :-)

HackGyver Linux Binary

Whereas the Windows Binary was focused on decryption of encrypted data,
the linux binary focusses more on hashing and encoding.

The Linux binary takes one parameter on the command line, the key. The
key has to be atleast nine characters long, or it will be rejected.

The main routine calls two different functions, namely md6 and
RC4_encode, and finally checks two strings with strcmp.

md6 and RC4_encode

The debug symbols were messed up on purpose. The function md6 is
not really using the
md6 hashing algorithm,
instead, it’s a wrapper around the
md5 hashing algorithm. The
md6 function takes a zero-terminated string and returns the
hexadecimal representation of the md5 hash of this string.

Then we have the RC4_encode function which takes only two
parameters, whereas one would expect an
rc4 algorithm to take
atleast an input, an output, and a key parameter. It turns out that the
RC4_encode function simply takes a string and a length (which is
hardcoded to nine) as arguments. Since the length parameter is hardcoded
to nine, it will treat the input string as if it was truncated. It will
then copy the string into a new buffer, append the
base64
encoded version of the string and return this newly created string.

In other words, the RC4_encode function can be represented in
Python as the following.

def RC4_encode(s):
    # strip the string because the base64 encoding
    # tends to append a newline character
    return s + s.encode('base64').strip()

strcmp(hash1, hash2)

Looking back at the decompiled code of the main function, we see
the strcmp which compares both hashes, and they
have to be equal. So what happens is the following.

Hash1 is calculated by taking the md5 hash of the entire key which
is given on the commandline.

Hash2, however, is the md5 hash of the string generated by the
RC4_encode function. The RC4_encode function is called with
the hardcoded length nine, as mentioned earlier. In other words,
everything after the first nine characters in the key which we give on the
commandline is ignored.

Conclusion: we have to come up with a key on the commandline which equals
the output of the RC4_encode function. To automatically generate
a correct key for any combination of nine characters we have the following
few lines of Python code.

import sys

# grab the first 9 chars of the key
if len(sys.argv) != 2 or len(sys.argv[1]) != 9:
    print 'Usage: %s <9-char key>' % sys.argv[0]
    exit(0)

# generate the key
print sys.argv[1] + sys.argv[1].encode('base64').strip()

Pwnd :-)

Generating a key and solving the challenge is as simple as executing the
following two commands.

$ python hackgyver.py 123456789
123456789MTIzNDU2Nzg5

$ ./hackgyverlnx 123456789MTIzNDU2Nzg5
Well, now create your own keygen ; )

That was it for today. Now get yourself curious for the next release of
Cuckoo Sandbox, because a new
version will be released soon!

Pintool Makefile

Pintool Makefile

Unfortunately there hasn’t been a new blogpost in a while. Today’s
blogpost will be a short one.

So, anyone that has been using pintool on windows has faced the beautiful
Nmakefile that you get to use. And it only works using the Visual
Studio Command Prompt, which is basically cmd.exe with some extra
environment variables set.

Being a bash user nowadays (yes, you can have bash on windows using e.g.
Cygwin), it is useful to me to be able to
build my pintools from the bash terminal, rather than the command prompt.
(Note that Git for Windows also
ships with a bash terminal and a nice amount of bash utilities, such as
grep etc.)

That being said, the bash Pintool Makefile can be found
here.
(Click on the raw button at github in order to get it in
plaintext.)

You will want to set the PINTOOL variable to the relative or
absolute path in which you’ve installed (read: unpacked) pintool. Finally,
set the DLLS variable to the name of your dll (you can also do
multiple dll’s, if you want to build multiple pintools.) Note that a
pintool called abc.dll is built from the sourcefile abc.cpp.

From now on, you can build your pintool(s) simply by executing make
in your bash terminal. If you want to delete the compiled files, simply
run make clean. (In the Makefile you can see that the *.dll
etc are single-quoted, this is because the windows port of make
simply crashes when you try to run e.g. rm *.dll directly. If
anyone has a solution for me.. please do tell, and I’ll buy you a beer in
return.)