Transparent MITM with Cuckoo Sandbox

Transparent MITM with Cuckoo Sandbox

In a series of upcoming blogposts I will be sharing a fair amount of cool
features that have been worked on over the past year in Cuckoo Sandbox. This
first blogpost features Man in the Middle support for Cuckoo Sandbox.

(For those that are familiar Cuckoo Sandbox and the general ideas behind MITM,
please scroll down to the slightly more exciting stuff in the Transparent
snooping of HTTPS traffic
paragraph).

So, man in the middle?

As we are well aware MITM is generally used to explain the process of snooping
on otherwise encrypted information, in this case network traffic. In this
blogpost we will dive into two different ways of doing MITM:

  • Providing a CA Root Certificate to allow a MITM proxy to intercept traffic.
  • Transparent dumping of TLS Master Secrets to decrypt TLS traffic.

Eh, Cuckoo Sandbox?

Before we continue onto the MITM stuff first a reminder on Cuckoo Sandbox. As
some of you will be familiar with, Cuckoo Sandbox is an Open Source
Automated Malware Analysis Sandbox. Analyses are performed by starting a VM
(Virtual Machine) and running the potentially malicious sample, or URL as we
will be exploring in this blogpost, inside the VM. Then stopping the VM once
the analysis is done.

Due to a personal interest, and that of some of my clients, Cuckoo has been
getting much, much better at analyzing Internet Explorer and alike in the past
few months. Both in actually analyzing it, but also due to developments
outside of the actual analysis, as will be outlined in this blogpost.
(For a part of the improvements on the analysis part I would like to thank
Brad Spengler for continuously providing feedback and bug fixes).

At work with mitmproxy

The first solution to provide MITM support to Cuckoo was to integrate a tool
called mitmproxy, created by Aldo Cortesi and maintained by fellow
The Honeynet Project member Maximilian Hils.

As outlined by the documentation mitmproxy works by
installing a CA Root Certificate on the target device, in this case a
VM running either Windows XP or Windows 7.

After Googling around and looking at GUI dialogs to import certificates into
the Windows Certificate store I finally managed to find an
easy command-line way to import a certificate (that only works on
Windows 7, not Windows XP). So basically invoking certutil.exe imports a .p12
certificate, this certificate can be found in ~/.mitmproxy after running
mitmproxy once (the first time mitmproxy is ran on a system it automatically
creates a unique set of certificates).

At this point there are two ways to throttle traffic from the VM into
mitmproxy. For the time being I have taken the easy way, which
involves explicitly routing traffic through a socks4/5 proxy, but this
approach has obvious disadvantages:

  • This technique is not compatible with Certificate Pinning.
  • Looking at the PCAP file all traffic goes to the proxy.
  • Having to explicitly tunnel traffic through socks4/5 translates into this
    technique not working for anything but Internet Explorer (i.e., at this
    point no support has been provided for other applications).
  • Hostnames are not resolved in the VM. Did I mention all the traffic goes to
    the proxy?

A better approach would be to route VM traffic to the proxy by the use of a
tool such as redsocks (not to be confused with RedSocks, a
Dutch startup and one of my clients, providing the malware threat defender, a
network security appliance for detecting malware infections and other unwanted
software in your corporate network).
Anyway, a possible drawback of such tool is the requirement of having to
configure it through various root commands, a requirement that generally is
not available to Cuckoo once it is running. I have to look into this later..
(And also this technique still requires the CA Root Certificate and thus it is
not compatible with Certificate Pinning).

Transparent snooping of HTTPS traffic

Going a bit more in-depth with HTTPS and TLS we learn
that in the TLS protocol the client and server exchange a per-session
random which, in combination with the master secret, can be
used to derive the encryption keys, MAC keys, and IVs (when needed) which in
turn allow one to fully decrypt the TLS stream.

Reading further we find where and how to intercept the PRF function by
Brendan Dolan-Gavitt, the developer of PANDA. We also find which
information is required to decrypt TLS streams in Wireshark.

Time to take a step back. So we require the RSA Session ID, which, as defined
in the TLS protocol, can be extracted from the Server Hello record.
We also require the Master Secret, which, as we have seen, can
be extracted from the PRF function call. By instrumenting the PRF function
call looking for calls which feature the “key expansion” string (as defined in
RFC 2246) we see that we can extract the master secret together with
the server random.

Long story short. If we extract each pair of server random and master
secret
from the PRF function in lsass.exe (Brendan outlined that all TLS
encryption is performed by the lsass.exe service on Windows, Windows 7 at
least), and if we extract Server Hello records from the PCAP file which
links the Session IDs to the server random, then
we can cross-reference this information to write the Master Secret file with
matching RSA Session ID and Master Secrets for each TLS session that was
negotiated during the analysis in the VM. (Note that to cross-reference we
extracted the server random in both scenarios, once from the Server Hello
record and once from the PRF “key expension” function call).

Fast forward various long nights debugging code, many changes and improvements
to Cuckoo to be able to facilitate all of this in the first place, and
matching the various pieces of extracted information to each other, we finally
conclude with functionality in Cuckoo to dump a tlsmaster.txt file for each
analysis.

To recap some facts about this transparent approach:

  • It does not require any special handling for the instrumented application,
    just to the instrumented lsass.exe service.
  • Cuckoo can decrypt any TLS/HTTPS stream that uses the Windows API to perform
    the TLS/HTTPS encryption. Including those of Windows Update, etc.
  • Since there is no need to proxy the traffic through some 3rd party tool, the
    PCAP file looks the same as it would without our transparent sniffer.
  • As nothing happens with the TLS itself, applications that use Certificate
    Pinning are supported.

Following a screenshot showing Wireshark with a PCAP containing decrypted
HTTPS traffic of an analysis going to the login page of the Dutch banking
website ING using the latest Cuckoo Sandbox:

Wireshark vs ING

HTTP/HTTPS replay tool

Because I was not really able to find such code elsewhere, and because tshark
falls under the not invented here rule, I worked up a small Python project
that extracts HTTP and HTTPS streams from a PCAP file with
according TLS Master Secrets file. To be fair, integrating a tool such as
tshark with a tool such as Cuckoo Sandbox is suboptimal, as naturally one of
the future goals is to include decrypted https traffic in the Cuckoo reports
without having to depend on tools like mitmproxy (due to the non-transparency
thing).

The final goal of httpreplay will, as one might expect, be to transparently
replay HTTP/HTTPS traffic from a PCAP file. At the moment this last step has
not been implemented yet, though. Aside other goals this can be used to
reproduce and unittest analysis of certain websites with Cuckoo Sandbox,
etc, etc.

Quickly running the httpreplay tool on the same PCAP as shown earlier we find
the following output (just URLs of extracted HTTP/HTTPS streams):

$ python httpreplay.py dump.pcap tlsmaster.txt
http://mijn.ing.nl/
https://mijn.ing.nl/internetbankieren/
https://mijn.ing.nl/favicon.ico
...

Some readers may note this tool is very similar to a thousand others in its
field, one of which being CapTipper, developed by our friends at
CheckPoint. At the moment the only added value of httpreplay would be https
support (and perhaps proper TCP reassembly and the future goal of being able
to operate multi-gigabyte files - in-memory loading and all that).

Conclusion

Knowledge about TLS was gained. Tools were reinvented. Cuckoo Sandbox gained
some new tricks. I finally wrote another blogpost ;)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>