XSS

Cross Site Scripting with Examples

What is Cross Site Scripting?

“Cross-site scripting (XSS) is a type of computer security vulnerability typically found in Web applications that enables attackers to inject client-side script into Web pages viewed by other users.” ^[1]

How does XSS Work?

The idea of Cross Site Scripting is to inject code into an existing website and thereby “extending” the websites functionality. Websites consist mainly of html, css and javascript. Each of these languages are based on plaintext, so what happens when a website includes data supplied by a user? User supplied data is plaintext as well and because of this, websites have to be very careful when using and displaying data they cannot trust, which is ALL data that a regular user can insert or tamper (think Username, About myself, URLs, messages, etc). Unfortunately a lot of websites fail to correctly sanitize the user’s content and are therefore vulnerable.
The key to success for Cross Site Scripting lies in the fact that each website has it’s own context in a browser, a script running under the context of website A cannot access or modify the context of another website, website B. So in order to be able to access / modify this data from website B, one has to run inside the context of website B, this is where XSS comes to play.

What can XSS do?

As outlined in the introduction it’s possible for an attacker to include their own code in a vulnerable website. The type of attack depends on the discovered vulnerability, but in each case various situations might occur. We assume that user Bob is surfing the internet and stumbles upon a ‘prepared’ webpage. Bob is on a legitimate website, but unknowingly Bob is now running malicious code from Vladimir. As said before, the attack might be limited by certain restrictions (various situations will be discussed later), but Vladimir is very smart and has obviously found a bypass for the given restrictions. At this point Vladimir might steal cookies, extract information from the current user, automate actions and much much more. It all depends on the type of website that is affected and the intentions of the attacker.

Stealing Cookies

Cookie Stealing is one of the most common usages when employing Cross Site Scripting attacks. Each opened website has it’s own context, as explained earlier in How does XSS Work?. However, when employing a Cross Site Scripting attack, we can access and/or modify the users cookies from the attacked website. A cookie contains an unique session on the website for the currently browsing (or logged in) user. If an attacker were to obtain this cookie it’s possible to use the same unique session from another (the attackers) computer. Example: Vladimir found an XSS vulnerability on hotmail.com and tricked Bob into opening a specially crafted url/page that will trigger the vulnerability. After executing some evil code by Vladimir, Bob has unknowingly given his Cookie to Vladimir, while Bob is still reading his mail Vladimir is already working: downloading important mails from Bob’s account and/or sending spam to all of Bob’s friends, possibilities are only limited by Vladimir’s imagination.

Extract Information & Automated Processes

As you can imagine stealing cookies is not always interesting for an attacker, some websites are very strict in their cookies, they might lock a session to an IP address^[2] thereby rendering Cookie Stealing useless. There are other protections against XSS as well, such as HttpOnly. In these cases it’s more conveniant for an attacker to do all data gathering (and whatnot) immediately when the vulnerability is triggered. Keep in mind it might be possible for an attacker to do *anything* he/she wants, this could also include sending money from your bank account to another bank account, showing phishing pages, exploiting your browser to install malware, etc.

Exploitation

XSS Exploitation 101

Now we are going to work through the basics of XSS Exploitation, but to start of I would like to mention that XSS can appear in any (web-)scripting language including html, javascript, css, vbscript, etc. We will start with the basic vulnerabilities and end with obscure and/or rarely used vulnerabilities (because they are not common, not because nobody knows about them).

Even though XSS vulnerabilities can appear in several (web-)scripting languages, the most common vulnerabilities are found in HTML^[3]. Here is a simple example of a website that has a title, some text and some javascript that will show a Messagebox, like this.

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <script>
            alert('msgbox in javascript');
        </script>
        <p>Some text here</p>
    </body>
</html>

Usually the idea of XSS is to trigger javascript execution (just like the messagebox), a PoC^[4] (Proof of Concept, showing off a vulnerability in this case) will most of the time make a Messagebox with a reference to the person that found the vulnerability. Most webpages are dynamically created by a PHP^[5] (or ASP.NET^[6], Python^[7], …) script, even though browsers only see the html. These scripts take user-supplied information as input and create a webpage accordingly as output, take the following PHP script:

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Some text here</p>
        <p>Some more text here.. <?php echo $_GET['text']; ?></p>
    </body>
</html>

As you can see the 7th line is highlighted, this is where the magic happens. The ?php tag is where the PHP code starts (it ends with the ?> tag). It would be nice to have a little PHP experience for the next part of the article, but without the reader will manage as well, I hope. The little PHP line in this script will ‘echo’ (aka print the contents of the variabele) $_GET['text']. $_GET is an array of parameters passed to the PHP script. In this line of code we obtain the ‘text’ parameter and print it directly to the html code. A simple example followed by a sample exploiting this script.

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Some text here</p>
        <p>Some more text here.. HELLOTHERE</p>
    </body>
</html>

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Some text here</p>
        <p>Some more text here.. <script>alert(1)</script></p>
    </body>
</html>

In the latter example we exploited the vulnerability that resides in vuln1.php, simply by submitting a script-element with some javascript in it as text, this is the easiest example of XSS vulnerabilities. Another common vulnerability is in the ‘tag’ of an html element:

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Look! it's a picture!!!1</p>
        <img
            src="<?php echo $_GET['url']; ?>"
            title="Image" alt="Image"
        />
    </body>
</html>

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Look! it's a picture!!!1</p>
        <img
            src="image.jpg"
            title="Image" alt="Image"
        />
    </body>
</html>

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Look! it's a picture!!!1</p>
        <img
            src="image.jpg" /><script>alert(1)</script><p
            title="Image" alt="Image"
        />
    </body>
</html>

Our input consists of an URL to an image (this is actually optional when the only point is to exploit the vulnerability) followed by a closing tag of the img-element (the " /> part). After the closing tag of the img-element we are free to insert any html code we want, so I chose for a simple javascript messagebox as PoC, again. (I also added a <p after the script-element in order not to break the entire html document, after all <p title=”Image” alt=”Image” /> is a valid element in html).

XSS Sanitizing

I first have to show you how to protect from these basic XSS attack before we move onto more obscure variants of XSS vulnerabilities.
Imagine we have a combination of vuln1.php and vuln2.php:

<html>
    <head>
        <title>Simple Page's Title</title>
    </head>
    <body>
        <p>Some text here</p>
        <p>Some more text here.. <?php echo $_GET['text']; ?></p>
        <p>Look! it's a picture!!!1</p>
        <img src="<?php echo $_GET['url']; ?>" title="Image" alt="Image" />
    </body>
</html>

As discussed earlier we can exploit vuln3.php in two different ways. Instead of doing exploitation we will look at possibilities to defend from it, there is a small set of ready-made PHP functions that help sanitizing possible XSS vectors:
htmlentities()^[8], addslashes()^[9] and urlencode()^[10] are commonly used and will help in these examples.
htmlentities() will encode characters such as <> into their html entity^[11] variants (< becomes < and > becomes >), after sanitizing $_GET['text'] (first vulnerability) with htmlentities() it is no longer possible to insert any sort of html element and therefore the vulnerability has now been ‘fixed’ (we can insert a string between <p> and </p> but we cannot make new elements (< char is ‘escaped’: written as < and will therefore not be part of the html format) and we cannot break out of the p-element because.. the < is escaped..
The second vulnerability is easy to fix as well because the ‘src’ tag is not a html event, we just have to make sure that the user-supplied string is not able to get out of the surrounding double quotes, for this we will use htmlentities() again, urlencode() would not be correct because the protocol and domain part of the url should *not* be urlencode()’d (note: even though you might think that addslashes() would work, it doesn’t).
Time for some examples:

<html>
<body>
    <p><?php echo htmlentities($_GET['text']); ?></p>
    <img
        src="<?php echo htmlentities($_GET['text']); ?>"
        title="Image" alt="Image"
    />
    <!-- vulnerable! -->
    <img
        src="<?php echo addslashes($_GET['text']); ?>"
        title="Image" alt="Image"
    />
    <!-- wrong! -->
    <img
        src="<?php echo urlencode($_GET['text']); ?>"
        title="Image" alt="Image"
    />
</body>
</html>

<html>
<body>
    <p>image.jpg&quot; /&gt;&lt;script&gt;alert(1)&lt;/script&gt;</p>
    <img
        src="image.jpg&quot; /&gt;&lt;script&gt;alert(1)&lt;/script&gt;"
        title="Image" alt="Image"
    />
    <!-- vulnerable! -->
    <img
        src="image.jpg\" /><script>alert(1)</script>"
        title="Image" alt="Image"
    />
    <!-- wrong! -->
    <img
        src="image.jpg%22+%2F%3E%3Cscript%3Ealert%281%29%3C%2Fscript%3E"
        title="Image" alt="Image"
    />
</body>
</html>

Let’s analyze this sample. The p-element looks fine, it contains some innertext, but definitely no javascript execution, do note that the browser will show the text as if it there were no html entities at all, which will be http://example.com/image.jpg" /><script>alert(1)</script> in this case and that is exactly what we want.
The first img-element, using htmlentities() to encode the ‘src’ tag, is doing good as well, the quotes are replaced by ", so it’s all safe and well.
The next test however wasn’t that successful, as you can see escaping the double-quote doesn’t work for html tags, so I was still able to inject my javascript code. Do not use addslashes() for anything but SQLi^[12].
Finally the last test from this sample, it sure as hell is encoded, but it’s worthless, an url can only have urlencode()’d data in the path/parameters, for example, this would work: http://example.com/image.jpg%22+%2F%3E%3Cscript%3Ealert%281%29%3C%2Fscript%3E.

HTML Events

Now we have finished the basics of XSS we will look at some more specific ‘problems’, starting by HTML Events.
HTML Events are a way for a developer to do some magic when a user interacts with a website, some simple examples: checking if a user has filled in all required fields in a form when registering a new account, nice graphics when a user hovers over a menu with the mouse, getting updates for sites such as twitter, etc.
As you can imagine these events give loads of opportunities for developers to make their website fancier, but also loads of opportunities for an attacker: each html event might be xss-able.

<html>
    <body>
        <img src="http://example.com/image.jpg" onclick="alert(1)" />
    </body>
</html>

This little example will show a messagebox when the user clicks on the image, now let’s get a better example.

<html>
<body>
    <script type="text/javascript">
    function f(username) {
        // process that the user has clicked on the image
    }
    </script>
    <?php $username = htmlentities($_GET['username'], ENT_QUOTES); ?>
    <img
        src="profile/<?php echo $username; ?>/profile.png"
        onclick="javascript:f('<?php echo $username; ?>')"
    />
</body>
</html>

Now this looks all shiny and safe you’d think, it’s not (note: the ENT_QUOTES parameter tells PHP to html entity encode single-quotes as well). HTML Event tags give another nice feature: before executing the javascript, the content is html-entity decoded!
What does this mean? It means that even though the content doesn’t directly contain quotes, in the executed javascript it does. Let’s see how this will work out as attacker..

<html>
<body>
    <script type="text/javascript">
    function f(username) {
        // process that the user has clicked on the image
    }
    </script>
    <img
        src="profile/jbremer/profile.png?&#039;);alert(1)///profile.png"
        onclick="javascript:f('jbremer/profile.png?&#039;);alert(1)//')"
    />
</body>
</html>

So what did I do here? First off I made sure the image path remained the correct, this was done by injecting the last part of the URL in the parameter following by a question mark, to terminate the path and start submitting “parameters”.
Secondly I made sure that we get code execution when the onclick html event is triggered: as stated above, the html entities are decoded before executing the code.

    f('jbremer/profile.png?');alert(1)//')

It doesn’t take a trained eye to notice f() is called with the last part of the image path, followed by execution of our code, followed by a comment (to ignore the ') part).
Before we continue with XSS Sanitizing #2 (HTML Events) there is one more feature, related to URLs.. URLs are encoded differently and I’m sure you have seen it many times before..
All characters except mixalphanum (all numbers and lower & uppercase letters from the alphabet) and a few more are encoded by a % token followed by the hexadecimal representation of the character.
For example the character < has the ascii value 60 which is 0x3c in hexadecimal, it’s representation in an URL is be %3c. There are two tags in elements that are url-decoded, these are href and src (respectively for an anchor and an image element).
Note that the tag is html entity decoded first, after that it’s url-decoded, for example <a href="javascript:alert(%22hello")">hello</a> turns into executing javascript:alert(%22hello") when the anchor is clicked, which will be url-decoded into alert("hello") (hexadecimal value of " is 0×22), which will trigger an alert box in javascript.

XSS Sanitizing #2 (HTML Events)

DOM Injection:

HTML Events can be used to trigger javascript, some examples:

<!-- Shows a messagebox when we click the Image -->
<img src="image.jpg" onclick="alert(1)" />

<!-- Shows a messagebox when we click on 'Link' -->
<a href="javascript:alert(1)">Link</a>

Sometimes it might be interesting to use user-supplied input in these html events, some examples:

<!-- Display image of user "jbremer" and give more information when the image is clicked -->
<img src="image.jpg" onclick="alert('User is: ' + 'jbremer')" />

<!-- Show a link to users' page -->
<a href="jbremer.html">Goto users' page</a>

Exploiting these examples is easily done:

<!-- Run our code using the username: jbremer'); alert(document.cookie);// -->
<img src="image.jpg" onclick="alert('User is: ' + 'jbremer'); alert(document.cookie);//')" />

<!-- Run our code using the username: javascript:alert(document.cookie);// -->
<a href="javascript:alert(document.cookie);//.html">Goto users' page</a>

Someone unknown with the problem will think that this can be fixed using htmlentities()/htmlspecialchars() [with ENT_QUOTES set], which results in:

<!-- htmlentities($str, ENT_QUOTES) / htmlspecialchars($str, ENT_QUOTES) -->
<img src="image.jpg" onclick="alert('User is: ' + 'jbremer&#039;); alert(document.cookie);//')" />
<a href="javascript:alert(document.cookie);//.html">Goto users' page</a>

Unfortunately no. When the HTML Event is triggered, the string will first be html-entity decoded. So ', ', ", etc will be decoded, rendering such ‘defense’ useless.

When javascript is inside an href or src tag there is another attack vector: the string is also url decoded, rendering urlencode() useless as well:

<!-- urlencode('alert(document.cookie);//') -->
<a href="javascript:alert%28document.cookie%29%3B%2F%2F.html">Goto users' page</a>

How to Protect from this attack?
As we have seen, in HTML tags one escape is not enough, so the solution is simple, two escapes.
This is not the only solution though, in this case the user-supplied input shouldn’t break out of a single-quoted string.
Some examples:

<!-- htmlentities(htmlentities($str, ENT_QUOTES), ENT_QUOTES) -->
<img src="image.jpg" onclick="alert('User is: ' + 'jbremer&amp;#039;); alert(document.cookie);//')" />

<!-- str_replace("'", "\\'", $str) -->
<img src="image.jpg" onclick="alert('User is: ' + 'jbremer\'); alert(document.cookie);//')" />

Vulnerable PHP Variabeles:

References:

Development & Security

By Jurriaan Bremer @skier_t