Randomizd | Random Thoughts Serialized: securing js code

Despite how cryptic JS obfuscation may sound, deobfuscating JS is only slightly more difficult than the obfuscating part.

Decompaction

Since this is exactly the job of a beautifier, any popular JS beautifier can get a whitespace-trimmed, cryptic JS one-liner back in good shape in a split second. Online Javascript Beautifier is my favourite, as it can process entire webpages (HTML+CSS+JS) as well, and even supports some JS deobfuscation features.

Packer

Ironically enough, packer itself has a deobfuscation feature. Although it's disabled by default, you can easily enable it by stripping the disabled attribute off the lower textarea control, using the Inspect Element feature available on web browsers. After this hack, you can paste the packed code into the lower textarea and use the Decode button to get it de-obfuscated.

Token replacement

I haven't yet come across a straightforward way to defend this trick, although there's most probably an extremely simple way. Yet, the JS consoles of most web browsers can help you what each set of tokens would concatenate into. All I can say is that it's just dead painful to copy and paste each set of tokens to see what they mean, so some regex and text manipulation magic may often come in handy.

Experience: It's the Key!

As with anything else, when it comes to deciphering a JS puzzle, what matters most is prior experience and, more importantly, attention to the tiniest levels of detail. Taking a 'snapshot' of the current JS context of the browser (window) can sometimes come in handy. The following script can do the trick, when run in the browser console:

var keys = Object.keys(window);
var backup = {}
for(i in keys) {
	var key = keys[i];
	backup[key] = window[key];
}

If you want to compare this snapshot with a later state of the context, the following script will help:

var keys = Object.keys(window);
for(i in keys) {
	var key = keys[i];
	if(backup[key] != window[key])
		console.log(key + ': ' + backup[key] + ' -> ' + window[key]);
}

Examining HTTP requests sent out by the web page can also provide hints regarding what might be going on inside the page, behind the curtain of obfuscation. Most modern browsers provide a network request tab in their developer consoles, while there are popular analyzer plug-ins like FireBug as well.

Circumvention

Although JS is nearly taken for granted in modern web apps, some old (or more "compatible") pages still contain <noscript> tags to tackle non-JS environments. No matter how complicated the corresponding JS may look, the <noscript> tag always has to use plain old HTML (and CSS) to accomplish the same goal. An excellent example is the CAPTCHA component used on many pages; the cryptic JS fragment used to dynamically load and validate the CAPTCHA is often accompanied by a plain <noscript> tag that simply contains the URL of a CAPTCHA image (usually in an <img> tag) and a form that simply POSTs the user input to the solution URL.

Situation-specific Strategies (ඌරගෙ මාලු උෟරගෙ ඇඟේ තියල කැපීම)

Once I came across a page (on a popular PTC site) that had gone to such lengths as to dynamically generate a JS fragment by concatenating an array of numbers (ASCII codes), which then sends another HTTP request for a different obfuscated script file, and evaluates the resulting content to obtain the actual JS source, which then gets evaluated yet again to bring about the desired functionality. I simply copied the initial script and ran it 'as-is', removing only the outermost (last) eval(), to obtain the final JS source that first appeared to be impossible to intercept. :) Sometimes, all that is required to defeat JS obfuscation is a little bit of strategy.

Please note that some keywords used in this article may not correspond to their actual technical meanings, as most of the scenarios are my personal experiences and not validated against standard 'hacking' literature.

Whether you're a professional WWW hacker or just an average WWW kiddie like me, JavaScript (JS) will definitely play a huge role in your line of work. JS is what gives life to almost anything on a web page—save for some fancy CSS, a web page without JS would be plain lifeless.

The great thing about JS is that it's browser-interpreted, so the source code is already in your browser, ready for examination and sometimes even modification. This may sound cool to the client-side hacker but it's a nuisance for the server-side programmer and security team; a part of the application logic is getting leaked to the client side, all ready for exploitation!

As a result, many websites now use obfuscation techniques to scramble their JS sources. JS programmers use the concept of metaprogramming—creating a program that mutates itself to become something else—to devise elaborate tricks for hiding their logic inside cryptic, uncommented, space-stripped blocks of sphagetti code. The browser doesn't give a damn, however, and gracefully executes this sphagetti metaprogram which eventually does the entire job as nicely as the original code (perhaps at the cost of a few extra milliseconds, but who cares?).

These are some of the obtuscation 'techniques', which I have come across, while examining obfuscated JS:

eval()

This native JS function executes whatever string passed to it, on the JS engine, just as if it were regular inline JS code. eval(['a','l','e','r','t','(','"','H','e','l','l','o','"',')'].join('')) will display an alert "Hello" on the browser. This is often the core trick of more advanced techniques.

Packer, and similar code compacting tools

/packer/ is a service that can convert an arbitrary JS fragment to a "packer" function, often with the signature function(p,a,c,k,e,r) which, when eval()'d, produces the same result as the original fragment. Inside the function is a total mess for the naked eye, with a mixture of math operations, concatenations, arrays of keyword and function names, nested eval()s, and more. There are other similar converters as well, doing the same thing with slightly different algorithms. Many PTC websites including Neobux and the late Probux use this technique to shield their ad view and validation logic from prying eyes (but not from mine ;-)).

Token replacement

A list of textual tokens of 2-3 characters each, often having cryptic names, are declared. Subsequent code uses various combinations of these tokens to invoke desired operations. For example,

a2x = 'do'
//...
n9p = 'ran'
//...
bb7 = 'm'
//...
alert(Math[n9p + a2x + bb7]())

AdFly

Plain compaction

To be honest, this is not obfuscation. However, given the possibility of complex syntactic tricks of JS, removing whitespace can make JS code highly unreadable and cryptic; yet it's just an illusion devised to frighten away inexperienced script kiddies.

Of course, this list is not exhausive; given the flexibility of JS, there are infinitely many possibilities of obfuscating the same piece of code. Nevertheless, all of them are based on the fundamentals—confusion, illusion and eval()—as they are meant for the man, not for the beast machine.

Sunday, March 15, 2015

JS Obfuscation (Part 2) - Countermeasures

Monday, March 2, 2015

JS Obfuscation - The Nuts and Bolts