Monday, March 2, 2015

JS Obfuscation - The Nuts and Bolts

Whether you're a professional WWW hacker or just an average WWW kiddie like me, JavaScript (JS) will definitely play a huge role in your line of work. JS is what gives life to almost anything on a web page—save for some fancy CSS, a web page without JS would be plain lifeless.

The great thing about JS is that it's browser-interpreted, so the source code is already in your browser, ready for examination and sometimes even modification. This may sound cool to the client-side hacker but it's a nuisance for the server-side programmer and security team; a part of the application logic is getting leaked to the client side, all ready for exploitation!

As a result, many websites now use obfuscation techniques to scramble their JS sources. JS programmers use the concept of metaprogramming—creating a program that mutates itself to become something else—to devise elaborate tricks for hiding their logic inside cryptic, uncommented, space-stripped blocks of sphagetti code. The browser doesn't give a damn, however, and gracefully executes this sphagetti metaprogram which eventually does the entire job as nicely as the original code (perhaps at the cost of a few extra milliseconds, but who cares?).

These are some of the obtuscation 'techniques', which I have come across, while examining obfuscated JS:

  • eval()
  • This native JS function executes whatever string passed to it, on the JS engine, just as if it were regular inline JS code. eval(['a','l','e','r','t','(','"','H','e','l','l','o','"',')'].join('')) will display an alert "Hello" on the browser. This is often the core trick of more advanced techniques.

  • Packer, and similar code compacting tools
  • /packer/ is a service that can convert an arbitrary JS fragment to a "packer" function, often with the signature function(p,a,c,k,e,r) which, when eval()'d, produces the same result as the original fragment. Inside the function is a total mess for the naked eye, with a mixture of math operations, concatenations, arrays of keyword and function names, nested eval()s, and more. There are other similar converters as well, doing the same thing with slightly different algorithms. Many PTC websites including Neobux and the late Probux use this technique to shield their ad view and validation logic from prying eyes (but not from mine ;-)).

  • Token replacement
  • A list of textual tokens of 2-3 characters each, often having cryptic names, are declared. Subsequent code uses various combinations of these tokens to invoke desired operations. For example,

    a2x = 'do'
    //...
    n9p = 'ran'
    //...
    bb7 = 'm'
    //...
    alert(Math[n9p + a2x + bb7]())
    will produce an alert with a random number. I have observed this technique extensively in AdFly.

  • Plain compaction
  • To be honest, this is not obfuscation. However, given the possibility of complex syntactic tricks of JS, removing whitespace can make JS code highly unreadable and cryptic; yet it's just an illusion devised to frighten away inexperienced script kiddies.

Of course, this list is not exhausive; given the flexibility of JS, there are infinitely many possibilities of obfuscating the same piece of code. Nevertheless, all of them are based on the fundamentals—confusion, illusion and eval()—as they are meant for the man, not for the beast machine.

No comments: