arei.net
Minification

There has been a great deal of discussion lately about the value of minification, yet very little concrete value. To that end I have decided to set down my thoughts from these discussions and share them such that everyone can be on the same page. Below I will outline what minification is and what it gains you. This is followed by discussion of minification versus HTTP Compression. Next, I will look at concept of minification as obfuscation and the usefulness of obfuscating in general. Finally, I will share my recommendation on the subject of Minification and which tools I would recommend.

MINIFICATION

To start, lets define what minification is: Minify or Minification is the process of taking some source file and compressing it to a smaller size by removing whitespace and comments. A number of other small rules are sometimes applied as part of the minify process, but these are less significant. For example, Minify will shorten variable names to two or three characters and thus gain some space there.

Consider this simple function which is not minified:

for (var i = 0; i <; 100 ; i++)
{
    var randomnumber = Math.floor(Math.random() * i);
    document.write("A random number between 0 and " + i +
                   " is " + randomnumber);
}

When minification is applied this function becomes thus:

for(var i=0;i<;100;i++){var randomnumber =Math.floor(Math.random()*L);document.write("A
random number between 0 and "+i+" is "+randomnumber);}

(Please note that any line breaks you see in the minified code above are due to word wrapping in your mail reader. Minified code is generally always a single line.)

In essence, minification reduces overall code size, and thus reduces the clients download time. The downside of minification is that by removing whitespace and comments and renaming some variables, the code becomes unreadable, extremely hard to debug, and may introduce unforeseen and hard to find bugs.

When we are talking about Web applications both Javascript (JS) and CSS files can be minified.

In a small test case I put two files through the minification process to see what the net gain would be. The Javascript file is of small size with few comments. The CSS is a large CSS file with few comments.

FILE SIZE                                JS                    CSS             
Bytes                                    17954                 39020
Minified Bytes                           10413                 31178
Net Gain                                 42%                   20%

As can be seen minification is much more useful to Javascript where whitespace is used much more. CSS tends to have less whitespace and thus it's gains are lesser. Also, gains will be much larger if your code is heavily commented. Since comments are removed, minification gains are directly related to the amount of comments.

So, a 42% gain seems like a lot when you are talking about really slow network connections. Yet the question is, are those gains really worth the sacrifice in the ability to debug your code? And if one does not want to sacrifice the ability to debug and read the code, what can be done instead of minification?

HTTP COMPRESSION

HTTP Compression is a standard part of the HTTP 1.1 protocol that is in use in almost all major browsers and web servers on the market today. In essence whenever an HTTP request is made by a browser, if that browser supports HTTP Compression a special parameter is added to the outgoing request that lists the different compression algorithms that the browser can uncompress. When a Web Server sees this special parameter, it checks to see if the requested file can be compressed and if the server has one of the compression algorithms that the browser has said it can use. If all this is true, the server will run the compression algorithm over the response before it is sent back to the browser. The browser receives the compressed response and make it uncompressed before making it available.

HTTP Compression has many advantages and few disadvantages. Foremost, HTTP Compression is an entirely automated process that is handled by the server and requires only a minor configuration change on most Web Servers to enable. The Browser requires no special functionality to use compression. The disadvantage of compression is that there is a minor performance hit to both the server side and the client side. The server side must spend time compressing the response. The client side must spend time uncompressing the response. Yet, in most cases the file sizes involved make the time spent in compression minimal. If files sizes were significantly larger, compression might begin to cause performance problems.

Let us consider our test files again. This time let us look at our net gain from using HTTP compression only.

FILE SIZE                                JS                    CSS             
Bytes                                    17954                 39020
Compressed Bytes                         3906                  6708
Net Gain                                 78%                   82%

Right away its apparent that HTTP compression offers significant gains. When compared to our previous results from minification, HTTP Compression is the clear favorite. The reason for these significant gains that the HTTP Compression works across all the bytes of the source files whereas Minification largely leaves the meat of the CSS and JS files alone and works mostly on the whitespace of those files.

Given the gains of HTTP Compression the next logical step is to ask, what if I did both minification and HTTP Compression. The result is even greater improvements, but not as much as one might imagine. Consider our test files again:

FILE SIZE                                JS                    CSS             
Bytes                                    17954                 39020
Compressed & Minified                    2514                  5907
Net Gain                                 86%                   85%

Overall when employing both techniques, the gain of minification is not as significant. For our JS file there is only an 8% gain over just plain HTTP Compression. There is even less gain when looking at the CSS file. However, please remember that both the JS and CSS files in our test are very light on comments. More comments will increase the gains somewhat, but less than one might think.

Given these results, the question to be asking is does the sacrifice of being able to debug and read my Javascript and CSS outweigh my need to gain an additional 8% over just using HTTP Compression.

So maybe Minification is not the way to go after all, but what about Minification as a means of protecting my code from people whom want to steal my hard work?

OBFUSCATION

In the world of interpretive computer languages such as Java, C#, Javascript, PHP, Groovy, etc there has long been the question of how one can prevent someone from reading or stealing their code. The more openly accessible the language, the easier it is to read and copy it.

Obfuscation, is the process of applying a series of heuristics to code that will make it harder to read and more confusing to understand. The intent is that by applying these rules, the code becomes such an confusing mess that nobody would bother to read the code.

Java Obfuscation is a fairly complex process. There are hundreds of heuristics that are applied to the incoming source code. The resulting end code is a nightmare of crazy classes, method and members names that make scanning the code painful.

Obfuscation for Javascript is much less powerful. Because Javascript is meant to be a widely open language, the ability to obfuscate the code is minimal. Variables, objects and methods in Javascript have no scope privacy and thus can be accessed by anyone and are often designed that way. This means that their names cannot be obfuscated because the names have relevance to the structure. You cannot change the name of a given method, because in many cases there is no way to tell whom has to call that method or where in the code that might happen.

Yet the biggest problem of all with Obfuscation is that it is almost completely worthless. Obfuscation will only stop the most basic of people from copying your code. Extracting obfuscated code is fairly simple in most interpreted languages. With many of today's sophisticated IDEs reformatting and stepping through code is fairly simple. Additionally, many IDEs support variable name matching such that when an obfuscated variable name is selected all of the matching uses are highlighted. All of this means taking obfuscated Javascript back to readable code is far easier than one might suspect.

Another big strike against obfuscation is that Javascript can be employed in very diverse and powerful ways. Closures, functions as objects, objects as associative arrays, and the like, can all lead up to some fairly complex Javascript. Often times this kind of programming can confuse an obfuscator that is not designed to handle the diversity of the language. Even when writing this email I managed to break one obfuscator with just the sample for loop code from above.

Obfuscation, at its best, serves to help keep honest people honest. It is not going to stop someone whose intent is to copy your code, it's not even going to put up a decent struggle.

Minification, while not technically an obfuscator, does provide some obfuscation like processes. The removal of whitespace and the collapsing of localized variables all makes the code harder to read. Yet, not that much is really changed otherwise. Consider our sample code block that we looked at earlier. With minification obfuscation, it is still fairly readable:

for(var L=0;L<;100;L++){var dS0=Math.floor(Math.random()*L);document.write("A random number between 0 and "+L+" is "+dS0);}

Given this code and about 2 minutes on the internet or a few search and replace calls and you can turn it back into fairly readable code as shown here:

for (var L = 0; L <; 100; L++) {
    var dS0 = Math.floor(Math.random() * L);
    document.write("A random number between 0 and" + L + "is" + dS0);
}

It's not a perfect match of our original code, but it is fairly close and takes little real work to achieve.

Ultimately, Obfuscation can have some utility in giving you a very basic first line of defense, but the ability to debug and read your code is reduced and in some instances obfuscation can add to code size, albeit a very small amount. These limitations must be carefully considered before undertaking obfuscation. What are you really trying to do when you obfuscate your code?

Unfortunately, there is no way to really protect your Javascript code on the internet. The design of HTML, Javascript, and CSS is all meant to be plain text readable formats and this means that absolutely nothing stands between the site code and the end user. The only real protection you have from people reading and copying your code is in how much the worry about your legal recourses.

CONCLUSIONS

So where does this leave us with regards to Minification?

Well, we showed that from a size compression stand point, minification has some small value, but not nearly as much value as just turning on HTTP Compression. From an obfuscation standpoint we outlined the fundamental weakness of obfuscation in general. Given these factors my recommendation is for careful consideration of whether or not minification is actually needed in the particular case you are making.

I would ask myself these questions:

1). How important is the ability to read and debug my code?

2). How much time and resources do I wish to invest in tracking down bugs in minified code?

3). How relevant is the commenting in my code to the ability to debug and read my code?

4). With Obfuscation, whom am I trying to protect my code from and how competent are they?

Ideally, I think the best application is to do no Obfuscation, HTTP Compression, and a reduced form of Minification in which only comments are removed from the code. This provides you with high compression and still maintains the ability to read and debug your code. In my opinion obfuscation really is not worth the effort to employ it for the benefit you receive from it.

CONFIGURATION NOTES

HTTP Compression can be turned on in either Tomcat of Apache Web Server with just a little bit of effort. Tomcat requires a mere 3 lines of code to get HTTP Compression working. Apache HTTP Server requires a few more lines, but it is still relatively easy. For more information about configuring tomcat see http://viralpatel.net/blogs/2008/11/enable-gzip-compression-in-tomcat.html. For more information about configuring Apache Web Server see http://httpd.apache.org/docs/2.0/mod/mod_deflate.html.

TOOLS NOTES

Finally, I will plug the YUI Compressor. There are a lot of Minification programs out there, but YUI Compressor seems to allow for the most configurability. Also, I liked YUI Compressors ability to minify both CSS and JS files within one simple program. You can find out more about YUI Compressor at http://developer.yahoo.com/yui/compressor/

permalink: http://www.arei.net/archives/97