An intro to GZIP and compression

GZIP is one of those big words used in programming to generally describe compression. Then again, you end up asking what “compression” is in the context of programming and computers. These two terms can be a little intimidating to the noob and not until you’ve really implemented it, or experienced it first hand you won’t really know it’s true nature and the fundamental idea behind it. So let’s do that.

We often see these two terms come up frequently during page speed optimizations. Software tools like GTMetrix and Page Speed Insights. If you’re running into page speed issues or a really slow loading page, these tools will recommend compressing your files. Alright so what does that mean?

Alright, a lot actually. But in simple terms:

Compression is the process of encoding information using fewer bits. — Ilya Grigorik, Google

Compression is much like how are data is passed through the wire or through waves. A little bit of physics knowledge can help you understand this actually. Think of how information is sent and their “states” once displayed. Behind the scenes are 1’s and 0’s, bits that can be rearranged in an efficient manner – in an order (or disorder) that can travel at efficient speeds.

I mentioned “disorder” here because it relates to “Entropy”, yes we use the fundamental laws of the universe to compress data. I couldn’t find any information on the internet that explains the subject of compression well enough, aside from Wikipedia. This guy explains it pretty well – I really like how he makes mention of “Entropy”.

So you start with some text, something like this:

“To be or not to be, that is the question”

Using compression, this text can be rearranged and encrypted resulting in something like this:

“sdfsarr234234jcvwrwerrweklrj;lkwjer;lkwje;rlkjw”

Yeah, I know that looks cryptic – that’s the whole point. Don’t worry that’s what it looks like when it’s being passed through the wire or wirelessly. Once it gets to the browser client, that sentence is “decompressed” and arranged back to normal. Everything is all good and gravy.

Note that all this magic doesn’t happen by default – some implementation is involved on your part, as the programmer.

Below is a common implementation in an apache environment, within the .htaccess file.


<IfModule mod_deflate.c>
  # Compress HTML, CSS, JavaScript, Text, XML and fonts
  AddOutputFilterByType DEFLATE application/javascript
  AddOutputFilterByType DEFLATE application/rss+xml
  AddOutputFilterByType DEFLATE application/vnd.ms-fontobject
  AddOutputFilterByType DEFLATE application/x-font
  AddOutputFilterByType DEFLATE application/x-font-opentype
  AddOutputFilterByType DEFLATE application/x-font-otf
  AddOutputFilterByType DEFLATE application/x-font-truetype
  AddOutputFilterByType DEFLATE application/x-font-ttf
  AddOutputFilterByType DEFLATE application/x-javascript
  AddOutputFilterByType DEFLATE application/xhtml+xml
  AddOutputFilterByType DEFLATE application/xml
  AddOutputFilterByType DEFLATE font/opentype
  AddOutputFilterByType DEFLATE font/otf
  AddOutputFilterByType DEFLATE font/ttf
  AddOutputFilterByType DEFLATE image/svg+xml
  AddOutputFilterByType DEFLATE image/x-icon
  AddOutputFilterByType DEFLATE text/css
  AddOutputFilterByType DEFLATE text/html
  AddOutputFilterByType DEFLATE text/javascript
  AddOutputFilterByType DEFLATE text/plain
  AddOutputFilterByType DEFLATE text/xml

  # Remove browser bugs (only needed for really old browsers)
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  BrowserMatch ^Mozilla/4\.0[678] no-gzip
  BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
  Header append Vary User-Agent
</IfModule>

Let’s use this line for example:

[javacript]

AddOutputFilterByType DEFLATE application/javascript

[/javascript]

Understanding the above should help you understand the rest of the lines, the same concept applies to it all. All were doing here is applying compression to all javascript files, this is what “application/javascript” means. Each type uses a specific type of value so we can’t just use arbitrary names here.

For instance, if we want to compress HTML files we will use “text/html” as the file type.

Yes I know this is a lot of digest, but gaining a basic understanding of compression is a good start. Just know that compression has to do with rearranging information using bits and then rearranging them back in a human readable form using decompression.