HTMLMinifier in use on this blog now

07 August 2018   3 comments   Web development, Javascript, Web Performance

Last week I enabled HTMLMinifier as a post-build step for server-rendered content here on this blog. Basically, after a page is rendered in Django, it's sent to a Celery queue that does things to the index.html file. The first thing it does its that it extracts the stylesheets and replaces them with a block of inline CSS. More details in this blog post. Secondly, what the background job does it that it sends the index.html file to node_modules/.bin/html-minifier. See the code here.

What that does is that it removes quotation marks where not needed (e.g. <div id=foo> instead of <div id="foo">), removes HTML comments, and lastly removes whitespace that is not needed. The result is that the HTML now looks like this:

View source

I also added a line of logging that spits out a measurement of the size of the HTML size before, before with gzip, after, and after with gzip. Why? Because the optimization of HTML minification is usually insignificant after you gzip. See this blog post about how insignificant space optimization is in comparison to gzip. Look at the sample log lines:

...
Minified before: 38,249 bytes (11,150 gzipped), After: 36,098 bytes (10,875 gzipped), Shaving 2,151 bytes (275 gzipped)
Minified before: 37,698 bytes (10,534 gzipped), After: 35,622 bytes (10,243 gzipped), Shaving 2,076 bytes (291 gzipped)
Minified before: 58,846 bytes (14,623 gzipped), After: 55,540 bytes (14,313 gzipped), Shaving 3,306 bytes (310 gzipped)
...

So this last one saved 3.2KB of HTML document which isn't a sneeze, but since 99% of clients support gzip, it actually only saved 310 bytes. As a matter of fact, I parsed the log lines and calculated the average and it was saving 338 bytes per page.

Worth it? I doubt it. It's not without risks and now it's slightly harder and weirder to view the source. However 338 bytes multiplied by the total number of visitors per month, I estimate to save a total of 161 MB of data less to be sent.

Comments

Martin
Peter Bengtsson
Interesting! What do you think of it? In particular, what do you think of django-htmlmin? I see that it's a mix of beautifulsoup4 and html5lib. Has it been solid?

I'm quite hesitant towards tools that are called "django-" because HTML minification should just be you and your HTML.

I noticed there's another project called https://github.com/mankyd/htmlmin which you'd think django-htmlmin wraps but that's not the case. Have you tried this one?
Martin
Ah, it works fine so far. I haven't put much effort into research but I checked that the minifier doesn't strip linebreaks within pre and textare tags.

I like the idea of the npm minifier that it also removes quotes to really squeeze the last byte out out of it, but the overhead in this deployment wouldn't be worth for me. I like that django-htmlmin is a simple middleware so I can run it before my full page cache kicks in.

Your email will never ever be published


Related posts

Previous:
To defer or to async JavaScript tags. That's the question. 29 June 2018
Next:
Quick dog-piling (aka stampeding herd) URL stresstest 10 August 2018
Related by Keyword:
django-pipeline and Zopfli 15 August 2018
The impressive first-meaningful-paint improvement of using minimalcss 24 April 2018
Now using minimalcss 12 March 2018
csso and django-pipeline 28 February 2018
minimalcss 0.6.2 now strips all unused font faces 22 January 2018
Related by Text:
jQuery and Highslide JS 08 January 2008
I'm back! Peterbe.com has been renewed 05 June 2005
Anti-McCain propaganda videos 12 August 2008
Ever wondered how much $87 Billion is? 04 November 2003
Guake, not Yakuake or Yeahconsole 23 January 2010