Last week I enabled HTMLMinifier as a post-build step for server-rendered content here on this blog. Basically, after a page is rendered in Django, it's sent to a Celery queue that does things to the index.html
file. The first thing it does its that it extracts the stylesheets and replaces them with a block of inline CSS. More details in this blog post. Secondly, what the background job does it that it sends the index.html
file to node_modules/.bin/html-minifier
. See the code here.
What that does is that it removes quotation marks where not needed (e.g. <div id=foo>
instead of <div id="foo">
), removes HTML comments, and lastly removes whitespace that is not needed. The result is that the HTML now looks like this:
I also added a line of logging that spits out a measurement of the size of the HTML size before, before with gzip, after, and after with gzip. Why? Because the optimization of HTML minification is usually insignificant after you gzip. See this blog post about how insignificant space optimization is in comparison to gzip. Look at the sample log lines:
... Minified before: 38,249 bytes (11,150 gzipped), After: 36,098 bytes (10,875 gzipped), Shaving 2,151 bytes (275 gzipped) Minified before: 37,698 bytes (10,534 gzipped), After: 35,622 bytes (10,243 gzipped), Shaving 2,076 bytes (291 gzipped) Minified before: 58,846 bytes (14,623 gzipped), After: 55,540 bytes (14,313 gzipped), Shaving 3,306 bytes (310 gzipped) ...
So this last one saved 3.2KB of HTML document which isn't a sneeze, but since 99% of clients support gzip, it actually only saved 310 bytes. As a matter of fact, I parsed the log lines and calculated the average and it was saving 338 bytes per page.
Worth it? I doubt it. It's not without risks and now it's slightly harder and weirder to view the source. However 338 bytes multiplied by the total number of visitors per month, I estimate to save a total of 161 MB of data less to be sent.
Comments
I'm doing the same but with https://pypi.org/project/django-staticinline/ and https://pypi.org/project/django-htmlmin/.
Interesting! What do you think of it? In particular, what do you think of django-htmlmin? I see that it's a mix of beautifulsoup4 and html5lib. Has it been solid?
I'm quite hesitant towards tools that are called "django-" because HTML minification should just be you and your HTML.
I noticed there's another project called https://github.com/mankyd/htmlmin which you'd think django-htmlmin wraps but that's not the case. Have you tried this one?
Ah, it works fine so far. I haven't put much effort into research but I checked that the minifier doesn't strip linebreaks within pre and textare tags.
I like the idea of the npm minifier that it also removes quotes to really squeeze the last byte out out of it, but the overhead in this deployment wouldn't be worth for me. I like that django-htmlmin is a simple middleware so I can run it before my full page cache kicks in.