A blog and website by Peter Bengtsson
I've written about mozjpeg before where I showed what it can do to a sample directory full of different kinds of JPEGs. But let's get more real. Let's actually install it and look at one thumbnail and one big photo.
To install, I used the pre-compiled binaries from this wonderful site. Like this:
# wget http://mozjpeg.codelove.de/bin/mozjpeg_3.1_amd64.deb # dpkg -i mozjpeg_3.1_amd64.deb # ls -l /opt/mozjpeg/bin/cjpeg -rwxr-xr-x 1 root root 50784 Sep 3 19:03 /opt/mozjpeg/bin/cjpeg
I don't know why the binary executable becomes called
cjpeg but that's fine. Let's put it in
$PATH so other users can execute it:
# cd /usr/local/bin # ln -s /opt/mozjpeg/bin/cjpeg
Now, let's actually use it for something. First we need a realistic lossy thumbnail that we can optimize.
$ wget http://www.peterbe.com/static/cache/eb/f0/ebf08e64e80170dc009e97f6f9681ceb.jpg
This was one of the thumbnails from a previous post called Panasonic Lumix from 2008 or a iPhone 5S from 2014.
$ jpeg -outfile ebf08e64e80170dc009e97f6f9681ceb.moz.jpg -optimise ebf08e64e80170dc009e97f6f9681ceb.jpg $ ls -l ebf08e64e80170dc009e97f6f9681ceb.* -rw-rw-r-- 1 django django 11391 Sep 26 17:04 ebf08e64e80170dc009e97f6f9681ceb.jpg -rw-r--r-- 1 django django 9414 Oct 10 01:40 ebf08e64e80170dc009e97f6f9681ceb.moz.jpg
Yay! It's 17.4% smaller. Saving 1.93Kb.
So what do they look like? See for yourself:
I have to zoom in (⌘-+) 3 times until I can see any difference. But remember, the saving isn't massive but the usecase here is a thumbnail.
So, let's do the same with a non-thumbnail. Some huge JPEG.
$ time cjpeg -outfile Lumix-2.moz.jpg -optimise Lumix-2.jpg real 0m3.285s user 0m3.122s sys 0m0.080s $ ls -l Lumix* -rw-rw-r-- 1 django django 4880446 Sep 26 17:20 Lumix-2.jpg -rw-rw-r-- 1 django django 1546978 Oct 10 02:02 Lumix-2.moz.jpg $ ls -lh Lumix* -rw-rw-r-- 1 django django 4.7M Sep 26 17:20 Lumix-2.jpg -rw-rw-r-- 1 django django 1.5M Oct 10 02:02 Lumix-2.moz.jpg
In other words, from 4.7Mb to 1.5Mb. It's 68.3% the size of the original. And the visual difference?
Again, I have to zoom in 3 times to be able to tell any difference and even when I've done that it's hard to tell which is which.
In conclusion, let's go ahead and use
mozjpeg to optimize thumbnails.
I'm currently working on a Django library that uses mozjpeg to optimize thumbnails that are generated from stored images. I first wanted to get a feel for how good
mozjpeg really is.
~/Downloads directory I have all sorts of "junk" from all sorts of saves and experiments. It'll work as a good testbed of relatively random JPEG images of all sorts of sizes and qualities. Without further ado, here's the results:
FILENAME OPTIMIZE ORIGINAL SAVING PERCENT ----------------------------------------------------------------------------------------- 180697_1836563311933_3364808_n.jpg 45.2Kb 50.4Kb 5.1Kb 10.2% 2014-03-20 17.35.39.jpg 2040.1Kb 2207.8Kb 167.7Kb 7.6% 2015-03-04 21.18.16.jpg 1521.5Kb 1629.2Kb 107.7Kb 6.6% 2015-03-04 21.19.16.jpg 1602.4Kb 1720.0Kb 117.6Kb 6.8% 2015-03-04 21.23.16.jpg 1181.7Kb 1272.1Kb 90.4Kb 7.1% 2015-03-05 06.03.00.jpg 1426.7Kb 1557.7Kb 131.0Kb 8.4% 20150626_200629_001.jpg 1566.4Kb 1717.3Kb 151.0Kb 8.8% 20150626_200631.jpg 2157.6Kb 2319.6Kb 162.0Kb 7.0% Boba_Fett_by_RobD4E.jpg 96.2Kb 104.3Kb 8.1Kb 7.8% Horse_Play.jpg 170.4Kb 185.2Kb 14.9Kb 8.0% Image (107).jpg 344.9Kb 390.6Kb 45.7Kb 11.7% Misc Candle Holder NECA FOTR Balrog Dec2002.jpg 37.1Kb 37.7Kb 0.6Kb 1.5% Mozilla_Lightbeam.jpg 55.1Kb 79.7Kb 24.6Kb 30.8% Photo on 12-17-14 at 5.55 PM.jpg 168.5Kb 187.7Kb 19.2Kb 10.2% dev.jpg 17.5Kb 30.8Kb 13.3Kb 43.2% dev2.jpg 41.1Kb 54.3Kb 13.3Kb 24.4% dev3.jpg 35.3Kb 49.0Kb 13.7Kb 28.0% dev4.jpg 42.0Kb 56.0Kb 14.0Kb 25.0% dev5.jpg 24.6Kb 37.9Kb 13.2Kb 35.0% dev6.jpg 28.9Kb 42.8Kb 13.9Kb 32.4% hr_0570_220_135__0570220135006.jpg 3124.3Kb 3467.8Kb 343.5Kb 9.9% hr_0570_220_158__0570220158006.jpg 3010.0Kb 3319.1Kb 309.1Kb 9.3% hr_0570_220_175__0570220175006.jpg 2245.5Kb 2442.6Kb 197.0Kb 8.1% hr_0570_227_599__0570227599006.jpg 2561.7Kb 2809.8Kb 248.1Kb 8.8% hr_0596_622_701__0596622701006.jpg 3238.8Kb 3453.6Kb 214.7Kb 6.2% hr_0596_623_849__0596623849006.jpg 2902.9Kb 3102.1Kb 199.3Kb 6.4% hr_0622_219_873__0622219873006.jpg 985.3Kb 1066.9Kb 81.7Kb 7.7% logo.jpg 43.5Kb 51.2Kb 7.7Kb 15.1% mvm-header.jpg 8.5Kb 12.4Kb 3.9Kb 31.6% mvm-postcard-picture.jpg 72.2Kb 73.4Kb 1.3Kb 1.7% overhang_pixels.jpg 3014.3Kb 3370.8Kb 356.4Kb 10.6% peterbe copy.jpg 4.2Kb 10.4Kb 6.2Kb 59.7% peterbe.jpg 36.7Kb 44.3Kb 7.5Kb 17.0% pjt-mcguinty-2.jpg 96.8Kb 101.6Kb 4.8Kb 4.8% sl1.jpg 28.7Kb 35.4Kb 6.7Kb 18.9%
That's an median of 9.3% (average of 15.3%) savings.
It's not very fast though. Some of the large files take more than a second. In total it took 23.7 seconds to create all of those optimized files. Do what you want with that fact, bear in mind that these are hopefully "once in a lifetime" operations (depending on the ephemerality of your thumbnail storage). Mind you, the really large JPEGs skew that since the median is 72.1 milliseconds and average is 527.0 milliseconds. Also, when I look through the numbers I find that the large JPGs take the longest but had the least benefit in terms of byte savings.
Chris Adams, in the comment below, inspired me to compare my trials with jpegoptim and jpegrescan. So, I took my script that generated a directory of 45 JPEGs and changed it to use
mozjpeg total size of that output directory is 34.1Mb and it took a total of 23.3 seconds (median 76.4 milliseconds).
jpegoptim & jpegrescan total size of that output directory is 35.6Mb and it took a total of 4.6 seconds (median 32.1 milliseconds).
In other words, roughly speaking
mozjpeg is 4.2% more space effective and 58% slower than
jpegoptim & jpegrescan.
tl;dr Crash-stats is Mozilla's crash reporter dashboard. Simply fixing the static assets made the site 25% faster.
(The "First Byte Time" is still terrible but that's for another discussion. We're working on a re-write of the underlying data model for that particular report.)
Note how the SpeedIndex dropped from 2823 to 2098 which basically means, you can see stuff sooner.
The Load Time used to be 5.7 seconds on average. Now it takes 3.5 seconds.
It used to weigh 717 KB to load the whole thing. Now it weighs 326 KB.
The only thing we changed was a long overdue correction of static asset headers and Gzip compression. Now, files with unique URLs (e.g.
/static/CACHE/css/23a811f100bc.css) have maximum aggressive cache headers. And now all
text/html is Gzipped.
Was it easy to do? Hell no!
Does it matter? Hell yeah! We don't have a lot of users or traffic on these reports but the people who use them do this for a living and making the site feel snappier for them would make their lives more productive.
The way it works, is that I use a library called Lazyr.js which notices when you scroll down and when certain pictures are going to be in view, it changes the
So it basically looks like this:
<article> <h3>Event 1</h3> <img src="event1.png"> </article> <article> <h3>Event 2</h3> <img src="event2.png"> </article> <article> <h3>Event 3</h3> <img src="event3.png"> </article> <article> <h3>Event 4</h3> <img src="placeholder.png" data-lazyr="event4.png"> </article> <article> <h3>Event 5</h3> <img src="placeholder.png" data-lazyr="event5.png"> </article> <article> <h3>Event 6</h3> <img src="placeholder.png" data-lazyr="event6.png"> </article>
That means that to load this page it needs to download, only:
event1.png event2.png event3.png placeholder.png
Only 4 images instead of the otherwise 6 (in this example).
When you scroll down to see the rest of the list, it then also downloads:
event4.png event5.png event6.png
The actual numbers on Air Mozilla is that there are 10 events page page and I lazy load 6 of them.
There is more work to do though. At the moment, the thumbnails in the sidebar (Trending and Upcoming events) are above the fold when you're browsing but below the fold when you're viewing an individual event. That's something I have yet to implement.
We're proud to announce that we've now published our first Roku channel; Air Mozilla
We actually started this work in the third quarter of 2014 but the review process for adding a channel is really slow. The people we've talked to have been super friendly and provide really helpful feedback as to changes that need to be made. After the first submission, it took about a month for them to get back to us and after some procrastination we submitted it a second time about a month ago and yesterday we found out it's been fully published. I.e. gone live.
Obviously it would be nice if they could get back to us quicker but another thing they could improve is to appreciate that we're a team. All communication with Roku has been to just me and I always have to forward emails or add my teammates as CC when I communicate with them.
Anyway, now we can start on a version 2. We deliberately kept this first version ultra-simple just to prove that it's possible and not being held back due to feature creep.
What we're looking to add in version 2 are, in no particular order:
It's going to be much easier to find the energy to work on those features now that we know it's live.
Also, we currently have a problem watching live and archived streams on HTTPS. It's not a huge problem right now because we're not making any restricted content available and we're lucky in that the CDNs we use allow for HTTP traffic equally.
tl;dr Don't run ffmpeg over HTTP(S) and use ffmpegthumbnailer
UPDATE tl;dr Download the file then run ffmpeg with -ss HH:MM:SS first. Don't bother with ffmpegthumbnailer
At work I work on something called Air Mozilla. It's a site for hosting live video broadcasts and then archiving those so they can be retrieved later.
Unlike sites like YouTube we can't take a screencap from the video because many videos are future (aka. "upcoming") videos so instead we use a little placeholder thumbnail (for example, the Rust logo).
However, once it has been recorded we want to switch from the logo to an actual screen capture from the video itself. We set up a cronjob that uses
ffmpeg to extract these as JPGs and then the users can go in and select whichever picture they like the best.
This is all work in progress by the way (as of December 2014).
One problem is that we have is that the command for extracting JPGs is really slow. So slow that we can't wrap the subprocess in a Django database connection because it's so slow that the database connection is often killed.
The command to extract them looks something like this:
ffmpeg -i https://cdnexample.com/url/to/file.mp4 -r 0.0143 /tmp/screencaps-%02d.jpg
Where the number
r is based on the duration and how many pictures we want out. E.g.
0.0143 = 15 * 1049 where 15 is how many JPGs we want and
1049 is a duration of 17 minutes and 29 seconds.
The script I used first was: ffmpeg1.sh
My first experiment was to try to extract one picture at a time, hoping that way, internally,
ffmpeg might be able to optimize something.
The second script I used was: ffmpeg2.sh
The third alternative was to try ffmpegthumbnailer which is an intricate wrapper on
ffmpeg and it has the benefit that you can produce slightly higher picture quality too.
The third script I used was: ffmpeg3.sh
For a video clip that is 17 minutes long and a 138Mb mp4 file.
ffmpeg1.sh 2m0.847s ffmpeg2.sh 11m46.734s ffmpeg3.sh 0m29.780s
Clearly it's not efficient to do one screenshot at a time.
ffmpegthumbnailer you can tell it not to reduce the picture quality the total weight of the produced JPGs from
ffmpeg1.sh was 784Kb and the total weight from
ffmpeg3.sh was 1.5Mb.
Just to try again, I ran a similar experiment with a 35 minutes long and 890Mb mp4 file. And this time I didn't bother with
ffmpeg2.sh. The results were:
ffmpeg1.sh 18m21.330s ffmpeg3.sh 2m48.656s
So that means that using
ffmpegthumbnailer is about 5 times faster than
ffmpeg. Huge difference!
The reason for doing
ffmpeg -i https://... was so that we don't have to first download the whole beast and run the command on a local file. However, in light of how so much longer this takes and my disdain to have to install and depend on a new tool (
ffmpegthumbnailer) across all servers. Why not download the whole file and run the
ffmpeg command locally.
So I download the file and it's slow because of my, currently, terrible home DSL. Then I run and time them again but just a local file instead:
ffmpeg1.sh 0m20.426s ffmpeg3.sh 0m0.635s
Did you see that!? That's an insane difference. Clearly doing this command over HTTP(S) is a bad idea. It'll be worth downloading it first.
On Stackoverflow, LordNeckBeard gave a great tip of using the
-ss option before in the input file and now it's much faster. At this point. I'm no longer interested in having to bother with
Let's fork ffmpeg2.sh into two versions.
Now, let's run them again on the 138Mb file:
# the 138Mb mp4.mp4 file ffmpeg2.1.sh 2m10.898s ffmpeg2.2.sh 0m0.672s
187 times faster
And again, I re-ran this again against a bigger file that is 1.4Gb:
# the 1.4Gb mp4-1.44Gb.mp4 file ffmpeg2.1.sh 10m1.143s ffmpeg2.2.sh 0m1.428s
420 times faster