The Air Mozilla project is a regular Django webapp. It's reasonably big for a more or less one man project. It's ~200K lines of Python and ~100K lines of JavaScript. There are 816 "unit tests" at the time of writing. Most of them are kinda typical Django tests. Like:
def test_some_feature(self):
thing = MyModel.objects.create(key='value')
url = reverse('namespace:name', args=(thing.id,))
response = self.client.get(url)
....
Also, the site uses sorl.thumbnail to automatically generate thumbnails from uploaded images. It's a great library.
However, when running tests, you almost never actually care about the image itself. Your eyes will never feast on them. All you care about is that there is an image, that it was resized and that nothing broke. You don't write tests that checks the new image dimensions of a generated thumbnail. If you need tests that go into that kind of detail, it best belongs somewhere else.
So, I thought, why not fake ALL operations that are happening inside sorl.thumbnail
to do with resizing and cropping images.
Here's the changeset that does it. Note, that the trick is to override the default THUMBNAIL_ENGINE
that sorl.thumbnail
loads. It usually defaults to sorl.thumbnail.engines.pil_engine.Engine
and I just wrote my own that does no-ops in almost every instance.
I admittedly threw it together quite quickly just to see if it was possible. Turns out, it was.
# Depends on setting something like:
# THUMBNAIL_ENGINE = 'airmozilla.base.tests.testbase.FastSorlEngine'
# in your settings specifically for running tests.
from sorl.thumbnail.engines.base import EngineBase
class _Image(object):
def __init__(self):
self.size = (1000, 1000)
self.mode = 'RGBA'
self.data = '\xa0'
class FastSorlEngine(EngineBase):
def get_image(self, source):
return _Image()
def get_image_size(self, image):
return image.size
def _colorspace(self, image, colorspace):
return image
def _scale(self, image, width, height):
image.size = (width, height)
return image
def _crop(self, image, width, height, x_offset, y_offset):
image.size = (width, height)
return image
def _get_raw_data(self, image, *args, **kwargs):
return image.data
def is_valid_image(self, raw_data):
return bool(raw_data)
So, was it much faster?
It's hard to measure because the time it takes to run the whole test suite depends on other stuff going on on my laptop during the long time it takes to run the tests. So I ran them 8 times with the old code and 8 times with this new hack.
Iteration | Before | After |
---|---|---|
1 | 82.789s | 73.519s |
2 | 82.869s | 67.009s |
3 | 77.100s | 60.008s |
4 | 74.642s | 58.995s |
5 | 109.063s | 80.333s |
6 | 100.452s | 81.736s |
7 | 85.992s | 61.119s |
8 | 82.014s | 73.557s |
Average | 86.865s | 69.535s |
Median | 82.869s | 73.519s |
Std Dev | 11.826s | 9.0757s |
So rougly 11% faster. Not a lot but it adds up when you're doing test-driven development or debugging where you run a suite or a test over and over as you're saving the files/tests you're working on.
Room for improvement
In my case, it just worked with this simple solution. Your site might do fancier things with the thumbnails. Perhaps we can combine forces on this and finalize a working solution into a standalone package.
Comments
Post your own commentCould you speak to the benefits of using this approach over something like unittest.mock.Mock?
First of all, I didn't even know that mock was part of unittest now. I thought you still had to install it separately.
Generally, I suspect both will work. Maybe more a matter of taste. I'm generally pessimistic towards mocking unless it's the only way possible. Mocking is a clever but equally nasty hack and the code often becomes hard to read (once it's escaped your short-term memory) and it's so easy to "overmock" and accidentally make everything a mock object that doesn't help you check your sanity.
Done well, I believe mocking can be incredibly insightful and readable. I'll admit that doing it well is often less trivial than it sounds! I also see the value in creating objects that are essentially test harnesses, so I'm not necessarily saying I'd never follow your approach. Just wanted to get your thoughts. Thanks for the input!
Yeah you could over mock thing, but will always prefer using the same approaches in all of my tests, writing a specific mock for each thing seems a weird, and not all code lend itself to the pattern you demonstrated (I.e. having a pluggable engines)
Mocking is a very valid approach, as you demonstrated.
I think that unit tests should be very specific, and anything beyond the limits of your process, should be avoided (mocked).
There are other types of tests, like component/integration tests, where the opposite is advised (but still for a lot of reason it perfectlly valid to use pretenders/simulators, for some parts of your system)
For example I'm started recently testing any component I'm writing in a docker compose setup which give me access to controlling the connections to DB, or other services, I.e. you can stop the database container and test reconnectivity.
Why not use the unittest.mock, and the you can also check if it was. called ?
See response above to Dane.