Django test optimization with no-op PIL engine

27 October 2016   6 comments   Python, Django

Powered by Fusion×

The Air Mozilla project is a regular Django webapp. It's reasonably big for a more or less one man project. It's ~200K lines of Python and ~100K lines of JavaScript. There are 816 "unit tests" at the time of writing. Most of them are kinda typical Django tests. Like:

def test_some_feature(self):
    thing = MyModel.objects.create(key='value')
    url = reverse('namespace:name', args=(thing.id,))
    response = self.client.get(url)
    ....

Also, the site uses sorl.thumbnail to automatically generate thumbnails from uploaded images. It's a great library.

However, when running tests, you almost never actually care about the image itself. Your eyes will never feast on them. All you care about is that there is an image, that it was resized and that nothing broke. You don't write tests that checks the new image dimensions of a generated thumbnail. If you need tests that go into that kind of detail, it best belongs somewhere else.

So, I thought, why not fake ALL operations that are happening inside sorl.thumbnail to do with resizing and cropping images.

Here's the changeset that does it. Note, that the trick is to override the default THUMBNAIL_ENGINE that sorl.thumbnail loads. It usually defaults to sorl.thumbnail.engines.pil_engine.Engine and I just wrote my own that does no-ops in almost every instance.

I admittedly threw it together quite quickly just to see if it was possible. Turns out, it was.

# Depends on setting something like:
#    THUMBNAIL_ENGINE = 'airmozilla.base.tests.testbase.FastSorlEngine'
# in your settings specifically for running tests.


from sorl.thumbnail.engines.base import EngineBase


class _Image(object):
    def __init__(self):
        self.size = (1000, 1000)
        self.mode = 'RGBA'
        self.data = '\xa0'


class FastSorlEngine(EngineBase):

    def get_image(self, source):
        return _Image()

    def get_image_size(self, image):
        return image.size

    def _colorspace(self, image, colorspace):
        return image

    def _scale(self, image, width, height):
        image.size = (width, height)
        return image

    def _crop(self, image, width, height, x_offset, y_offset):
        image.size = (width, height)
        return image

    def _get_raw_data(self, image, *args, **kwargs):
        return image.data

    def is_valid_image(self, raw_data):
        return bool(raw_data)

So, was it much faster?

It's hard to measure because the time it takes to run the whole test suite depends on other stuff going on on my laptop during the long time it takes to run the tests. So I ran them 8 times with the old code and 8 times with this new hack.

Iteration Before After
1 82.789s 73.519s
2 82.869s 67.009s
3 77.100s 60.008s
4 74.642s 58.995s
5 109.063s 80.333s
6 100.452s 81.736s
7 85.992s 61.119s
8 82.014s 73.557s
Average 86.865s 69.535s
Median 82.869s 73.519s
Std Dev 11.826s 9.0757s

So rougly 11% faster. Not a lot but it adds up when you're doing test-driven development or debugging where you run a suite or a test over and over as you're saving the files/tests you're working on.

Room for improvement

In my case, it just worked with this simple solution. Your site might do fancier things with the thumbnails. Perhaps we can combine forces on this and finalize a working solution into a standalone package.

Follow @peterbe on Twitter

Comments

Dane Hillard
Could you speak to the benefits of using this approach over something like unittest.mock.Mock?
Peter Bengtsson
First of all, I didn't even know that mock was part of unittest now. I thought you still had to install it separately.

Generally, I suspect both will work. Maybe more a matter of taste. I'm generally pessimistic towards mocking unless it's the only way possible. Mocking is a clever but equally nasty hack and the code often becomes hard to read (once it's escaped your short-term memory) and it's so easy to "overmock" and accidentally make everything a mock object that doesn't help you check your sanity.
Dane Hillard
Done well, I believe mocking can be incredibly insightful and readable. I'll admit that doing it well is often less trivial than it sounds! I also see the value in creating objects that are essentially test harnesses, so I'm not necessarily saying I'd never follow your approach. Just wanted to get your thoughts. Thanks for the input!
Israel Fruchter
Yeah you could over mock thing, but will always prefer using the same approaches in all of my tests, writing a specific mock for each thing seems a weird, and not all code lend itself to the pattern you demonstrated (I.e. having a pluggable engines)

Mocking is a very valid approach, as you demonstrated.
I think that unit tests should be very specific, and anything beyond the limits of your process, should be avoided (mocked).

There are other types of tests, like component/integration tests, where the opposite is advised (but still for a lot of reason it perfectlly valid to use pretenders/simulators, for some parts of your system)

For example I'm started recently testing any component I'm writing in a docker compose setup which give me access to controlling the connections to DB, or other services, I.e. you can stop the database container and test reconnectivity.
Israel Fruchter
Why not use the unittest.mock, and the you can also check if it was. called ?
Peter Bengtsson
See response above to Dane.
Thank you for posting a comment

Your email will never ever be published


Related posts

Previous:
hashin 0.7.0 and multiple packages 30 August 2016
Next:
Optimization of QuerySet.get() with or without select_related 03 November 2016
Related by Keyword:
Time to do concurrent CPU bound work 13 May 2016
Introducing optisorl 18 August 2015
Related by Text:
Mocking os.stat in Python 08 November 2009
Optimization story involving something silly I call "dict+" 13 June 2011
How to do performance micro benchmarks in Python 24 June 2017
Fastest "boolean SQL queries" possible with Django 14 January 2011
My dislike for booleans and that impact on the Django Admin 01 June 2009