Fastest Redis configuration for Django

Thursday, May 11, 2017
1 comment Python, Linux, Web development, Django

URL: https://github.com/peterbe/django-fastest-redis

I have an app that does a lot of Redis queries. It all runs in AWS with ElastiCache Redis. Due to the nature of the app, it stores really large hash tables in Redis. The application then depends on querying Redis for these. The question is; What is the best configuration possible for the fastest service possible?

Note! Last month I wrote Fastest cache backend possible for Django which looked at comparing Redis against Memcache. Might be an interesting read too if you're not sold on Redis.

Options

All options are variations on the compressor, serializer and parser which are things you can override in django-redis. All have an effect on the performance. Even compression, for if the number of bytes between Redis and the application is smaller, then it should have better network throughput.

Without further ado, here are the variations:


CACHES = {
    "default": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/0',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
        }
    },
    "json": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/1',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "SERIALIZER": "django_redis.serializers.json.JSONSerializer",
        }
    },
    "ujson": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/2',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "SERIALIZER": "fastestcache.ujson_serializer.UJSONSerializer",
        }
    },
    "msgpack": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/3',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "SERIALIZER": "django_redis.serializers.msgpack.MSGPackSerializer",
        }
    },
    "hires": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/4',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "PARSER_CLASS": "redis.connection.HiredisParser",
        }
    },
    "zlib": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/5',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "COMPRESSOR": "django_redis.compressors.zlib.ZlibCompressor",
        }
    },
    "lzma": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": config('REDIS_LOCATION', 'redis://127.0.0.1:6379') + '/6',
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
            "COMPRESSOR": "django_redis.compressors.lzma.LzmaCompressor"
        }
    },
}

As you can see, they each have a variation on the OPTIONS.PARSER_CLASS, OPTIONS.SERIALIZER or OPTIONS.COMPRESSOR.

The default configuration is to use redis-py and to pickle the Python objects to a bytestring. Pickling in Python is pretty fast but it has the disadvantage that it's Python specific so you can't have a Ruby application reading the same Redis database.

The Experiment

Note how I have one LOCATION per configuration. That's crucial for the sake of testing. That way one database is all JSON and another is all gzip etc.

What the benchmark does is that it measures how long it takes to READ a specific key (called benchmarking). Then, once it's done that it appends that time to the previous value (or [] if it was the first time). And lastly it writes that list back into the database. That way, towards the end you have 1 key whose value looks something like this: [0.013103008270263672, 0.003879070281982422, 0.009411096572875977, 0.0009970664978027344, 0.0002830028533935547, ..... MANY MORE ....].

Towards the end, each of these lists are pretty big. About 500 to 1,000 depending on the benchmark run.

In the experiment I used wrk to basically bombard the Django server on the URL /random (which makes a measurement with a random configuration). On the EC2 experiment node, it finalizes around 1,300 requests per second which is a decent number for an application that does a fair amount of writes.

The way I run the Django server is with uwsgi like this:

uwsgi --http :8000 --wsgi-file fastestcache/wsgi.py --master --processes 4 --threads 2

And the wrk command like this:

wrk -d30s  "http://127.0.0.1:8000/random"

(that, by default, runs 2 threads on 10 connections)

At the end of starting the benchmarking, I open http://localhost:8000/summary which spits out a table and some simple charts.

An Important Quirk

One thing I noticed when I started was that the final numbers' average was very different from the medians. That would indicate that there are spikes. The graph on the right shows the times put into that huge Python list for the default configuration for the first 200 measurements. Note that there are little spikes but generally quite flat over time once it gets past the beginning.

Sure enough, it turns out that in almost all configurations, the time it takes to make the query in the beginning is almost order of magnitude slower than the times once the benchmark has started running for a while.

So in the test code you'll see that it chops off the first 10 times. Perhaps it should be more than 10. After all, if you don't like the spikes you can simply look at the median as the best source of conclusive truth.

The Code

The benchmarking code is here. Please be aware that this is quite rough. I'm sure there are many things that can be improved, but I'm not sure I'm going to keep this around.

The Equipment

The ElastiCache Redis I used was a cache.m3.xlarge (13 GiB, High network performance) with 0 shards and 1 node and no multi-zone enabled.

The EC2 node was a m4.xlarge Ubuntu 16.04 64-bit (4 vCPUs and 16 GiB RAM with High network performance).

Both the Redis and the EC2 were run in us-west-1c (North Virginia).

The Results

Here are the results! Sorry if it looks terrible on mobile devices.

root@ip-172-31-2-61:~# wrk -d30s  "http://127.0.0.1:8000/random" && curl "http://127.0.0.1:8000/summary"
Running 30s test @ http://127.0.0.1:8000/random
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     9.19ms    6.32ms  60.14ms   80.12%
    Req/Sec   583.94    205.60     1.34k    76.50%
  34902 requests in 30.03s, 2.59MB read
Requests/sec:   1162.12
Transfer/sec:     88.23KB
                         TIMES        AVERAGE         MEDIAN         STDDEV
json                      2629        2.596ms        2.159ms        1.969ms
msgpack                   3889        1.531ms        0.830ms        1.855ms
lzma                      1799        2.001ms        1.261ms        2.067ms
default                   3849        1.529ms        0.894ms        1.716ms
zlib                      3211        1.622ms        0.898ms        1.881ms
ujson                     3715        1.668ms        0.979ms        1.894ms
hires                     3791        1.531ms        0.879ms        1.800ms

Best Averages (shorter better)
###############################################################################
██████████████████████████████████████████████████████████████   2.596  json
█████████████████████████████████████                            1.531  msgpack
████████████████████████████████████████████████                 2.001  lzma
█████████████████████████████████████                            1.529  default
███████████████████████████████████████                          1.622  zlib
████████████████████████████████████████                         1.668  ujson
█████████████████████████████████████                            1.531  hires
Best Medians (shorter better)
###############################################################################
███████████████████████████████████████████████████████████████  2.159  json
████████████████████████                                         0.830  msgpack
████████████████████████████████████                             1.261  lzma
██████████████████████████                                       0.894  default
██████████████████████████                                       0.898  zlib
████████████████████████████                                     0.979  ujson
█████████████████████████                                        0.879  hires


Size of Data Saved (shorter better)
###############################################################################
█████████████████████████████████████████████████████████████████  60K  json
██████████████████████████████████████                             35K  msgpack
████                                                                4K  lzma
█████████████████████████████████████                              35K  default
█████████                                                           9K  zlib
████████████████████████████████████████████████████               48K  ujson
█████████████████████████████████████                              34K  hires

Discussion Points

There is very little difference once you avoid the json serialized one.
msgpack is the fastest by a tiny margin. I prefer median over average because it's more important how it over a long period of time.
The default (which is pickle) is fast too.
lzma and zlib compress the strings very well. Worth thinking about the fact that zlib is a very universal tool and makes the app "Python agnostic".
You probably don't want to use the json serializer. It's fat and slow.
Using hires makes very little difference. That's a bummer.
Considering how useful zlib is (since you can fit so much much more data in your Redis) it's impressive that it's so fast too!
I quite like zlib. If you use that on the pickle serializer you're able to save ~3.5 times as much data.
Laugh all you want but until today I had never heard of lzma. So based on that odd personal fact, I'm pessmistic towards that as a compression choice.

Conclusion

This experiment has lead me to the conclusion that the best serializer is msgpack and the best compression is zlib. That is the best configuration for django-redis.

msgpack has implementation libraries for many other programming languages. Right now that doesn't matter for my application but if msgpack is both faster and more versatile (because it supports multiple languages) I conclude that to be the best serializer instead.