Fastest cache backend possible for Django

07 April 2017   5 comments   Python, Linux, Web development

Powered by Fusion×

tl;dr; Redis is twice as fast as memcached as a Django cache backend when installed using AWS ElastiCache. Only tested for reads.

Django has a wonderful caching framework. I think I say "wonderful" because it's so simple. Not because it has a hundred different bells or whistles. Each cache gets a name (e.g. "mymemcache" or "redis append only"). The only configuration you generally have to worry about is 1) what backed and 2) what location.

For example, to set up a memcached backend:

# this in settings.py
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'KEY_PREFIX': 'myapp',
        'LOCATION': config('MEMCACHED_LOCATION', '127.0.0.1:11211'),
    },
}

With that in play you can now do:

>>> from django.core.cache import caches
>>> caches['default'].set('key', 'value', 60)  # 60 seconds
>>> caches['default'].get('key')
'value'

Django comes without built-in backend called django.core.cache.backends.locmem.LocMemCache which is basically a simply Python object in memory with no persistency between Python processes. This one is of course super fast because it involves no further network (local or remote) beyond the process itself. But it's not really useful because if you care about performance (which you probably are if you're here because of the blog post title) because it can't be reused amongst processes.

Anyway, the most common backends to use are:

These are semi-persistent and built for extremely fast key lookups. They can both be reached over TCP or via a socket.

What I wanted to see, is which one is fastest.

The Experiment

First of all, in this blog post I'm only measuring the read times of the various cache backends.

Here's the Django view function that is the experiment:

from django.conf import settings
from django.core.cache import caches

def run(request, cache_name):
    if cache_name == 'random':
        cache_name = random.choice(settings.CACHE_NAMES)
    cache = caches[cache_name]
    t0 = time.time()
    data = cache.get('benchmarking', [])
    t1 = time.time()
    if random.random() < settings.WRITE_CHANCE:
        data.append(t1 - t0)
        cache.set('benchmarking', data, 60)
    if data:
        avg = 1000 * sum(data) / len(data)
    else:
        avg = 'notyet'
    # print(cache_name, '#', len(data), 'avg:', avg, ' size:', len(str(data)))
    return http.HttpResponse('{}\n'.format(avg))

It records the time to make a cache.get read and depending settings.WRITE_CHANCE it also does a write (but doesn't record that).
What it records is a list of floats. The content of that piece of data stored in the cache looks something like this:

  1. [0.0007331371307373047]
  2. [0.0007331371307373047, 0.0002570152282714844]
  3. [0.0007331371307373047, 0.0002570152282714844, 0.0002200603485107422]

So the data grows from being really small to something really large. If you run this 1,000 times with settings.WRITE_CACHE of 1.0 the last time it has to fetch a list of 999 floats out of the cache backend.

You can either test it with 1 specific backend in mind and see how fast Django can do, say, 10,000 of these. Here's one such example:

$ wrk -t10 -c400 -d10s http://127.0.0.1:8000/default
Running 10s test @ http://127.0.0.1:8000/default
  10 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    76.28ms  155.26ms   1.41s    92.70%
    Req/Sec   349.92    193.36     1.51k    79.30%
  34107 requests in 10.10s, 2.56MB read
  Socket errors: connect 0, read 0, write 0, timeout 59
Requests/sec:   3378.26
Transfer/sec:    259.78KB

$ wrk -t10 -c400 -d10s http://127.0.0.1:8000/memcached
Running 10s test @ http://127.0.0.1:8000/memcached
  10 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    96.87ms  183.16ms   1.81s    95.10%
    Req/Sec   213.42     82.47     0.91k    76.08%
  21315 requests in 10.09s, 1.57MB read
  Socket errors: connect 0, read 0, write 0, timeout 32
Requests/sec:   2111.68
Transfer/sec:    159.27KB

$ wrk -t10 -c400 -d10s http://127.0.0.1:8000/redis
Running 10s test @ http://127.0.0.1:8000/redis
  10 threads and 400 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    84.93ms  148.62ms   1.66s    92.20%
    Req/Sec   262.96    138.72     1.10k    81.20%
  25271 requests in 10.09s, 1.87MB read
  Socket errors: connect 0, read 0, write 0, timeout 15
Requests/sec:   2503.55
Transfer/sec:    189.96KB

But an immediate disadvantage with this is that the "total final rate" (i.e. requests/sec) is likely to include so many other factors. However, you can see that LocMemcache got 3378.26 req/s, MemcachedCache got 2111.68 req/s and RedisCache got 2503.55 req/s.

The code for the experiment is available here: https://github.com/peterbe/django-fastest-cache

The Infra Setup

I created an AWS m3.xlarge EC2 Ubuntu node and two nodes in AWS ElastiCache. One 2-node memcached cluster based on cache.m3.xlarge and one 2-node 1-replica Redis cluster also based on cache.m3.xlarge.

The Django server was run with uWSGI like this:

uwsgi --http :8000 --wsgi-file fastestcache/wsgi.py  --master --processes 6 --threads 10

The Results

Instead of hitting one backend repeatedly and reporting the "requests per second" I hit the "random" endpoint for 30 seconds and let it randomly select a cache backend each time and once that's done, I'll read each cache and look at the final massive list of timings it took to make all the reads. I run it like this:

wrk -t10 -c400 -d30s http://127.0.0.1:8000/random && curl http://127.0.0.1:8000/summary
...wrk output redacted...

                         TIMES        AVERAGE         MEDIAN         STDDEV
memcached                 5738        7.523ms        4.828ms        8.195ms
default                   3362        0.305ms        0.187ms        1.204ms
redis                     4958        3.502ms        1.707ms        5.591ms

Best Averages (shorter better)
###############################################################################
█████████████████████████████████████████████████████████████  7.523  memcached
██                                                             0.305  default
████████████████████████████                                   3.502  redis

Things to note:

Other Things To Test

Perhaps pylibmc is faster than python-memcached.

                         TIMES        AVERAGE         MEDIAN         STDDEV
pylibmc                   2893        8.803ms        6.080ms        7.844ms
default                   3456        0.315ms        0.181ms        1.656ms
redis                     4754        3.697ms        1.786ms        5.784ms

Best Averages (shorter better)
###############################################################################
██████████████████████████████████████████████████████████████   8.803  pylibmc
██                                                               0.315  default
██████████████████████████                                       3.697  redis

Using pylibmc didn't make things much faster. What if we we pit memcached against pylibmc?:

                         TIMES        AVERAGE         MEDIAN         STDDEV
pylibmc                   3005        8.653ms        5.734ms        8.339ms
memcached                 2868        8.465ms        5.367ms        9.065ms

Best Averages (shorter better)
###############################################################################
█████████████████████████████████████████████████████████████  8.653  pylibmc
███████████████████████████████████████████████████████████    8.465  memcached

What about that fancy hiredis Redis Python driver that's supposedly faster?

                         TIMES        AVERAGE         MEDIAN         STDDEV
redis                     4074        5.628ms        2.262ms        8.300ms
hiredis                   4057        5.566ms        2.296ms        8.471ms

Best Averages (shorter better)
###############################################################################
███████████████████████████████████████████████████████████████  5.628  redis
██████████████████████████████████████████████████████████████   5.566  hiredis

These last two results are both surprising and suspicious. Perhaps the whole setup is wrong. Why wouldn't the C-based libraries be faster? Is it so incredibly dwarfed by the network I/O in the time between my EC2 node and the ElastiCache nodes?

In Conclusion

I personally like Redis. It's not as stable as memcached. On a personal server I've run for years the Redis server sometimes just dies due to corrupt memory and I've come to accept that. I don't think I've ever seen memcache do that.

But there are other benefits with Redis as a cache backend. With the django-redis library you have really easy access to the raw Redis connection and you can do much more advanced data structures. You can also cache certain things indefinitely. Redis also supports storing much larger strings than memcached (1MB for memcached and 512MB for Redis).

The conclusion is that Redis is faster than memcached by a factor of 2. Considering the other feature benefits you can get out of having a Redis server available, it's probably a good choice for your next Django project.

Bonus Feature

In big setups you most likely have a whole slur of web heads that are servers that do nothing by handle web requests. And these are configured to talk to databases and caches over the near network. However, so many of us have cheap servers on DigitalOcean or Linode where we run web servers, relational databases and cache servers all on the same machine. (I do. This blog is one of those where there is Nginx, Redis, memcached and PostgreSQL on a 4GB DigitalOcean NYC SSD machine).

So here's one last test where I installed a local Redis and a local memcached on the EC2 node itself:

$ cat .env | grep 127.0.0.1
MEMCACHED_LOCATION="127.0.0.1:11211"
REDIS_LOCATION="redis://127.0.0.1:6379/0"

Here are the results:

                         TIMES        AVERAGE         MEDIAN         STDDEV
memcached                 7366        3.456ms        1.380ms        5.678ms
default                   3716        0.263ms        0.189ms        1.002ms
redis                     5582        2.334ms        0.639ms        4.965ms

Best Averages (shorter better)
###############################################################################
█████████████████████████████████████████████████████████████  3.456  memcached
████                                                           0.263  default
█████████████████████████████████████████                      2.334  redis

The conclusion of that last benchmark is that Redis is still faster and it's roughly 1.8x faster to run these backends on the web head than to use ElastiCache. Perhaps that just goes to show how amazingly fast the AWS inter-datacenter fiber network is!

Comments

Cezar
I've run redis with a million users without it crashing for years at a time. Not sure what's up with your install.
Peter Bengtsson
My problem was that the server ran out of memory. And instead of grinding to a halt, it crashes.
Grant Jenks
Great to see more benchmarks out there. In my local testing, Memcached often outperforms Redis by a fair margin. Seems like AWS ElastiCache is having a large effect on the results. You may be interested in my DiskCache project: http://www.grantjenks.com/docs/diskcache/ which also benchmarks Django cache backends.
Peter Bengtsson
How did you measure memcache outperformed Redis?

Also, disk cache seems dangerously inefficient when you have multiple web heads.
Mind you, this while blog is served from disk. Django renders HTML which is dumped on disk and Nginx renders that, and falls back to Django (via uWSGI).
Grant Jenks
Similar setup to yours. Initialize Django with multiple caches then test the performance of get/set/delete operations. Script in Github repo at tests/benchmark_djangocache.py Results are here: http://www.grantjenks.com/docs/diskcache/djangocache-benchmarks.html I'm measuring multiple percentiles: 50th (median), 90th, 99th, and max.

I'm not sure what "multiple web heads" refers to but I will guess you mean a load balancer with multiple servers behind it. That is a scenario where you've probably outgrown DiskCache. Although if you load balance consistently (say by ip-hash) then the issue can be mitigated.
Thank you for posting a comment

Your email will never ever be published


Related posts

Previous:
Fastest way to download a file from S3 29 March 2017
Next:
A decent Elasticsearch search engine implementation 09 April 2017
Related:
Fastest *local* cache backend possible for Django 04 August 2017
Fastest Redis configuration for Django 11 May 2017
Autocompeter.com 02 April 2015
Bye bye AWS EC2, Hello Digital Ocean 03 November 2014
All my apps are now running on one EC2 server 03 November 2013
Fastest database for Tornado 09 October 2013
django-fancy-cache with or without stats 11 March 2013
Secs sell! How frickin' fast this site is! (server side) 05 April 2012
Persistent caching with fire-and-forget updates 14 December 2011
Comparing Google Closure with UglifyJS 10 July 2011