Combining Django signals with in-memory LRU cache

Saturday, Aug 9, 2025
4 comments Python, Django

I have a Django ORM model called Category. It's simple and it looks like this this:


class Category(models.Model):
    name = models.CharField(max_length=100)

What I occasionally need is all of those collected up in one big dict. Like this:


mapping = {}
for id, name in Category.objects.values_list("id", "name"):
    mapping[id] = name

If you want to be fancy you can use the dict constructor directly with:


mapping = dict(Category.objects.values_list("id", "name"))

...which does the same thing.

Even though it's not strictly necessary, I use the ORM class to put a class method in there so it's all neatly together:


class Category(models.Model):
    name = models.CharField(max_length=100)

    @classmethod
    def get_category_id_name_map(cls):
        return dict(cls.objects.values_list("id", "name"))

The Category model doesn't change very often, so it's ripe for caching to avoid the SQL query. You can use the functools.lru_cache for memoization.


from functools import lru_cache

class Category(models.Model):
    name = models.CharField(max_length=100)

    @classmethod
    @lru_cache(maxsize=1)
    def get_category_id_name_map(cls):
        return dict(cls.objects.values_list("id", "name"))

Now, for each Python processor, calling Category.get_category_id_name_map() will cache it indefinitely and consecutive uses of this makes that function call pretty much instant. Obviously, this assume the list of Category object is reasonably bounded.

Next, the Category does sometimes change. To be made aware when it changes, you can use Django signals. To purge that memoization cache, you do this:


from django.db.models.signals import post_delete, post_save,
from django.dispatch import receiver

@receiver(post_save, sender=Category)
@receiver(post_delete, sender=Category)
def purge_get_category_id_name_map(sender, instance, **kwargs):
    Category.get_category_id_name_map.cache_clear()

Simple benchmark


def f1():
    return dict(Category.objects.values_list("id", "name"))


def f2():
    return Category.get_category_id_name_map()


assert f1() == f2()

# Reporting
import time
import random
import statistics

functions = f1, f2
times = {f.__name__: [] for f in functions}

for i in range(10000):  # adjust accordingly so whole thing takes a few sec
    func = random.choice(functions)
    t0 = time.time()
    func()
    t1 = time.time()
    times[func.__name__].append((t1 - t0) * 1000)


def ms(s):
    return f"{s * 1000:.3f}ms"


for name, numbers in times.items():
    print("FUNCTION:", name, "Used", len(numbers), "times")
    print("\tMEDIAN", ms(statistics.median(numbers)))
    print("\tMEAN  ", ms(statistics.mean(numbers)))
    print("\tSTDEV ", ms(statistics.stdev(numbers)))

The output when you run it:


FUNCTION: f1 Used 5016 times
    MEDIAN 92.983ms
    MEAN   97.828ms
    STDEV  19.589ms
FUNCTION: f2 Used 4984 times
    MEDIAN 0.000ms
    MEAN   0.189ms
    STDEV  0.396ms

On average, 500x faster to avoid the SQL query compared to caching it once and retrieving from cache the other 999 times.

Comments

Stefan August 10, 2025

This looks like it won’t work in multi-thread/process servers as the signal is only sent/received in one thread.

David L Nugent August 12, 2025

I would have to agree with Stefan. LRU is great for single-threaded apps that run for a long time, but they would exist only for a short term during request processing within a Django environment. You need to externalise the cache using redis or similar to reap the benefit here. That brings its own pitfalls, but correctly implemented this works really well with Django.

Peter Bengtsson August 13, 2025

Django comes with a great caching framework. I have mine set up to use Redis and using `django_redis.cache.RedisCache`.
I extended the benchmark to include:

```
from django.core.cache import cache

def f3():
    value = cache.get('all-categories')
    if value is None:
        value = f1()
        cache.set('all-categories', value, timeout=60)
    return value
```

Re-running the benchmark yields a median that is 80% faster than the PostgreSQL ORM.
However, this benchmark was made were the Redis AND the Postgres are both available on localhost which might not be realistic thing in a production system (which is where optimizations matter)

Peter Bengtsson August 13, 2025

Yeah, it's fraught. You need to be careful when you have depend on something like `gunicorn wsgi -w 2` which I actually do for my Django server.

Another solution is using a TTL cache from `cachetools` and setting it to something like 60 seconds just to feel a little safer.

Previous:: gg2 - a new CLI for helping me manage git branches August 6, 2025 JavaScript, Bun, macOS
Next:: gg shell completion August 13, 2025 Linux, JavaScript, Bun, macOS

Related by category:: Using AI to rewrite blog post comments November 12, 2025 Python; A Python dict that can report which keys you did not use June 12, 2025 Python; Native connection pooling in Django 5 with PostgreSQL June 25, 2025 Django, Python; In Python, you have to specify the type and not rely on inference October 10, 2025 Python

Related by keyword:: Bun vs. Go for a basic web server benchmark October 24, 2025 Go, Bun; Fastest way to uniqify a list in Python August 14, 2006 Python; Benchmark compare Highlight.js vs. Prism May 19, 2020 Node, JavaScript; How to use django-cache-memoize November 3, 2017 Python, Django

Combining Django signals with in-memory LRU cache

Simple benchmark

Comments

Related posts