I have a Django ORM model called Category. It's simple and it looks like this this:


class Category(models.Model):
    name = models.CharField(max_length=100)

What I occasionally need is all of those collected up in one big dict. Like this:


mapping = {}
for id, name in Category.objects.values_list("id", "name"):
    mapping[id] = name

If you want to be fancy you can use the dict constructor directly with:


mapping = dict(Category.objects.values_list("id", "name"))

...which does the same thing.

Even though it's not strictly necessary, I use the ORM class to put a class method in there so it's all neatly together:


class Category(models.Model):
    name = models.CharField(max_length=100)

    @classmethod
    def get_category_id_name_map(cls):
        return dict(cls.objects.values_list("id", "name"))

The Category model doesn't change very often, so it's ripe for caching to avoid the SQL query. You can use the functools.lru_cache for memoization.


from functools import lru_cache

class Category(models.Model):
    name = models.CharField(max_length=100)

    @classmethod
    @lru_cache(maxsize=1)
    def get_category_id_name_map(cls):
        return dict(cls.objects.values_list("id", "name"))

Now, for each Python processor, calling Category.get_category_id_name_map() will cache it indefinitely and consecutive uses of this makes that function call pretty much instant. Obviously, this assume the list of Category object is reasonably bounded.

Next, the Category does sometimes change. To be made aware when it changes, you can use Django signals. To purge that memoization cache, you do this:


from django.db.models.signals import post_delete, post_save,
from django.dispatch import receiver

@receiver(post_save, sender=Category)
@receiver(post_delete, sender=Category)
def purge_get_category_id_name_map(sender, instance, **kwargs):
    Category.get_category_id_name_map.cache_clear()

Simple benchmark


def f1():
    return dict(Category.objects.values_list("id", "name"))


def f2():
    return Category.get_category_id_name_map()


assert f1() == f2()

# Reporting
import time
import random
import statistics

functions = f1, f2
times = {f.__name__: [] for f in functions}

for i in range(10000):  # adjust accordingly so whole thing takes a few sec
    func = random.choice(functions)
    t0 = time.time()
    func()
    t1 = time.time()
    times[func.__name__].append((t1 - t0) * 1000)


def ms(s):
    return f"{s * 1000:.3f}ms"


for name, numbers in times.items():
    print("FUNCTION:", name, "Used", len(numbers), "times")
    print("\tMEDIAN", ms(statistics.median(numbers)))
    print("\tMEAN  ", ms(statistics.mean(numbers)))
    print("\tSTDEV ", ms(statistics.stdev(numbers)))

The output when you run it:


FUNCTION: f1 Used 5016 times
    MEDIAN 92.983ms
    MEAN   97.828ms
    STDEV  19.589ms
FUNCTION: f2 Used 4984 times
    MEDIAN 0.000ms
    MEAN   0.189ms
    STDEV  0.396ms

On average, 500x faster to avoid the SQL query compared to caching it once and retrieving from cache the other 999 times.

Comments

Stefan

This looks like it won’t work in multi-thread/process servers as the signal is only sent/received in one thread.

David L Nugent

I would have to agree with Stefan. LRU is great for single-threaded apps that run for a long time, but they would exist only for a short term during request processing within a Django environment. You need to externalise the cache using redis or similar to reap the benefit here. That brings its own pitfalls, but correctly implemented this works really well with Django.

Peter Bengtsson

Django comes with a great caching framework. I have mine set up to use Redis and using `django_redis.cache.RedisCache`.
I extended the benchmark to include:

```
from django.core.cache import cache

def f3():
    value = cache.get('all-categories')
    if value is None:
        value = f1()
        cache.set('all-categories', value, timeout=60)
    return value
```

Re-running the benchmark yields a median that is 80% faster than the PostgreSQL ORM.
However, this benchmark was made were the Redis AND the Postgres are both available on localhost which might not be realistic thing in a production system (which is where optimizations matter)

Peter Bengtsson

Yeah, it's fraught. You need to be careful when you have depend on something like `gunicorn wsgi -w 2` which I actually do for my Django server.

Another solution is using a TTL cache from `cachetools` and setting it to something like 60 seconds just to feel a little safer.

Your email will never ever be published.

Previous:
gg2 - a new CLI for helping me manage git branches August 6, 2025 JavaScript, Bun, macOS
Next:
gg shell completion August 13, 2025 Linux, JavaScript, Bun, macOS
Related by category:
A Python dict that can report which keys you did not use June 12, 2025 Python
Native connection pooling in Django 5 with PostgreSQL June 25, 2025 Django, Python
Faster way to sum an integer series in Python August 28, 2025 Python
How I run standalone Python in 2025 January 14, 2025 Python
Related by keyword:
Fastest way to uniqify a list in Python August 14, 2006 Python
Benchmark compare Highlight.js vs. Prism May 19, 2020 Node, JavaScript
How to use django-cache-memoize November 3, 2017 Python, Django
Django ORM optimization story on selecting the least possible February 22, 2019 Python, Web development, Django, PostgreSQL