Persistent caching with fire-and-forget updates

Wednesday, Dec 14, 2011
4 comments Python, Tornado

I just recently landed some patches on toocool that implements and interesting pattern that is seen more and more these days. I call it: Persistent caching with fire-and-forget updates

Basically, the implementation is this: You issue a request that requires information about a Twitter user: E.g. http://toocoolfor.me/following/chucknorris/vs/peterbe The app looks into its MongoDB for information about the tweeter and if it can't find this user it goes onto the Twitter REST API and looks it up and saves the result in MongoDB. The next time the same information is requested, and the data is available in the MongoDB it instead checks if the modify_date or more than an hour and if so, it sends a job to the message queue (Celery with Redis in my case) to perform an update on this tweeter.

You can basically see the code here but just to reiterate and abbreviate, it looks like this:


tweeter = self.db.Tweeter.find_one({'username': username})
if not tweeter:
   result = yield tornado.gen.Task(...)
   if result:
       tweeter = self.save_tweeter_user(result)
   else:
       # deal with the error!
elif age(tweeter['modify_date']) > 3600:
   tasks.refresh_user_info.delay(username, ...)
# render the template!

What the client gets, i.e. the user using the site, is it that apart from the very first time that URL is request is instant results but data is being maintained and refreshed.

This pattern works great for data that doesn't have to be up-to-date to the second but that still needs a way to cache invalidate and re-fetch. This works because my limit of 1 hour is quite arbitrary. An alternative implementation would be something like this:


tweeter = self.db.Tweeter.find_one({'username': username})
if not tweeter or (tweeter and age(tweeter) > 3600 * 24 * 7):
    # re-fetch from Twitter REST API
elif age(tweeter) > 3600:
    # fire-and-forget update

That way you don't suffer from persistently cached data that is too old.

Comments

Shawn Wheatley December 14, 2011

What you describe is really a specific implementation of Memoization - http://en.wikipedia.org/wiki/Memoization. You're right, it's a very powerful design.

Peter February 9, 2012

I'm just testing something here.

Paul Winkler December 14, 2011

Can you explain the use of "yield" in that code, or link to something that explains it? This doesn't look like a generator. I don't think I've seen the value of a yield expression assigned to a local before.

Peter Bengtsson December 14, 2011

It's a Tornado thing. It's awesome because it's an alternative to using callbacks basically. Unlike callbacks which need a new function/method with new parameters and scope, this method just carries on on the next line like any procedural program. I was turned on by Tornado before this was added and now it just makes it even sexier.

Previous:: Cryptic errors when using django-nose December 7, 2011 Django
Next:: When to __deepcopy__ classes in Python March 14, 2012 Python

Related by category:: Native connection pooling in Django 5 with PostgreSQL June 25, 2025 Python; A Python dict that can report which keys you did not use June 12, 2025 Python; How I run standalone Python in 2025 January 14, 2025 Python; How to resolve a git conflict in poetry.lock February 7, 2020 Python

Related by keyword:: How much faster is Redis at storing a blob of JSON compared to PostgreSQL? September 28, 2019 Python, PostgreSQL, Redis; Make your NextJS site 10-100x faster with Express caching February 18, 2022 Node, Nginx, React, JavaScript; How to use django-cache-memoize November 3, 2017 Python, Django; TypeScript generic async function wrapper function September 12, 2021 JavaScript

Persistent caching with fire-and-forget updates

Comments

Related posts