A blog and website by Peter Bengtsson

Filtered home page! Currently only showing blog entries under the category: Django. Clear filter

Following on from yesterday's blog about How and why to use django-mongokit I extended the exampleproject which is inside the django-mongokit project with another app called exampleapp_sql which does the same thing as the exampleapp but does it with SQL instead. Then I added a very simple benchmarker app in the same project and wrote three functions:

  1. One to create 10/100/500/1000 instances of my class
  2. One to edit one field of all 10/100/500/1000 instances
  3. One to delete each of the 10/100/500/1000 instances

Speed test between django_mongokit and postgresql_psycopg2

The results can speak for themselves:

# 10
mongokit django_mongokit.mongodb
Creating 10 talks took 0.0108649730682 seconds
Editing 10 talks took 0.0238521099091 seconds
Deleting 10 talks took 0.0241661071777 seconds
IN TOTAL 0.058883190155 seconds

sql django.db.backends.postgresql_psycopg2
Creating 10 talks took 0.0994439125061 seconds
Editing 10 talks took 0.088721036911 seconds
Deleting 10 talks took 0.0888710021973 seconds
IN TOTAL 0.277035951614 seconds

# 100
mongokit django_mongokit.mongodb
Creating 100 talks took 0.114995002747 seconds
Editing 100 talks took 0.181537866592 seconds
Deleting 100 talks took 0.13414812088 seconds
IN TOTAL 0.430680990219 seconds

sql django.db.backends.postgresql_psycopg2
Creating 100 talks took 0.856637954712 seconds
Editing 100 talks took 1.16229200363 seconds
Deleting 100 talks took 0.879518032074 seconds
IN TOTAL 2.89844799042 seconds

# 500
mongokit django_mongokit.mongodb
Creating 500 talks took 0.505300998688 seconds
Editing 500 talks took 0.809900999069 seconds
Deleting 500 talks took 0.65673494339 seconds
IN TOTAL 1.97193694115 seconds

sql django.db.backends.postgresql_psycopg2
Creating 500 talks took 4.4399368763 seconds
Editing 500 talks took 5.72280597687 seconds
Deleting 500 talks took 4.34039878845 seconds
IN TOTAL 14.5031416416 seconds

# 1000
mongokit django_mongokit.mongodb
Creating 1000 talks took 0.957674026489 seconds
Editing 1000 talks took 1.60552191734 seconds
Deleting 1000 talks took 1.28869891167 seconds
IN TOTAL 3.8518948555 seconds

sql django.db.backends.postgresql_psycopg2
Creating 1000 talks took 8.57405209541 seconds
Editing 1000 talks took 14.8357069492 seconds
Deleting 1000 talks took 11.9729249477 seconds
IN TOTAL 35.3826839924 seconds

On average, MongoDB is 7 times faster.

All in all it doesn't really mean that much. We expect MongoDB to be faster than PostgreSQL because what it lacks for in features it makes up for in speed. It's interesting to see it in action and nice to see that MongoKit is fast enough to benefit from the database's speed.

As always with benchmarks: Lies, lies and more damn lies! This doesn't really compare apples for apples but hopefully with django-mongokit the comparison is becoming more fair. Also, you're free to fork the project on github and do your optimizations and re-run the tests yourself.

How and why to use django-mongokit Here I'm going to explain how to combine Django and MongoDB using MongoKit and django-mongokit.

MongoDB is a document store built for high speed and high concurrency with a very good redundancy story. It's an alternative to relational databases (e.g. MySQL) that is what Django is tightly coupled with in it's ORM (Object Relation Mapping) and what it's called now is ODM (Object Document Mapping) in lack of a better acronym. That's where MongoKit comes in. It's written in Python and it connects to the MongoDB database using a library called pymongo and it turns data from the MongoDB and turns it into instances of classes you have defined. MongoKit has nothing to do with Django. That's where django-mongokit comes in. Written by yours truly.

So we start by defining a MongoKit subclass:

import datetime
from mongokit import Document

class Computer(Document):

    structure = {
      'make': unicode,
      'model': unicode,
      'purchase_date': datetime.datetime,
      'cpu_ghz': float,

    validators = {
      'cpu_ghz': lambda x: x > 0,
      'make': lambda x: x.strip(),

    default_values = {
      'purchase_date': datetime.datetime.utcnow,

    use_dot_notation = True

    indexes = [
      {'fields': ['make']},

All of these class attributes are features of MongoKit. Their names are so obvious that it needs no explanation. Perhaps the one about 'use_dot_notation'; it makes it possible to access data in the structure with a dot on the instance rather that the normal dictionary lookup method. Now let's work with this class on the shell. Important: to actually try this you have to have MongoDB and pymongo installed and up and running MongoDB:

>>> from mongokit import Connection
>>> conn = Connection()
>>> from mymodels import Computer
>>> conn.register([Computer])
>>> database = conn.mydb # will be created if it didn't exist
>>> collection = database.mycollection # equivalent of a SQL table
>>> instance = collection.Computer()
>>> instance.make = u"Apple"
>>> instance.model = u"G5"
>>> instance.cpu_hrz = 2.66
>>> type(instance)
<class 'mymodels.Computer'>
>>> instance
{'model': u'G5', 'make': u'Apple', '_id':
ObjectId('4b9244989d40b334b4000000'), 'cpu_ghz': None,
'purchase_date': datetime.datetime(2010, 3, 6, 12, 3, 8, 281905)}

As you can see it's pretty easy to work with and it just feels so pythonic and obvious. What you get is a something that works just like a normal base class with some extra sugar plus the fact that it can save the data persistently and does so efficiently and redundantly (assuming you do some work on your MongoDB set it up with replication and/or sharding). Now let's look at retrieval which, as per the design principles of MongoKit, follows the basic interface of pymongo. To learn about querying you can skim the MongoKit documentation but really the thing to read is the pymongo documentation which MongoKit layers thinly:

>>> from mongokit import Connection
>>> conn = Connection()
>>> from mymodels import Computer
>>> conn.register([Computer])
>>> database = conn.mydb
>>> collection = database.mycollection
>>> instances = collection.Computer.find()
>>> type(instances)
<class 'mongokit.generators.MongoDocumentCursor'>
>>> list(instances)[0]
{u'cpu_ghz': None, u'model': u'G5', u'_id':
ObjectId('4b9244989d40b334b4000000'), u'purchase_date':
datetime.datetime(2010, 3, 6, 12, 3, 8, 281000), u'make': u'Apple'}
>>> instances = collection.Computer.find().count()
>>> == list(collection.Computer.find())[0]

The query methods one() and find() can take search parameters which limits what you get back. These are quite similar to how Django's default Manager has a method called objects.get() and objects.filter() which should make you feel familiar.

So, what would it take to be able to do this MongoKit business in a running Django so that you can write Django views and templates that interface with your Mongo "documents". Answer: use django-mongokit. django-mongokit is a thin wrapper around MongoKit that makes it just slightly more convenient to use MongoKit in a Django environment. The primary tasks django-mongokit takes care of are: (1) the connection and (2) giving your classes a _meta class attribute. Especially important regarding the connection is that django-mongokit takes care of setting up and destroying a test database for you for running your tests. And since it's all in one place you don't have to worry about creating various connections to MongoKit in your views or management commands. Let's first define the database in your file:

    'default': {
        'ENGINE': 'sqlite3',
        'NAME': 'example-sqlite3.db',
    'mongodb': {
        'ENGINE': 'django_mongokit.mongodb',
        'NAME': 'mydb',

Then, with that in place all you need to get a connection are these lines:

>>> from django_mongokit import get_database
>>> database = get_database()

The reason it's a function an not an instance is because the database is going to be different based on if you're running tests or running in production/development mode. Had we imported a database instance instead of a function to get a database instance, the code would need to know what database you want when the python files are imported which is something that happens before we even know what you're doing with the imported code. django-mongokit also gives you the connection instances which you'll need to register your own models:

>>> from django_mongokit import connection
>>> connection.register([Computer])

But I recommend that a best practice is to always register your models right after you have defined them. This brings us to the DjangoDocument class so let's get straight into it this time in your file inside a Django app you've just created:

import datetime
from django_mongokit import connection
from django_mongokit.document import DjangoDocument

class Computer(DjangoDocument): # notice difference from above
    class Meta:
        verbose_name_plural = "Computerz"

    structure = {
      'make': unicode,
      'model': unicode,
      'purchase_date': datetime.datetime,
      'cpu_ghz': float,

    validators = {
      'cpu_ghz': lambda x: x > 0,
      'make': lambda x: x.strip(),

    default_values = {
      'purchase_date': datetime.datetime.utcnow,

    use_dot_notation = True

    indexes = [
      {'fields': ['make']},


That's now all you need to get on with your code. The DjangoDocument class offers a few more gems that makes your life easier such as handling signals and registering itself in a global variable (import django_mongokit.document.model_names and inspect). See the django-mongokit README file for more information.

So, what's so great about this setup? It's by personal taste but for me it's simplicity and purity. I like the thin layer MongoKit adds on top of pure pymongo that becomes oh so practical such as helping you make sure you only store what you said you would and it's easier to work with class instances you can see the definition of than it is to work with dictionaries and lists.

And here's one of MongoKit's best selling points for me: the few times you need speed, speed and more speed it's possible to go straight to the source without doing any wrapping. This is equivalent of how you sometimes in Django run raw SQL queries which, let's be honest, does happen quite frequently when the project becomes non-trivial. Django's ORM has the ability to turn the output of the raw SQL output into objects and with MongoKit when you go straight into MongoDB you get pure Python dictionaries which you can use to create instances with. Here's an example where you can't query what you're looking for but you might be trolling through thousands of documents:

>>> from some.thridparty import my_kind_of_cpu
>>> computers = []
>>> for item in collection.find():
...     # can't use dot notation when it's not a document
...     cpu = item['cpu_ghz']
...     if my_kind_of_cpu(cpu):
...         computers.append(collection.Computer(item))

A use case for this is when you want to store different types of documents in the same collection and by a value extracted from a raw query you only turn selected few results into mapped instances. More about that in a later post maybe.

If you, like me, have various projects that do things like OAuth on Twitter or Google or you have a development site that goes to PayPal. So you're doing some Django development on http://localhost:8000/foo and click, for example, to do an OAuth on Twitter with an app you have there. Then Twitter will redirect you back to the live site with which you've set it up. But you're doing local development so you want to go back to http://localhost:8080/... instead.

Add this bookmarklet: to localhost:8000 to your browser Bookmarks toolbar and it does exactly that.

Here's its code in more verbose form:

(function() { 
   a = function(){
     location.href = window.location.href.replace(/http:\/\/[^\/]+\//,
   if (/Firefox/.test(navigator.userAgent)) { 
   } else {

I have a Django model that looks something like this:

class MyModel(models.Model):
   modify_date = models.DateTimeField(auto_now=True)

Retroactively now I wanted to add a field called add_date which uses the auto_now_add=True trick. The migration used in this project is South which is great but doesn't work very well with the auto_now_add=True because the field doesn't have a straight forward default. So, first I changed the field to this:

class MyModel(models.Model):
   modify_date = models.DateTimeField(auto_now=True)
   add_date = models.DateTimeField(auto_now_add=True, null=True)

Notice the null=True which is important. Then I used startmigration to generate the code for the forward and backward to which I added a this stuff:

class Migration:

   def forwards(self, orm):

       db.add_column('myapp_mymodel', 'add_date', orm['myapp.mymodel:add_date'])
       for each in MyModel.objects.all():
           # since MyModel is referenced elsewhere I can work out the oldest date
           oldest_date = get_oldest_related_date(each, 
           each.add_date = oldest_date

That way all old records will have the date (not entirely accurate but good enough) and all new records will automatically get a date. Is there a better way? I bet, but I don't know how to do it.

Those Crazy Chinese My friend Chris West has built a great new site called Those Crazy Chinese which describes itself like this:

"Chinese (Mandarin) is a beautiful and highly literal language - directly translated, many words have entertaining and occasionally logical meanings."

It's built in Django and it integrates to Twitter so if you're on Twitter just follow it there to get the latest additions of new interesting and amusing literal translations. This website is mainly geared towards people who are, like Chris, learning Mandarin.

In case this bites someone else like it bit more and chewed off many many minutes of debugging time.

If you ever get weird columns in your Django Administration interface, I now know why that happens. See this screenshot example:

Messed up columns in Django Admin

This happens when you've defined a TEMPLATE_STRING_IF_INVALID in your I always put in my this line:


So that I can quickly see which variable references in template code is potential typos. I'm not a big fan of the implicit magic of equating absence to False/None so I try to avoid the confusion altogether.

The current project I'm working has at the time of writing 20 different forms (90% model forms) instantiated in different scenarios. Django doesn't automatically strip whitespace in text based fields. So instead of doing this:

class ContactMarketingForm(forms.ModelForm):
   class Meta:
       model = ContactMarketing
       exclude = ('contact',)

   def clean_notes(self):
       return self.cleaned_data['notes'].strip()

   def clean_name(self):
       return self.cleaned_data['name'].strip()

Instead I wrote a common class for all of my form classes to use:

class _BaseForm(object):
   def clean(self):
       for field in self.cleaned_data:
           if isinstance(self.cleaned_data[field], basestring):
               self.cleaned_data[field] = self.cleaned_data[field].strip()
       return self.cleaned_data

class BaseModelForm(_BaseForm, forms.ModelForm):

class ContactMarketingForm(BaseModelForm):
   class Meta:
       model = ContactMarketing
       exclude = ('contact',)

Now all text inputs and textareas are automatically whitespace stripped. Perhaps useful for other Djangonauts.

London Frock Exchange launched Today we launched The London Frock Exchange which is a joint project between Fry-IT, Charlotte Davies and Sarah Caverhill

Elevator sales pitch: Unlike other clothes swapping sites, with Charlotte, Sarah and Rani as an expert hub in the middle you don't swap straight across; no you swap one frock in and can choose a frock (of equal value) from the pool of frocks.

Fry-IT is co-founding this venture and hope it'll make us billionaires by the end of the year (They take a small admin fee of £25 for sending you a frock back but sending it in is free with freepost). It's been great fun to work on it over the last couple of months as it means we (Fry-IT is a all-male highly technical company) have had to learn about sizes, body shapes and trying to learn how a female web audience thinks. The ladies have done a great job of seeding it with lots and lots of frocks all of which you can wear in a matter of days if you just swap one of equal value in first. Enjoy!

Here I'm going to explain a solution I had to make for a site I recently launched. Basically, I wanted to cache the whole page in memcache and set the appropriate Expires and Cache-Control headers so that my view was only rendered once an hour and parts of the page needs to be unique (i.e. "Hi, logged in as xxxx")

The advantages is great: The page loads fast, content is stored in memcache every hour, page still appears to be dynamic.

The disadvantages are not so great: the AJAX loads fast but causes a flicker

Basically, I wrote a custom decorator called custom_cache_page(<delay in seconds>) that works like the normal cache_page(<delay in seconds>) decorator available in stock Django. However, my decorator inserts a piece of HTML into the rendered HTML (before it's stored in memcache) that I later use to update certain elements of the page with AJAX instead of server side. Enough talking, let's look at the code!:

from django.utils.decorators import decorator_from_middleware
from django.middleware.cache import CacheMiddleware
class CustomCacheMiddleware(CacheMiddleware):
   def __init__(self, cache_delay=0, *args, **kwargs):
       super(CustomCacheMiddleware, self).__init__(*args, **kwargs)
       self.cache_delay = cache_delay

   def process_response(self, request, response):
       if self.cache_delay:
           extra_js = '<script type="text/javascript">var '\
                      'CACHE_CONTROL=%s;</script>' %\
           response.content = response.content.replace(u'</body>',
                                             u'%s\n</body>' % extra_js)

       return super(CustomCacheMiddleware, self
                   ).process_response(request, response)

custom_cache_page = decorator_from_middleware(CustomCacheMiddleware)

if settings.DEBUG:
   def custom_cache_page(delay):
       def rendered(view):
           def inner(request, *args, **kwargs):
               return view(request, *args, **kwargs)
           return inner
       return rendered

@custom_cache_page(60 * 60 * 1) # 1 hours
def my_expensive_but_cacheable_view(request):
   # only run max. once every hour

What you now get is that in every page that is server-side cached in memcache (and set Expires and Cache-Control headers) get a little piece of Javascript code inserted into the rendered HTML. Now, all I need to do is to write some jQuery code that loads the navigation menu dynamically but instigate it from a AJAX request. Your mileage here might vary (put this in your base.html or whatever you call it):

$(function() {
  if (typeof CACHE_CONTROL != "undefined" &amp;&amp; CACHE_CONTROL) {
     // the page is cached, need to use AJAX to load what should be dynamic

I usually prefix all my views with an underscore when they only return a limited chunk of HTML rather than a whole HTML document.


98% of my pages are deliberately not cached because they either aren't expensive to render or really can't be cached because of logged in users or whatnot.

This solution has the added benefit of being totally contained in Django, so it doesn't require any work on the nginx/apache/lighttpd front end but if you really want some speed (e.g. getting nginx talking directly to memcache) you can still use the above solution to get what you want.

My little decorator can be improved admittedly. You might perhaps add a dynamic check if the page should not be cache for example. Or perhaps some other available trick for invalidating the cache if you're really need to. Or perhaps some other tricks can be made to automate the Javascript AJAX loading.

Calling all kung fu people - Tonight we're launching our new Kung Fu website:

My friend Chris and I have been busy building a website where people who do kung fu can put themselves on a map to say where they train kung fu, what style they do and what kung fu club they belong to. The site is very much centred on having a world map and each little pin on the map is one kung fu martial artist.

This site is build in Django and is based on work that was done to build Django People originally developed by Simon Willison. We took his original code and revamped it almost completely.

Our goal is to slowly build up a world map of people from all sorts of clubs and styles and hopefully one day become the best place on the Internet for understanding what clubs are available where and what styles different people do. The site has been in an "alpha testing" phase now for a couple of weeks and even though we still have lots of ideas and cool features to add we believe it's ready to go live.

So if you train kung fu or know someone who trains kung fu go to our website and add yourself to the map