A blog and website by Peter Bengtsson

Filtered home page! Currently only showing blog entries under the category: Django. Clear filter

The idea with template context processors in Django is to inject some defaults thing to be available when rendering a template that is rendered with a request.

I.e. instead of...:

def view1(request):
    context = {
        'name': 'View 1', 
        'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES
    return render(request, 'view1.html', context)

def view2(request):
    context = {
        'name': 'View 2', 
        'other': 'things', 
        'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES
    return render(request, 'view2.html', context)

And in your nominal templates/base.html you might have something like this:

  <p>&copy; You 2015</p>
  {% if on_dev_server %}
    <p color="red">Note! We're currently on a dev server!</p>
  {% endif %}

Instead you do this trick; in your you write down the list of defaults plus the one you want to always have available:


And to accompany that you define your myprojects/myapp/ like so:

def debug_info(request):
    return {
        'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES,

So far so good.

However, there's a problem with this. Two problems in fact.

First problem is that when all the templates in your big complicated website renders, it's quite possible that some pages don't need everything you set up in your context processors. That might mean a heck of a lot of extra computation when it won't ever be displayed.

For example, I have a project where most pages have a sidebar where I show "Trending Events" which is something I compute in a function called def sidebar_events(request):. But the sidebar is not always shown and on the pages where it's not shown it's a waste to compute the stuff that sidebar_events computes. Also, I have management pages which uses a totally different base.html template. So there's a big chance you're wasting precious CPU.

Another problem is that of code-readability (aka. how frustrating is this to debug for someone else or yourself after months of idle activity). If you're skimming through your base.html and you see this "random" variable called on_dev_server it's very very hard to tell where the heck that's defined. Hopefully grepping the whole source code is a way to go. A much better way to solve that problem would be sensible namespace naming.

And also, by being too liberal with globally scoped variables there's a chance you might clash from a different piece of functionality that uses the same variable names. That chance is smaller when you use namespaces.

So, to remedy this, let your template context processor functions return closures. It wraps the request automagically.

Let's rewrite our trivial example from above, the should now look like this:

def debug_info(request):
    def inner():
        return {
            'on_dev_server': request.get_host() in settings.DEV_HOSTNAMES,
    return {'debug_info': inner}

Now executing that becomes more optional and more deliberate in the template instead. E.g.

  <p>&copy; You 2015</p>
  {% set debug_info = debug_info() %}
  {% if debug_info['on_dev_server'] %}
    <p color="red">Note! We're currently on a dev server!</p>
  {% endif %}

This makes it more explicity which is a good thing. It also has the potential to be avoided if the stuff in there isn't needed in some templates.

In airmozilla the tests almost all derive from one base class whose tearDown deletes the automatically generated settings.MEDIA_ROOT directory and everything in it.

Then there's some code that makes sure a certain thing from the fixtures has a picture uploaded to it.

That means it has do that shutil.rmtree(directory) and that shutil.copy(src, dst) on almost every single test. Some might also not need or depend on it but it's conveninent to put it here.

Anyway, I thought this is all a bit excessive and I could probably optimize that by defining a custom test runner that is first responsible for creating a clean settings.MEDIA_ROOT with the necessary file in it and secondly, when the test suite ends, it deletes the directory.

But before I write that, let's measure how many gazillion milliseconds this is chewing up.

Basically, the tearDown was called 361 times and the _upload_media 281 times. In total, this adds to a whopping total of 0.21 seconds! (of the total of 69.133 seconds it takes to run the whole thing).

I think I'll cancel that optimization idea. Doing some light shutil operations are dirt cheap.

So recently, I moved home for this blog. It used to be on AWS EC2 and is now on Digital Ocean. I wanted to start from scratch so I started on a blank new Ubuntu 14.04 and later rsync'ed over all the data bit by bit (no pun intended).

When I moved this site I copied the /etc/uwsgi/apps-enabled/peterbecom.ini file and started it with /etc/init.d/uwsgi start peterbecom. The settings were the same as before:

# this is /etc/uwsgi/apps-enabled/peterbecom.ini
virtualenv = /var/lib/django/django-peterbecom/venv
pythonpath = /var/lib/django/django-peterbecom
user = django
master = true
processes = 3
env = DJANGO_SETTINGS_MODULE=peterbecom.settings
module = django_wsgi2:application

But I kept getting this error:

Traceback (most recent call last):
  File "/var/lib/django/django-peterbecom/venv/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/", line 182, in _cursor
    self.connection = Database.connect(**conn_params)
  File "/var/lib/django/django-peterbecom/venv/local/lib/python2.7/site-packages/psycopg2/", line 164, in connect
    conn = _connect(dsn, connection_factory=connection_factory, async=async)
psycopg2.OperationalError: FATAL:  Peer authentication failed for user "django"

What the heck! I thought. I was able to connect perfectly fine with the same config on the old server and here on the new server I was able to do this:

django@peterbecom:~/django-peterbecom$ source venv/bin/activate
(venv)django@peterbecom:~/django-peterbecom$ ./ shell
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from peterbecom.apps.plog.models import *
>>> BlogItem.objects.all().count()

Clearly I've set the right password in the settings/ file. In fact, I haven't changed anything and I pg_dump'ed the data over from the old server as is.

I edit edited the file psycopg2/ and added a print "DSN=", dsn and those details were indeed correct.
I'm running the uwsgi app as user django and I'm connecting to Postgres as user django.

Anyway, what I needed to do to make it work was the following change:

# this is /etc/uwsgi/apps-enabled/peterbecom.ini
virtualenv = /var/lib/django/django-peterbecom/venv
pythonpath = /var/lib/django/django-peterbecom
user = django
uid = django   # THIS IS ADDED
master = true
processes = 3
env = DJANGO_SETTINGS_MODULE=peterbecom.settings
module = django_wsgi2:application

The difference here is the added uid = django.

I guess by moving across (I'm currently on uwsgi I get a newer version of uwsgi or something that simply can't just take the user directive but needs the uid directive too. That or something else complicated to do with the users and permissions that I don't understand.

Hopefully, by having blogged about this other people might find it and get themselves a little productivity boost.

If you do things with the Django ORM and want an audit trails of all changes you have two options:

  1. Insert some cleverness into a pre_save signal that writes down all changes some way.

  2. Use eventlog and manually log things in your views.

(you have other options too but I'm trying to make a point here)

eventlog is almost embarrassingly simple. It's basically just a model with three fields:

  • User
  • An action string
  • A JSON dump field

You use it like this:

from eventlog.models import log

def someview(request):
    if request.method == 'POST':
        form = SomeModelForm(request.POST)
        if form.is_valid():
            new_thing =
            log(request.user, 'mymodel.create', {
                # You can put anything JSON 
                # compatible in here
            return redirect('someotherview')
        form = SomeModelForm()
    return render(request, 'view.html', {'form': form})

That's all it does. You then have to do something with it. Suppose you have an admin page that only privileged users can see. You can make a simple table/dashboard with these like this:

from eventlog.models import Log  # Log the model, not log the function

def all_events(request):
    all = Log.objects.all()
    return render(request, 'all_events.html', {'all': all})

And something like this to to all_events.html:

  {% for event in all %}
    <td>{{ event.user.username }}</td>
    <td>{{ event.timestamp | date:"D d M Y" }}</td>
    <td>{{ event.action }}</td>
    <td>{{ event.extra }}</td>
  {% endfor %}

What I like about it is that it's very deliberate. By putting it into views at very specific points you're making it an audit log of actions, not of data changes.

Projects with overly complex model save signals tend to dig themselves into holes that make things slow and complicated. And it's not unrealistic that you'll then record events that aren't particularly important to review. For example, a cron job that increments a little value or something. It's more interesting to see what humans have done.

I just wanted to thank the Eldarion guys for eventlog. It's beautifully simple and works perfectly for me.

In action
A couple of weeks ago we had accidentally broken our production server (for a particular report) because of broken HTML. It was an unclosed tag which rendered everything after that tag to just plain white. Our comprehensive test suite failed to notice it because it didn't look at details like that. And when it was tested manually we simply missed the conditional situation when it was caused. Neither good excuses. So it got me thinking how can we incorporate HTML (html5 in particular) validation into our test suite.

So I wrote a little gist and used it a bit on a couple of projects and was quite pleased with the results. But I thought this might be something worthwhile to keep around for future projects or for other people who can't just copy-n-paste a gist.

With that in mind I put together a little package with a README and a and now you can use it too.

There are however some caveats. Especially if you intend to run it as part of your test suite.

Caveat number 1

You can't flood Well, you can I guess. It would be really evil of you and kittens will die. If you have a test suite that does things like response = self.client.get(reverse('myapp:myview')) and there are many tests you might be causing an obscene amount of HTTP traffic to them. Which brings us on to...

Caveat number 2

The site is written in Java and it's open source. You can basically download their validator and point django-html-validator to it locally. Basically the way it works is java -jar vnu.jar myfile.html. However, it's slow. Like really slow. It takes about 2 seconds to run just one modest HTML file. So, you need to be patient.

I built something. It's called Wish List Granted.

It's a mash-up using's Wish List functionality. What you do is you hook up your Amazon wish list onto and pick one item. Then you share that page with friends and familiy and they can then contribute a small amount each. When the full amount is reached, Wish List Granted will purchase the item and send it to you.

The Rules page has more details if you're interested.

The problem it tries to solve is that you have friends would want something and even if it's a good friend you might be hesitant to spend $50 on a gift to them. I'm sure you can afford it but if you have many friends it gets unpractical. However, spending $5 is another matter. Hopefully Wish List Granted solves that problem.

Wish List Granted started as one of those insomnia late-night project. I first wrote a scraper using pyQuery then a couple of Django models and views and then tied it up by integrating Balanced Payments. It was actually working on the first night. Flawed but working start to finish.

When it all started, I used Persona to require people to authenticate to set up a Wish List. After some thought I decided to ditch that and use "email authentication" meaning they have to enter an email address and click a secure link I send to them.

One thing I'm very proud of about Wish List Granted is that it does NOT store any passwords, any credit cards or any personal shipping addresses. Despite being so totally void of personal data I thought it'd look nicer if the whole site is on HTTPS.

More information on the Help & Frequently Asked Questions page.

This has served me well of the last couple of years of using Django:

from django import forms

class _BaseForm(object):
    def clean(self):
        cleaned_data = super(_BaseForm, self).clean()
        for field in cleaned_data:
            if isinstance(cleaned_data[field], basestring):
                cleaned_data[field] = (
                    cleaned_data[field].replace('\r\n', '\n')
                    .replace(u'\u2018', "'").replace(u'\u2019', "'").strip())

        return cleaned_data

class BaseModelForm(_BaseForm, forms.ModelForm):

class BaseForm(_BaseForm, forms.Form):

So instead of doing...

class SigupForm(forms.Form):
    name = forms.CharField(max_length=100)
    nick_name = forms.CharField(max_length=100, required=False) do:

class SigupForm(BaseForm):
    name = forms.CharField(max_length=100)
    nick_name = forms.CharField(max_length=100, required=False)

What it does is that it makes sure that any form field that takes a string strips all preceeding and trailing whitespace. It also replaces the strange "curved" apostrophe ticks that Microsoft Windows sometimes uses.

Yes, this might all seem trivial and I'm sure there's something as good or better out there but isn't it a nice thing to never have to worry about doing things like this again:

class SignupForm(forms.Form):

    def clean_name(self):
        return self.cleaned_data['name'].strip()


form = SignupForm(request.POST)
if form.is_valid():
    name = form.cleaned_data['name'].strip()


This breaks some fields, like DateField.

>>> class F(BaseForm):
...     start_date = forms.DateField()
...     def clean_start_date(self):
...         return self.cleaned_data['start_date']
>>> f=F({'start_date': '2013-01-01'})
>>> f.is_valid()
>>> f.cleaned_data['start_date']
datetime.datetime(2013, 1, 1, 0, 0)

As you can see, it cleans up '2013-01-01' into datetime.datetime(2013, 1, 1, 0, 0) when it should become, 1, 1).

Not sure why yet.

If you use django-fancy-cache you can either run with stats or without. With stats, you can get a number of how many times a cache key "hits" and how many times it "misses". Keeping stats incurs a small performance slowdown. But how much?

I created a simple page that either keeps stats or ignores it. I ran the benchmark over Nginx and Gunicorn with 4 workers. The cache server is a memcached running on the same host (my OSX 10.7 laptop).

With stats:

Average: 768.6 requests/second
Median: 773.5 requests/second
Standard deviation: 14.0

Without stats:

Average: 808.4 requests/second
Median: 816.4 requests/second
Standard deviation: 30.0

That means, roughly that running with stats incurs a 6% slower performance.

The stats is completely useless to your users. The stats tool is purely for your own curiousity and something you can switch on and off easily.

Note: This benchmark assumes that the memcached server is running on the same host as the Nginx and the Gunicorn server. If there was more network in between, obviously all the .incr() commands would cause more performance slowdown.

This personal blog site of mine uses django-fancy-cache and mincss.

What that means is that I can cache the whole output of every blog post for weeks and when I do that I can first preprocess the HTML and convert every external CSS into one inline STYLE block which will only reference selectors that are actually used.

To see it in action, right-click and select "View Page Source". You'll see something like this:

Stats about using
Requests:         1 (now: 0)
Before:           81Kb
After:            11Kb
After (minified): 11Kb
Saving:           70Kb

The reason the saving is so huge, in my case, is because I'm using Twitter Bootstrap CSS framework which is awesome but as any framework, it will inevitably contain a bunch of stuff that I don't use. Some stuff I don't use on any page at all. Some stuff is used only on some pages and some other stuff is used only on some other pages.

What I gain by this, is faster page loads. What the browser does is that it, gets a URL, downloads all HTML, opens the HTML to look for referenced CSS (using the link tag) and downloads that too. Once all of that is downloaded, it starts to render the page. Approximately after that it starts to download all referenced Javascript and starts evaluating and executing that.

By not having to download the CSS the browser has one less thing to do. Only one request? Well, that request might be on a CDN (not a great idea actually) so even though it's just 1 request it will involve another DNS look-up.

Here's what the loading of the homepage looks like in Firefox from a US east coast IP.

Granted, a downloaded CSS file can be cached by the browser and used for other pages under the same domain. But, on my blog the bounce rate is about 90%. That doesn't necessarily mean that visitors leave as soon as they arrived, but it does mean that they generally just read one page and then leave. For those 10% of visitors who visit more than one page will have to download the same chunk of CSS more than once. But mind you, it's not always the same chunk of CSS because it's different for different pages. And the amount of CSS that is now in-line only adds about 2-3Kb on the HTML load when sent gzipped.

Getting to this point wasn't easy because I first had to develop mincss and django-fancy-cache and integrate it all. However, what this means is that you can have it done on your site too! All the code is Open Source and it's all Python and Django which are very popular tools.

A Django cache_page on steroids

Django ships with an awesome view decorator called cache_page which is awesome. But a bit basic too.

What it does is that it stores the whole view response in memcache and the key to it is the URL it was called with including any query string. All you have to do is specify the length of the cache timeout and it just works.
Now, it's got some shortcomings which django-fancy-cache upgrades. These "steroids" are:

  1. Ability to override the key prefix with a callable.
  2. Ability to remember every URL that was cached so you can do invalidation by a URL pattern.
  3. Ability to modify the response before it's stored in the cache.
  4. Ability to ignore certain query string parameters that don't actually affect the view but does yield a different cache key.
  5. Ability to serve from cache but always do one last modification to the response.
  6. Incrementing counter of every hit and miss to satisfy your statistical curiosity needs.

The documentation is here:

You can see it in a real world implementation by seeing how it's used on my blog here. You basically use it like this::

from fancy_cache import cache_page

@cache_page(60 * 60)
def myview(request):
    return render(request, 'template.html', stuff)

What I'm doing with it here on my blog is that I make the full use of caching on each blog post but as soon as a new comment is posted, I wipe the cache by basically creating a new key prefix. That means that pages are never cache stale but the views never have to generate the same content more than once.

I'm also using django-fancy-cache to do some optimizations on the output before it's stored in cache.