Kwissle

My real-time quiz battle game Kwissle.com

Crosstips.org

My fun Crossword solver project. Crosstips.org & Krysstips.se

Kung Fu

Fujian White Crane Kung Fu

Photos

Photoalbum, both old and new.

Twitter

Follow me on Twitter

Contact me

My contact details and how to contact me.

 

KungFuPeople.com
Do you train Kung Fu?
Or know someone who does?
Then check out KungFuPeople.com


Mobile version of this page Mobile version of this page

RSS

Hot topics

by Anderson Pierre Cardoso: Thanks =]. I'm definitely going to use it. Great post, thanks again!...

Integrate BrowserID in a Tornado web app

by : Pretty cool app! BTW I'm working now on my own first web app: http://www.fi...

My first iPhone web app - Crosstips iPhone interfa

by Jonathan: I had a similar issue with setAttribute not working on IE. I put it in a t...

setAttribute('style', ...) workaround for IE

by terrence: The simpsons are from Ohio. Check the Halloween special #5 or #6. The one w...

Ask Yahoo "What state do the Simpsons live in?"

by terrence: The simpsons are from Ohio. In the Halloween episode where the giant ads c...

Ask Yahoo "What state do the Simpsons live in?"

by Ashraf at Akbar: Hi Thanks for help...

Lost my mobile phone

by selevistar: verey good...

fcgi vs. gunicorn vs. uWSGI

by scott: the simpsons live in florida because in the simpson hit and run buy marges ...

Ask Yahoo "What state do the Simpsons live in?"

Old entries


July, 2011
A blog comment spam solution: Retalition!
A taste of the Django on inside Mozilla, Sheriffs Duty
Comparing Google Closure with UglifyJS
Slides about Kwissle from yesterdays London Python Dojo

June, 2011
Chinese tea sampler pack now on sale
Optimization story involving something silly I call "dict+"
Launching Kwissle.com
Google teething problems still with duplicated content
Test static resources in Django tests

2011
2010
2009
2008
2007
2006
2005
2004
2003

 

You're viewing blogs from Django only. RSS?

View all different categories

6th of December

Django

Cryptic errors when using django-nose

After about 3 days of debugging using pdb, print and writing to a log file I've almost finally solve my bizarre errors I was getting when running a whole test suite. The error that it lead to was that Django refused to re-register models to the admin and the errors looked something like this:

  ...
  File "/Users/peterbe/dev/MOZILLA/PTO/pto/urls.py", line 6, in <module>
    admin.autodiscover()
  File "/Users/peterbe/dev/MOZILLA/PTO/pto/vendor/src/django/django/contrib/admin/__init__.py", line 26, in autodiscover
    import_module('%s.admin' % app)
  File "/Users/peterbe/dev/MOZILLA/PTO/pto/vendor/src/django/django/utils/importlib.py", line 35, in import_module
    __import__(name)
  File "/Users/peterbe/dev/MOZILLA/PTO/pto/apps/users/admin.py", line 30, in <module>
    admin.site.register(UserProfile, UserProfileAdmin)
  File "/Users/peterbe/dev/MOZILLA/PTO/pto/vendor/src/django/django/contrib/admin/sites.py", line 85, in register
    raise AlreadyRegistered('The model %s is already registered' % model.__name__)
 AlreadyRegistered: The model UserProfile is already registered

Turns out to be independent of which Django project I ran and it was something no one else was able to reproduce on any machine with the exact same code.

After 2 days I found that there's a difference between a successful run and a failing run was how I specified (to nose) which module to load:

 ./manage.py test users  # fails!
 ./manage.py test users.test  # works!

In both cases it finds the same tests. So it would either fail 10 times or work 10 times. Hmmm...

The bridging between nose and Django is done by awesome django-nose developed here at Mozilla by Django extraordinaire Jeff Balogh and it's a non-trivial piece of code as it depends on some really smart importing tricks and stuff which I haven't even begun to understand.

However, after so many trial and errors I finally discovered that the solution (for me) was to delete the ~/.noserc file. What's strange is that all it contained was:

 [nosetests]
 with-doctest=1

I might never actually find out what went wrong. Ultimately I think a reason things went wrong was because it incorrectly populated sys.modules with excessive keys that would cause double imports of urls.py which in turn runs admin.autodiscover() but incorrectly does so twice.

Sorry for the rambling. And sorry for not actually finding the real bug. I did spent 2-3 days debugging this non-stop and hopefully some other poor frustrated person is going to see this and also look into the ~/.noserc for ways to fix it maybe.

1st of August

Django

EmailInput HTML5 friendly for Django

Suppose you have a Django app with a login where people can only log in with their email address. Then use this widget on your login form:

 ## The input widget class
 class EmailInput(forms.widgets.Input):
    input_type = 'email'

    def render(self, name, value, attrs=None):
        if attrs is None:
            attrs = {}
        attrs.update(dict(autocorrect='off',
                          autocapitalize='off',
                          spellcheck='false'))
        return super(EmailInput, self).render(name, value, attrs=attrs)

 ## Example usage
 class AuthenticationForm(django.contrib.auth.forms.AuthenticationForm):
    """override the authentication form because we use the email address as the
    key to authentication."""

    # allows for using email to log in
    username = forms.CharField(label="Username", max_length=75,
                               widget=EmailInput())
    rememberme = forms.BooleanField(label="Remember me", required=False)

EmailInput HTML5 friendly for Django This input field does some cool stuff in the browser such as automatic validation in the browser as seen in this screenshot here.

More importantly it fixes a very annoying problem when surfing on a smartphone or a tablet like the iPad. As I'm about to type "someusername@mozilla.com" it first wants to start capitalized and which might fail the login. Also if the email address contains a word that it wants to correct like ("mozilla" -> "Mozilla") you have to click the little correct tooltip to tell the input is correct in verbatim.

Note to Djangonauts who want to use this and have a dual authentication backend that takes both usernames and email addresses, this form will make it impossible to log in as something called "admin" for example.

22nd of July

Django

A taste of the Django on inside Mozilla, Sheriffs Duty

A taste of the Django on inside Mozilla, Sheriffs Duty One of the many great things about working for Mozilla is that everything we do is Open Source. Even our wiki is open (however we have an internal wiki for corporation boring stuff such as meeting rooms, HR etc.)

Last week I wrote an internal application for Mozilla's build engineers. Essentially it's a roster that lists one user per day and it's helped by being visualized as a calendar and as a vCal export. It's very unlikely that anybody outside Mozilla will find this particularly useful. But who knows, perhaps other companies have needs to take turns to sheriff build machines.

Anyway, the project was easy to write because we have something called Playdoh. It's a set of nifty and useful settings and a folder structure and it comes with a submodule called "playdoh-lib" which is stuffed with lots of useful packages that you'll most likely want to use. If you browse Playdoh on Github it might look like a lot of stuff but after a second look you'll see that there's actually almost no code. So don't you dare to play the "bloat card"! :)

What this app uses is TastyPie for the REST API which was awesome by the way.

For the authentication I used django-auth-ldap and some custom classes because at Mozilla we use email addresses instead of usernames.

To make the vCal export I use VObject which was easy to work with but has some usual syntax in places.

Jinja was used for the template rendering and it meant I had to do some tricks to use the django.contrib.auth.views.login view but with my templates. Might be worth looking into if people are interested.

The code has 98% test coverage but I had to upgrade to the latest nose to be able to run test coverage on app modules that have similar names to modules in the standard lib.

2nd of June

Django

Test static resources in Django tests

At Mozilla we use jingo-minify to bundle static resources such as .js and .css files. It's not a perfect solution but it's got some great benefits. One of them is that you need to know exactly which static resources you need in a template and because things are bundled you don't need to care too much about what files it originally consisted of. For example "jquery-1.6.2.js" + "common.js" + "jquery.cookies.js" can become "bundles/core.js"

A drawback of this is if you forget to compress and prepare all assets (using the compress_assets management command in jingo-minify) is that you break your site with missing static resources. So how to test for this?


> Read the whole text (367 more words)

22nd of February

Django

Optimization of getting random rows out of a PostgreSQL in Django

There was a really interesting discussion on the django-users mailing list about how to best select random elements out of a SQL database the most efficient way. I knew using a regular RANDOM() in SQL can be very slow on big tables but I didn't know by how much. Had to run a quick test!

Cal Leeming discussed a snippet of his to do with pagination huge tables which uses the MAX(id) aggregate function.

So, I did a little experiment on a table with 84,000 rows in it. Realistic enough to matter even though it's less than millions. So, how long would it take to select 10 random items, 10 times? Benchmark code looks like this:

 TIMES = 10
 def using_normal_random(model):
    for i in range(TIMES):
        yield model.objects.all().order_by('?')[0].pk

 t0 = time()
 for i in range(TIMES):
    list(using_normal_random(SomeLargishModel))
 t1 = time()
 print t1-t0, "seconds"

Result:

 41.8955321312 seconds

Nasty!! Also running this you'll notice postgres spiking your CPU like crazy.

A much better approach is to use Python's random.randint(1, <max ID>). Looks like this:

  from django.db.models import Max
  from random import randint
  def using_max(model):
    max_ = model.objects.aggregate(Max('id'))['id__max']
    i = 0
    while i < TIMES:
        try:
            yield model.objects.get(pk=randint(1, max_)).pk
            i += 1
        except model.DoesNotExist:
            pass

 t0 = time()
 for i in range(TIMES):
    list(using_max(SomeLargishModel))
 t1 = time()
 print t1-t0, "seconds"

Result:

 0.63835811615 seconds

Much more pleasant!

UPDATE

Commentator, Ken Swift, asked what if your requirement is to select 100 random items instead of just 10. Won't those 101 database queries be more costly than just 1 query with a RANDOM(). Answer turns out to be no.

I changed the script to select 100 random items 1 time (instead of 10 items 10 times) and the times were the same:

 using_normal_random() took 41.4467599392 seconds
 using_max() took 0.6027739048 seconds

And what about 1000 items 1 time:

 using_normal_random() took 204.685141802 seconds
 using_max() took 2.49527382851 seconds

20th of February

Django

Nice testimonial about django-static

My friend Chris is a Django newbie who has managed to build a whole e-shop site in Django. It will launch on a couple of days and when it launches I will blog about it here too. He sent me this today which gave me a smile:

"I spent today setting up django_static for the site, and optimising it for performance. If there's one thing I've learned from you, it's optimisation.

So, my homepage is now under 100KB (was 330KB), and it loads in @5-6 seconds from hard refresh (was 13-14 seconds at its worst). And I just got a 92 score on Yslow. I do believe I have the fastest tea website around now, and I still haven't installed caching.

Wicked huh?"

He's talking about using django-static. Then I get another email shortly after with this:

"correction - I get 97 on YSlow if I use a VPN.

I just found that the Great Firewall tags extra HTTP requests onto every request I make from my browser, pinging a server in Shanghai with a PHP script which probably checks the page for its content or if its on some kind of blocked list. Cheeky buggers!"

It's that interesting! (Note: Chris is based in China but hosts the test site in the UK)

13th of January

Django

Fastest "boolean SQL queries" possible with Django

For those familiar with the Django ORM they know how easy it is to work with and that you can do lots of nifty things with the result (QuerySet in Django lingo).

So I was working report that basically just needed to figure out if a particular product has been invoiced. Not for how much or when, just if it's included in an invoice or not.


> Read the whole text (610 more words)

11th of January

Django

django-static version 1.5 automatically taking care of imported CSS

I just released django-static 1.5 (github page) which takes care of optimizing imported CSS files.

To explain, suppose you have a file called foo.css and do this in your Django template:

 {% load django_static %}
 <link href="{% slimfile "/css/foo.css" %}"
   rel="stylesheet" type="text/css" />

And in foo.css you have the following:

 @import "bar.css";
 body {
    background-image
: url(/images/foo.png);
 
}

And in bar.css you have this:

 div.content {
    background-image
: url("bar.png");
 
}

The outcome is the following:

 # foo.css
 
@import "/css/bar.1257701299.css";
 body{
background-image:url(/images/foo.1257701686.png)}

 # bar.css
 div
.content{background-image:url("/css/bar.1257701552.png")}

In other words not only does it parse your CSS content and gives images unique names you can set aggressive caching headers on, it will also unfold imported CSS files and optimize them too.

I think that's really useful. You with one single setting (settings.DJANGO_STATIC=True) you can get all your static resources massaged and prepare for the best possible HTTP optimization. Also, it's all automated so you never need to run any build scripts and the definition of what static resources to use (and how to optimize them) is all defined in the template. This I think makes a lot more sense than maintaining static resources in a config file.

The coverage is 93% and there is an example app to look at in the if you prefer that over a README.

27th of October

Django

In Django, how much faster is it to aggregate?

Being able to do aggregate functions with Django's QuerySet API is really useful. Not because it's difficult to write your own loop but because the summation is then done inside the SQL database. I had this piece of code:

 t = Decimal('0')
 for each in some_queryset:
    t += each.cost

Which can be rewritten like this instead:

 t = qs.aggregate(Sum('cost'))['cost__sum']

For my 6,000+ records in the database the first one takes about 0.7 seconds. The aggregate takes 0.02 seconds. Blimey! That's over 30 fold difference in speed for practically the same thing.

Granted, when doing the loop you can do some other stuff such as counting or additional function calls but that difference is quite significant. In my current application those 0.7 seconds isn't really a problem but it quickly becomes when it has to be done over and over for multiple sets.

11th of October

Django

Local Django development with Nginx

When doing local Django development with runserver you end up doing some changes, then refreshing in Firefox/Chrome/Safari again and again. Doing this means that all your static resources are probably served via Django. Presumably via django.views.static.serve, right? What's wrong with that? Not much, but we can do better.

So, you serve it via Nginx and let Nginx take care of all static resources. You'll still use Django's own runserver so no need for mod_wsgi, gunicorn or uWSGI. This requires that you have Nginx installed and running on your local development environment. First you need to decide on a fake domain name. For example mylittlepony. Edit your /etc/hosts file by adding this line:

 127.0.1.1       mylittlepony


> Read the whole text (410 more words)

27th of August

PythonDjango

Musings about django.contrib.auth.models.User

Dawned on me that the Django auth user model that ships with Django is like the string built-in of a high level programming language. With the string built-in it's oh so tempting to add custom functionality to it like a fancy captialization method or some other function that automatically strips whitespace or what not. Yes, I'm looking at you Prototype for example.

By NOT doing that, and leaving it as it is, you automatically manage to Keep It Simple Stupid and your application code makes sense to the next developer who joins your project.

I'm not a smart programmer but I'm a smart developer in that I'm good at keeping things pure and simple. It means I can't show off any fancy generators, monads or metaclasses but it does mean that fellow coders who follow my steps can more quickly hit the ground running.

My colleagues and I now have more than ten Django projects that rely on, without overriding, the django.contrib.auth.models.User class and there has been many times where I've been tempted to use it as a base class or something instead but in retrospect I'm wholeheartedly happy I didn't. The benefit isn't technical; it's a matter of teamwork and holistic productivity.

9th of July

Django

Hosting Django static images with Amazon Cloudfront (CDN) using django-static

About a month ago I add a new feature to django-static that makes it possible to define a function that all files of django-static goes through.

First of all a quick recap. django-static is a Django plugin that you use from your templates to reference static media. django-static takes care of giving the file the optimum name for static serving and if applicable compresses the file by trimming all whitespace and what not. For more info, see The awesomest way possible to serve your static stuff in Django with Nginx

The new, popular, kid on the block for CDN (Content Delivery Network) is Amazon Cloudfront. It's a service sitting on top of the already proven Amazon S3 service which is a cloud file storage solution. What a CDN does is that it registers a domain for your resources such that with some DNS tricks, users of this resource URL download it from the geographically nearest server. So if you live in Sweden you might download myholiday.jpg from a server in Frankfurk and if you live in North Carolina, USA you might download the very same picture from Virgina, USA. That assures the that the distance to the resource is minimized. If you're not convinced or sure about how CDNs work check out THE best practice guide for faster webpages by Steve Sounders (it's number two)

A disadvantage with Amazon Cloudfront is that it's unable to negotiate with the client to compress downlodable resources with GZIP. GZIPping a resource is considered a bigger optimization win than using CDN. So, I continue to serve my static CSS and Javascript files from my Nginx but put all the images on Amazon Cloudfront. How to do this with django-static? Easy: add this to your settings:

 DJANGO_STATIC = True
 ...other DJANGO_STATIC_... settings...
 # equivalent of 'from cloudfront import file_proxy' in this PYTHONPATH
 DJANGO_STATIC_FILE_PROXY = 'cloudfront.file_proxy'

Then you need to write that function that get's a chance to do something with every static resource that django-static prepares. Here's a naive first version:

 # in cloudfront.py

 conversion_map = {} # global variable
 def file_proxy(uri, new=False, filepath=None, changed=False, **kwargs):
     if filepath and (new or changed):
         if filepath.lower().split('.')[-1] in ('jpg','gif','png'):
             conversion_map[uri] = _upload_to_cloudfront(filepath)
     return conversion_map.get(uri, uri)


> Read the whole text (1013 more words)

30th of May

DjangoMongoDB

Correction: running Django tests with MongoDB is NOT slow

At Euro DjangoCon I met lots of people and talked a lot about MongoDB as the backend. I even did a presentation on the subject which led to a lot of people asking me more questions about MongoDB.

I did mention to some people that one of the drawbacks of using MongoDB which doesn't have transactions is that you have to create and destroy the collections (like SQL tables) each time for every single test runs. I thought this was slow. It's not

Today I've been doing some more profiling and testing and debugging and I can conclude that it's not a problem. Creating the database has a slight delay but it's something you only have to do once and actually it's very fast. Here's how I tear down the collections in between each test:

 class BaseTest(TestCase):

    def tearDown(self):
        for name in self.database.collection_names():
            if name not in ('system.indexes',):
                self.database.drop_collection(name)

For example, running test of one of my apps looks like this:

 $ ./manage.py test myapp
 ...........lots.............
 ----------------------------------------------------------------------
 Ran 55 tests in 3.024s

So, don't fear writing lots of individual unit tests. MongoDB will not slow you down.

25th of May

Django

"Using MongoDB in your Django app - implications and benefits"

Straight from DjangoCon 2010 here in Berlin. Slides from my talk on "Using MongoDB in your Django app - implications and benefits" are available as a HTML5 web page so you'll need one of those fancy browsers like Chrome to be able to view it. Sorry.

23rd of May

PythonDjango

mongoengine vs. django-mongokit

django-mongokit is the project you want to use if you want to connect your Django project to your MongoDB database via the pymongo Python wrapper. An alternative (dare I say competing alternative) is MongoEngine which is bridge between Django and straight to pymongo. The immediate difference you notice is the syntax. django-mongokit looks like MongoKit syntax and MongoEngine looks like Django ORM. They both accomplish pretty much the same thing. So, which one is fastest?

First of all, remember this? where I showed how django-mongokit sped past the SQL ORM like a lightning bullet. Well appears MongoEngine is even faster.

mongoengine vs. django-mongokit

That's an average of 23% faster for all three operations!

 

Older entries Order entries