So I set out to benchmark good old threaded fcgi and gunicorn and then with a source compiled nginx with the uwsgi module baked in I also benchmarked uwsgi. The first mistake I did was testing a Django view that was using sessions and other crap. I profiled the view to make sure it wouldn't be the bottleneck as it appeared to take only 0.02 seconds each. However, with fcgi, gunicorn and uwsgi I kept being stuck on about 50 requests per second. Why? 1/0.02 = 50.0!!! Clearly the slowness of the Django view was thee bottleneck (for the curious, what took all of 0.02 was the need to create new session keys and putting them into the database).
So I wrote a really dumb Django view with no sessions middleware enabled. Now we're getting some interesting numbers:
fcgi (threaded) 640 r/s fcgi (prefork 4 processors) 240 r/s (*) gunicorn (2 workers) 1100 r/s gunicorn (5 workers) 1300 r/s gunicorn (10 workers) 1200 r/s (?!?) uwsgi (2 workers) 1800 r/s uwsgi (5 workers) 2100 r/s uwsgi (10 workers) 2300 r/s (* this made my computer exceptionally sluggish as CPU when through the roof)
If you're wondering why the numbers appear to be rounded it's because I ran the benchmark multiple times and guesstimated an average (also obviously excluded the first run).
gunicorn is the winner in my eyes. It's easy to configure and get up and running and certainly fast enough and I don't have to worry about stray threads being created willy nilly like threaded fcgi. uwsgi definitely worth coming back to the day I need to squeeze few more requests per second but right now it just feels to inconvenient as I can't convince my sys admins to maintain compiled versions of nginx for the little extra benefit.
Having said that, the day uwsgi becomes available as a Debian package I'm all over it like a dog on an ass-flavored cookie.
And the "killer benefit" with gunicorn is that I can predict the memory usage. I found, on my laptop: 1 worker = 23Mb, 5 workers = 82Mb, 10 workers = 155Mb and these numbers stayed like that very predictably which means I can decide quite accurately how much RAM I should let Django (ab)use.
Since this was publish we, in my company, have changed all Djangos to run over uWSGI. It's proven faster than any alternatives and extremely stable. We actually started using it before it was merged into core Nginx but considering how important this is and how many sites we have it's not been a problem to run our own Nginx package.
Voila! Now feel free to flame away about the inaccuracies and what multitude of more wheels and knobs I could/should twist to get even more juice out.