How I profile my Nginx + proxy pass server

Wednesday, Feb 16, 2011
3 comments Python, Web development

Like so many others you probably have an Nginx server sitting in front of your application server (Django, Zope, Rails). The Nginx server serves static files right off the filesystem and when it doesn't do that it proxy passes the request on to the backend. You might be using proxy_pass, uwsgi or fastcgi_pass or at least something very similar. Most likely you have an Nginx site configure something like this:


server {
   access_log /var/log/nginx/mysite.access.log;
   location ^~ /static/ {
       root /var/lib/webapp;
       access_log off;
   }
   location / {
       proxy_pass http://localhost:8000;
   }
}

What I do is that I add an access log directive that times every request. This makes it possible to know how long every non-trivial request takes for the backend to complete:


server {
   log_format timed_combined '$remote_addr - $remote_user [$time_local]  ' 
                             '"$request" $status $body_bytes_sent '
                             '"$http_referer" "$http_user_agent" $request_time';
   access_log /var/log/nginx/timed.mysite.access.log timed_combined;

   location ^~ /css/ {
       root /var/lib/webapp/static;
       access_log off;
   }
   location / {
       proxy_pass http://localhost:8000;
   }
}

If you do that your access log file will look the same as before except now it will have a last column that contains the timing. Now, let your site spin for a couple of days/weeks/months and later download the access log:


$ rsync -avzP root@myserver.com:/var/log/nginx/timed.mysite.access.log .

Excellent, now download this script and save it next to your log file. When you run it you get a nice little menu that should be sufficiently self-explanatory:


$ python analyze.py timed.mysite.access.log
What do you want to know?

     1) Slowest performer
     2) Most common
     3) Total cumulative time

The most interesting one is probably the last one because that's where most time is spent by your web application and perhaps that's the place to start if you want to make your site faster or if you want to know what is most important in terms of test coverage. Here's a sample output from my on the "Total cumulative time":


GETS
537.582    /?replypath=/comment-20041222-x54i/comment-20041231-0gug
519.277    /?replypath=/comment-20041222-x54i
306.039    /
259.845    /rss.xml?oc=Django
251.064    /Bush-country/
233.165    /plog/blogitem-040601-1
224.459    /plog/blogitem-040601-1/?replypath=/c0610287425
186.032    /plog/interior-octopus/octopus.jpg?display=large
170.430    /plog/blogitem-040601-1/?replypath=/comment-20050714-12mf
POSTS
182.964    /plog/button-tag-in-IE
65.311     /plog/unicode-to-ascii
30.086     /plog/blogitem-040406-1/compressor
7.581      /plog/blogitem-040806-1/test-printsql
1.676      /gkc/callback
0.372      /plog/blogitem-040404-1/getCommentCookie
0.364      /plog/donecal-homepage-10k-req-per-sec/manage_editKeywords
0.363      /plog/blogitem-040627-1/previewComment
0.274      /plog/createelement-a
0.246      /plog/blogitem-20031027-2106/previewComment

I hope it helps. Perhaps other people can help me improve the script and later we can turn it into a package. One thing I would like to see for example is to use the median to reduce crazy spikes (e.g. a URL that normally takes 10 milliseconds just once takes 10 seconds)

Comments

Aldrin Leal March 2, 2011

You missed the "$" in request_time, obviously. I just discovered after running for one week hehe

Peter Bengtsson March 23, 2011

Thanks for noticing. Fixed now.

Nopz March 6, 2012

It seems that the link to the script does not work anymore.

Previous:: DoneCal homepage now able to do 10,000 requests/second February 13, 2011 DoneCal
Next:: Nice testimonial about django-static February 21, 2011 Django

Related by category:: Native connection pooling in Django 5 with PostgreSQL June 25, 2025 Python; A Python dict that can report which keys you did not use June 12, 2025 Python; How I run standalone Python in 2025 January 14, 2025 Python; How to resolve a git conflict in poetry.lock February 7, 2020 Python

Related by keyword:: Be very careful with your add_header in Nginx! You might make your site insecure February 11, 2018 Linux, Web development, Nginx; ssl_session_cache in Nginx and the ab benchmark December 31, 2010 Linux, DoneCal; How I simulate a CDN with Nginx May 15, 2019 Python, Nginx; fcgi vs. gunicorn vs. uWSGI April 9, 2010 Python, Linux, Django

How I profile my Nginx + proxy pass server

Comments

Related posts