Kwissle

My real-time quiz battle game Kwissle.com

Crosstips.org

My fun Crossword solver project. Crosstips.org & Krysstips.se

Kung Fu

Fujian White Crane Kung Fu

Photos

Photoalbum, both old and new.

Twitter

Follow me on Twitter

Contact me

My contact details and how to contact me.

 

KungFuPeople.com
Do you train Kung Fu?
Or know someone who does?
Then check out KungFuPeople.com


Mobile version of this page Mobile version of this page

RSS

Hot topics

by Anderson Pierre Cardoso: Thanks =]. I'm definitely going to use it. Great post, thanks again!...

Integrate BrowserID in a Tornado web app

by : Pretty cool app! BTW I'm working now on my own first web app: http://www.fi...

My first iPhone web app - Crosstips iPhone interfa

by Jonathan: I had a similar issue with setAttribute not working on IE. I put it in a t...

setAttribute('style', ...) workaround for IE

by terrence: The simpsons are from Ohio. Check the Halloween special #5 or #6. The one w...

Ask Yahoo "What state do the Simpsons live in?"

by terrence: The simpsons are from Ohio. In the Halloween episode where the giant ads c...

Ask Yahoo "What state do the Simpsons live in?"

by Ashraf at Akbar: Hi Thanks for help...

Lost my mobile phone

by selevistar: verey good...

fcgi vs. gunicorn vs. uWSGI

by scott: the simpsons live in florida because in the simpson hit and run buy marges ...

Ask Yahoo "What state do the Simpsons live in?"

Old entries


July, 2011
A blog comment spam solution: Retalition!
A taste of the Django on inside Mozilla, Sheriffs Duty
Comparing Google Closure with UglifyJS
Slides about Kwissle from yesterdays London Python Dojo

June, 2011
Chinese tea sampler pack now on sale
Optimization story involving something silly I call "dict+"
Launching Kwissle.com
Google teething problems still with duplicated content
Test static resources in Django tests

2011
2010
2009
2008
2007
2006
2005
2004
2003

 

You're viewing blogs from Python only. RSS?

View all different categories

13th of December

PythonTornado

Persistent caching with fire-and-forget updates

I just recently landed some patches on toocool that implements and interesting pattern that is seen more and more these days. I call it: Persistent caching with fire-and-forget updates

Basically, the implementation is this: You issue a request that requires information about a Twitter user: E.g. http://toocoolfor.me/following/chucknorris/vs/peterbe The app looks into its MongoDB for information about the tweeter and if it can't find this user it goes onto the Twitter REST API and looks it up and saves the result in MongoDB. The next time the same information is requested, and the data is available in the MongoDB it instead checks if the modify_date or more than an hour and if so, it sends a job to the message queue (Celery with Redis in my case) to perform an update on this tweeter.

You can basically see the code here but just to reiterate and abbreviate, it looks like this:

 tweeter = self.db.Tweeter.find_one({'username': username})
 if not tweeter:
    result = yield tornado.gen.Task(...)
    if result:
        tweeter = self.save_tweeter_user(result)
    else:
        # deal with the error!
 elif age(tweeter['modify_date']) > 3600:
    tasks.refresh_user_info.delay(username, ...)
 # render the template!

What the client gets, i.e. the user using the site, is it that apart from the very first time that URL is request is instant results but data is being maintained and refreshed.

This pattern works great for data that doesn't have to be up-to-date to the second but that still needs a way to cache invalidate and re-fetch. This works because my limit of 1 hour is quite arbitrary. An alternative implementation would be something like this:

 tweeter = self.db.Tweeter.find_one({'username': username})
 if not tweeter or (tweeter and age(tweeter) > 3600 * 24 * 7):
     # re-fetch from Twitter REST API
 elif age(tweeter) > 3600:
     # fire-and-forget update

That way you don't suffer from persistently cached data that is too old.

2nd of December

Python

Python file with closing automatically

Perhaps someone who knows more about the internals of python and the recent changes in 2.6 and 2.7 can explain this question that came up today in a code review.

I suggest using with instead of try: ... finally: to close a file that was written to. Instead of this:

 dest = file('foo', 'w')
 try:
    dest.write('stuff')
 finally:
    dest.close()
 print open('foo').read()  # will print 'stuff'

We can use this:

 with file('foo', 'w') as dest: 
     dest.write('stuff')
 print open('foo').read()  # will print 'stuff'

Why does that work? I'm guessing it's because the file() instance object has a built in __exit__ method. Is that right?

That means I don't need to use contextlib.closing(thing) right?

For example, suppose you have this class:

 class Farm(object):
    def __enter__(self):
        print "Entering"
        return self
    def __exit__(self, err_type, err_val, err_tb):
        print "Exiting", err_type
        self.close()
    def close(self):
        print "Closing"

 with Farm() as farm:
    pass
 # this will print:
 #   Entering
 #   Exiting None
 #   Closing

Another way to achieve the same specific result would be to use the closing() decrorator:

 class Farm(object):
    def close(self):
        print "Closing"

 from contextlib import closing
 with closing(Farm()) as farm:
    pass
 # this will print:
 #   Closing

So the closing() decorator "steals" the __enter__ and __exit__. This last one can be handy if you do this:

 from contextlib import closing
 with closing(Farm()) as farm:
    raise ValueError

 # this will print
 #  Closing
 #  Traceback (most recent call last):
 #   File "dummy.py", line 16, in <module>
 #     raise ValueError
 #  ValueError

This is turning into my own loud thinking and I think I get it now. contextlib.closing() basically makes it possible to do what I did there with the __enter__ and __exit__ and it seems the file() built-in has a exit handler that takes care of the closing already so you don't have to do it with any extra decorators.

18th of November

Python

Trivial but powerful tips for nosetests

I'm clearly still a nosetests beginner because it was only today that I figured out how to set certain plugins to always be on.

First of all you might like these plugins too:

 $ pip install rudolf
 $ pip install disabledoc

Docs: rudolf and disabledoc

To get these gorgeous little tricks into every run of nosetests edit the file ~/.noserc and add the following:

 [nosetests]
 with-disable-docstring=1
 with-color=1

That should make your life a little easier.

UPDATE:

I've since managed to shoot myself in both legs with messing around with nosetests plugins because I heavily rely on django-nose in Django. Long story short: be careful if you get strange import related errors!

8th of July

Python

Slides about Kwissle from yesterdays London Python Dojo

Here are the slides from yesterday's London Python Dojo event.

I presented and demo'ed Kwissle to my fellow Python London friends and focused a lot on the technology but also tried to plug the game a bit.

Having seen that there's a lot of interest in "socket" related web applications about I thought this was a good chance to say that you don't need NodeJS and that tornadio is a great framework for that.

12th of June

PythonMongoDB

Optimization story involving something silly I call "dict+"

Here's a little interesting story about using MongoKit to quickly draw items from a MongoDB

So I had a piece of code that was doing a big batch update. It was slow. It took about 0.5 seconds per user and I sometimes had a lot of users to run it for.

The code looked something like this:

  for play in db.PlayedQuestion.find({'user.$id': user._id}):
     if play.winner == user:
          bla()
     elif play.draw:
          ble()
     else:
          blu()

Because the model PlayedQuestion contains DBRefs MongoKit will automatically look them up for every iteration in the main loop. Individually very fast (thank you indexes) but because of the number of operations very slow in total. Here's how to make it much faster:

    for play in db.PlayedQuestion.collection.find({'user.$id': user._id}):

The problem with this is that you get dict instances for each which is more awkward to work with. I.e. instead of `play.winner` you have use `play['winner'].id`. Here's my solution that makes this a lot easier:

 class dict_plus(dict):

   def __init__(self, *args, **kwargs):
        if 'collection' in kwargs: # excess we don't need
            kwargs.pop('collection')
        dict.__init__(self, *args, **kwargs)
        self._wrap_internal_dicts()

    def _wrap_internal_dicts(self):
        for key, value in self.items():
            if isinstance(value, dict):
                self[key] = dict_plus(value)

    def __getattr__(self, key):
        return self[key]

   ...

  for play in db.PlayedQuestion.collection.find({'user.$id': user._id}):
     play = dict_plus(play)
     if play.winner.id == user._id:
          bla()
     elif play.draw:
          ble()
     else:
          blu()

Now, the whole thing takes 0.01 seconds instead of 0.5. 50 times faster!!

6th of April

PythonTornado

TornadoGists.org - launched and ready!

Today Felinx Lee and I launched TornadoGists.org which is a site for discussing gists related to Tornado (python web framework open sourced by Facebook).

Everyone in the Tornado community seems to solve similar problems in different ways. Oftentimes, these solutions are just a couple of lines or so and not something you can really turn into a full package with setup.py and everything.

Sharing a snippet of code is a great way to a) help other people and b) to get feedback on your solutions.

The goal is to make it a very open and active project with lots of contributors. I'll be accepting and reviewing all forks but hopefully control will be opened up to all Tornado developers. Also, since the code is quite generic to any open source project Felinx and I might one day port this to rubygists.org or lispgists.org or something like that. After all, Github does all the heavy lifting and we just wrap it up nicely.

10th of March

Python

More productive than Lisp? Really??!

Erann Gat reveals why he lost his mojo with Lisp

What caught my attention (for busy people who don't want to read the whole email):

"So I can't really go into many specifics about what happened at Google because of confidentiality, but the upshot was this: I saw, pretty much for the first time in my life, people being as productive and more in other languages as I was in Lisp. What's more, once I got knocked off my high horse (they had to knock me more than once -- if anyone from Google is reading this, I'm sorry) and actually bothered to really study some of these other languges I found myself suddenly becoming more productive in other languages than I was in Lisp. For example, my language of choice for doing Web development now is Python."

I'm currently studying Lisp myself and it's hard. Really hard. I blame it on being spoiled with a programming language that I can work in without having to read the manual. With python's brilliant introspection I can use the interpreter to find out how a library works just by using help() and dir() without even having to read the source code. (not always true of course)

As we're entering the 21st century, the new contender "Usability" is becoming more and more important. Considering that I've now done Python for more than a decade I remind myself one of the reasons I liked it so much; yes, exactly that: Usability.

23rd of February

Python

Connecting with psycopg2 without a username and password

My colleague Lukas and I banged our heads against this for much too long today. So, our SQLAlchemy is was configured like this:

 ENV_DB_CONNECTION_DSN = postgresql://localhost:5432/mydatabase

And the database doesn't have a password (local) so I can log in to it like this on the command line:

 $ psql mydatabase

Which assumes the username peterbe which is what I'm logged in. So, this is a shortcut for doing this:

 $ psql mydatabase -U peterbe

Which, assumes a blank/empty password.


> Read the whole text (145 more words)

15th of February

Web developmentPython

How I profile my Nginx + proxy pass server

Like so many others you probably have an Nginx server sitting in front of your application server (Django, Zope, Rails). The Nginx server serves static files right off the filesystem and when it doesn't do that it proxy passes the request on to the backend. You might be using proxy_pass, uwsgi or fastcgi_pass or at least something very similar. Most likely you have an Nginx site configure something like this:

 server {
    access_log /var/log/nginx/mysite.access.log;
    location ^~ /static/ {
        root /var/lib/webapp;
        access_log off;
    }
    location / {
        proxy_pass http://localhost:8000;
    }
 }

What I do is that I add an access log directive that times every request. This makes it possible to know how long every non-trivial request takes for the backend to complete:

 server {
    log_format timed_combined '$remote_addr - $remote_user [$time_local]  ' 
                              '"$request" $status $body_bytes_sent '
                              '"$http_referer" "$http_user_agent" $request_time';
    access_log /var/log/nginx/timed.mysite.access.log timed_combined;

    location ^~ /css/ {
        root /var/lib/webapp/static;
        access_log off;
    }
    location / {
        proxy_pass http://localhost:8000;
    }
 }


> Read the whole text (259 more words)

20th of December

Python

To code or to pdb in Python

To code or to pdb in Python This feels like a bit of a face-plant moment but I've never understood why anyone would use the code module when you can use the pdb when the pdb is like the code module but with less.

What you use it for is to create you own custom shell. Django does this nicely with it's shell management command. I often find myself doing this:

 $ python
 Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
 [GCC 4.4.3] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> from this import that
 >>> from module import something
 >>> from db import connection
 >>> con = connection('0.0.0.0', bla=True)
 >>> con.insert(something(that))

And there's certain things I almost always import (depending on the project). So use code to write your own little custom shell loader that imports all the stuff you need. Here's one I wrote real quick. Ultra useful time-saver:

 #!/usr/bin/env python
 import code, re
 if __name__ == '__main__':
    from apps.main.models import *
    from mongokit import Connection
    from pymongo.objectid import InvalidId, ObjectId
    con = Connection()
    db = con.worklog
    print "AVAILABLE:"
    print '\n'.join(['\t%s'%x for x in locals().keys()
                     if re.findall('[A-Z]\w+|db|con', x)])
    print "Database available as 'db'"
    code.interact(local=locals())

This is working really well for me and saving me lots of time. Hopefully someone else finds it useful.

22nd of November

PythonTornado

Welcome to the world: DoneCal.com

Welcome to the world: DoneCal.com After about two months of evening hacking I'm finally ready to release my latest project: DoneCal.com

It's a simple calendar that doesn't get in your way. You just click on a day and type what you did that day. DoneCal can be an ideal replacement to boring spreadsheet-like timesheets. And unlike regular timesheets/timetrackers with tags you immediately get statistics about how you've spent your time.

I'm personally excited about the Bookmarklet because I practically live in my webbrowser and now I can quickly type what I've just done (could be a piece of support work for a client) with one single click.

If you're a project manager trying to track what your developers are working on, ask them to start tracking time on DoneCal and then ask them to share their calendar with you. They can set up their share so that it only shares on relevant tags.

I'm going to improving it more and more as feedback comes in. Hopefully later this week I'm going to be writing about the technical side of this since this is my first web app built with the uber-fast Tornado framework

21st of November

Python

jsonpprint - a Python script to format JSON data nicely

This isn't rocket science but it might help someone else.

I often do testing of my various restful HTTP APIs on the command line with curl but often the format the server spits out is very compact and not easy to read. So I pipe it to a little script I've written. Used like this:

 $ curl http://worklog/api/events.json?u=1234 | jsonpprint
 {'events': [{'allDay': True,
             'end': 1290211200.0,
             'id': '4ce6a2096da6814e5b000000',
             'start': 1290211200.0,
             'title': '@DoneCal test sample'},
            {'allDay': True,
             'end': 1290729600.0,
             'id': '4ce6a22b6da6814e5b000001',
 ...


> Read the whole text (151 more words)

21st of October

PythonMongoDB

How I made my MongoDB based web app 10 times faster

MongoKit is a Python wrapper on top of pymongo that adds structure and validation and some other bits and pieces. It's like an ORM but not for an SQL database but for a document store like MongoDB. It's a great piece of code because it's thin. It's very careful not to molly cuddle you and your access to the source. What I discovered was that I was doing an advanced query and with the results they we instantiated as class instances and later turned into JSON for the HTTP response. Bad idea. I don't need them to be objects really so with MongoKit it's possible to go straight to the source and that's what I did.

With few very simple changes I managed to make my restful API app almost 10 times faster!!

Read the whole story here

13th of October

PythonTornado

My tricks for using AsyncHTTPClient in Tornado

I've been doing more and more web development with Tornado recently. It's got an awesome class for running client HTTP calls in your integration tests. To run a normal GET it looks something like this:

 from tornado.testing import AsyncHTTPTestCase
 class ApplicationTestCase(AsyncHTTPTestCase):
    def get_app(self):
        return app.Application(database_name='test', xsrf_cookies=False)

    def test_homepage(self):
        url = '/'
        self.http_client.fetch(self.get_url(url), self.stop)
        response = self.wait()
        self.assertTrue('Click here to login' in response.body)

Now, to run a POST request you can use the same client. It looks something like this:

    def test_post_entry(self):
        url = '/entries'
        data = dict(comment='Test comment')
        from urllib import urlencode
        self.http_client.fetch(self.get_url(url), self.stop, 
                               method="POST",
                               data=urlencode(data))
        response = self.wait()
        self.assertEqual(response.code, 302)


> Read the whole text (519 more words)

27th of August

PythonDjango

Musings about django.contrib.auth.models.User

Dawned on me that the Django auth user model that ships with Django is like the string built-in of a high level programming language. With the string built-in it's oh so tempting to add custom functionality to it like a fancy captialization method or some other function that automatically strips whitespace or what not. Yes, I'm looking at you Prototype for example.

By NOT doing that, and leaving it as it is, you automatically manage to Keep It Simple Stupid and your application code makes sense to the next developer who joins your project.

I'm not a smart programmer but I'm a smart developer in that I'm good at keeping things pure and simple. It means I can't show off any fancy generators, monads or metaclasses but it does mean that fellow coders who follow my steps can more quickly hit the ground running.

My colleagues and I now have more than ten Django projects that rely on, without overriding, the django.contrib.auth.models.User class and there has been many times where I've been tempted to use it as a base class or something instead but in retrospect I'm wholeheartedly happy I didn't. The benefit isn't technical; it's a matter of teamwork and holistic productivity.

 

Older entries Order entries