Crosstips.org

My fun Crossword solver project. Crosstips.org & Krysstips.se

Kung Fu

Fujian White Crane Kung Fu

Fry-IT

Fry-IT is the company I work for

Photos

Photoalbum, both old and new.

Zope

What I have and am doing with Zope

Receptsamlingen

In Swedish only. About my "Collection of Recipes" website.

Contact me

My contact details and how to contact me.

 

KungFuPeople.com
Do you train Kung Fu?
Or know someone who does?
Then check out KungFuPeople.com


Mobile version of this page Mobile version of this page

RSS

Hot topics

by Mark: Thanks for sharing!...

Fastest way to uniqify a list in Python

by Peter Bengtsson: Lack of transactions is definitely a key pain point. It changes your code a...

Speed test between django_mongokit and postgresql_

by Peter Bengtsson: I'm sure there are parameters to make it faster but there are parameters to...

Speed test between django_mongokit and postgresql_

by Peter Bengtsson: Both when it retrieves for editing and for deleting it does selects by key....

Speed test between django_mongokit and postgresql_

by Eas: Michael, there are other postgres tuning parameters that help with throughp...

Speed test between django_mongokit and postgresql_

by Lior Gradstein: You made an error in one of the assignements, you wrote cpu_hrz, so in the ...

How and why to use django-mongokit (aka. Django to

by Alex: You're comparing 1000 transactions vs. 1000 inserts to a database that does...

Speed test between django_mongokit and postgresql_

by Michael Pasternak: I'm following your posts carefully :-) It seems, that "good old SQL" is of...

Speed test between django_mongokit and postgresql_

by Massimiliano Torromeo: What about SELECTs?...

Speed test between django_mongokit and postgresql_

by Gudbergur Erlendsson: I think it's lacking DRY. This is more beautiful IMO: #models.py from mong...

How and why to use django-mongokit (aka. Django to

Old entries


September, 2009
London Frock Exchange launched
My first Twitter app - KungFuPeople.com
Comparing jsmin and slimmer
Python Code Dojo London - 17 Sep 2009
"Hello John. It's Gordon Brown."
7 of the World's Most Irresponsible Companies

August, 2009
Cgunit - Online Gallery
To sub-select or not sub-select in PostgreSQL
Custom CacheMiddleware that tells Javascript a page is cached in Django
What a super user-friendly menu!
Table Of Countries Showing Drive Direction
The Secret to SEO Search Engine Optimization
Calling all kung fu people - kungfupeople.com
Google Reverse Geocoding vs. GeoNames
gg - wrapping git-grep
Public calendars on Google Calendar
More optimization of Peterbe.com - CSS sprites

2009
2008
2007
2006
2005
2004
2003

 

You're viewing blogs from Python only.

View all different categories

9th of March

Speed test between django_mongokit and postgresql_psycopg2

Following on from yesterday's blog about How and why to use django-mongokit I extended the exampleproject which is inside the django-mongokit project with another app called exampleapp_sql which does the same thing as the exampleapp but does it with SQL instead. Then I added a very simple benchmarker app in the same project and wrote three functions:

  1. One to create 10/100/500/1000 instances of my class
  2. One to edit one field of all 10/100/500/1000 instances
  3. One to delete each of the 10/100/500/1000 instances


>Read the whole text (325 more words)

8th of March

How and why to use django-mongokit (aka. Django to MongoDB)

How and why to use django-mongokit Here I'm going to explain how to combine Django and MongoDB using MongoKit and django-mongokit.

MongoDB is a document store built for high speed and high concurrency with a very good redundancy story. It's an alternative to relational databases (e.g. MySQL) that is what Django is tightly coupled with in it's ORM (Object Relation Mapping) and what it's called now is ODM (Object Document Mapping) in lack of a better acronym. That's where MongoKit comes in. It's written in Python and it connects to the MongoDB database using a library called pymongo and it turns data from the MongoDB and turns it into instances of classes you have defined. MongoKit has nothing to do with Django. That's where django-mongokit comes in. Written by yours truly.


>Read the whole text (1551 more words)

28th of February

Massive improvement on sorting a fat list

IssueTrackerMassContainer is a simple Zope product that is used to put a bunch of IssueTrackerProduct instances into. It doesn't add much apart from a nice looking dashboard that lists all recent issues and then with an AJAX poll it keeps updating automatically.

But what it was doing was it recursively put together all issues across all issue trackers, sorting them and then returning only the first 20. Fine, but once the numbers start to add up it can become a vast sort operation to deal with.

In my local development copy of 814 issues, by the use of pympler and time() I was able to go from 7 Mb taking 2 seconds down to using only 8 Kb and taking 0.05 seconds.


>Read the whole text (409 more words)

18th of January

Tip: creating a Xapian database in Python

This cost me some hair-pulling today as I was trying to write a custom test runner for a Django project I'm working on that creates a test Xapian database just for running the tests. Basically, you can't do this:

 os.mkdir(database_file_path)

Because if you do you end up getting these strange DatabaseOpeningError exceptions. So, here's how you do it:

 import xapian
 xapian.WritableDatabase(database_file_path,
                         xapian.DB_CREATE_OR_OPEN)

Hopefully by blogging about this some other poor coder will save some time.

17th of November

Comparing YUI Compressor and slimmer

YUI Compressor apparently supports compressing CSS. Cool! I had to try it and what's even more cool is that it's the only other CSS minifier/compressor that doesn't choke on CSS hacks (the ones I tried). The only other CSS minifier/compressor is my very own slimmer. So, let's see what the difference is.

Running the YUI Compressor 10 times on a relatively small file takes 0.3 seconds on average. Running the same with python 2.6 and slimmer.css_slimmer takes 0.1 seconds on average. I think most of this time is spent loading the jar file than the actual time of running the compression.


>Read the whole text (118 more words)

8th of November

Mocking os.stat in Python

I have some code that checks that a file is treated differently if some time has passed. In other words, it reacts if the modification time is more than it was before. The code uses os.stat(filename)[stat.ST_MTIME] to do this. The challenge was to mock os.stat so that I can pretend some time has passed without having to wait. Here was my first solution which made running many tests sloooow:

 def test_file_changed(self):
     filename = 'foo.txt'
     expect_timestamp = int(time.time())
     ...run the unit test with 'expect_timestamp'...
     time.sleep(1)
     expect_timestamp += 1
     ...run the unit test with 'expect_timestamp'...

So, here's how I mock os.stat to avoid having to do the time.sleep(1) in the test:

 def test_file_changed(self):
     filename = 'foo.txt'
     expect_timestamp = int(time.time())
     ...run the unit test with 'expect_timestamp'...
     import os
     from posix import stat_result
     def fake_stat(arg):
         if arg == filename:
             faked = list(orig_os_stat(arg))
             faked[stat.ST_MTIME] = faked[stat.ST_MTIME] + 1
             return stat_result(faked)
         else:
             return orig_os_stat(arg)
     orig_os_stat = os.stat
     os.stat = fake_stat

     expect_timestamp += 1
     ...run the unit test with 'expect_timestamp'...

I hope this helps someone else who's trying to do the same. It took me some time to figure out that os.stat is used by lots of various sub routines in the os module so it's important to only mock the relevant argument otherwise you might get unexpected problems.

19th of October

What I hate about PIL and Image in Python

One really annoying thing about PIL is that it's importable as ImageandPIL. It leads me and other newbies to think if it's different. I don't want choices:

 Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41)
 [GCC 4.3.3] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import PIL
 >>> import Image
 >>> ?

When PIL/Image is put into standard lib, can we call the module: imaging?

22nd of September

My first Twitter app - KungFuPeople.com

My first Twitter app - KungFuPeople.com I've just finished my first Twitter app. It's basically a just a about using OAuth to allow people to sign up to KungFuPeople.com without having to pick yet another password.

I simply took the oauth.py module by Leah Culver and wrapped it with some useful functions taken from a similar Twitter app we've done at work.

Unlike other Twitter apps for this one I'm using Twitter solely for handling authorization and authentication. That means that it has to work with the existing user + profile functionality but just side-step the sign up and login.

Next goal: Google OAuth

17th of September

Comparing jsmin and slimmer

JSMIN - The Javascript Minifier is written in C and it does an excellent job of minifying Javascript code. After all, Douglas Crockford wrote it. I noticed there's a Python implementation of it so I wanted to see how it stacks up against my slimmer which is also written in Python.

For sake of argument I compiled the C version and ran that in my little benchmark and did so by using the subprocess module. Also, for the sake of comparison I threw in a run with YUI Compressor. Here are some quick results:

 On js/signup-core.js
 --------------------
 js_slimmer
 from 9708 to 6905 in 0.0245039463043 seconds
 jsmin
 from 9708 to 6720 in 0.0850019454956 seconds
 jsmin.c
 from 9708 to 6721 in 0.0026159286499 seconds
 yuicompressor
 from 9708 to 6102 in 0.914173126221 seconds

 On js/zoom.js 
 -------------
 js_slimmer
 from 5920 to 3712 in 0.0106379985809 seconds
 jsmin
 from 5920 to 3582 in 0.0582370758057 seconds
 jsmin.c
 from 5920 to 3583 in 0.00282216072083 seconds
 yuicompressor
 from 5920 to 2771 in 0.839382171631 seconds

 On js/diypack.js
 ----------------
 js_slimmer
 from 21559 to 14059 in 0.0409741401672 seconds
 jsmin
 from 21559 to 13655 in 0.177556037903 seconds
 jsmin.c
 from 21559 to 13656 in 0.00346994400024 seconds
 yuicompressor
 from 21559 to 11638 in 0.891603946686 seconds

So, roughly, slimmer is 4 times faster than jsmin.py but fails to minify a couple of bytes. jsmin.c is about 6 times faster than slimmer.py but is awkward since it's in C. I guess jsmin.c is the way forward when you want speed and the best result. slimmer has the advantage of being all in python and PyPi and contains functions for CSS, HTML and XHTML as well.

It's clear the YUI Compressor does a wicked job at minifying but by running a .jar file every time in a subprocess is crazily slow if that matters for you.

14th of September

Python Code Dojo London - 17 Sep 2009

If you're on the python-uk mailing list you will have already seen this but if you're not, here we go.

Fry-IT is hosting a Code Dojo in our offices. It's free and open to anyone. My colleague Nicholas has written up a little bit about what a Code Dojo is which should get you excited.

Details are available on this page which is also the place to go to secure your place. Currently there are 12 people who say they're coming and we've decided to cap the geek influx to 30 people.

Cheers,
Peter-san

17th of August

Google Reverse Geocoding vs. GeoNames

I've been experimenting with the new Google Reverse Geocoding which allows you to get the location name and country and stuff from a latitude/longitude coordinate.

What I've been doing is comparing this with GeoNames. GeoNames is available from geopy in the reverse-geocode branch.

I wrote down a list of about 15 lat/long points and the result I expect from them (taken from an existing app I'm contemplating switching to Google Reverse Geocoding for) and ran a batch of timed tests on. These results might satisfy the impatient:

 FAILURES:
 geonames_json        0
 google               0
 geonames             12

 TOTAL TIMES:
 geonames_json        2.43582677841        0.143283928142 seconds/request
 google               2.24999976158        0.132352927152 seconds/request
 geonames             1.78063511848        0.104743242264 seconds/request


>Read the whole text (144 more words)

20th of July

gorun.py - Using (py)inotify to run commands when files change

gorun.py - Using (py)inotify to run commands when files change By popular demand I've made my little pyinotify wrapper available for download. It's nothing fancy really but damn useful and productive.

It relies on inotify (so you're stuffed on OSX and Windows) which makes it very fast and efficient (as opposed to periodic polling and file modification time comparisons).

At the moment it's actually quite generic for any command and any file but I'm hoping to take this to the next level with some magic dust that automatically only runs unit tests that fail or something. We'll see what happens.


>Read the whole text (47 more words)

15th of July

setuptools usability - not good, what can be done?

Gun to your head; what would it take to make setuptools as a package author easy to use?

I've spent far too long time today trying to create a package for a little piece of code I've written. Because I can never remember all the bizarre options and commands to setup.py I tried to do it by following Tarek Ziade's wonderful Expert Python Programming but I still got stuck.

Granted, I did not read the f**n manual. Why should I have to? I've got more important things to do such as eating cookies and watching tv.


>Read the whole text (209 more words)

11th of July

premailer.py - Transform CSS into line style attributes with lxml.html

By blogging about it I can pretty much guarantee that someone will comment and say "Hey, why didn't you use the pypi/alreadyexists package which does the same thing but better". I couldn't find one after a quick search and I felt the hacker mood creeping up on my begging me to (re)invent it.

premailer.py takes a HTML page, finds all CSS blocks and transforms these into style attributes. For example, from this:

 <html>
   <head>
     <title>Test</title>
     <style>
     h1, h2 { color:red; }
     strong {
       text-decoration:none
     }
     </style>
   </head>
   <body>
     <h1>Hi!</h1>
     <p><strong>Yes!</strong></p>
   </body>
 </html>

You get this:

 <html>
   <head>
     <title>Test</title>
   </head>
   <body>
     <h1 style="color:red">Hi!</h1>
     <p><strong style="text-decoration:none">Yes!</strong></p>
   </body>
 </html>

Why is this useful? When you're writing HTML emails. Like this newsletter app that I'm working on.

I just wrote it late yesterday and it needs lots of work to impress but for the moment it works for me. If I take the time to tidy it up properly I'll turn it into a package. Assuming there isn't one already :)

UPDATE

No available on github.com and as a PyPi package

11th of May

Most unusual letters in English language

I needed to find out what are the least used letters in the English language. I pulled down a list of about 100,000+ English words, split them all and made a list of about 1,000,000 letters. Sorted them by usage and came up with this as the result:

 esiarntoldcugpmhbyfkwvzxjq

It would be interesting to make a heatmap of this over an image of a QWERTY keyboard.


>Read the whole text (103 more words)

8th of May

To JSON, Pickle or Marshal in Python

To JSON, Pickle or Marshal in Python I was reading David Cramer's tip to use JSONField in Django to be able to store arbitrary fields in a SQL database. Nice. But is it fast enough? Well, I can't answer that but I did look into the difference in read/write performance between simplejson, cPickle and marshal.

Only reading:

 JSON 0.00593531370163
 PICKLE 0.0109532237053
 MARSHAL 0.00413788318634

Reading and writing:

 JSON 0.0434390544891
 PICKLE 0.0289686655998
 MARSHAL 0.00728442907333

Clearly marshal is faster but to quote the documentation:

"Warning: The marshal module is not intended to be secure against erroneous or maliciously constructed data. Never unmarshal data received from an untrusted or unauthenticated source."

Clearly simplejson is a very fast reader and the JSON format has the delicious advantage that it's "human readable" (compared to the others).

NOTE! I spent about 5 minutes putting together the script and about 10 minutes writing this so feel free to doubt it's scientific accuracy.


>Read the whole text (139 more words)

22nd of April

Git + Twitter = Friedcode

Git + Twitter = Friedcode I've now written my first Git hook. For the people who don't know what Git is you have either lived under a rock for the past few years or your not into computer programming at all.

The hook is a post-commit hook and what it does is that it sends the last commit message up to a twitter account I called "friedcode". I guess it's not entirely useful but for you who want to be loud about your work and the progress you make I guess it can make sense. Or if you're a team and you want to get a brief overview of what your team mates are up to. For me, it was mostly an experiment to try Git hooks and pytwitter. Here's how I did it:


>Read the whole text (252 more words)

14th of February

To assert or assertEqual in Python unit testing

When you write unit tests in Python you can use these widgets:

 self.assertEqual(var1, var2, msg=None)
 self.assertNotEqual(var1, var2, msg=None)
 self.assertTrue(expr, msg=None)
 self.assertRaises(exception, func, para, meters, ...)

That's fine but is it "pythonic" enough? The alternative is to do with with "pure python". Eg:

 assert var1 == var2, msg
 assert var1 != var2, msg
 assert expr, msg
 try:
    func(para, meter)
    raise Exception
 except exception:
    pass

I'm sure there are several benefits with using the unittest methods that I don't understand but I understand the benefits of brevity and readability. The more tests you write the more tedious it becomes to write self.assertEquals(..., ...) every time. In my own code I prefer to use simple assert statements rather than the verbose unittest alternative. Partially because I'm lazy and partially because they read better and the word assert is highlit in red in my editor so it just looks nicer from a distance.

Perhaps some much more clever people than me can explain what a cardinal sin it is to not use the unittest methods over the lazy more pythonic ones.

Incidentally, during the course of jotting down this blog I reviewed some old inherited code and changed this:

 self.assertEqual(len(errors),0)

into this:

 assert not errors

Isn't that just nicer to use/read/write?

4th of December

bool is instance of int in Python

I lost about half an hour just moments ago debugging this and pulling out a fair amount of hair. I had some code that looked like this:

 result = []
 for key, value in data.items():
    if isinstance(value, int):
        result.append(dict(name=key, value=value, type='int'))
    elif isinstance(value, float):
        result.append(dict(name=key, value=value, type='float'))
    elif isinstance(value, bool):
        result.append(dict(name=key, type='bool',
                           value=value and 'true' or 'false'))
 ...

It looked so simple but further up the tree I never got any entries with type="bool" even though I knew there were boolean values in the dictionary.

The pitfall I fell into was this:

 >>> isinstance(True, bool)
 True
 >>> isinstance(False, bool)
 True
 >>> isinstance(True, int)
 True
 >>> isinstance(False, int)
 True

Not entirely obvious if you ask me. The solution in my case was just to change the order of the if and the elif so that bool is tested first.

19th of November

domstripper - A lxml.html test project

I'm just playing with the impressive lxml.html package. It makes it possible to easily work with HTML trees and manipulate them.

I had this crazy idea of a "DOM stripper" that removes all but specified elements from an HTML file. For example you want to keep the contents of the <head> tag intact but you just want to keep the <div id="content">...</div> tag thus omitting <div id="banner">...</div> and <div id="nav">...</div>. domstripper now does that. This can be used for example as a naive proxy that tranforms a bloated HTML page into a more stripped down smaller version suitable for say mobile web browsers. It's more a proof of concept that anything else.

To test you just need a virtual python environment and the right system libs to needed to install lxml. This worked for me:

 $ sudo apt-get install cython libxslt1-dev zlib1g-dev libxml2-dev
 $ cd /tmp
 $ virtualenv --no-site-packages testenv
 $ cd testenv
 $ source bin/activate
 $ easy_install domstripper

Now you can use it like this:

 >>> from domstripper import domstripper
 >>> help(domstripper)
 ...
 >>> domstripper('bloat.html', ['#content', 'h1.header'])
 <!DOCTYPE...
 ...

Best to just play with it and see if makes sense. I'm not saying this is an amazing package but it goes to show what can be done with lxml.html and the extremely user friendly CSS selectors.

18th of September

The importance of env (and how it works with virtualenv)

I have for a long time wondered why I'm supposed to use this in the top of my executable python files:

 #!/usr/bin/env python

Today I figured out why.

The alternative, which you see a lot around is something like this:

 #!/usr/bin/python

Here's why it's better to use env rather than the direct path to the executable: virtualenv. Perhaps there are plenty of other reasons the Linux experts can teach me but this is now my first obvious benefit of doing it the way I'm supposed to do it.

If you create a virtualenv, enter it and activate it so that writing:

 $ python 

starts the python executable of the virtual environment, then this will be respected if you use the env shebang header. Good to know.

16th of September

The stupidity of 'id' as a variable name (or stupidity of me)

Both in Zope2 and in Django you need to work with attributes called id. This is a shame since it's such a huge pitfall. Despite having done Python programming for so many years I today fell into this pitfall twice!! The pitfall is that id is a builtin function, not a suitable variable name. The reason is that I was changing a complex app to use something called the UUID as the indentifier instead of the ID which happened to be a name of a primary key in a table.

This meant lots of changes and I tested and tested and kept getting really strange errors. I took the whole thing apart and put it back together when I discovered my error of checking if variable id was set or not. id, if undefined, defaults to the builtin function id() which will always return true on bool(id).

It's been a long day. I'm going home. Two newbie mistakes in one programming session. I'm sure I'm not the only one who's been trapped by this.

12th of July

Python new-style classes and the super() function

I've never really understood the impact of new-style Python classes and what it means to your syntax until now. With new-style classes you can use the super() builtin, otherwise you can't. This works for new-style classes:

 class Farm(object):
    def __init__(self): pass

 class Barn(Farm):
    def __init__(self):
        super(Barn, self).__init__()

If you want to do the same for old-style classes you simply can't use super() so you'll have to do this:

 class Farm:
    def __init__(self): pass

 class Barn(Farm):
    def __init__(self):
        Farm.__init__(self)

Strange that I've never realised this before. The reason I did now was that I had to back-port some code into Zope 2.7 which doesn't support setting security on new-style classes.

Now I need to do some reading on new-style classes because clearly I haven't understood it all.

15th of May

split_search() - A Python functional for advanced search applications

Inspired by Google's way of working I today put together a little script in Python for splitting a search. The idea is that you can search by entering certain keywords followed by a colon like this:

 Free Text name:Peter age: 28

And this will be converted into two parts:

 'Free Text'
 {'name': 'Peter', 'age':'28}

You can configure which keywords should be recognized and to make things simple, you can basically set this to be the columns you have to do advanced search on in your application. For example (from_date,to_date)

Feel free to download and use it as much as you like. You might not agree completely with it's purpose and design so you're allowed to change it as you please.

Here's how to use it:

 $ wget http://www.peterbe.com/plog/split_search/split_search.py
 $ python
 >>> from split_search import split_search
 >>> free_text, parameters = split_search('Foo key1:bar', ('key1',))
 >>> free_text
 'Foo'
 >>> parameters
 {'key1': 'bar'}

29th of April

Releasing IssueTrackerProduct 0.9

Tonight I released an experimental version of the IssueTrackerProduct that is packed with new cool stuff. I call this an experimental release (but I run it on my production systems) because it's got so many new features.

During the course of preparing for this release and writing the news item I deployed the latest version to real.issuetrackerproduct.com and immediately noticed two bugs I to do with user names. So I immediately fixed those and prepared a new release minutes after. I expect to release another more stable version within a few weeks.

10th of March

See you at PyCon 2008

I'm going to Chicago on Wednesday for the PyCon 2008 conference. I'm going to stay at the Crowne Plaza (or whatever it was called) like many of the other people at the conference.

This is what I look like:

See you at PyCon 2008

If you see this mug, go up to it and say Hi. It speaks British, Swedish and some American and loves food, beer and tea which might be helpful to know if you would feel like to talk more to it. Its interests for this conference are: Grok, Zope, Django, Plone, buildout, automated testing, agile development and Javascript. Its main claim-to-fame is an Open Source bug/issue tracker program called IssueTrackerProduct which it is more than delighted to talk about.

I've never been to Chicago before and I'm really excited about Tuesday night as I've bought tickets to a Chicago Bulls NBA game (basketball). All other nights I'm hoping to socialise, get drunk, get full and get down and dirty nerdy all week. See you there!

21st of February

CommandLineApp by Doug Hellmann

I just read the feature article "Command line programs are classes, too!" by Doug Hellmann in the January 2008 issue of Python Magazine about his program CommandLineApp and I've tried it out on one of my old Python programs where I do the opt parsing manually with getopt. The results are beautiful and quick. It's sprinkled with Doug specific magic but I quickly got over that when I saw out easy it was to work with. There are still a few questions of things I didn't manage to work out but that will unfortunately have to wait.

If anything, the worst thing about this library is that it's not part of the standard library so either you have to tell people to sudo easy_install CommandLineApp in the instructions or include it yourself in your packages if you prefer to ship things with a kitchen sink included.

If you want to check it out in action, either subscribe to the magazine (and support the effort) or just download csvcat

22nd of December

String comparison function in Python (alpha)

I was working on a unittest which when it failed would say "this string != that string" and because some of these strings were very long (output of a HTML lib I wrote which spits out snippets of HTML code) it became hard to spot how they were different. So I decided to override the usual self.assertEqual(str1, str2) in Python's unittest class instance with this little baby:

 def assertEqualLongString(a, b):
    NOT, POINT = '-', '*'
    if a != b:
        print a
        o = ''
        for i, e in enumerate(a):
            try:
                if e != b[i]:
                    o += POINT
                else:
                    o += NOT
            except IndexError:
                o += '*'

        o += NOT * (len(a)-len(o))
        if len(b) > len(a):
            o += POINT* (len(b)-len(a))

        print o
        print b

        raise AssertionError, '(see string comparison above)'

It's far from perfect and doesn't really work when you've got Unicode characters that the terminal you use can't print properly. It might not look great on strings that are really really long but I'm sure that's something that can be solved too. After all, this is just a quick hack that helped me spot that the difference between one snippet and another was that one produced <br/> and the other produced <br />. Below are some examples of this utility function in action.


>Read the whole text (145 more words)

17th of December

Calculator in Python for dummies

I need a mini calculator in my web app so that people can enter basic mathematical expressions instead of having to work it out themselfs and then enter the result in the input box. I want them to be able to enter "3*2" or "110/3" without having to do the math first. I want this to work like a pocket calculator such that 110/3 returns a 36.6666666667 and not 36 like pure Python arithmetic would. Here's the solution which works but works like Python:

 def safe_eval(expr, symbols={}):
    return eval(expr, dict(__builtins__=None), symbols)

 def calc(expr):
    return safe_eval(expr, vars(math))

 assert calc('3*2')==6
 assert calc('12.12 + 3.75 - 10*0.5')==10.87
 assert calc('110/3')==36


>Read the whole text (361 more words)

13th of December

WSSE Authentication and Apache

I recently wrote a Grok application that implements a REST API for Atom Publishing so that I can connect a website I have via my new Nokia phone has LifeBlog which uses the Atom API to talk to the server.

Anyway, the authentication on Atom is WSSE (good introduction article) which basically works like this:

 PasswordDigest = Base64 \ (SHA1 (Nonce + CreationTimestamp + Password))

This is one of the pieces in a request header called Authorization which can look something like this:

 Authorization: WSSE profile="UsernameToken"
X-WSSE: UsernameToken Username="bob", PasswordDigest="quR/EWLAV4xLf9Zqyw4pDmfV9OY=", 
 Nonce="d36e316282959a9ed4c89851497a717f", Created="2003-12-15T14:43:07Z"

What I did was I wrote a simple Python script to mimic what the Nokia does but from a script. The script creates a password digest using these python modules: sha, binascii and base64 and then fires off a POST request. Here's thing, if you generate this header with base64.encodestring(ascii_string) you get something like this:

 quR/EWLAV4xLf9Zqyw4pDmfV9OY=\n

Notice the extra newline character at the end of the base64 encoded string. This is perfectly valid and is decoded easily with base64.decodestring(base64_string) by the Grok app. Everything was working fine when I tried posting to http://localhost:8080/++rest++atompub/snapatom and my application successfully authenticated the dummy user. I was happy.

Then I set this up properly on atom.someotherdomain.com which was managed by Apache who internally rewrote the URL to a Grok on localhost:8080. The problem now was that the Authentication header value was broken into two lines because of the newline character and then the whole request was rejected by Apache because some header values came without a : semi-colon.

The solution was to not use base64.encodestring() and base64.decodestring() but to instead use base64.urlsafe_b64encode() and base64.urlsafe_b64decode(). Let me show you:

 >>> import base64
 >>> x = 'Peter'
 >>> base64.encodestring(x)
 'UGV0ZXI=\n'
 >>> base64.urlsafe_b64encode(x)
 'UGV0ZXI='
 >>> base64.decodestring(base64.urlsafe_b64encode(x))
 'Peter'

If you're still reading, then hopefully you won't make the same mistake as I did and wasting time on trying to debug Apache. The lesson learned from this is to use the URL safe base64 header values and not the usual ones.

 

Older entriesOrder entries