Kwissle

My real-time quiz battle game Kwissle.com

Crosstips.org

My fun Crossword solver project. Crosstips.org & Krysstips.se

Kung Fu

Fujian White Crane Kung Fu

Photos

Photoalbum, both old and new.

Twitter

Follow me on Twitter

Contact me

My contact details and how to contact me.

 

KungFuPeople.com
Do you train Kung Fu?
Or know someone who does?
Then check out KungFuPeople.com


Mobile version of this page Mobile version of this page


 
Python

String comparison function in Python (alpha)


22nd of December 2007

I was working on a unittest which when it failed would say "this string != that string" and because some of these strings were very long (output of a HTML lib I wrote which spits out snippets of HTML code) it became hard to spot how they were different. So I decided to override the usual self.assertEqual(str1, str2) in Python's unittest class instance with this little baby:

 def assertEqualLongString(a, b):
    NOT, POINT = '-', '*'
    if a != b:
        print a
        o = ''
        for i, e in enumerate(a):
            try:
                if e != b[i]:
                    o += POINT
                else:
                    o += NOT
            except IndexError:
                o += '*'

        o += NOT * (len(a)-len(o))
        if len(b) > len(a):
            o += POINT* (len(b)-len(a))

        print o
        print b

        raise AssertionError, '(see string comparison above)'

It's far from perfect and doesn't really work when you've got Unicode characters that the terminal you use can't print properly. It might not look great on strings that are really really long but I'm sure that's something that can be solved too. After all, this is just a quick hack that helped me spot that the difference between one snippet and another was that one produced <br/> and the other produced <br />. Below are some examples of this utility function in action.

Beware, you can use this in many different ways that fits your need so I'm not going to focus on how it's executed:

 u = MyUnittest()
 u.assertEqualLongString('Peter Bengtsson 123', 'Peter PengtsXon 124'); print ""
 u.assertEqualLongString('Bengtsson','Bengtzzon'); print ""
 u.assertEqualLongString('Bengtsson','BengtzzonLonger'); print ""
 u.assertEqualLongString('BengtssonLonger','Bengtzzon'); print ""
 u.assertEqualLongString('Bengtssonism','Bengtsson'); print ""

 # Results:

 Peter Bengtsson 123
 ------*-----*-----*
 Peter PengtsXon 124

 Bengtsson
 -----**--
 Bengtzzon

 Bengtsson
 -----**--******
 BengtzzonLonger

 BengtssonLonger
 -----**--******
 Bengtzzon

 Bengtssonism
 ---------***
 Bengtsson



Comment

Ivo - 22nd December 2007  [«« Reply to this]
I love writing python oneliners :)

".join([x[0]==x[1] and "-" or "*" for x in map(None, a, b)])

in python2.5 you can use the trinary op to make it slightly more elegant
Marius Gedminas - 22nd December 2007  [«« Reply to this]
Nice.

I've often used Python's difflib.ndiff to make test failures easier to understand, back when I wrote PyUnit-style unit tests. It was especially useful for multiline strings.

Nowadays with doctests getting readable diffs is easy (#doctest: +REPORT_NDIFF). Unless you're comparing several-thousand-line-long HTML pages in your functional tests.
pacificator - 22nd December 2007  [«« Reply to this]
while don't use the difflib builtin module?
Krys - 22nd December 2007  [«« Reply to this]
Hi Peter,

I did the same thing recently, but I used the built-in difflib module to show me where the differences are. Might be worth a look. :)

HTH
Krys - 22nd December 2007  [«« Reply to this]
Hi again,

Just FYI, I only posted a repeat of what others had already mentioned because with cookies off, I do not see any comments.

The warning about needing to turn on javascript in order to comment is nice, but with cookies off, you site looks like no one has commented.

Anyway, just thought I'd mention that. Sorry for the duplicate info.

Krys
Peter Bengtsson - 23rd December 2007   [«« Reply to this]
It's not about cookies, it's just that the caching is set to 1 hour which might have tricked you. I haven't had the time to find a good solution to this yet.
Ian Bicking - 26th December 2007  [«« Reply to this]
lxml.html.usedoctest (and lxml.doctestcompare) implement a smarter comparison for HTML fragments. It's not perfect, but it's also somewhat less sensitive to unimportant differences in the HTML.
 
Name:
Email:
hide my email address.

Your email address will be encoded to prevent email-extraction spiders from reading it so you won't get spammed if you decide to show your email address.