I was working on a unittest which when it failed would say "this string != that string" and because some of these strings were very long (output of a HTML lib I wrote which spits out snippets of HTML code) it became hard to spot how they were different. So I decided to override the usual self.assertEqual(str1, str2) in Python's unittest class instance with this little baby:


def assertEqualLongString(a, b):
   NOT, POINT = '-', '*'
   if a != b:
       print a
       o = ''
       for i, e in enumerate(a):
           try:
               if e != b[i]:
                   o += POINT
               else:
                   o += NOT
           except IndexError:
               o += '*'

       o += NOT * (len(a)-len(o))
       if len(b) > len(a):
           o += POINT* (len(b)-len(a))

       print o
       print b

       raise AssertionError, '(see string comparison above)'

It's far from perfect and doesn't really work when you've got Unicode characters that the terminal you use can't print properly. It might not look great on strings that are really really long but I'm sure that's something that can be solved too. After all, this is just a quick hack that helped me spot that the difference between one snippet and another was that one produced <br/> and the other produced <br />. Below are some examples of this utility function in action.

Beware, you can use this in many different ways that fits your need so I'm not going to focus on how it's executed:


u = MyUnittest()
u.assertEqualLongString('Peter Bengtsson 123', 'Peter PengtsXon 124'); print ""
u.assertEqualLongString('Bengtsson','Bengtzzon'); print ""
u.assertEqualLongString('Bengtsson','BengtzzonLonger'); print ""
u.assertEqualLongString('BengtssonLonger','Bengtzzon'); print ""
u.assertEqualLongString('Bengtssonism','Bengtsson'); print ""

# Results:

Peter Bengtsson 123
------*-----*-----*
Peter PengtsXon 124

Bengtsson
-----**--
Bengtzzon

Bengtsson
-----**--******
BengtzzonLonger

BengtssonLonger
-----**--******
Bengtzzon

Bengtssonism
---------***
Bengtsson

Comments

Post your own comment
Ivo

I love writing python oneliners :)

".join([x[0]==x[1] and "-" or "*" for x in map(None, a, b)])

in python2.5 you can use the trinary op to make it slightly more elegant

Marius Gedminas

Nice.

I've often used Python's difflib.ndiff to make test failures easier to understand, back when I wrote PyUnit-style unit tests. It was especially useful for multiline strings.

Nowadays with doctests getting readable diffs is easy (#doctest: +REPORT_NDIFF). Unless you're comparing several-thousand-line-long HTML pages in your functional tests.

pacificator

while don't use the difflib builtin module?

Krys

Hi Peter,

I did the same thing recently, but I used the built-in difflib module to show me where the differences are. Might be worth a look. :)

HTH

Krys

Hi again,

Just FYI, I only posted a repeat of what others had already mentioned because with cookies off, I do not see any comments.

The warning about needing to turn on javascript in order to comment is nice, but with cookies off, you site looks like no one has commented.

Anyway, just thought I'd mention that. Sorry for the duplicate info.

Krys

Peter Bengtsson

It's not about cookies, it's just that the caching is set to 1 hour which might have tricked you. I haven't had the time to find a good solution to this yet.

Ian Bicking

lxml.html.usedoctest (and lxml.doctestcompare) implement a smarter comparison for HTML fragments. It's not perfect, but it's also somewhat less sensitive to unimportant differences in the HTML.

Your email will never ever be published.

Related posts