Mobile version of this pageLondon bus 26 from Hackney
Next:
Release package file size
Related blogs
Anti-spamming email harvestingRecon - Regular Expression Test Console
Regular Expressions in Javascript cheat sheet
\b in Python regular expressions
Python regular expression tester
Are you a web developer? Then VisiBone is for you
Quick PostgreSQL optimization story
Related by category
\B in Python regular expressions
regular expressions, regular expression, \B, \b, wordboundry, word boundry
23rd of July 2005
Today I learnt about how to use the \B gadget in Python regular expressions. I've previously talked about the usefulness of \b but there's a big benefit to using \B sometimes too.
What \b does is that it is a word-boundary for alphanumerics. It allows you to find "peter" in "peter bengtsson" but not "peter" in "nickname: peterbe". In other words, all the letters have to be grouped prefixed or suffixed by a wordboundry such as newline, start-of-line, end-of-line or a non alpha character like (.
What \b does for finding alphanumerics, \B does for finding non-alphanumerics. Example:
>>> re.compile(r'\bX\b').findall('X + Y')
['X'] # it can find 'X'
>>> re.compile(r'\b\+\b').findall('X + Y')
[] # same technique can't find '+'
>>> re.compile(r'\B\+\B').findall('X + Y')
['+'] # better to use \B when finding '+'
>>> re.compile(r'\BX\B').findall('X + Y')
[] # and use \B only for non-alphanumerics
The lesson is: \b is a really useful tool but it's limited to finding alphanumerics (numbers and A-Z). \B is what you have to use for finding non-alphanumerics.







Save this page in del.icio.us