Mobile version of this page
Previous:
My trade salary has gone down, apparently
Next:
Scandinavian Airlines phone booking
Recon - Regular Expression Test Console
Quick PostgreSQL optimization story
Are you a web developer? Then VisiBone is for you
Python regular expression tester
\B in Python regular expressions
Regular Expressions in Javascript cheat sheet
My trade salary has gone down, apparently
Next:
Scandinavian Airlines phone booking
Related blogs
Anti-spamming email harvestingRecon - Regular Expression Test Console
Quick PostgreSQL optimization story
Are you a web developer? Then VisiBone is for you
Python regular expression tester
\B in Python regular expressions
Regular Expressions in Javascript cheat sheet
Related by category
\b in Python regular expressions
regular expressions, regular expression, special character
14th of June 2005
Boy did that shut me up! The \b special character i python regular expressions is so useful. I've used it before but have forgotten about it. The following code:
def createStandaloneWordRegex(word):
""" return a regular expression that can find 'peter'
only if it's written alone (next to space, start of
string, end of string, comma, etc) but not if inside
another word like peterbe """
return re.compile(r"""
(
^ %s
(?=\W | $)
|
(?<=\W)
%s
(?=\W | $)
)
"""% (re.escape(word), re.escape(word)),
re.I|re.L|re.M|re.X)
""" return a regular expression that can find 'peter'
only if it's written alone (next to space, start of
string, end of string, comma, etc) but not if inside
another word like peterbe """
return re.compile(r"""
(
^ %s
(?=\W | $)
|
(?<=\W)
%s
(?=\W | $)
)
"""% (re.escape(word), re.escape(word)),
re.I|re.L|re.M|re.X)
can with the \b gadget be simplified to this:
def createStandaloneWordRegex(word):
""" return a regular expression that can find 'peter'
only if it's written alone (next to space, start of
string, end of string, comma, etc) but not if inside
another word like peterbe """
return re.compile(r'\b%s\b' % word, re.I)
""" return a regular expression that can find 'peter'
only if it's written alone (next to space, start of
string, end of string, comma, etc) but not if inside
another word like peterbe """
return re.compile(r'\b%s\b' % word, re.I)
Quite a lot simpler isn't it? The simplified passes all the few unit tests I had.
Comment
YuppY -
1st July 2005
[«« Reply to this]
First variant could be shorter:
re.compile(r'((?<=\W)|^)%s(?=\W|$)' % re.escape(word), re.I)
re.escape is necessary.
First variant could be shorter:
re.compile(r'((?<=\W)|^)%s(?=\W|$)' % re.escape(word), re.I)
re.escape is necessary.







Save this page in del.icio.us
Excellent! I was wondering how to do this for a script (http://schinckel.blogsome.com/2005/06/27/ecto-auto-abbracronym/) that automatically adds abbr and acronym tags to text in ecto, a blogging client for MacOS X.