Quick PostgreSQL optimization story

11 March 2006   1 comment   Work

Powered by Fusion×

There are several ways to do case insensitive string matching in SQL. Here are two ways that I've tried and analyzed on a table that doesn't have any indices.

Option 1:

 LOWER(u.first_name) = LOWER('Lazy') OR 
 LOWER(u.last_name) = LOWER('Lazy') OR
 LOWER(u.first_name || u.last_name) = LOWER('Lazy')

Option 2:

 u.first_name ILIKE 'Lazy' OR 
 u.last_name ILIKE 'Lazy' OR
 u.first_name || u.last_name ILIKE 'Lazy'

A potentially third option is to make sure that the parameters sent to the SQL code is cooked, in this case we make the parameter into lower case before sent to the SQL code

Option 1b:

 LOWER(u.first_name) = 'lazy' OR 
 LOWER(u.last_name) = 'lazy' OR
 LOWER(u.first_name || u.last_name) = 'lazy'

Which one do you think is fastest?

The results are:

Option 1:  2.0ms - 2.5ms (average 2.25ms)
Option 1b: 2.0ms - 2.1ms (average 2.05ms)
Option 2: 1.7ms - 2.0ms (average 1.85ms)

Conclusion: the ILIKE operator method is the fastest. Not only is it faster, it also supports regular expressions.

I've always thought that the LIKE and ILIKE were sinfully slow (yet useful when time isn't an issue). I should perhaps redo these tests with an index on the first_name and last_name columns.



CREATE INDEX u_first_name_index ON u (lower(first_name));

or a variation thereof.

(Ah, I just noticed you wrote "without any indices". You probably already know this, then. I'll post it anyway - for the search engines.)

You could also use a shadow column maintained by a trigger, but that's an evil solution, only to be used in almost never-met circumstances.

Your email will never ever be published

Related posts

Squeezebox + Pandora 08 March 2006
Carbon XEmacs installed 14 March 2006
Related by keywords:
Fastest way to uniqify a list in Python 14 August 2006
mincss "Clears the junk out of your CSS" 21 January 2013
Gzip rules the world of optimization, often 09 August 2014
Fastest database for Tornado 09 October 2013
Integer division in programming languages 04 August 2004
Optimization of getting random rows out of a PostgreSQL in Django 23 February 2011
Connecting with psycopg2 without a username and password 24 February 2011
Adding a year in PostgreSQL 04 February 2004
pg_class to check if table exists 20 April 2005
Fastest way to thousands-commafy large numbers in Python/PyPy 13 October 2012
mincss in action - sample report from the wild 22 January 2013
Optimizing MozTrap 04 June 2014