Fastest way to uniqify a list in Python

Monday, Aug 14, 2006

⬅︎ Back to Fastest way to uniqify a list in Python

Comment

Ely Spears September 11, 2013

I'm curious if the recursive solution here: (http://stackoverflow.com/a/18729193/567620) is comparable for average-case kinds of lists (where number of unique elements is relatively small compared to total number of elements)?

This was inspired by Haskell's `nub` function. It would be cool to add it to the collection of uniquifiers in the main post and explore some timing about it. Below is the code for good measure:

def unique(lst):
    return [] if lst==[] else [lst[0]] + unique(filter(lambda x: x != lst[0], lst[1:]))

I also think it's interesting that this could be readily generalized to uniqueness by other operations. Like this:

import operator
def unique(lst, cmp_op=operator.ne):
    return [] if lst==[] else [lst[0]] + unique(filter(lambda x: cmp_op(x, lst[0]), lst[1:]), cmp_op)

For example, you could pass in a function that uses the notion of rounding to the same integer as if it was "equality" for uniqueness purposes, like this:

def test_round(x,y):
    return round(x) != round(y)

then unique(some_list, test_round) would provide the unique elements of the list where uniqueness no longer meant traditional equality (which is implied by using any sort of set-based or dict-key-based approach to this problem) but instead meant to take only the first element that rounds to K for each possible integer K that the elements might round to, e.g.:

In [6]: unique([1.2, 5, 1.9, 1.1, 4.2, 3, 4.8], test_round)
Out[6]: [1.2, 5, 1.9, 4.2, 3]