I'm curious if the recursive solution here: (http://stackoverflow.com/a/18729193/567620) is comparable for average-case kinds of lists (where number of unique elements is relatively small compared to total number of elements)?
This was inspired by Haskell's `nub` function. It would be cool to add it to the collection of uniquifiers in the main post and explore some timing about it. Below is the code for good measure:
def unique(lst): return [] if lst==[] else [lst[0]] + unique(filter(lambda x: x != lst[0], lst[1:]))
I also think it's interesting that this could be readily generalized to uniqueness by other operations. Like this:
For example, you could pass in a function that uses the notion of rounding to the same integer as if it was "equality" for uniqueness purposes, like this:
def test_round(x,y): return round(x) != round(y)
then unique(some_list, test_round) would provide the unique elements of the list where uniqueness no longer meant traditional equality (which is implied by using any sort of set-based or dict-key-based approach to this problem) but instead meant to take only the first element that rounds to K for each possible integer K that the elements might round to, e.g.:
Comment
I'm curious if the recursive solution here: (http://stackoverflow.com/a/18729193/567620) is comparable for average-case kinds of lists (where number of unique elements is relatively small compared to total number of elements)?
This was inspired by Haskell's `nub` function. It would be cool to add it to the collection of uniquifiers in the main post and explore some timing about it. Below is the code for good measure:
def unique(lst):
return [] if lst==[] else [lst[0]] + unique(filter(lambda x: x != lst[0], lst[1:]))
I also think it's interesting that this could be readily generalized to uniqueness by other operations. Like this:
import operator
def unique(lst, cmp_op=operator.ne):
return [] if lst==[] else [lst[0]] + unique(filter(lambda x: cmp_op(x, lst[0]), lst[1:]), cmp_op)
For example, you could pass in a function that uses the notion of rounding to the same integer as if it was "equality" for uniqueness purposes, like this:
def test_round(x,y):
return round(x) != round(y)
then unique(some_list, test_round) would provide the unique elements of the list where uniqueness no longer meant traditional equality (which is implied by using any sort of set-based or dict-key-based approach to this problem) but instead meant to take only the first element that rounds to K for each possible integer K that the elements might round to, e.g.:
In [6]: unique([1.2, 5, 1.9, 1.1, 4.2, 3, 4.8], test_round)
Out[6]: [1.2, 5, 1.9, 4.2, 3]