I added a new variant of f5 / f5b (see below) that simply swaps the seen items being a dict to use a set. The thinking amongst myself, and colleagues, is a set() would be faster ... running the tests (simply adding below into the downloadable benchmark py) showed the dict (both f5 and f5b) is faster by 10-20% across python versions 2.6 - 3.1 ... does anyone else find this counter intuitive? and have a reason why this is so? (oh, for py3 testing I did remove f7 and used print function too)
# BEGIN f5c def f5c(seq, idfun=None): # Alex Martelli ******* order preserving if idfun is None: def idfun(x): return x seen = set() result = [] for item in seq: marker = idfun(item) # in old Python versions: # if seen.has_key(marker) # but in new ones: if marker not in seen: seen.add(marker) result.append(item)
Comment
Hi All,
I added a new variant of f5 / f5b (see below) that simply swaps the seen items being a dict to use a set. The thinking amongst myself, and colleagues, is a set() would be faster ... running the tests (simply adding below into the downloadable benchmark py) showed the dict (both f5 and f5b) is faster by 10-20% across python versions 2.6 - 3.1 ... does anyone else find this counter intuitive? and have a reason why this is so? (oh, for py3 testing I did remove f7 and used print function too)
# BEGIN f5c
def f5c(seq, idfun=None): # Alex Martelli ******* order preserving
if idfun is None:
def idfun(x): return x
seen = set()
result = []
for item in seq:
marker = idfun(item)
# in old Python versions:
# if seen.has_key(marker)
# but in new ones:
if marker not in seen:
seen.add(marker)
result.append(item)
return result
# END f5c
TIA,