Comment

Joshua

I am new to python and have run into a serious problem that is causing me to grey prematurely. The problem that I am currently running into is that I am trying to pull unique IP addresses from a list of IPs that often have duplicates. What I have read above seems to be a list populated by the user. How would you do this if the list is prepopulated text file? Right now this is what my script looks like for pulling unique IPs out:

def uniq_count(item):
      seen = set()
      for item in item:
           seen.add(ips)
           yield len(seen)

in small batches it works but when the list is larger it seems to spit out two of each. Any help would be appreciated.

Replies

Peter Bengtsson

`for item in item`??
Is that a typo.

If you want to count unique occurances of an iterator you can't return a generator. Something like this should work:

def uniq_count(iterator):
    seen = set()
    for item in iterator:
        seen.add(item)
    return len(seen)

And you'd be able to use it something like this:

print uniq_count(open('some.log'))

However, there's an even simpler way. Suppose you have a list of things. Like ['a', 'b', 'a', 'c']
Then, to get the count of unique elements you simply convert it to a set like this:

print len(set(['a', 'b', 'a', 'c'])) # will print 3