02 September 2005 0 comments Python

I was working on a little application that is similar to tinyurl.com (where any URL gets a unique random id that can be used to redirect with long clumsy URL strings) and came up with this little algorithm that I wanted to share with the world for some feedback.

There are two loops, one nested inside one big loop that goes on for infinity. The inner loop creates a list of possible combinations from a sample of letters and numbers eg (abc, acb, bac, bca, cab, cba). For each such found combinations it does a check to see if this has been saved before and if not it exists the loop like this:

```
if not hasattr(self, combination):
return combination
```

What's important is the sample to use. This algorithm starts with 1 letter and 1 number, then when it's exhausted all the combinations using that samplespace, it increments the number of letters by 1 and the next time the samplespace is exhausted the number of digits is increased by 1.

A clever thing about this little algorithm is that its inner loop is done of the combinations and not its permutations. The number of combinations of `AAA`

is 1 and `ABC`

is `3!`

but the number of permutations of `AAA`

is `3!`

and `ABC`

is also `3!`

. This efficiency is accomplished with these lines:

```
unique = Set(genid_bits)
possible_combinations = factorial(len(unique))
```

This means that if `genid_bits`

is (A, A, B) you get `possible_combinations`

= `2!`

= 2, not `3!`

which it would be had we not taken the unique set.

The whole algorithm looks like this in python code:

```
import random
from sets import Set
class Container:
def __init__(self):
self.minlength_alpha=1
self.minlength_numeric=1
self.upper_and_lower=False
def _generateID(self):
""" generate a new ID that hasn't been instanciated in this
self before """
assert self.minlength_alpha >= 1
assert self.minlength_numeric >= 1
letters = "abcdefghijklmnopqrstuvwxyz"
if self.upper_and_lower:
letters += letters.upper()
numbers = "0123456789"
while 1:
try:
genid_bits = random.sample(letters, self.minlength_alpha) + \
random.sample(numbers, self.minlength_numeric)
except:
letters += letters
numbers += numbers
continue
unique = Set(genid_bits)
possible_combinations = factorial(len(unique))
combinations = []
while len(combinations) < possible_combinations:
random.shuffle(genid_bits)
combination = ''.join(genid_bits)
if combination not in combinations:
if not hasattr(self, combination):
return combination
combinations.append(combination)
if self.minlength_numeric < self.minlength_alpha:
self.minlength_numeric += 1
else:
self.minlength_alpha += 1
```

If you want to run it, download idgeneratordummy.py and run it's `profile()`

function and you'll see that it's damn fast. What do you think?

So far, this actually has little to do with Zope but it will. I can now write my little competing system to tinyurl.com. I think I'll also drop certain characters that might cause confusion such as `0`

and `o`

.

- Previous:
- British English for Americans 31 August 2005
- Next:
- Amazon bug in shopping basket 05 September 2005

- Related by Text:
- Don't forget your sets in Python! 10 March 2017
- Optimization of getting random rows out of a PostgreSQL in Django 23 February 2011
- How to do performance micro benchmarks in Python 24 June 2017
- My tricks for using AsyncHTTPClient in Tornado 13 October 2010
- Case insensitive list remove call (part II) 11 April 2006