02 September 2005 0 comments Python

I was working on a little application that is similar to tinyurl.com (where any URL gets a unique random id that can be used to redirect with long clumsy URL strings) and came up with this little algorithm that I wanted to share with the world for some feedback.

There are two loops, one nested inside one big loop that goes on for infinity. The inner loop creates a list of possible combinations from a sample of letters and numbers eg (abc, acb, bac, bca, cab, cba). For each such found combinations it does a check to see if this has been saved before and if not it exists the loop like this:

```
if not hasattr(self, combination):
return combination
```

What's important is the sample to use. This algorithm starts with 1 letter and 1 number, then when it's exhausted all the combinations using that samplespace, it increments the number of letters by 1 and the next time the samplespace is exhausted the number of digits is increased by 1.

A clever thing about this little algorithm is that its inner loop is done of the combinations and not its permutations. The number of combinations of `AAA`

is 1 and `ABC`

is `3!`

but the number of permutations of `AAA`

is `3!`

and `ABC`

is also `3!`

. This efficiency is accomplished with these lines:

```
unique = Set(genid_bits)
possible_combinations = factorial(len(unique))
```

This means that if `genid_bits`

is (A, A, B) you get `possible_combinations`

= `2!`

= 2, not `3!`

which it would be had we not taken the unique set.

The whole algorithm looks like this in python code:

```
import random
from sets import Set
class Container:
def __init__(self):
self.minlength_alpha=1
self.minlength_numeric=1
self.upper_and_lower=False
def _generateID(self):
""" generate a new ID that hasn't been instanciated in this
self before """
assert self.minlength_alpha >= 1
assert self.minlength_numeric >= 1
letters = "abcdefghijklmnopqrstuvwxyz"
if self.upper_and_lower:
letters += letters.upper()
numbers = "0123456789"
while 1:
try:
genid_bits = random.sample(letters, self.minlength_alpha) + \
random.sample(numbers, self.minlength_numeric)
except:
letters += letters
numbers += numbers
continue
unique = Set(genid_bits)
possible_combinations = factorial(len(unique))
combinations = []
while len(combinations) < possible_combinations:
random.shuffle(genid_bits)
combination = ''.join(genid_bits)
if combination not in combinations:
if not hasattr(self, combination):
return combination
combinations.append(combination)
if self.minlength_numeric < self.minlength_alpha:
self.minlength_numeric += 1
else:
self.minlength_alpha += 1
```

If you want to run it, download idgeneratordummy.py and run it's `profile()`

function and you'll see that it's damn fast. What do you think?

So far, this actually has little to do with Zope but it will. I can now write my little competing system to tinyurl.com. I think I'll also drop certain characters that might cause confusion such as `0`

and `o`

.

- Previous:
- British English for Americans 31 August 2005
- Next:
- Amazon bug in shopping basket 05 September 2005

- Related by keywords:
- To readline() or readlines() 12 March 2004
- bool is instance of int in Python 05 December 2008
- Reciprocal lesson about gender perspectives 02 September 2011
- Nginx vs. Squid 17 March 2009
- IssueTrackerProduct now officially abandoned 30 March 2012
- How and why to use django-mongokit (aka. Django to MongoDB) 08 March 2010
- On the command line no one can hear you screen. Or can they? 03 May 2012
- Nasty surprise of Django cache 09 December 2008
- Google Calendar, iCalendar Validator but not bloody Apple iCal 09 April 2009
- tempfile in Python standard library 07 February 2006
- In Django, how much faster is it to aggregate? 27 October 2010
- Google and Python code 22 February 2006