17 December 2007

I need a mini calculator in my web app so that people can enter basic mathematical expressions instead of having to work it out themselfs and then enter the result in the input box. I want them to be able to enter "3*2" or "110/3" without having to do the math first. I want this to work like a pocket calculator such that `110/3`

returns a `36.6666666667`

and not `36`

like pure Python arithmetic would. Here's the solution which works but works like Python:

```
def safe_eval(expr, symbols={}):
return eval(expr, dict(__builtins__=None), symbols)
def calc(expr):
return safe_eval(expr, vars(math))
assert calc('3*2')==6
assert calc('12.12 + 3.75 - 10*0.5')==10.87
assert calc('110/3')==36
```

But to make it work like non-Python-geek users would expect it I ended up with the following solution which also adds a few more bells and whistles:

```
import math
import re
integers_regex = re.compile(r'\b[\d\.]+\b')
def calc(expr, advanced=False):
def safe_eval(expr, symbols={}):
return eval(expr, dict(__builtins__=None), symbols)
def whole_number_to_float(match):
group = match.group()
if group.find('.') == -1:
return group + '.0'
return group
expr = expr.replace('^','**')
expr = integers_regex.sub(whole_number_to_float, expr)
if advanced:
return safe_eval(expr, vars(math))
else:
return safe_eval(expr)
def test():
print calc("147.43 - 40") # 107.43
print calc('110/3') # 36.6666666667
print calc('110/3.0') # 36.6666666667
print calc('(10-(3+5))^2') # 4.0
print calc('sys.exit(100)') # None
print calc('a+b') # None
print calc('(3+10))') # None
print calc('del expr') # None
print calc('cos(2*pi)') # None
print calc('pow(3,2)', advanced=True) # 9.0
print calc('cos(2*pi)', advanced=True) # 1.0
```

What this does is that it replaces whole numbers into floating point looking numbers before the expression is evaluated. It also replaces `**`

with `^`

as an alias because I think most non-Python people expect `10^2`

to be 100.

I haven't put this into production yet. I'm still playing around with it to get a feel for how it could work and what the implications might be. There is of course more work needed to wrap this with try-except statements so that dodgy attempts are captured correctly.

Note that it's still possible to do evil things like cos.__class__.__bases[0].__subclasses__() and get access to other types in the system, or create a list comprehension which grabs a huge amount of memory.

About the dunder __, I'll just kick that out with a search for the string '__'

You can't search for "__" because someone can use "_"+"_" or even "_" "_" because of the implicit string concatenation by the parser.

the following will check the input and make it safe to use. Lets user use all functions in `math` module as well as `natural` expression.

import math

import re

whitelist = '|'.join(

# oprators, digits

['-', '\+', '/', '\\', '\*', '\^', '\*\*', '\(', '\)', '\d+']

# functions of math module (ex. __xxx__)

+ [f for f in dir(math) if f[:2] != '__'])

valid = lambda exp: re.match(whitelist, exp)

>>> valid('23**2')

<_sre.SRE_Match object at 0xb78ac218>

>>> valid('del exp') == None

True

whitelist = '^('+'|'.join(

# oprators, digits

['-', r'\+', '/', r'\\', r'\*', r'\^', r'\*\*', r'\(', r'\)', '\d+']

# functions of math module (ex. __xxx__)

+ [f for f in dir(math) if f[:2] != '__']) + ')*$'

The little "r"s are just to make the strings work more correctly, the "^...$" forces it to check the whole string, and the "(...)*" matches an arbitrary string of allowable tokens. Now re.match(whitelist, expr)actually does what was expected above.

It's very hard to make Python's eval safe. It's much easier to use something like PyParsing or PLY to parse the string yourself, and in doing that add the extra precautions you need, like checking for too large results before actually doing the computation.

If you can trust your users then don't worry about it.

Another option would be to start another Python process in a chroot jail, and send expressions to that process and get the response back. You could place process limits on the executable to avoid some DoS problems.

A nice example of how to do that is at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469.

http://blog.dowski.com/2007/12/19/simpleparse-plug/

it might be helpful too. ;-)

class calculator(object):

def __init__(self):

self.error = None

self.intRegex = re.compile(r'\b[\d\.]+\b')

def _return(self, value, error=None):

if error:

self.error = error

return value

def _safeEval(self, expr, symbols={}):

return eval(expr, dict(__builtins__=None), symbols)

def _toFloat(self, match):

group = match.group()

if group.find('.') == -1:

return group + '.0'

return group

def calc(self, expr, advanced=False):

self.error = None

expr = expr.replace('^','**')

expr = self.intRegex.sub(self._toFloat, expr)

try:

if advanced:

return self._return(self._safeEval(expr, vars(math)))

else:

return self._return(self._safeEval(expr))

except Exception, e:

return self._return(None, error=e)

def fancyCalc(self, expr, advanced=False):

result = self.calc(expr, advanced=advanced)

if not result:

return "Error [{1}]: `{0}`".format(self.error, self.error.__class__.__name__)

else:

return result

calc = calculator()

for equation in ["2+2","test"]:

print "Result for equation `{0}` is: {1}".format(equation, calc.fancyCalc(equation))

#That's my little addition :3 - thanks!!

expr = string.replace(expr,",",".")

in the beginnig to handle users, who use "," instead of ".", because it is common to write 3,14 instead od 3.14 in several European countries.