Bestest and securest way to handle Python dependencies

01 February 2016   1 comment   Python

pip 8 is out and with it, the ability to only install dependencies you've vetted. Thank Erik Rose! Now you can be absolutely certain that dependencies you downloaded and installed locally is absolutely identical to the dependencies you download and install in your production server.

First pipstrap.py

So your server needs pip to install those dependencies safely and securely. Initially you have to trust the pip/virtualenv that is installed globally on the system. If you can trust it but unsure it's a good version of pip version 8 and up, that's where pipstrap.py comes in. It makes sure you get a pip version installed that supports pip install with hashes:

Add pipstrap.py to your git/hg repo and use it to make sure you have a good pip. For example your deployment script might look like this now:

#!/bin/bash
git pull origin master
virtualenv venv
source venv/bin/activate
python ./tools/pipstrap.py
pip install --require-hashes -r requirements.txt

Then hashin

Thanks to pipstrap we now have a version of pip that really does check the hashes you've put in the requirements.txt file.

(By the way, the --require-hashes on pip install is optional. pip will imply it if the requirements.txt file appears to have hashes defined. But to avoid the risk and you accidentally fumbling a bad requirements.txt it's good to specify --require-hashes to pip install)

Now that you're up and running and you sleep well at night because you know your production server has exactly the same dependencies you had when you did the development and unit testing, how do you get the hashes in there?

The tricks is to install hashin. (pip install hashin). It helps you write those hashes.

Suppose you have a requirements.txt file that looks like this:

Django==1.9.1
bgg==0.22.1
html2text==2016.1.8

You can try to run pip install --require-hashes -r requirements.txt and learn from the errors. E.g.:

Hashes are required in --require-hashes mode, but they are missing from some requirements. 
Here is a list of those requirements along with the hashes their downloaded archives actually 
had. Add lines like these to your requirements files to prevent tampering. (If you did not 
enable --require-hashes manually, note that it turns on automatically when any package has a hash.)
    Django==1.9.1 --hash=sha256:9f7ca04c6dbcf08b794f2ea5283c60156a37ebf2b8316d1027f594f34ff61101
    bgg==0.22.1 --hash=sha256:e5172c3fda0e8a42d1797fd1ff75245c3953d7c8574089a41a219204dbaad83d
    html2text==2016.1.8 --hash=sha256:088046f9b126761ff7e3380064d4792279766abaa5722d0dd765d011cf0bb079

But those are just the hashes for your particular environment (and your particular support for Python wheels). Instead, take each requirement and run it through hashin

$ hashin Django==1.9.1
$ hashin bgg==0.22.1
$ hashin html2text==2016.1.8

Now your requirements.txt will look like this:

Django==1.9.1 \
    --hash=sha256:9f7ca04c6dbcf08b794f2ea5283c60156a37ebf2b8316d1027f594f34ff61101 \
    --hash=sha256:a29aac46a686cade6da87ce7e7287d5d53cddabc41d777c6230a583c36244a18
bgg==0.22.1 \
    --hash=sha256:e5172c3fda0e8a42d1797fd1ff75245c3953d7c8574089a41a219204dbaad83d \
    --hash=sha256:aaa53aea1cecb8a6e1288d6bfe52a51408a264a97d5c865c38b34ae16c9bff88
html2text==2016.1.8 \
    --hash=sha256:088046f9b126761ff7e3380064d4792279766abaa5722d0dd765d011cf0bb079

One Last Note

pip is smart enough to traverse the nested dependencies of packages that need to be installed. For example, suppose you do:

$ hashin premailer

It will only add...

premailer==2.9.7 \
    --hash=sha256:1516cbb972234446660bf7862b28521f0fc8b5e7f3087655f35ae5dd233013a3 \
    --hash=sha256:843e624bdac9d28725b217559904aa5a217c1a94707bc2ecef6c91a8d82f1a23

...to your requirements.txt. But this package has a bunch of dependencies of its own. To find out what those are, let pip "fail for you".

$ pip install --require-hashes -r requirements.txt
Collecting premailer==2.9.7 (from -r r.txt (line 1))
  Downloading premailer-2.9.7-py2.py3-none-any.whl
Collecting lxml (from premailer==2.9.7->-r r.txt (line 1))
Collecting cssutils (from premailer==2.9.7->-r r.txt (line 1))
Collecting cssselect (from premailer==2.9.7->-r r.txt (line 1))
In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    lxml from https://pypi.python.org/packages/source/l/lxml/lxml-3.5.0.tar.gz#md5=9f0c5f1eb43ff44d5455dab4b4efbe73 (from premailer==2.9.7->-r r.txt (line 1))
    cssutils from https://pypi.python.org/packages/2.7/c/cssutils/cssutils-1.0.1-py2-none-any.whl#md5=b173f51f1b87bcdc5e5e20fd39530cdc (from premailer==2.9.7->-r r.txt (line 1))
    cssselect from https://pypi.python.org/packages/source/c/cssselect/cssselect-0.9.1.tar.gz#md5=c74f45966277dc7a0f768b9b0f3522ac (from premailer==2.9.7->-r r.txt (line 1))

So apparently you need to hashin those three dependencies:

$ hashin lxml
$ hashin cssutils
$ hashin cssselect

Now your requirements.txt file will look something like this:

premailer==2.9.7 \
    --hash=sha256:1516cbb972234446660bf7862b28521f0fc8b5e7f3087655f35ae5dd233013a3 \
    --hash=sha256:843e624bdac9d28725b217559904aa5a217c1a94707bc2ecef6c91a8d82f1a23
lxml==3.5.0 \
    --hash=sha256:349f93e3a4b09cc59418854ab8013d027d246757c51744bf20069bc89016f578 \
    --hash=sha256:8628cc82957c41be10abce889a1976ceb7b9e3f36ebffa4fcb1a80901bf77adc \
    --hash=sha256:1c9c26bb6c31c3d5b3c104e843211d9c105db60b4df6770ac42673263d55d494 \
    --hash=sha256:01e54511034333f18772c335ec0b33a76bba988135eaf727a075897866d19604 \
    --hash=sha256:2abf6cac9b7952047d8b7265384a9565e419a727dba675e83e4b7f5b7892b6bb \
    --hash=sha256:6dff909020d0c030fb26004626c8f87f9116e0381702fed415caf94f5a9b9493
cssutils==1.0.1 \
    --hash=sha256:78ac48006ac2336b9456e88a75ed35f6a31a030c65162503b7af01a60d78db5a \
    --hash=sha256:d8a18b2848ea1011750231f1dd64fe9053dbec1be0b37563c582561e7a529063
cssselect==0.9.1 \
    --hash=sha256:0535a7e27014874b27ae3a4d33e8749e345bdfa62766195208b7996bf1100682

Ah... Now you feel confident.

Actually, One More Last Note

Sorry for repeating the obvious but it's so important it's worth making it loud and clear:

Use the same pip install procedure and requirements.txt file everywhere

I.e. Install the depdendencies the same way on your laptop, your continuous integration server, your staging server and production server. That really makes sure you're running the same process and the same dependencies everywhere.

Comments

Piotr Zalewa

Thanks Peter, that was a good read.

Your email will never ever be published


Related posts

Previous:
A quicksearch for Bugzilla using Autocompeter 27 January 2016
Next:
How to no-mincss links with django-pipeline 03 February 2016
Related by Keyword:
hashin 0.14.0 with --update-all and a bunch of other features 13 November 2018
hashin 0.12.0 is much much faster 20 March 2018
hashin 0.7.0 and multiple packages 30 August 2016
hashin 0.5.0 bug fix 17 May 2016
hashin - a replacement for peepin 26 January 2016
Related by Text:
jQuery and Highslide JS 08 January 2008
I'm back! Peterbe.com has been renewed 05 June 2005
Anti-McCain propaganda videos 12 August 2008
Ever wondered how much $87 Billion is? 04 November 2003
Guake, not Yakuake or Yeahconsole 23 January 2010