Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page! Currently only showing blog entries under the category: PostgreSQL. Clear filter

Yesterday I had the good fortune to present Crontabber to the The San Francisco Bay Area PostgreSQL Meetup Group organized by my friend Josh Berkus.

To spare you having to watch the whole presentation I'm going to summarize some of it here.

My colleague Lars also has a blog post about Crontabber and goes into a bit more details about the nitty gritty of using PostgreSQL.

What is crontabber?

It's a tool for running cron jobs. It's written in Python and PostgreSQL and it's a tool we need for Socorro which has lots and lots of stored procedures and cron jobs.

So it's basically a script that gets started by UNIX crontab and runs every 5 minutes. Internally it keeps an index of all the apps it needs to run and it manages dependencies between jobs and is self-healing meaning that if something goes wrong during run-time it just retries again and again until it works. Amongst many of the juicy features it offers on top of regular "cron wrappers" is something we call "backfilling".

Backfill jobs are jobs that happen on a periodic basis and get given a date. If all is working this date is the same as "now()" but if something was to go wrong it remembers all the dates that did not work and re-attempts with those exact dates. That means that you can guarantee that the job gets started with every date possible even if it needs to catch up on past dates.

There's plenty of documentation on how to install and create jobs but it all starts with a simple pip install crontabber.

To get a feel for how you write crontabber "apps", checkout the ones for Socorro or flick through the slides in the PDF.

Is it mature?

Yes! It all started in early 2012 as a part of the Socorro code base and after some hard months of it stabalizing and maturing we decided to extract it out of Socorro and into its own project on GitHub under the Mozilla Public Licence 2.0 licence. Now it stands on its own legs and has no longer anything to do with Socorro and can be used for anything and anyone who has a lot of complicated cron jobs that need to run in Python with a PostgreSQL connection. In Socorro we use it primarily for executing stored procedures but that's just one type of app. You can also make it call out on to anything the command line with a "@with_subprocess" decorator.

Is it finished?

No. It works really well for us but there's a decent list of features we want to add. The hope is that by open sourcing it we can get other organizations to adopt it and not only find bugs but also contribute code to add more juicy features.

One of the soon-to-come features we have in mind is to "internalize locking". At the moment you have to wrap it in a bash script that prevents it from being run concurrently. Crontabber is single-threaded and we don't have to worry about "dependent jobs" starting before "parent jobs" because the depdendencies and their state is stored in the database. But you might need to worry about the same job (the one due next) to be started concurrently. By internalizing the locking we can store, in the state database, that a particular job is being started on and thus not have to worry about it starting the same job twice.

I really hope this project can grow and continue to support us in our needs.

If you run Homebrew on your OSX and you use that to install postgres you will have noted there's a new formula for Postgres 9.3(.4). Yay! (actually this was done many months ago but I'm slow to keep my brews upgraded)

When you run the upgrade you'll notice that psql no longer works because the server can't start.

Bummer! But there's hope: This excellent blog post

That's all you need. ...unless you have some uncompatible library extension installed. E.g. json_enhancements

The problem is that you can't install json_enhancements into a Postgres 9.3 server (json_enhancements is a backport from 9.3 to desperate people still on 9.2). And you can't do the upgrade because the new server has one less installed library. You'll get this after a failing pg_upgrade:

peterbe@mbp:~$ cat loadable_libraries.txt
Could not load library "$libdir/json_enhancements"
ERROR:  could not access file "$libdir/json_enhancements": No such file or directory

I left some more details about this on the pgsql-admin mailing list.

The trick that worked for me was to start the old 9.2 server like this:

/usr/local/Cellar/postgresql/9.2.2/bin/postgres -D /usr/local/var/postgres -p 9000

And now I can open psql against the old data[base] and drop the extension:

$ psql -p 9000 mydatabase
psql (9.3.4, server 9.2.2)
mydatabase=# drop extension json_enhancements;

After I had done that I was able to successfully complete the pg_upgrade.

I hope this helps some other poor sucker stuck in the same chicken and egg situation.