Introducing: HUGEpic - a web app for showing massive pictures

03 November 2012   19 comments   Python

Mind That Age!

This blog post is 5 years old! Most likely, it's content is outdated. Especially if it's technical.

Powered by Fusion×

So here's my latest little fun side-project:

Zoomed in on Mona Lisa
It's a web app for uploading massive pictures and looking at them like maps.

The advantages with showing pictures like this are:

All the code is here on Github and as you can see it's a Tornado that uses two databases: MongoDB and Redis and when it connects to MongoDB it uses the new Tornado specific driver called Motor which is great.

Before I get to the juicy client side stuff, let me talk about something awesome in between Tornado and Javascript, namely: RQ
It's an awesomely simple python message queue that only works with Python, Redis and on UNIXy systems. All checks. My only real experience with message queues has honestly been with Celery which is also great but a right pain compared to RQ. With RQ, all I do is reduce the heavy tasks down to pure python functions. For example, make_thumbnail() then I simply do this:

from utils import make_thumbnail
from rq import Queue
queue = Queue(connection=self.redis)
job = q.enqueue(

and that's it. Starting rqworker on the command line (from somewhere where __import__('utils.make_thumbnail') makes sense) and we're off!

You might think that using a message queue is all fancy pants and just something I need to bother myself with because of Tornado's eventloop nature. But no, it's so much more than that. When a massive 5 Mb JPG is uploaded, a little algorithm is figuring out roughly how many zoom levels that can be used and what ALL 256x256 tiles are going to be for each resized version of the original. Then it needs to generate a thumbnail to represent that JPG and all the tiles and the thumbnail need to go through an optimizer (I'm using jpegoptim and optipng).
Lastly, to be able to serve all tiles from a fast CDN I have to upload every single tile to a Reduced Redundancy Amazon S3 storage that the Amazon CloudFront CDN is hooked up to. This might sometimes fail due to network hickups and must be resilient to continue where it left off.

All of that stuff takes a very long time but it's made it much easier and much more comfortable thanks to RQ.

Now, on the front end. The genesis inspiration to this was a library called Polymaps which isn't bad but when I later switched to Leaftlet I was blown away. It was lighter, smoother running and has an absolutely stunning API that even I could understand.

And that led me to find another amazingly neat library, for Leaflet, called Leaflet.draw which makes it really easy to add tools for drawing on the pictures. You can draw lines, rectangles, circles, polygons and drop markers. And for all of them it was relatively easy to bind cute popup bubbles so you can type in comments like this or this.

And lastly, there's Filepicker. It's a brilliant web service that simply takes care of your uploads. Uploading a 8 Mb JPG through a little file upload form not only takes an incredibly long time, it's fragile and has no good default UX. Filepicker takes care of all of that and makes it possible to upload files the way you want it. For example, if you use Google Drive to back up your massie pictures, Filepicker can handle that. And Dropbox. And Box. And of course, regular drag-and-drop uploads but with a lovely progress bar indicator and thumbnail preview.

Uploading by URL
There's also upload by simply entering a URL. So, try find a picture on Google Images click on one, then in the right-hand bar right-click the URL and "Copy Link Location" and paste that into HUGEpic to test.

So for a weekend project that has taken only a couple of weeks I'm quite proud. My hopes for big success is nil but it has been a great learning experience mixing interesting client-side programming, web programming and intereting CPU bound and networking challenges.

On iPhone Safari
Oh, and did I mention it works great on mobile too? Even the file uploading part. Thanks to Filepicker.


Is there any way to not have it show each action/move on the "map" as a new page. It makes going back to a previous page almost impossible and fills up your web history super fast.

But in general, awesome!
Peter Bengtsson
That has been fixed now. Just remove the location hash part of the URL and reload.

I kinda liked it but I'm rather home blind being the developer. I'm so used to using the Refresh button.
Mathieu Virbel
The idea is very nice! I'm also very impressed about how fast you can fill the browser history! Just kidding, it would be nice to not change the hash part at every movement, just set it at a timeout or something :/
Peter Bengtsson
That hash thing is now taken care of. It will now only happen if you have a location hash in the URL or else it just won't track like that any more.
Nice interface, pretty snappy over DSL with the samples... but umm, it's 2012, my *phone* takes 8Mb images ; have you checked out this code on anything that actually deserves the term "massive"? (I'd suggest grabbing stuff from but I'm not sure there's a way to download, there's custom software that cooks them up from hundreds or thousands of individual camera jpgs...)
Peter Bengtsson
It looks like this gigapan is a "competing" solution. But in Flash.

8Mb is easy peasy. There are some much larger images in there now.

My current problem is with how to resize pictures that are 50+Mb in file size without the server dying with memory errors. Poor VM. The downloading part is easy. The resizing is harder.
How does this compare against Djatoka? Do you also use JPEG2000?
Peter Bengtsson
I have no idea what this is but i'm reading up on it right now.
Peter Bengtsson
Actually, I don't see how Djatoka compares at all. It's a picture gallery thing.

And I don't use JPEG2000. I use ImageMagick to do the conversion.
Mr Whirleygig
Interesting, but when I tried to upload my 100mb tif I got an alert box saying 400 bad request.
Peter Bengtsson
TIFs don't work. Only JPEG and PNG.

It would be interesting to solve that. What I've learned today is that the hardest problem isn't the downloading (uploading as seen from your end) part but the resizing. When I resize, the VM maxes out of RAM entirely and everything freezes up and it breaks all the messages queues and stuff.

I need to figure out a way to gently do the resizing with maxing the RAM. Like a Hadoop job but for image resizing.

A thought would be crop, resize and re-assembly. Perhaps that's the way to go.
Dustin J. Mitchell
This is really cool!

Something funny I've noticed at Mozilla is that "message queueing" and the worker model are conflated. I first noticed this when I learned that our web clusters' rabbitmq hosts are named "celery", even though they're only running rabbitmq.

Message queueing can be used for so much more than the worker model. You can use it to implement shared state (using Paxos if you want to get fancy). You can build multi-user games with it. You can build APIs around it to allow different applications to communicate - the classic example being sales and billing.

It's a little thing, I know, but I'd hate to see the webdev community miss the forest for the trees by thinking that MQ is *just* the worker model. It's *so* much more.
Peter Bengtsson
Wise words. Thanks. I'll try to keep an open mind about MQs. Personally I know I'm probably only going to really understand it by actually building something with it. I'm not very academic.
Daniel Gasienica
Glad people like you keep exploring the space. I worked on and before.
Peter Bengtsson
That's awesome! Nice to meet you. I didn't know about those till today.
Blogging about stuff I build is the best way to find out about things that you don't think to google for.

How did you solve the problem of resizing seriously large images?
Neil Rashbrook
This made me think of XKCD 1110.
Dustin J. Mitchell
There's an open project for Buildbot to build a reliable means of message queueing between the browser and a Python daemon, if you want something to work on ;)

Buildbot's using MQ for notifications and dynamic content updates. The Buildbot servers communicate using a "normal" MQ protocol like AMQP, then filter and relay some of those messages to browser-side clients that have requested them.
How come my image is tiled horizontally?
Peter Bengtsson
That's unfortunately the limitations of Leaflet which is the library used for the actual display.
Thank you for posting a comment

Your email will never ever be published

Related posts

Fastest way to thousands-commafy large numbers in Python/PyPy 13 October 2012
Ability to embed HUGE pictures 18 November 2012
How I back up all my photos on S3 via Dropbox 28 August 2014
Is Nginx obsolete now that we have Amazon CloudFront? 28 July 2012
Secs sell! How frickin' fast this site is! (client side) 30 March 2012
Persistent caching with fire-and-forget updates 14 December 2011
DoneCal homepage now able to do 10,000 requests/second 13 February 2011
Hosting Django static images with Amazon Cloudfront (CDN) using django-static 09 July 2010
Ugly one-liner to debug an object in Zope 31 March 2005