Peterbe.com

A blog and website by Peter Bengtsson

Screenshot-sharing performance comparison

13 November 2015 10 comments   MacOSX, Web development


One tool that I use many times per day, at work, is to take a screenshot on my mac and then that gets uploaded to the clouds and a permalink to that picture gets put in my clipboard so I can quickly and easily share it.

First I was using CloudApp, which was awesome. I can't remember how much I paid but they started being very unreliable. Sometimes the upload just failed. Sometimes viewing the image failed. It was mostly working but unreliable enough that I just couldn't cope.

So I switched to Dropbox and they have been very reliable. I can't remember how much I pay them but the primary use for paying them is that they back up a folder on the hard drive and make it easy to share other files in a nice way.

But when I take a screenshot, and share that link, that page, that shows the screenshot, is horribly slow. It's just supposed to show an image! It's not supposed to load so slowly that it makes my browser tremble. Shame on you Dropbox!

So lastly, people have been saying great things about Jumpshare. It's free!! Their "plus upgrade", for $9.99/month gives you more options, more storage (1TB), possible password protection, custom branding, custom domain and analytics. That's nice but I'm not desperate so I might upgrade later.

Samples

But let's look at the difference in how these three perform in showing an image:

  1. Dropbox sample

  2. CloudApp sample

  3. Jumpshare sample

By the way, I'm sorry about the motif in the pictures but I encourage you to open each of these and notice that they all look different. I don't know if that's because those sites (CloudApp and Jumpshare) apply some CSS filters a la Instagram but they look different. Here's the original. That might be topic enough for a whole new blog post. But that's for another time.

Webpagetest.org

First let's load these on Webpagetest.org:

  1. Dropbox

  2. CloudApp

  3. Jumpshare

Last but not least; a visual comparison of all three on Firefox, DSL from San Jose, CA, USA. Here's the video comparison.

Devtools

Here using the pure browser Firefox Devtools to measure the network requests needed:

  1. Dropbox
    Dropbox

  2. CloudApp
    CloudApp

  3. Jumpshare
    Jumpshare

Things to note about these:

In numbers

Metric Dropbox CloudApp Jumpshare
Length of URL 85 17 44
HTTPS Yes No Yes
Fully loaded (time) 21.216s 12.420s 13.839s
Fully loaded (bytes) 2,747 KB 1,772 KB 1,910 KB
Fully loaded (requests) 198 90 44
Speed Index 13065 8707 8685
Upgrade price (per month) $9.99 $8.25 $9.99

The winner?

As you can see, CloudApp loads marginally faster than Jumpshare (and Dropbox trails long long after). Also, CloudApp wins more rows in the "In numbers" section above. But lack of HTTPS I kinda sad.

But remember, the reason I ditched CloudApp was because it was unreliable to the point of serious frustration. They might win todays performance comparsion but I dare not go back. This new contender, Jumpshare, looks and feels great. The OSX app worked wonderfully and was really easy peasy to set up. Now I have a cute little kanguru in the OSX toolbar.

So, I think I'll stick with Jumpshare.com for now. I can't tell how much storage they give you for free but...

My money

So you get more features and more storage if you pay $X per month? What I really would pay for is a much faster web page. I know it would be possible. The image you view is 1,074.4KB and all you actually only need is a little bit of HTML around it and maybe some really basic CSS. It should be possible entirely without any JavaScripts. That, I would happily pay for.

UPDATE

On closer inspection, it seems Jumpshare's CSS is NOT 636.18KB. The requests analyzer in the Firefox Devtools most likely have a bug.

Whatsdeployed

11 November 2015 4 comments   Mozilla, Web development, Python

http://whatsdeployed.io/


Whatsdeployed was a tool I developed for my work at Mozilla. I think many other organizations can benefit from using it too.

So, on many sites, what we do when deploying a site, is that we note which git sha was used and write that to a file which is then exposed via the web server. Like this for example. If you know that sha and what's at the tip of the master branch on the project's GitHub page, you can build up an interesting dashboard that allows you to see what's available and what's been deployed.

Sample Whatsdeployed screen for the Mozilla Socorro project
The other really useful case is when you have more than just one environment. For example, you might have a dev, stage and prod environment and, always lastly, the master branch on GitHub. Now you can see what code has been shipped on prod versus your staging environment for example.

This is one of those far too few projects that you build quickly one Friday afternoon and it turns out to be surprisingly useful to a lot of people. I for one, check various projects like this several times per day.

The code is on GitHub and it's basically a tiny bit of Flask with some jQuery doing a couple of AJAX requests. If you enjoy it and use it, please share.

Chainable catches in a JavaScript promise

05 November 2015 0 comments   Javascript, Web development


If you have a Promise that you're executing, you can chain multiple things quite nicely by simply returning the value as it "passes through".
For example:

new Promise((resolve) => {
  resolve('some value')
})
.then((value) => {
  console.log('1', value)
  return value
})
.then((value) => {
  console.log('2', value)
  return value
})

This will console log

1 some value
2 some value

And you can add more .then() to it. As many as you like. Just remember to "play ball" by passing the value. In fact, you can actually pass a different value. Like this for example:

new Promise((resolve) => {
  resolve('some value')
})
.then((value) => {
  console.log('1', value)
  return value
})
.then((value) => {
  console.log('2', value)
  return value.toUpperCase()
})
.then((value) => {
  console.log('3', value)
  return value
})

Demo here. This'll console log

1 some value
2 some value
3 SOME VALUE

But how do you do the same with multiple .catch()?

This is NOT how you do it:

new Promise((resolve, reject) => {
  reject('some reason')
})
.catch((reason) => {
  console.warn('1', reason)
  return reason
})
.catch((reason) => {
  console.warn('2', reason)
  return reason
})

Demo here. When you run that you just get:

1 some reason

To chain catches you have to re-raise (aka re-throw) it:

new Promise((resolve, reject) => {
  reject('some reason')
})
.catch((reason) => {
  console.warn('1', reason)
  throw reason
})
.catch((reason) => {
  console.warn('2', reason)
})

Demo here. The output if you run this is:

1 some value
2 some value

But you have to be a bit more careful here. Note that in the second .catch() it doesn't re-throw the reason one last time. If you do that, you get a general JavaScript error on that page. I.e. an unhandled error that makes it all the way out to the web console. Meaning, you have to be aware of errors and take care of them.

Why does this matter?

It matters because you might want to have a, for example, low level and a high level dealing with errors. For example, you might want to log all exceptions AND still pass them along so that higher level code can be aware of it. For example, suppose you have a function that fetches data using the fetch API. You use it from multiple places and you don't want to have to log it everywhere. Instead, that wrapping function can be responsible for logging it but you still have to deal with it.

For example, this is contrived by not totally unrealistic code:

let fetcher = (url) => {
  // this function might be more advanced
  // and do other fancy things
  return fetch(url)
}

// 1st
fetcher('http://example.com/crap')
.then((response) => {
  document.querySelector('#result').textContent = response
})
.catch((exception) => {
  console.error('oh noes!', exception)
  document.querySelector('#result-error').style['display'] = 'block'
})

// 2nd
fetcher('http://example.com/other')
.then((response) => {
  document.querySelector('#other').textContent = response
})
.catch((exception) => {
  console.error('oh noes!', exception)
  document.querySelector('#other-error').style['display'] = 'block'
})

Demo here

Notice how each .catch() handler does the same kind of logging but deals with the error in a human way differently.
Wouldn't it be nice if you could have a general and central .catch() for logging but continue dealing with the errors in a human way?

Here's one such example:

let fetcher = (url) => {
  // this function might be more advanced
  // and do other fancy things
  return fetch(url)
  .catch((exception) => {
    console.error('oh noes! on:', url, 'exception:', exception)
    throw exception
  })
}

// 1st
fetcher('http://example.com/crap')
.then((response) => {
  document.querySelector('#result').textContent = response
})
.catch(() => {
  document.querySelector('#result-error').style['display'] = 'block'
})

// 2nd
fetcher('http://example.com/other')
.then((response) => {
  document.querySelector('#other').textContent = response
})
.catch(() => {
  document.querySelector('#other-error').style['display'] = 'block'
})

Demo here

Here you get the best of both worlds. You have a central place where all exceptions are logged in a nice way, and the higher level code only has to deal with the human way of explaining that something went wrong.

It's pretty basic but it's probably useful to somebody else who gets confused about how to deal with exceptions in promises.

Weight of your PostgreSQL tables "lumped together"

31 October 2015 0 comments   PostgreSQL


We have lots of tables that weigh a lot. Some of the tables are partitions so they're called "mytable_20150901" and "mytable_20151001" etc.

To find out how much each table weighs you can use this query:

select table_name, pg_relation_size(table_name), pg_size_pretty(pg_relation_size(table_name))
from information_schema.tables
where table_schema = 'public'
order by 2 desc limit 10;

It'll give you an output like this:

        table_name        | pg_relation_size | pg_size_pretty
--------------------------+------------------+----------------
 raw_adi_logs             |      14724538368 | 14 GB
 raw_adi                  |      14691426304 | 14 GB
 tcbs                     |       7173865472 | 6842 MB
 exploitability_reports   |       6512738304 | 6211 MB
 reports_duplicates       |       4428742656 | 4224 MB
 addresses                |       4120412160 | 3930 MB
 missing_symbols_20150601 |       3264897024 | 3114 MB
 missing_symbols_20150608 |       3170762752 | 3024 MB
 missing_symbols_20150622 |       3039731712 | 2899 MB
 missing_symbols_20150615 |       2967281664 | 2830 MB
(10 rows)

But as you can see in this example, it might be interesting to know what the sum is of all the missing_symbols_* partitions.

Without further ado, here's how you do that:

select table_name, total, pg_size_pretty(total)
from (
  select trim(trailing '_0123456789' from table_name) as table_name, 
  sum(pg_relation_size(table_name)) as total
  from information_schema.tables
  where table_schema = 'public'
  group by 1
) as agg
order by 2 desc limit 10;

Then you'll get possibly very different results:

        table_name        |    total     | pg_size_pretty
--------------------------+--------------+----------------
 reports_user_info        | 157111115776 | 146 GB
 reports_clean            | 106995695616 | 100 GB
 reports                  | 100983242752 | 94 GB
 missing_symbols          |  42231529472 | 39 GB
 raw_adi_logs             |  14724538368 | 14 GB
 raw_adi                  |  14691426304 | 14 GB
 extensions               |  12237242368 | 11 GB
 tcbs                     |   7173865472 | 6842 MB
 exploitability_reports   |   6512738304 | 6211 MB
 signature_summary_uptime |   6027468800 | 5748 MB
(10 rows)

You can read more about the trim() function here.

How to "onchange" in ReactJS

21 October 2015 1 comment   ReactJS, Javascript


Normally, in vanilla Javascript, the onchange event is triggered after you have typed something into a field and then "exited out of it", e.g. click outside the field so the cursor isn't blinking in it any more. This for example

document.querySelector('input').onchange = function(event) {
  document.querySelector('code').textContent = event.target.value;
}

First of all, let's talk about what this is useful for. One great example is a sign-up form where you have to pick a username or type in an email address or something. Before the user gets around to pressing the final submit button you might want to alert them early that their chosen username is available or already taken. Or you might want to alert early that the typed in email address is not a valid one. If you execute that kind of validation on every key stroke, it's unlikely to be a pleasant UI.

Problem is, you can't do that in ReactJS. It doesn't work like that. The explanation is quite non-trivial:

*"<input type="text" value="Untitled"> renders an input initialized with the value, Untitled. When the user updates the input, the node's value property will change. However, node.getAttribute('value') will still return the value used at initialization time, Untitled.

Unlike HTML, React components must represent the state of the view at any point in time and not only at initialization time."*

Basically, you can't easily rely on the input field because the state needs to come from the React app's state, not from the browser's idea of what the value should be.

You might try this

var Input = React.createClass({
  getInitialState: function() {
    return {typed: ''};
  },
  onChange: function(event) {
    this.setState({typed: event.target.value});
  },
  render: function() {
    return <div>
        <input type="text" onChange={this.onChange.bind(this)}/>
        You typed: <code>{this.state.typed}</code>
      </div>
  }
});
React.render(<Input/>, document.querySelector('div'));

But what you notice is the the onChange handler is fired on every key stroke. Not just when the whole input field has changed.

So, what to do?

The trick is surprisingly simple. Use onBlur instead!

Same snippet but using onBlur instead

var Input = React.createClass({
  getInitialState: function() {
    return {typed: ''};
  },
  onBlur: function(event) {
    this.setState({typed: event.target.value});
  },
  render: function() {
    return <div>
        <input type="text" onBlur={this.onBlur.bind(this)}/>
        You typed: <code>{this.state.typed}</code>
      </div>
  }
});
React.render(<Input/>, document.querySelector('div'));

Now, your handler is triggered after the user has finished with the field.

And bash basics

16 October 2015 2 comments   MacOSX, Linux


It's one of those things; not hard to understand and certainly not an advanced trick but I sometimes see people miss out on this.

In bash there are sort of two ways of saying "Do this and then do that". You can either say "Do this and no matter what happens then do that" or you can say "Do this and if that worked also do that".

Examples

Suppose you have two command executables you want to run. They can succeed or fail.

$ echo "Do this and no matter what happens then do that"
$ ./command1 ; ./command2

If you run that, ./command2 will run even if ./command1 failed.
The other one is...

$ echo "Do this and if that worked also do that"
$ ./command1 && ./command2

You might recognize the && thing from JavaScript or Java or C or one of those. If you recognize it you might quickly also conclude that you can do this too:

$ echo "Do this and only if it failed do that"
$ ./command1 || ./command2

In this latter case only one of those (or none!) will succeed.

So when does this come in handy?

Here are some examples that I often use:

Meaning, I know my code is good to push, iff the tests pass

$ nosetests && git commit -a -m "some feature" && git push peterbe mybranch

Or if you might want to be alerted if something failed after the first command slowly takes its time to finish:

$ nosetests && say "Tests finished" || say "Work harder"

(say is an OSX specific command and not a built-in in bash)

The ; is useful when you don't care if the first command finished and this is more rare. For example:

$ rm static/ ; ./manage.py collectstatic --noinput

Why bother?

Perhaps it goes without saying, the reason for doing all of these is generally when the first command takes a long time and you don't want to sit and wait till it's finished to run the second time. By "piping them together" like this, the second command will safely start as soon as possible whilst you go away and pay attention to something else.

mozjpeg installation and sample

10 October 2015 3 comments   Mozilla, Web development, Linux

https://github.com/mozilla/mozjpeg


I've written about mozjpeg before where I showed what it can do to a sample directory full of different kinds of JPEGs. But let's get more real. Let's actually install it and look at one thumbnail and one big photo.

To install, I used the pre-compiled binaries from this wonderful site. Like this:

# wget http://mozjpeg.codelove.de/bin/mozjpeg_3.1_amd64.deb
# dpkg -i mozjpeg_3.1_amd64.deb
# ls -l /opt/mozjpeg/bin/cjpeg
-rwxr-xr-x 1 root root 50784 Sep  3 19:03 /opt/mozjpeg/bin/cjpeg

I don't know why the binary executable becomes called cjpeg but that's fine. Let's put it in $PATH so other users can execute it:

# cd /usr/local/bin
# ln -s /opt/mozjpeg/bin/cjpeg

Now, let's actually use it for something. First we need a realistic lossy thumbnail that we can optimize.

$ wget http://d1ac1bzf3lrf3c.cloudfront.net/static/cache/eb/f0/ebf08e64e80170dc009e97f6f9681ceb.jpg

This was one of the thumbnails from a previous post called Panasonic Lumix from 2008 or a iPhone 5S from 2014.

Let's optimize!

$ jpeg -outfile ebf08e64e80170dc009e97f6f9681ceb.moz.jpg -optimise ebf08e64e80170dc009e97f6f9681ceb.jpg
$ ls -l ebf08e64e80170dc009e97f6f9681ceb.*
-rw-rw-r-- 1 django django 11391 Sep 26 17:04 ebf08e64e80170dc009e97f6f9681ceb.jpg
-rw-r--r-- 1 django django  9414 Oct 10 01:40 ebf08e64e80170dc009e97f6f9681ceb.moz.jpg

Yay! It's 17.4% smaller. Saving 1.93Kb.

So what do they look like? See for yourself:

I have to zoom in (⌘-+) 3 times until I can see any difference. But remember, the saving isn't massive but the usecase here is a thumbnail.

So, let's do the same with a non-thumbnail. Some huge JPEG.

$ time cjpeg -outfile Lumix-2.moz.jpg -optimise Lumix-2.jpg
real    0m3.285s
user    0m3.122s
sys     0m0.080s
$ ls -l Lumix*
-rw-rw-r-- 1 django django 4880446 Sep 26 17:20 Lumix-2.jpg
-rw-rw-r-- 1 django django 1546978 Oct 10 02:02 Lumix-2.moz.jpg
$ ls -lh Lumix*
-rw-rw-r-- 1 django django 4.7M Sep 26 17:20 Lumix-2.jpg
-rw-rw-r-- 1 django django 1.5M Oct 10 02:02 Lumix-2.moz.jpg

In other words, from 4.7Mb to 1.5Mb. It's 68.3% the size of the original. And the visual difference?

Again, I have to zoom in 3 times to be able to tell any difference and even when I've done that it's hard to tell which is which.

In conclusion, let's go ahead and use mozjpeg to optimize thumbnails.

localStorage is not async, but it's FAST!

06 October 2015 7 comments   Javascript, AngularJS, Web development


A long time I go I wrote an angular app that was pleasantly straight forward. It loads all records from the server in one big fat AJAX GET. The data is large, ~550Kb as a string of JSON, but that's OK because it's a fat-client app and it's extremely unlikely to grow any multiples of this. Yes, it'll some day go up to 1Mb but even that is fine.

Once ALL records are loaded with AJAX from the server, you can filter the whole set and paginate etc. It feels really nice and snappy. However, the app is slightly smarter than that. It has two cool additional features...

  1. Every 10 seconds it does an AJAX query to ask "Have any records been modified since {{insert latest modify date of all known records}}?" and if there's stuff, it updates.

  2. All AJAX queries from the server are cached in the browser's local storage (note, I didn't write localStorage, "local storage" encompasses multiple techniques). The purpose of that is to that on the next full load of the app, we can at least display what we had last time whilst we wait for the server to return the latest and greatest via a slowish network request.

  3. Suppose we have brand new browser with no local storage, because the default sort order is always known, instead of doing a full AJAX get of all records, it does a small one first: "Give me the top 20 records ordered by modify date" and once that's in, it does the big full AJAX request for all records. Thus bringing data to the eyes faster.

All of these these optimization tricks are accompanied with a flash message at the top that says: <img src="spinner.gif"> Currently using cached data. Loading all remaining records from server....

When I built this I decided to use localForage which is a convenience wrapper over localStorage AND IndexedDB that does it all asynchronously and with proper promises. And to make it work in AngularJS I used angular-localForage so it would work with Angular's cycle updates without custom $scope.$apply() stuff. I thought the advantage of this is that it being async means that the main event can continue doing important rendering stuff whilst the browser saves things to "disk" in the background.

Also, I was once told that localStorage, which is inherently blocking, has the risk that calling it the first time in a while might cause the browser to have to take a major break to boot data from actual disk into the browsers allocated memory. Turns out, that is extremely unlikely to be a problem (more about this is a future blog post). The warming up of fetching from disk and storing into the browser's memory happens when you start the browser the very first time. Chrome might be slightly different but I'm confident that this is how things work in Firefox and it has for many many months.

What's very important to note is that, by default, localForage will use IndexedDB as the storage backend. It has the advantage that it's async to boot and it supports much large data blobs.

So I timed, how long does it take for localForage to SET and GET the ~500Kb JSON data? I did that like this, for example:

var t0 = performance.now();
$localForage.getItem('eventmanager')
.then(function(data) {
    var t1 = performance.now();
    console.log('GET took', t1 - t0, 'ms');
    ...

The results are as follows:

Operation Iterations Average time
SET 4 341.0ms
GET 4 184.0ms

In all fairness, it doesn't actually matter how long it takes to save because my app actually doesn't depend on waiting for that promise to resolve. But it's an interesting number nevertheless.

So, here's what I did. I decided to drop all of that fancy localForage stuff and go back to basics. All I really need is these two operations:

// set stuff
localStorage.setItem('mykey', JSON.stringify(data))
// get stuff
var data = JSON.parse(localStorage.getItem('mykey') || '{}')

So, after I've refactored my code and deleted (6.33Kb + 22.3Kb) of extra .js files and put some performance measurements in:

Operation Iterations Average time
SET 4 5.9ms
GET 4 3.3ms

Just WOW!
That is so much faster. Sure the write operation is now blocking, but it's only taking 6 milliseconds. And the reason it took IndexedDB less than half a second also probably means more hard work for it to sweat CPU over.

Sold? I am :)

django-pipeline + django-jinja

04 October 2015 2 comments   Django


Do you have django-jinja in your Django 1.8 project to help you with your Jinja2 integration, and you use django-pipeline for your static assets?
If so, you need to tie them together by passing pipeline.templatetags.ext.PipelineExtension "to your Jinja2 environment". But how? Here's how:

# in your settings.py


from django_jinja.builtins import DEFAULT_EXTENSIONS

TEMPLATES = [
    {
        'BACKEND': 'django_jinja.backend.Jinja2',
        'APP_DIRS': True,
        'OPTIONS': {
            'match_extension': '.jinja',
            'context_processors': [
                ...
            ],
            'extensions': DEFAULT_EXTENSIONS + [
                'pipeline.templatetags.ext.PipelineExtension',
            ],
        }
    },
    ...

Now, in your template you simply use the {% stylesheet '...' %} or {% javascript '...' %} tags in your .jinja templates without the {% load pipeline %} stuff.

It took me a little while to figure that out so I hope it helps someone else googling around for a solution alike.

Using Lovefield as an AJAX proxy maybe

30 September 2015 0 comments   Javascript, Web development

http://www.peterbe.com/ajaxornot/view7a


Lovefield, by Arthur Hsu at Google, is a cool little Javascript browser abstraction on top of IndexedDB. IndexedDB is this amazingly powerful asynchronous framework for storing data in the browser tied to the domain you're visiting. Unlike it's much "simpler" sibling localStorage, with IndexedDB you can store individual keys in a schema, use indexes for faster retrieval, asynchronous querying and much larger memory capacity than general DOM storage (e.g. localStorage and sessionStorage).

What Lovefield brings is best described by watching this video. But to save you time let me try...:

Anyway, it sounds really cool and I'm looking forward to using it for something. But before I do I thought I'd try using it as a "AJAX proxy".

So what's an AJAX proxy, you ask. Well, it can mean many things but what I have in mind is a pattern where a web app's MVC is tied to a local storage and the local storage is tied to AJAX. That means two immediate benefits:

Another subtle benefit is a "corner case" of that offline capability; when you load up the app you can read from local storage much much faster than from the network meaning you can display user data on the screen sooner. (With the obvious caveat that it might be stale and that the data will change once the network read is completed later)

So, to take this idea for a spin (the use of a local storage to remember the loaded data last time first) I extended my AJAX or Not playground with a hybrid that uses React to render the data, but render the data from Lovefield (and from localStorage too). Remember, it's an experiment so it's a bit clunky and perhaps contrieved. The objective is to notice how soon after loading the page, that data because available for your eyes to consume.

Here is the playground test page

You have to load it at least once to fill your IndexedDB with some data from an AJAX request. Then, reload the page and it'll display what it has locally (in IndexedDB extracted with the Lovefield API). Then, after it's loaded, try refreshing the browser repeatedly. With or without a Wifi connection.

Basically, it works. However, perhaps I've chosen the worst possible test bed for playing with Lovefield. Because it is super slow. If you open the web console, you'll see it reports how long it takes to extract the data out of Lovefield. The code looks like this:

...
getPostsFromDB: function() {
  return schemaBuilder.connect().then(function(db) {
    var table = db.getSchema().table('post');
    return db.select().from(table).exec();
  });
},
...
var t0 = performance.now();
this.getPostsFromDB()
.then(function(results) {
  var t1 = performance.now();
  console.log(results.length, 'records extracted');
  console.log((t1 - t0).toFixed(2) + 'ms to extract');
  ...

You can see the source here in full.

So out of curiousity, I forked this experiment. Kept almost all the React code but replaced the Lovefield stuff with good old JSON.parse(localStorage.getItem('posts') || '[]'). See code here.
This only takes 1-2 milliseconds. Lovefield repeatedly takes about 400-550 milliseconds on my Firefox version 43.

By the way, load up the localStorage fork and after a first load, try refreshing it over and over and notice how amazingly fast it is. Yay localStorage!