Peterbe.com

A blog and website by Peter Bengtsson

tl;dr Crash-stats is Mozilla's crash reporter dashboard. Simply fixing the static assets made the site 25% faster.

Before http://www.webpagetest.org/result/150820_X5_V5T/

After http://www.webpagetest.org/result/150824_7F_1C3Q/

(The "First Byte Time" is still terrible but that's for another discussion. We're working on a re-write of the underlying data model for that particular report.)

  • Note how the SpeedIndex dropped from 2823 to 2098 which basically means, you can see stuff sooner.

  • The Load Time used to be 5.7 seconds on average. Now it takes 3.5 seconds.

  • It used to weigh 717 KB to load the whole thing. Now it weighs 326 KB.

The only thing we changed was a long overdue correction of static asset headers and Gzip compression. Now, files with unique URLs (e.g. /static/CACHE/css/23a811f100bc.css) have maximum aggressive cache headers. And now all .js, .css and text/html is Gzipped.

Was it easy to do? Hell no!
Does it matter? Hell yeah! We don't have a lot of users or traffic on these reports but the people who use them do this for a living and making the site feel snappier for them would make their lives more productive.

Suppose that you've enabled gzip delivery of your site and its static assets. How do you test it?

One obvious way is to load the site with the developer tools in your browser and look at the headers there. Like this for example:
Is it gzip'ed? Yes!

Another more hard code and geek-power way is to simply use curl.

It goes without saying, the ideal way to set up Nginx is to make it optional. Don't upload a gzipped file to your server and force gzip down on every client. Instead, let something like Nginx handle it on-the-fly (don't worry, it's ultrafast).

So to see if gzip is working, take your URL and run these two commands:

$ curl -v --compressed http://www.example.com/page.cat > /dev/null

And look for this line:

< Content-Encoding: gzip

Also you should look for this line:

Content-Length: 2403

(number will obviously vary)

Now run the same curl command but without the --compressed. E.g.

$ curl -v http://www.example.com/page.cat > /dev/null

Now there won't be a Content-Encoding header in the response. It defaults to plain text.
Also, now look for the Content-Length and amuse yourself in the profit that this number is likely to be much larger than before.

Here's a realistic example:

With --compressed

$ curl -v --compressed https://crash-stats.allizom.org/home/products/Firefox > /dev/null
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 52.26.241.244...
* Connected to crash-stats.allizom.org (52.26.241.244) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: crash-stats.allizom.org
* Server certificate: DigiCert SHA2 Secure Server CA
* Server certificate: DigiCert Global Root CA
> GET /home/products/Firefox HTTP/1.1
> User-Agent: curl/7.37.1
> Host: crash-stats.allizom.org
> Accept: */*
> Accept-Encoding: deflate, gzip
>
< HTTP/1.1 200 OK
< Content-Encoding: gzip
< Content-Type: text/html; charset=utf-8
< Date: Thu, 20 Aug 2015 18:38:00 GMT
* Server nginx/1.6.3 is not blacklisted
< Server: nginx/1.6.3
< Set-Cookie: anoncsrf=yieBMvzCn4fO4lmMQbjuq0Cibl9s7oxG; expires=Thu, 20-Aug-2015 20:38:00 GMT; httponly; Max-Age=7200; Path=/; secure
< Vary: Accept-Encoding
< Vary: Cookie
< X-Frame-Options: DENY
< Content-Length: 2403
< Connection: keep-alive
<
{ [data not shown]
100  2403  100  2403    0     0   4734      0 --:--:-- --:--:-- --:--:--  4730
* Connection #0 to host crash-stats.allizom.org left intact

Without

$ curl -v  https://crash-stats.allizom.org/home/products/Firefox > /dev/null
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 54.213.30.86...
* Connected to crash-stats.allizom.org (54.213.30.86) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: crash-stats.allizom.org
* Server certificate: DigiCert SHA2 Secure Server CA
* Server certificate: DigiCert Global Root CA
> GET /home/products/Firefox HTTP/1.1
> User-Agent: curl/7.37.1
> Host: crash-stats.allizom.org
> Accept: */*
>
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0< HTTP/1.1 200 OK
< Content-Type: text/html; charset=utf-8
< Date: Thu, 20 Aug 2015 18:38:05 GMT
* Server nginx/1.6.3 is not blacklisted
< Server: nginx/1.6.3
< Set-Cookie: anoncsrf=evG8kmoXjHv5aeyFIQHxNcnGahdxwIOy; expires=Thu, 20-Aug-2015 20:38:05 GMT; httponly; Max-Age=7200; Path=/; secure
< Vary: Accept-Encoding
< Vary: Cookie
< X-Frame-Options: DENY
< Content-Length: 12299
< Connection: keep-alive
<
{ [data not shown]
100 12299  100 12299    0     0  24314      0 --:--:-- --:--:-- --:--:-- 24306
* Connection #0 to host crash-stats.allizom.org left intact

optisorl is a Python package for sorl-thumbnail which is a kick-ass Python package for Django. sorl-thumbnail is pretty popular and used by a lot of people who have images they want to display as thumbnails.

A problem you find is that oftentimes the PNG thumbnails aren't as optimized as they can be. A great tool for having a second optimization pass on an PNG file is pngquant. You basically, run it like this:

$ ls -l bugzilla.png
-rw-r--r--@ 1 peterbe  staff  12188 Dec 12  2014 bugzilla.png
$ pngquant bugzilla.png
:~/Downloads$ ls -l bugzilla-fs8.png
-rw-r--r--@ 1 peterbe  staff  6630 Aug 18 13:15 bugzilla-fs8.png

That's a 140x140 pixel PNG that became 5,558 bytes smaller (46% saving).

Anyway, this is where optisorl comes in. It's an extension to sorl-thumbnail that is able to execute pngquant on the PNG right after the thumbnail file has been created. It does so by calling out a sub-process command to pngquant. See the code here which is all the magic there is to it really.

The reason I built this was to reduce the images on Air Mozilla. At the time I did the measurement, the PNGs total weight on the home page was 129KB and after running them all through optisorl the total weight was only 65KB.

To install, it just pip install it like so:

$ pip install optisorl

And you need to install pngquant like brew install pngquant or apt-get install pngquant.

Then, to activate it you need to set this Django setting:

THUMBNAIL_BACKEND = 'optisorl.backend.OptimizingThumbnailBackend'

If you decide to put the pngquant executable somewhere not on the PATH you can add to your settings.py file something like this:

PNGQUANT_LOCATION = '/path/to/bin/pngquant'

There's a bunch of features it doesn't have but we can work together on that. For example, there are certain PNG images that you might want to display as thumbnails but due to something about the image, e.g. its use of Alpha channels, you might want to explicitly disable optimizations.

These days, it's almost a painful experience reading newspaper news online.

It's just generally a pain to read these news sites. Because they are soooooo sloooooooow. Your whole browser is stuttering and coughing blood and every click feels like a battle.

The culprit is generally that these sites are so full of crap. Like click trackers, ads, analytics trackers, videos, lots of images and heavy Javascript.

Here's a slight comparison (in no particular order) of various popular news websites (using a DSL connection):

Site Load Time Requests Bytes in SpeedIndex
BBC News homepage 14.4s 163 1,316 KB 6935
Los Angeles Times 35.4s 264 1,909 KB 13530
The New York Times 28.6s 330 4,154 KB 17948
New York Post 40.7s 197 6,828 KB 13824
USA Today 19.6s 368 2,985 KB 3027
The Washington Times 81.7s 547 12,629 KB 18104
The Verge 18.9s 152 2,107 KB 7850
The Huffington Post 22.3s 213 3,873 KB 4344
CNN 45.9s 272 5,988 KB 12579

Wow! That's an average of...

  • 34.2 seconds per page
  • 278 requests per page
  • 4,643 KB per page

That is just too much. I appreciate that these news companies that make these sites need to make a living and most of the explanation of the slow-down is the ads. Almost always, if you click to open one of those pages, it's just a matter of sitting and waiting. It loading in a background tab takes so much resources you feel it in other tabs. And by the time you go to view and you start scrolling your computer starts to cough blood and crying for mercy.

But what about the user? Imagine if you simply pull out of some of the analytics trackers and opt for simple images for the ads and just simplify the documents down to the minimum? Google Search does this and they seem to do OK in terms of ad revenue.

An interesting counter-example; this page on nytimes.com is a long article and when you load it on WebPageTest it yields a result of 163 requests. But if you use your browser dev tools and keep an eye on all network traffic as you slowly scroll down the length of the article all the way down to the footer, you notice it downloads an additional 331 requests. It starts with about 2KB of data but by the time you've downloaded to the end and scrolled past all ads and links it's amassed about 5.5KB of data. I think that's fair. At least the inital feeling when you arrive on the page isn't that of disgust.

Another realization I've found whilst working on this summary is that oftentimes, sites that are REALLY really slow and horrible to use don't necessarily have super many external resources from different domains but they just have far far too much Javascript. This random page on Washington Times for example has 209 Javascript files that together weighs 9.4KB (roughly 8 times the amount of image data). Not only does that need to be downloaded. It also needs to be parsed and I bet a zillion event handlers and DOM manipulators kick in and basically make scrolling a crying game even on a fast desktop computer.

And then it hit me!

We've all been victims of websites annoyingly try to lure you to install their native app when trying to visit their website on a smartphone. The reason they're doing that is because they've been given a second chance to build something without eleventeen iframes and 4MB of Javascript so the native app feels a million better because they've started over and gotten things right.

My parting advice to newspapers who can do something about their website; Radically refactor your code. Question EVERYTHING and delete, remove and clean up things that doesn't add value. You might lose some stats and you might not be able to show as much ads but if your site loads faster you'll get a lot more visitors and a lot more potential to earn bigger $$$ on fewer more targetted ads.

the-toast.net on WebPageTest
The number one rule in the general best practice of web performance is "Minimize HTTP Requests".

Yeah, it's more important than anything else.

So, I took the biggest offender from the collected sites Number of Domains and ran it through WebPageTest. The result is almost comical.

That's amazing! I'm not even sure I could achieve that even if I tried. It's almost like those sickeningly infested Internet Explorer toolbars that are often a joke.

Some numbers to blow your mind (on a DSL connection from San Jose, CA, USA)...

  • Time to fully loaded: 85 seconds
  • Bytes in: 9,851 KB
  • Total requests: 1,390 (!!)
  • SpeedIndex: 10028 (my own site's 950)

What's mindblowing about how bad this site is that even on the repeated view, where a browser cache is supposed to help a lot, is that it takes 30.6 seconds and still makes 347 requests. Just wow!

In the last month I've been playing with React in my spare time. Said time is extremely limited so I'm unable to read the documentation or source code as a way to learn. Instead, I follow various tutorials and snippets on stackoverflow etc. to get going. I have something now that will soon be production ready and I'm excited about it even though it only scratches the surface of what React can do.

If you're learning React, starting from more or less scratch, here are some of my tips:

  1. Start with skimming the official tutorial or this little gem on scotch.io. Both start with a simple index.html in which you include JSXTransformer.js and you write your React/JSX code in a <script type="text/jsx"> block. Not a bad idea. That helps you appreciate what JSX is and how it relates to turning your code into production code.

  2. Don't fear jQuery! For those who've lept from jQuery based apps to AngularJS or Ember are often told to stay away from jQuery and learn to do it the "new way". But don't fear jQuery. Suppose you're working on a widget/component with React code; then you can continue to let your trusted jQuery code handle some other effect or widget on the page like a bootstrap modal window or something. Apparently, at Facebook, the first production use of React was just the little commenting widget underneath posts. They didn't rewrite the whole site in React when it all started. Also, even the official tutorial advocates using jQuery to do an AJAX fetch. (Personally I prefer the built-in fetch and this polyfill for doing AJAX fetches)

  3. Avoid "super-normalization" of components. A lot of React apps is about one master component rendering one or more other components that renders itself with other components. That's fine but can easily get messy when you're starting out. Don't fear writing extra rendering functions in your class instead of always relying on writing yet another deep component. For example, this is perfectly fine.
    It's good to split up distinct functionality into sub-components but don't go overboard. Ideal things for sub-components are things that have their own and different context. Just calling other local functions is for when your render function simply gets too many lines long.

  4. Avoid ES6 (aka Babel) unless you're comfortable with tooling like Webpack and Gulp. I personally jumped into the deep end straight away writing my first big React app in ES6 which has been fun but it's been hard sometimes to find matching resources. In particular around testing frameworks. A lot of stackoverflow posts and blog posts don't use ES6 so some things just don't work. I chose ES6 because I was curious about React and this project is not on any deadline so I was OK with getting stuck here and there.

  5. Remember, React is just Javascript. For someone who has done a lot of AngularJS I sometimes stop and have to think when I'm in AngularJS, "How do this the AngularJS way?". For example, in AngularJS you can't just use var timer = setTimeout(function() { ... because you're leaving "the way AngularJS works". React has its own state-awareness pitfalls but it's not nearly has precarious. Just write your code. It's just Javascript. Use React for what it's good at but don't be scared to just write code. Having said that, it might help to be aware of how binding this works. Here's a good example. (And here's a good counter-example to avoid too much function() {...}.bind(this) noise)

  6. Learn to distinguish between state and props. It's confusing at first. Especially in terms of which should I use when? My attempt of explaining it is that you can think of the state as the database and the props as that data being extracted and passed around to be shown and changed.

  7. Let render() just render. Every component has a render function. Its job is to render the current state and nothing else. It's tempting to do too much logic in there before it returns the JSX but you should avoid it. Suppose your render function renders a list of object. You might be tempted to apply filtering or sorting of that list using other queues, like the state, before displaying. Try to avoid that, it means you can't change the state without causing that whole filtering/sorting logic to run again. Basically, keep the render function short and simple if you can.

Note, I'm a beginner too. My hope is that by sharing these tips more people will get a chance to enjoy React too without being too intimidated by all the things you think you need to learn and understand to build something.

When you're using PostgreSQL for local development it's sometimes important to get an insight into ALL SQL that happens on the PostgreSQL server. Especially if you're trying to debug all the transaction action too.

To do this on OSX where you have PostgreSQL installed with Homebrew you have to do the following:

1. Locate the right postgresql.conf file. On my computer this is in /opt/boxen/homebrew/var/postgres/ but that might vary depending on how you set up Homebrew. Another easier way is to just start psql and ask there:

$ psql
psql (9.4.0)
Type "help" for help.

peterbe=# show config_file;
                   config_file
--------------------------------------------------
 /opt/boxen/homebrew/var/postgres/postgresql.conf
(1 row)

peterbe=#

Open that file in your favorite editor.

2. Look for a line that looks like this:

#log_statement = 'all'                   # none, ddl, mod, all

Uncomment that line.

3. Now if you can't remember how to restart PostgreSQL on your system you can ask brew:

$ brew info postgresql

Towards the end you'll see some files that look something like this:

$ launchctl unload ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist
$ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist

4. To locate where PostgreSQL dumps all logging, you have to look to how Homebrew set it up. You can do that by opening ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist. You'll find something like this:

  StandardErrorPath
  /opt/boxen/homebrew/var/postgres/server.log

That's it. You can now see all SQL going on by running:

$ tail -f /opt/boxen/homebrew/var/postgres/server.log

Remember to reverse this config change when you're done debugging because that server.log file can quickly grow to an insane size since it's probably not log rotated.

Happy SQL debuggin'!

Last year I put together a little experiment called AJAX or Not? and blogged about it here. The basic idea was to display 1,000 rows in a table. There are several ways of doing it but I decided to compare the following three patterns:

  1. Rendering the whole table in Django server-side and return the whole HTML document.
  2. Rendering a skeleton page, then load the table content as a big chunk of HTML via AJAX.
  3. Rendering a skeleton page, and let AngularJS load all the content of the table from the server as JSON and let AngularJS render it into the DOM.

It was clear as day that the server-side rendering version was hands down the fastest. And the AngularJS rendering the slowest.

Note! AngularJS is amazing and super flexible and powerful because you don't really need to worry about how to re-render once the data changes. This this really useful when you do things like loading more data from a remote endpoint or doing some in-page filtering.

Enter ReactJS

The point of AJAX or Not was not to compare Javascript frameworks but I had some time and I thought I'd write an equivalent version of the AngularJS one with ReactJS (version 0.13.3).

Anyway, here's the code and it's using the GitHub fetch polyfill to do the AJAX query. The AngularJS code is here and here and as you can see it's using track by on the ng-repeat.

WebPageTest
To measure the difference I ran a comparison in WebPageTest which I encourage you open and study for a bit. You can watch the video and download the video here.

Also, note that the Django rendered version loads jQuery. That's because the functionality dictates that clicking on a link should show a confirmation box before going to the link. I know, it's silly but it's very realistic that every page needs some Javascript functionality.

Executive summary...

  • Django server-side takes 0.8 seconds, ReactJS version takes 2.0 seconds and the AngularJS version takes 2.9 seconds.
  • The ReactJS version is the fastest to display something. It displays the header and the image first. Only by 0.2 seconds before the Django server-side version.
  • The AngularJS version causes a lot more CPU utilization. This might really matter when you're on a low-end smartphone.
  • The ReactJS causes twice as much CPU utilization than the server-side version. The AngularJS causes twice as much CPU utilization than the ReactJS version.
  • AngularJS is slightly larger than ReactJS + fetch but I don't think this has a huge effect on the total load time.

Some other thoughts...

  • The ReactJS code is all in one place more or less. That's neat! But it's pretty darn big in terms of number of lines. AngularJS code is split half in the Javascript code and half in the HTML.
  • It's clear, if you want a fast loading page, avoid Javascript as much as you possibly can.
  • This experiment is very optimized in how it gets the data to be displayed. In fact, the server-side rendering time is close to 0 seconds because the whole HTML blob is stored in memcache. A more realistic thing is that extracting the data would take a lot longer if the query isn't so easy to cache. That would be a huge disadvantage for the fully server-side rendered version since if the data query takes a long time you'll sit and stare at a white screen longer. Doing the AJAX approach would definitely be a nicer experience.
  • The difference isn't that big. Both fancy Javascript frameworks have amazing features that leaves jQuery in the dust but if you want your page to load crazy-fast, do as much server-side as you possibly can.

Premailer is a Python library for turning a HTML + CSS into HTML with all the CSS embedded as inline style attributes. This is sadly very necessary to ensure that your fancy HTML emails look spiffy across all email clients and email webapps.

So, last week I put together a little site to test the library via a browser: Premailer.io

It's just a simple webapp with a form where you can enter HTML in three different ways; textarea, by URL and by file upload.

You can also override all the possible advanced options that premailer supports.

What's kinda cool is that you can get a preview of how the HTML document will look like in an iframe that is dynamically loaded with the result from the conversion.

The webapp is of course open source and available on github.com/peterbe/premailer.io. The front-end is an AngularJS app and the build system is Lineman.js. The server is a Falcon server running on uWSGI via Nginx.

There's very little fancy here. There's no limitations or protections. I just hope it becomes handy for people to test premailer out.

The inspiration came from MailChimp's CSS Inliner Tool which is cute but very basic and doesn't allow you the same kinds of input.

If anybody with some AngularJS or highlight.js chops has time I'd love to help fix why the HTML is not syntax highlighted.

Over the years, style guides have come and gone. And contributors have come and gone.
Some people, at some times use 2-spaces indentation in JavaScript. Some prefer 4-spaces.

Even I have changed my mind over the years and now I'm content to do either. I just go by whatever the projectroot/.editconfig config tells me.

So I wanted to clean up all the files so that they are use the same type of indentation (as dictated by the project's .editorconfig file). But which files are what indentation? I could open each file in turn and look at it and keep a tally of which is what. Or I can script it.

I wrote a script. Usage example included in the gist.

Now I easily see which files use what indentation. That makes it easy to file bugs for refactoring.

Some of the files in this grep search include vendor scripts that I'm not going to touch but as you can see, most files use 4 spaces but some still us 2 spaces.

4   base/static/angular/watchcounter.js
2   base/static/dropzone/dropzone.js
4   base/static/js/base.js
2   base/static/js/gallery_select.js
1   base/static/js/libs/include.js
4   base/static/js/libs/moment.js
4   base/static/select2/select2.js
4   comments/static/comments/js/comments.js
1   main/static/main/fullcalendar/gcal.js
4   main/static/main/js/autocompeter.js
4   main/static/main/js/calendar.js
4   main/static/main/js/discussion.js
4   main/static/main/js/download.js
4   main/static/main/js/edit.js
4   main/static/main/js/embed.js
4   main/static/main/js/event_video.js
4   main/static/main/js/eventstatus.js
4   main/static/main/js/include-tabzilla.js
4   main/static/main/js/jwplay.js
4   main/static/main/js/livehits.js
4   main/static/main/js/nav.js
4   main/static/main/js/playbackrate.js
4   main/static/main/js/tabzilla.js
4   main/static/main/js/tearout.js
4   manage/static/manage/js/autocompeter.js
1   manage/static/manage/js/bootstrap-datepicker.js
2   manage/static/manage/js/bootstrap-typeahead.js
4   manage/static/manage/js/channel-html-edit.js
4   manage/static/manage/js/confirm-delete.js
4   manage/static/manage/js/cronlogger.js
4   manage/static/manage/js/dashboard.js
4   manage/static/manage/js/dashboard_graphs.js
4   manage/static/manage/js/discussion-configuration.js
4   manage/static/manage/js/event-archive.js
4   manage/static/manage/js/event-assignment.js
4   manage/static/manage/js/event-edit.js
4   manage/static/manage/js/event-request.js
2   manage/static/manage/js/event-tweets.js
4   manage/static/manage/js/event-upload.js
4   manage/static/manage/js/event-vidly-submissions.js
4   manage/static/manage/js/eventmanager.js
4   manage/static/manage/js/events.js
4   manage/static/manage/js/form-errors.js
4   manage/static/manage/js/locations.js
4   manage/static/manage/js/mainmanager.js
4   manage/static/manage/js/manage.js
4   manage/static/manage/js/picture-add.js
4   manage/static/manage/js/picturegallery.js
4   manage/static/manage/js/staticpage-edit.js
4   manage/static/manage/js/suggestions.js
4   manage/static/manage/js/survey-edit.js
4   manage/static/manage/js/tagmanager.js
4   manage/static/manage/js/url-transforms.js
4   manage/static/manage/js/user-edit.js
4   manage/static/manage/js/usermanager.js
4   manage/static/manage/js/vidly-media-timings.js
4   manage/static/manage/js/vidly-media.js
4   new/static/new/js/RecordRTC.js
4   new/static/new/js/app.js
1   new/static/new/js/ccv.js
4   new/static/new/js/controllers.js
2   new/static/new/js/humanize-duration.js
4   new/static/new/js/services.js
4   starred/static/starred/js/star_event.js
4   starred/static/starred/js/starredevents.js
4   suggest/static/suggest/js/details.js
4   suggest/static/suggest/js/discussion.js
4   suggest/static/suggest/js/file.js
4   suggest/static/suggest/js/start.js
4   suggest/static/suggest/js/suggest.js
4   surveys/static/surveys/js/survey.js
2   uploads/static/uploads/js/s3upload.js
4   uploads/static/uploads/js/upload.js
2   webrtc/static/webrtc/js/camera.js
4   webrtc/static/webrtc/js/libs/RecordRTC.js
4   webrtc/static/webrtc/js/photobooth.js
4   webrtc/static/webrtc/js/summary.js
4   webrtc/static/webrtc/js/video.js
4   webrtc/static/webrtc/js/webrtc.js