Peterbe.com

Peter Bengtsson's blog

Filtered by Web Performance

Page 2

Optimize DOM selector lookups by pre-warming by selectors' parents

February 11, 2019
0 comments Web development, Node, Web Performance, JavaScript

tl;dr; minimalcss 0.8.2 introduces a 20% post-processing optimization by lumping many CSS selectors to their parent CSS selectors as a pre-emptive cache.

In minimalcss the general core of it is that it downloads a DOM tree, as HTML, parses it and parses all the CSS stylesheets associated. These might be from <link ref="stylesheet"> or <style> tags.
Once the CSS stylesheets are turned into an AST it loops over each and every CSS selector and asks a simple question; "Does this CSS selector exist in the DOM?". The equivalent is to open your browser's Web Console and type:

>>> document.querySelectorAll('div.foo span.bar b').length > 0
false

For each of these lookups (which is done with cheerio by the way), minimalcss reduces the CSS, as an AST, and eventually spits the AST back out as a CSS string. The only problem is; it's slow. In the case of view-source:https://semantic-ui.com/ in the CSS it uses, there are 6,784 of them. What to do?

First of all, there isn't a lot you can do. This is the work that needs to be done. But one thing you can do is be smart about which selectors you look at and use a "decision cache" to pre-emptively draw conclusions. So, if this is what you have to check:

#example .alternate.stripe
#example .theming.stripe
#example .solid .column p b
#example .solid .column p

As you process the first one you extract that the parent CSS selector is #example and if that doesn't exist in the DOM, you can efficiently draw conclusion about all preceeding selectors that all start with #example .... Granted, if they call exist you will pay a penalty of doing an extra lookup. But that's the trade-off that this optimization is worth.

Check out the comments where I tested a bloated page that uses Semantic-UI before and after. Instead of doing 3,285 of these document.querySelector(selector) calls, it's now able too come to the exact same conclusion with just 1,563 lookups.

Sadly, the majority of the time spent processing lies in network I/O and other overheads but this work did reduce something that used to take 6.3s (median) too 5.1s (median).

How much HTML is too much for optimal web performance

October 17, 2018
4 comments Web development, Web Performance

Right off the bat; I don't know. All I know is that it's complicated.

I have this page which is just a blog post page. It's entirely rendered on the server, comments and all. At the time of writing, the total size of the HTML document is 119KB (30KB gzipped). If you remove all the comments, which makes up the bulk of the HTML it reduces down to 31KB (7KB gzipped). Fair enough. That's 23KB less to download. But, does it matter (much)?

Downloading

First of all, I noticed this:

WebPagetest with iPhone 6, 4G on the same US coast as the datacenter

That's a WebPagetest using iPhone 6 on 4G and, lemme emphasize this, it took 126ms to download the HTML document. If you subtract "DNS Lookup" (283ms), "Initial Connection" (1013ms), and "SSL Negotiation" (733ms) it took 684ms serve the file, download it, and parse it. Remember, this is all on 4G. Pretty fast. In conclusion, it's probably not too much HTML in that page to download. This downloadingness is fraction of the total "web performance cost". Let's dig deeper.

Note! With WebPagetest all those numbers like DNS Lookup, Initial Connection and SSL Negotiation are wildly unpredictable between tests. Chances are, the numbers are very different the next time you run a test using the exact same input. Who knows. Deep internet plumbings beyond the control of WebPagetest.

Note! I ran it one more time with the exact same parameters and this time it was 535ms (instead of 684ms) to serve, download, and parse.

Parsing & layout

Parsing is hard to measure but here's what I found when using the Google Chrome dev tools:

Google Chrome Performance devtools

It says it took...

parsed HTML - 94ms
recalculate style - 43ms
layout - 386ms

That's half a second just loading and rendering. Definitely sucks. But note, this test uses 4x CPU slowdown and 3G simulation. So perhaps it's not so bad.

Let's try again with a smaller HTML document

So I butchered up a hybrid version that has almost the same HTML except all but 1 of those 166'ish div.comment DOM nodes. It's now 31KB (7KB gzipped´) to download instead of 119KB (30KB gzipped).

Same WebPagetest parameters but now this this smaller HTML document:

WebPagetest with a much smaller HTML footprint

Now it says it only took 39ms to download and 232ms (it was 684ms before) to serve the file, download it, and parse it. Interesting!

Note! I ran it one more time with exact same parameters and this time it was 237ms (instead of 232ms) to serve, download, and parse.

Clearly it's working. The smaller the HTML document the faster it performs. No surprise. But stick around for the conclusion.

Parsing & layout with a smaller HTML document

Check this out:

Google Chrome Performance devtools (smaller HTML document)

It says it took...

parsed HTML - 91ms
recalculate style - 6ms
layout - 29ms

Mind you, all of these numbers are at the mercy of what my laptop is up to at the moment as it can affect Chrome's rendering if it has, at that moment, less (or more) access to CPU and memory caching.

Either way, it parses + layout in 126ms instead of 523ms for the larger HTML document.

Side-by-side

The best test to see how much faster the smaller HTML document variant is, is to compare them side-by-side. It looks like this:

Visual comparison on WebPagetest (using 4G)

Two major takeaways from this:

The smaller HTML version starts rendering half a second before the original one.
The complete time favors the smaller HTML version by 2.5 seconds but that's possibly influenced by the ads that load more than any slow layout rendering.
This is using 4G which isn't unheard of but definitely much less common than better speeds.

Here they are compared on "Desktop" which appears to give the smaller HTML version a 0.2 second advantage:

Visual comparison on WebPagetest (using "Desktop")

And here are the Lighthouse reports side-by-side:

Side-by-side using Lighthouse

Discussion

The above concludes rather unsurprisingly that a smaller HTML footprint downloads, parses and lays out quicker.

The killer reason that page is so large, with all those comments rendered in the original HTML is simple: SEO. Google loves comments because comments indicate that the page is thriving and a place where people go, spend time, and stick around. I've experimented with this in the past and found that if I make the HTML document smaller (or loading the rest after document load) the SEO takes a big hit. Yes, Google's bot renders with JavaScript but not always and even if it does, I assume it's smart enough to appreciate that content that is loaded (async or post-DOMContentLoad) is less important and thus not what the page is about.

Regarding SEO, we know that Google loves fast sites. Especially for mobile. But content is still king my gut tells me. Left as an exercise to the reader to take a stand on this.

Another problem with lazy loading the comments (or whatever else might be applicable to your site) is that it might cause "flicker". I put that word in quote because sometimes flicker is literally visual flicker and sometimes it's moments of browser sluggishness. The XHR request and the subsequent post-rendering will cause a bunch of work that strains the browser and might make it unpleasant when your eyes and brain is in the midst of committing to consuming it.

Basically, there are significant real benefits of not trying to squeeze every little millisecond out by making the HTML smaller upfront. Remember the fact that the "smaller HTML" version in this test is drastic. I butchered it from 119KB to 31KB which might be so drastic that it's not necessarily applicable at all. In other words, had I reduced the HTML size by just 20% it might not even register on the performance graph but could be significant in terms SEO keywords.

Conclusion

The majority of the time spend making a web page useful to a user is a sum of all sorts of metrics. The size of the HTML document does matter but remember that it's just one of multiple aspects to watch out for.

In conclusion, it's complicated and depends on your needs and context. I hope you can benefit a little bit from the metrics above.

An awesome snippet to web performance test a page programmatically

October 1, 2018
0 comments Web development, JavaScript, Web Performance

I found this in an issue discussing measuring page performance with puppeteer and it's pure gold. Especially because it's so accessible and easy to use.

Here's the code:


const puppeteer = require('puppeteer');

async function run() {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://www.peterbe.com/');

  console.log('\n==== performance.getEntries() ====\n');
  console.log(
    await page.evaluate(() =>
      JSON.stringify(performance.getEntries(), null, '  ')
    )
  );

  console.log('\n==== performance.toJSON() ====\n');
  console.log(
    await page.evaluate(() => JSON.stringify(performance.toJSON(), null, '  '))
  );

  console.log('\n==== page.metrics() ====\n');
  const perf = await page.metrics();
  console.log(JSON.stringify(perf, null, '  '));

  browser.close();
}

run();

When you run it you get this output: https://gist.github.com/peterbe/afb09bf9277e5fa9242f8d270c687640
To run it you need to have a decently up-to-date version of puppeteer installed.

I don't claim (far from it actually!) to understand all the metrics points in there but I believe this is basically what the Network panel in the Google Chrome Dev tools is built upon. But some details and facts are easy to figure out and use in your analysis. For example, the fact that the getEntries() lists all the resources that had to be downloaded in the order they were downloaded. Also, at the end of getEntries() you get the first-paint which is often a useful metric.

Anyway, give it a spin. Wrap this up in a platform and see if you can build something really simple and really tailored to your web projects web performance testing.

HTMLMinifier in use on this blog now

August 7, 2018
3 comments Web development, JavaScript, Web Performance

Last week I enabled HTMLMinifier as a post-build step for server-rendered content here on this blog. Basically, after a page is rendered in Django, it's sent to a Celery queue that does things to the index.html file. The first thing it does its that it extracts the stylesheets and replaces them with a block of inline CSS. More details in this blog post. Secondly, what the background job does it that it sends the index.html file to node_modules/.bin/html-minifier. See the code here.

What that does is that it removes quotation marks where not needed (e.g. <div id=foo> instead of <div id="foo">), removes HTML comments, and lastly removes whitespace that is not needed. The result is that the HTML now looks like this:

I also added a line of logging that spits out a measurement of the size of the HTML size before, before with gzip, after, and after with gzip. Why? Because the optimization of HTML minification is usually insignificant after you gzip. See this blog post about how insignificant space optimization is in comparison to gzip. Look at the sample log lines:

...
Minified before: 38,249 bytes (11,150 gzipped), After: 36,098 bytes (10,875 gzipped), Shaving 2,151 bytes (275 gzipped)
Minified before: 37,698 bytes (10,534 gzipped), After: 35,622 bytes (10,243 gzipped), Shaving 2,076 bytes (291 gzipped)
Minified before: 58,846 bytes (14,623 gzipped), After: 55,540 bytes (14,313 gzipped), Shaving 3,306 bytes (310 gzipped)
...

So this last one saved 3.2KB of HTML document which isn't a sneeze, but since 99% of clients support gzip, it actually only saved 310 bytes. As a matter of fact, I parsed the log lines and calculated the average and it was saving 338 bytes per page.

Worth it? I doubt it. It's not without risks and now it's slightly harder and weirder to view the source. However 338 bytes multiplied by the total number of visitors per month, I estimate to save a total of 161 MB of data less to be sent.

To defer or to async JavaScript tags. That's the question.

June 29, 2018
0 comments Web development, JavaScript, Web Performance

tl;dr; async scores slightly better that defer (on script tags) in this experiment using Webpagetest.

Much has been written about the difference between <script defer src="..."> and <script async src="..."> but nothing beats seeing it visually in Webpagetest.

Here are some good articles/resources:

So I took a page off my own blog. Butchered it and cleaned up the 6 <script> tags. It uses HTTP/2 and some jQuery and some other vanilla JavaScript stuff. See the page here: neither.html
Then I copied that HTML file and replaced all <script src="..."> with <script defer src="...">: defer.html. And lastly, the same with: async.html.

First let's compare all three against each other:

Neither vs defer vs async on Webpagetest.

Clearly, making the JavaScript non-blocking is critical for web performance. That's 1.7 seconds instead of 2.8 seconds.

Second, let's compare just defer vs. async on a 4G connection:

defer vs. async on 4G Also, if you like here's defer vs. async on a desktop browser instead.

Conclusions

Don't allow your JavaScript to block rendering unless it's OK to have your users staring at a white screen till everything has landed.
There's not much difference between defer and async. async has a slight advantage as per these experiments. I'm only capable of guessing, but I suspect it's because it can "spread out" the work better and get some work done in parallel whilst defer has things that tell it to wait. In particular, since with defer the order of the <script> tags is respected. Suppose that the file some.jquery.plugin.js downloads before jquery.min.js, then that file has to be blocked and execution delayed whilst waiting for jquery.min.js to download, parse and execute. With async it's more of a wild west of executing whenever you can.
The async.html is busted because of the unpredictable order of execution and these .js files depend on the order. Another reason to use defer if your scripts have that order-dependency problem.
Consider using a mix of async and defer. async has the advantage that some parsing/execution can be done by the main thread whilst waiting for other blocking resources like images.

To CDN assets or just HTTP/2

May 17, 2018
3 comments Web development, JavaScript, Web Performance

tl;dr; I see little benefit in using a CDN at this point.

I took two random pages here on my blog. One and Another. Doesn't matter what they say but it's important to notice that they're extremely similar. No big pictures. Both have 1 banner ad each. Both served with HTTP/2. Neither have any blocking linked assets. I.e. there is no blocking <link ref="stylesheet" href="styles.css"> and the script tags are are either async or defer. Both pages reference one little .png that is not deliberately lazy loaded. That's the baseline.

The HTML document, in both URLs, is served with HTTP/2 but it references a the lazy loaded .css and (a bunch of) .js files, via a CDN. In other words, it looks like this:


▶ curl -v https://www.peterbe.com/plog/hashin-0.7.0
...
> GET /plog/hashin-0.7.0 HTTP/2
...
< HTTP/2 200
...
<
...
<link rel="preload" href="/static/css/base.min.e8df96d84663.css" 
 as="style" onload="this.onload=null;this.rel='stylesheet'">
...
<script defer src="/static/js/blogitem-post.min.f6c0be691e73.js"></script>
...

So, cdn-2916.kxcdn.com is a an awesome CDN, but to a first-time visitor, that is going to require a DNS lookup and the creation of a new TCP connection that can be kept alive. The alternative to this is to not put any of the of the .png, .css or .js assets on a CDN. Basically, instead of <script src="https://mycdn.example.com/foo.js">, just do <script src="/foo.js">.

CDNs are really important since latency is a killer to web performance and remember that "Use a CDN" is rule number 2 in the, now dated, YSlow ruleset. However, we're entering an era where HTTP/2 is becoming more and more available in mainstream browsers (hint: nearly 100% of visitors to my site are HTTP/2 support). Buuuuuut, the latency (DNS, connection and SSL negotiation) doesn't matter that much if you have already paid those costs to get to the origin web server (https://www.peterbe.com in this example).

The Experiment

What I'm interested in seeing if there is a way to gauge/measure when it's best to use a CDN and when it's best to use the origin web server to serve all assets. My friend @stereobooster suggested: "Webpagetest.org is all you need"

Ok. Let's measure that then with Webpagetest.org and see what we can learn.

Here's a visual comparison of the two URLs when they both use CDN for the static assets.

They load pretty equally.
The Waterfall View looks almost identical.
Confirmed, there are no render blocking resources as it starts to paint already at about 1.5s.

Here's a visual comparison of one using a CDN for static assets and one does not.

They load pretty equally (diff by 0.1s).
The Waterfall View looks very different.
The second one does not have a second "dns - connection - ssl - download" bar.
Almost all the .js are downloaded at about 1.8s when there's no CDN.
Almost all the .js are downloaded at about 3.0s when using a CDN.
Use the little "Waterfall opacity" widget to slide left and right to see the difference.

You can see their webpagetests individually here and here.

Two connection prices paid. Downloads individual assets faster but ultimately takes a longer time.

Only 1 connection price paid. ALL assets downloaded sooner, albeit individually slower.

Analysis

My web server is served from a highly optimized Nginx server in New York, USA. The two Webpagetest visual comparisons above are both done from Virgina, USA. But the killer feature of a CDN is that latency can be so much better thanks to edge locations of the CDN. In particular, KeyCDN have an edge location in Stockholm, Sweden. So what happens when you run the URLs from a Webpagetest machine in Stockholm, Sweden?

The both start to render at the same time (expected since the HTML document is still in New York, USA) but the (rougly) total time to download all the .css and .js is (about) 2.6 seconds when a CDN and 1.9 seconds without a CDN. In other words, despite the CDN geographically so much closer, the static assets are still available sooner without a CDN.

It's pretty clear at this point that it's not a good idea to use a CDN for static assets. Even if they're not critical. The "First Meaningful Paint" and "Time To Interactive" are about the same but when HTTP/2 can download all the .js files faster, their useful JavaScript can start being useful sooner with HTTP/2.

What Else

So in my site, it's easiest to host the whole site on an Nginx server in a Digital Ocean server. It's easy to invalidate its cache (just delete the file from disk and wait for Django to regenerate it). Another advantage with using plain Nginx is that I serve the HTML with Cache-Control headers and then do some post-processing of the .html file and since Nginx is disk-based, I don't have to update a CDN.

An alternative would be to put the whole site behind a CDN. That way, the initial HTML document can be served from a CDN edge location, using HTTP/2 and send the rest of the static assets on the same HTTP/2 connection. But this means that every single dynamic URL (e.g. HTTP POSTs or some per-user XHR requests) has to go via a CDN rather than going straight to the Nginx that is connected to the Django web server.

Last but not least, even though my Nginx server is on a decent machine and pretty well tuned, I very much doubt it's as fast and powerful as a KeyCDN or CloudFront or Akamai or Google Cloud CDN. Those servers are beasts! Mind you, the DNS + connection + SSL negotiation, when requesting from Stockholm, Sweden was about 0.75s to my Nginx in New York, USA. For the KeyCDN edge location the DNS + connection + SSL negotiation was about 0.52s. So not a huge difference actually.

Another important aspect is Service Workers. Perhaps I don't know how to hack it, but it doesn't work when you use differnet domains for the service worker .js file and the URIs it references.

In conclusion; I see little benefit in using a CDN at this point. Perhaps for larger assets like videos, GIFs or high-res images. HTTP/2 changes one of the major web performance rules. End of an era(?)

Next page