Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page!
Currently only showing blog entries under the category: Web development. Clear filter

Find out all localStorage keys and their value sizes

13 July 2019 0 comments   Javascript, Web development


I use localhost:3000 for a lot of different projects. It's the default port on create-react-app 's dev server. The browser profile remains but projects come and go. There's a lot of old stuff in there that I have no longer any memory of adding.

My Storage tab in Firefox

Working in a recent single page app, I tried to use localStorage as a cache for some XHR requests and got: DOMException: "The quota has been exceeded.".
Wat?! I'm only trying to store a ~250KB JSON string. Surely that's far away from the mythical 5MB limit. Do I really have to lzw compress the string in and out to save room and pay for it in CPU cycles?

Better yet, find out what junk I still have in there.

Paste this into your Web Console (it's safe as milk):

Object.entries(localStorage).forEach(([k,v]) => console.log(k, v.length, (v.length / 1024).toFixed(1) + 'KB'))

The output looks something like this:

Web Console output

Or, sorted and filtered a bit:

Object.entries(localStorage).sort((a, b) => b[1].length -a[1].length).slice(0,5).forEach(
([k,v]) => console.log(k, v.length, (v.length / 1024).toFixed(1) + 'KB'));

Looks like this:

Sorted and sliced

And for the record, summed total in kilobytes:

(Object.values(localStorage).map(x => x.length).reduce((a, b) => a + b) / 1024).toFixed(1) + 'KB';

Summed in KB

Wrapping up

Seems my Firefox browser's localStorage limit is still 5MB.

Also, you can do the loop using localStorage.length and localStorage.key(n) and localStorage.getItem(localStorage.key(n)).length but using Object.entries(localStorage) seems neater.

I guess this means I can still use localStorage in my app. It seems I just need to localStorage.removeItem('massive-list:items') which sounds like an experiment, from eons ago, for seeing how much I can stuff in there.

From jQuery to Cash

18 June 2019 0 comments   Javascript, Web development


tl;dr; The main JavaScript bundle goes from 29KB to 6KB by switching from JQuery to Cash. Both with Brotli compression.

In Web Performance, every byte counts. Downloading less stuff means faster network operations but for JavaScript it also means less to parse and execute. This site used use JQuery 3.4.1 but now uses Cash 4.1.2. It requires some changes to how you use $ and most noticeable is the lack of animations and $.ajax.

I still stand by the $ function. It's great when you have a regular (static) website that isn't a single page app but still needs a little bit of interactive JavaScript functionality. On this site, I use it for making the commenting work and some various navigation/header stuff.

Switching to Cash means you have to stop doing things like $.getJSON() and $('.classname').fadeIn(400) which, in a sense, gives Cash an unfair advantage because those bits take up a large portion of the bundle size. Yes, there is a custom build of jQuery without those but check out this size comparison:

BundleUncompressed (bytes)Gzipped (bytes)
jQuery 3.4.188,14530,739
jQuery 3.4.1 Slim71,03724,403
Cash 4.1.214,8185,167

I still needed a fadeIn function, which I was relying on from jQuery, but to remedy that I just copied one of these from youmightnotneedjquery.com. It would be better to not do that an use a CSS transform instead but, well, I'm only human.

Before: with jQuery
Before: with jQuery

Another thing you'll need to replace is to switch from $.ajax to fetch but there are good polyfills but I haven't bothered with polyfills because the tiny percentage of visitors I have, without fetch support still get a working site but can't post comments.

I was contemplating doing what GitHub did in 2018 which was to replace jQuery with real vanilla JavaScript code but it didn't seem worth it now that Cash is only 5KB (gzipped) and it's an actively maintained project too.

Before: with jQuery
Before: with jQuery

After: with Cash
After: with Cash

WebSockets vs. XHR 2019

05 May 2019 0 comments   Javascript, Web Performance, Web development

https://sockshootout.app/


Back in 2012, I did an experiment to compare if and/or how much faster WebSockets are compared to AJAX (aka. XHR). It would be a "protocol benchmark" to see which way was faster to schlep data back and forth between a server and a browser in total. The conclusion of that experiment was that WebSockets were faster but when you take latency into account, the difference was minimal . Considering the added "complexities" of WebSockets (keeping connections, results don't come where the request was made, etc.) it's not worth it.

But, 7 years later browsers are very different. Almost all browsers that support JavaScript also support WebSockets. HTTP/2 might make things better too. And perhaps the WebSocket protocol is just better implemented in the browsers. Who knows? An experiment knows.

So I made a new experiment with similar tech. The gist of the code is best explained with some code:

// Inside App.js

loopXHR = async count => {
  const res = await fetch(`/xhr?count=${count}`);
  const data = await res.json();
  const nextCount = data.count;
  if (nextCount) {
    this.loopXHR(nextCount);
  } else {
    this.endXHR();
  }
};

Basically, pick a big number (e.g. 100) and send that integer to the server which does this:

# Inside app.py 

# from the the GET querystring "?count=123"
count = self.get_argument("count")   
data = {"count": int(count) - 1}
self.write(json.dumps(data))

So the browser keeps sending the number back to the server that decrements it and when the server returns 0 the loop ends and you look how long the whole thing took.

Try It

The code is here: https://github.com/peterbe/sockshootout2019

And the demo app is here: https://sockshootout.app (Just press "Start!", wait and press it 2 or 3 more times)

Location, location, location

What matters is the geographical distance between you and the server. The server used in this experiment is in New York, USA.

What you'll find is that the closer you are to the server (lower latency) the better WebSocket performs. Here's what mine looks like:

My result between South Carolina, USA and New York, USA
My result between South Carolina, USA and New York, USA

Now, when I run the whole experiment all on my laptop the results look very different:

Running all locally
Running all locally

I don't have a screenshot for it but a friend of mine ran this from his location in Perth, Australia. There was no difference. If any difference it was "noise".

Same Conclusion?

Yes, latency matters most. The technique, for the benefit of performance, doesn't matter much.

No matter how fancy you're trying to be, what matters is the path the bytes have to travel. Or rather, the distance the bytes have to travel. If you're far away a large majority of the total time is sending and receiving the data. Not the time it takes the browser (or the server) to process it.

However, suppose you do have all your potential clients physically near the server, it might be beneficial to use WebSockets.

Thoughts and Conclusions

My original thought was to use WebSockets instead of XHR for an autocomplete widget. At almost every keystroke, you send it to the server and as search results come in, you update the search result display. Things like that need to be fast and "snappy". But that's not where WebSockets shine. They shine in their ability to actively await results without having a loop that periodically pulls. There's nothing wrong with WebSocket and it has its brilliant use cases.

In summary, don't bother just to get a single-digit percentage performance increase if the complexity of the code and infrastructure is non-trivial. Keep building cool stuff with WebSockets but if you expect one result per action, XHR is good enough.

Bonus

The experiment app does collect everyone's results (just the timings and IP) and I hope to find the time to process this and build graph a correlating the geographical distance compared to the difference between the two techniques. Watch this space!

By the way, if you do plan on writing some WebSocket implementation code I highly recommend Sockette. It's solid and easy to use.

KeyCDN vs AWS CloudFront

29 April 2019 0 comments   Web Performance , Web development


Before I commit to KeyCDN for my little blog I wanted to check if CloudFront is better. Why? Because I already have an AWS account set up, familiar with boto3, it's what we use for work, and it's AWS so it's usually pretty good stuff. As an attractive bonus, CloudFront has 44 edge locations (KeyCDN 34).

Price-wise it's hard to compare because the AWS CloudFront pricing page is hard to read because the costs are broken up by regions. KeyCDN themselves claim KeyCDN is about 2x cheaper than CloudFront. This seems to be true if you look at cdnoverview.com's comparison too. CloudFront seems to have more extra specific costs. For example, with AWS CloudFront you have to pay to invalidate the cache whereas that's free for KeyCDN.

I also ran a little global latency test comparing the two using Hyperping using 7 global regions. The results are as follows:

KeyCDN on Hyperping.io
KeyCDN on Hyperping.io

CloudFront on Hyperping.io
CloudFront on Hyperping.io

RegionKeyCDNCloudFrontWinner
London27 ms36 msKeyCDN
San Francisco29 ms46 msKeyCDN
Frankfurt47 ms1001 msKeyCDN
New York City52 ms68 msKeyCDN
São Paulo105 ms162 msKeyCDN
Sydney162 ms131 msCloudFront
Mumbai254 ms76 msCloudFront

Take these with a pinch of salt because it's only an average for the last 1 hour. Let's agree that they both faster than your regular Nginx server in a single location.

By the way, both KeyCDN and CloudFront support Brotli compression. For CloudFront, this was added in July 2018 and if your origin can serve according to Content-Encoding you simply tell CloudFront to cache based on that header.

Although I've never tried it CloudFront does have an API for doing cache invalidation (aka. purging) and you can use boto3 to do it but I've never tried it. For KeyCDN here's how you do cache invalidation with the python-keycdn-api:

api = keycdn.Api(settings.KEYCDN_API_KEY)
call = "zones/purgeurl/{}.json".format(settings.KEYCDN_ZONE_ID)
all_urls = [
    'origin.example.com/static/foo.css',
    'origin.example.com/static/foo.cssbr',
    'origin.example.com/images/foo.jpg',
]
params = {"urls": all_urls}
response = api.delete(call, params)
print(response)

I'm not in love with that API but I know it issues the invalidation fast whereas with CloudFront I heard it takes a while to take effect.

I think I might put my whole site behind a CDN

23 April 2019 0 comments   Web Performance, Nginx, Web development


tl;dr; I'm going to put this blog behind KeyCDN and I expect a 2-4x performance boost (on Time To First Byte).

Right now, requests to my blog go straight to an Nginx server in DigitalOcean in NYC, USA. The Nginx server, 99% of the time, serves the blog posts (and static assets) as index.html files straight from disk. If the request is GET /plog/some-slug it will search for a file called /path/to/cached/files/plog/some-slug/index.html (or index.html.br or index.html.gz depending on the user agent's Accept-Encoding header). Only if the file doesn't exist on disk, it goes through to Django (via uWSGI built into Nginx). All of it is done with HTTP/2 and uses LetsEncrypt for SSL.

This has been working great but it's time to step it up. It's time to put the whole site behind a CDN. And I think I'm going to use KeyCDN for it.

In the past, it used to be best-practice that you serve your HTML document from your smart server (e.g. Django) and then, for the static assets, you put in a CDN. Like this:

<html>
  <link rel="stylesheet" href="https://myaccount123.cloudakamaifastlyflare.com/static/main.d910ef9a33.css">

...
<body>
  <img src="https://myaccount123.cloudakamaifastlyflare.com/images/hero.jpg">

...

But with HTTP/2, this becomes an anti-pattern for web performance because your client has already made an expensive HTTP/2 connection (and SSL negotiation) to https://yourcooldomain.com and now it's cheap to just download the rest. I used to do it like that too and I don't regret it. As a matter of fact, on https://songsear.ch is straight to Nginx but all its images are (lazy) loaded via songsearch-2916.kxcdn.com. But I think, when time allows, I'll put all of Song Search behind a CDN too.

Basically, it's time to put the whole site behind a CDN. With smart purging techniques and smarter CDNs respecting your dynamic content cache control headers, it's time to share the load. ...all over the world.

CDN Choices

There are many sites that want to compare CDNs. But many are affiliated or even made by one of them. So it's hard to get comparisons. For example, KeyCDN demonstrates they're the cheapest by comparing themselves with 5 others that they picked. (But mind you, that seems reasonably backed up by this comparison on cdn.reviews).

CDNPerf does a decent job with cool graphs and stuff. Incidentally, they rank my current favorite (KeyCDN) as the slowest compared to the well known giants that I compared it to.

CDNPerf graph

But mind you, the perf difference between KeyCDN and the winner (topmost in the graph as of today) is 36ms vs 47ms which are both fantastic numbers.

CDNPerf list

It's hard to compare CDNs because they're all pretty fast, and actually, they're all reasonably cheap. What really matters is the features and that's a lot harder to compare. CloudFlare often comes up as a CDN provider with stellar features that impress me. I've never actually used them but at least they mention "Fast cache purge" and "API programmability" are their key features. But they also don't mention Brotli caching which I know is a feature KeyCDN supports.

KeyCDN has been great to me in the past when I've used it to CDN host static assets. I'm familiar with their interface and they recently launched an API to do things like purge-by-tag and purge-by-URL. They're cheap, which matters because in this context it's all side-project stuff I want to put behind a CDN. They have a Python library which, although very rough around the edges, it works. And also very important; I've communicated very successfully with them through their support and they've been responsive and helpful. So I'll go with KeyCDN.

The Opportunity

Before I move my domain www.peterbe.com to become a CNAME for one of their CDN domains, I wanted to experiment a little and see how it works and what performance numbers I get for comparison. So I set up beta.peterbe.com and did some Django and Nginx wiring so it would work the same but with the difference that it goes through a CDN for everything.

Then I picked a random page and set up a Hyperping monitor from all of its available regions and let it brew for a while. Unfortunately, Hyperping doesn't let you compare two monitors side-by-side so you're going to have to use your own eyes to compare the graphs:

www means no CDN, just the origin Nginx
NOT behind a CDN (server is New York, USA)

beta means with a CDN in front
Behind a CDN

The "total Response Time" in Hyperping doesn't really make sense. They're an average across all regions it pings from. If you live in, for example, Germany; the only response time that matters to you is 1,215 ms versus 40 ms. Equally, if you live somewhere in New York, the only response time that matters to you is 20 ms versus 64 ms.

I actually ran another benchmark. I used Python like this:

t0 = time.time()
r = requests.get('https://www.peterbe.com/plog/some-slug')
t1 = time.time()
print("Took", t1 - t0)

I did this from South Carolina which means my nearest KeyCDN edge location could be Atlanta, Miami, or New York. Either way, I'm reasonably near New York (compared to the rest of the world) so it'd be a fair performance comparison for all US east coast traffic. (Insert disclaimer here). It downloads the most recent blog posts, in repeated cycles, which gives the CDN a solid chance to warm up and then it compares the median of the last 100 downloads. The output of this is as follows:

beta
    COUNT               1854 (but only using the last 100)
    HIT RATIO           100.0%
    AVERAGE (all)       63.12ms
    MEDIAN (all)        61.89ms

www
    COUNT               1856 (but only using the last 100)
    HIT RATIO           100.0%
    AVERAGE             136.22ms
    MEDIAN              135.61ms

("HIT RATIO" for the non-CDN URL means it was served entirely without Djando server rendering)

What it means is that the median, with a CDN is: 62ms and 135.6ms without. That's a 2x boost.

The crawler stats script is available here: github.com/peterbe/peterbecom-cdn-crawl and I would be thrilled if you can clone it and run it and report what numbers you get and where you're running it from.

Notes and Conclusion

Mind you, 62ms vs. 136ms might sound like a silly difference if Webpagetest says it takes 700ms until the page is interactive (on an LTE connection). And this is a tiny super-optimized page. But never forget A) we can't all live in the US east-coast area and B) if the HTML can download marginally faster it allows the browser to parse it sooner and start downloading all the other stuff much sooner. It'll make a big difference! I'm sure you've all seen graphs like this:

Cold-cache MDN page on 4G
Imagine if all those static asset downloads could have started a whole second "to the left"

Of course a CDN is faster. It's no news. But it's also a hassle and it costs money. It's 2019 and most good CDNs now support Brotli, fast purge-by-url, and HTTP/2. It's time to make the switch! It's not like cache-invalidation is hard.

UPDATE April 23 2019 (same day)

KeyCDN has a neat looking tool that is similar to Hyperping but more of a one-off kinda deal. It's called Performance Test and I wouldn't be surprised it's biased as heck because they probably run these pings from the same location'ish as where they have the edge locations. Anyway, the results are nevertheless juicy. Note the last, TTFB column numbers.

Performance Test without CDN
Performance Test without CDN

Performance Test with CDN
Performance Test with CDN

Whatsdeployed rewritten in React

15 April 2019 0 comments   Javascript, ReactJS, Python, Web development


A couple of months ago my colleague Michael @mythmon Cooper wanted to add a feature to the front-end code of Whatsdeployed and learned that the whole front-end is spaghetti jQuery code. So, instead, he re-wrote it in React. My only requirements were "Use create-react-app and no redux", i.e. keep it simple.

We also took the opportunity to rewrite some of the ways that URLs are handled. It used to be that a "short link" would redirect. For example GET /s-5HY would return 302 to Location: ?org=mozilla&repo=tecken&name[]=Dev&url[]=https://symbols.dev.mozaws.net/__version__&name[]=Stage... Basically, the short link was just an alias for a redirect. Just like those services like bit.ly or g.co. Now, the short link is a permanent fixture. The short link is included in the XHR calls to the server for getting the relevant data.

All old URLs will continue to work but now the canonical URL becomes /s/5HY/mozilla-services/tecken, for example. The :org/:repo isn't really necessary because the server knows exactly what 5HY (in this example means), but it's nice for the URL bar's memory.

Another thing that changed was how it can recognize "bors commits". When you use bors, you put a bunch of commits into a GitHub Pull Request and then ask the bors bot to merge them into master. Using "bors mode" in Whatsdeployed is optional but we believe it looks a lot more user-friendly. Here is an example of mozilla/normandy with and without bors toggled on and off.

Without "bors mode"
Without "bors mode"

With "bors mode"
With "bors mode"

Thank you mythmon!

Lastly, hopefully this will make it a lot easier to contribute. Check out https://github.com/peterbe/whatsdeployed. All you need is Python 3, a PostgreSQL, and almost any version of Node that can run create-react-apps. Ping me if you find it hard to get up and running.