Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page!
Currently only showing blog entries under thecategory: Nginx. Clear filter

I think I might put my whole site behind a CDN

Web development, Nginx, Web Performance

tl;dr; I'm going to put this blog behind KeyCDN and I expect a 2-4x performance boost (on Time To First Byte).

Right now, requests to my blog go straight to an Nginx server in DigitalOcean in NYC, USA. The Nginx server, 99% of the time, serves the blog posts (and static assets) as index.html files straight from disk. If the request is GET /plog/some-slug it will search for a file called /path/to/cached/files/plog/some-slug/index.html (or index.html.br or index.html.gz depending on the user agent's Accept-Encoding header). Only if the file doesn't exist on disk, it goes through to Django (via uWSGI built into Nginx). All of it is done with HTTP/2 and uses LetsEncrypt for SSL.

This has been working great but it's time to step it up. It's time to put the whole site behind a CDN. And I think I'm going to use KeyCDN for it.

In the past, it used to be best-practice that you serve your HTML document from your smart server (e.g. Django) and then, for the static assets, you put in a CDN. Like this:

<html>
  <link rel="stylesheet" href="https://myaccount123.cloudakamaifastlyflare.com/static/main.d910ef9a33.css">

...
<body>
  <img src="https://myaccount123.cloudakamaifastlyflare.com/images/hero.jpg">

...

But with HTTP/2, this becomes an anti-pattern for web performance because your client has already made an expensive HTTP/2 connection (and SSL negotiation) to https://yourcooldomain.com and now it's cheap to just download the rest. I used to do it like that too and I don't regret it. As a matter of fact, on https://songsear.ch is straight to Nginx but all its images are (lazy) loaded via songsearch-2916.kxcdn.com. But I think, when time allows, I'll put all of Song Search behind a CDN too.

Basically, it's time to put the whole site behind a CDN. With smart purging techniques and smarter CDNs respecting your dynamic content cache control headers, it's time to share the load. ...all over the world.

CDN Choices

There are many sites that want to compare CDNs. But many are affiliated or even made by one of them. So it's hard to get comparisons. For example, KeyCDN demonstrates they're the cheapest by comparing themselves with 5 others that they picked. (But mind you, that seems reasonably backed up by this comparison on cdn.reviews).

CDNPerf does a decent job with cool graphs and stuff. Incidentally, they rank my current favorite (KeyCDN) as the slowest compared to the well known giants that I compared it to.

CDNPerf graph

But mind you, the perf difference between KeyCDN and the winner (topmost in the graph as of today) is 36ms vs 47ms which are both fantastic numbers.

CDNPerf list

It's hard to compare CDNs because they're all pretty fast, and actually, they're all reasonably cheap. What really matters is the features and that's a lot harder to compare. CloudFlare often comes up as a CDN provider with stellar features that impress me. I've never actually used them but at least they mention "Fast cache purge" and "API programmability" are their key features. But they also don't mention Brotli caching which I know is a feature KeyCDN supports.

KeyCDN has been great to me in the past when I've used it to CDN host static assets. I'm familiar with their interface and they recently launched an API to do things like purge-by-tag and purge-by-URL. They're cheap, which matters because in this context it's all side-project stuff I want to put behind a CDN. They have a Python library which, although very rough around the edges, it works. And also very important; I've communicated very successfully with them through their support and they've been responsive and helpful. So I'll go with KeyCDN.

The Opportunity

Before I move my domain www.peterbe.com to become a CNAME for one of their CDN domains, I wanted to experiment a little and see how it works and what performance numbers I get for comparison. So I set up beta.peterbe.com and did some Django and Nginx wiring so it would work the same but with the difference that it goes through a CDN for everything.

Then I picked a random page and set up a Hyperping monitor from all of its available regions and let it brew for a while. Unfortunately, Hyperping doesn't let you compare two monitors side-by-side so you're going to have to use your own eyes to compare the graphs:

www means no CDN, just the origin Nginx
NOT behind a CDN (server is New York, USA)

beta means with a CDN in front
Behind a CDN

The "total Response Time" in Hyperping doesn't really make sense. They're an average across all regions it pings from. If you live in, for example, Germany; the only response time that matters to you is 1,215 ms versus 40 ms. Equally, if you live somewhere in New York, the only response time that matters to you is 20 ms versus 64 ms.

I actually ran another benchmark. I used Python like this:

t0 = time.time()
r = requests.get('https://www.peterbe.com/plog/some-slug')
t1 = time.time()
print("Took", t1 - t0)

I did this from South Carolina which means my nearest KeyCDN edge location could be Atlanta, Miami, or New York. Either way, I'm reasonably near New York (compared to the rest of the world) so it'd be a fair performance comparison for all US east coast traffic. (Insert disclaimer here). It downloads the most recent blog posts, in repeated cycles, which gives the CDN a solid chance to warm up and then it compares the median of the last 100 downloads. The output of this is as follows:

beta
    COUNT               1854 (but only using the last 100)
    HIT RATIO           100.0%
    AVERAGE (all)       63.12ms
    MEDIAN (all)        61.89ms

www
    COUNT               1856 (but only using the last 100)
    HIT RATIO           100.0%
    AVERAGE             136.22ms
    MEDIAN              135.61ms

("HIT RATIO" for the non-CDN URL means it was served entirely without Djando server rendering)

What it means is that the median, with a CDN is: 62ms and 135.6ms without. That's a 2x boost.

The crawler stats script is available here: github.com/peterbe/peterbecom-cdn-crawl and I would be thrilled if you can clone it and run it and report what numbers you get and where you're running it from.

Notes and Conclusion

Mind you, 62ms vs. 136ms might sound like a silly difference if Webpagetest says it takes 700ms until the page is interactive (on an LTE connection). And this is a tiny super-optimized page. But never forget A) we can't all live in the US east-coast area and B) if the HTML can download marginally faster it allows the browser to parse it sooner and start downloading all the other stuff much sooner. It'll make a big difference! I'm sure you've all seen graphs like this:

Cold-cache MDN page on 4G
Imagine if all those static asset downloads could have started a whole second "to the left"

Of course a CDN is faster. It's no news. But it's also a hassle and it costs money. It's 2019 and most good CDNs now support Brotli, fast purge-by-url, and HTTP/2. It's time to make the switch! It's not like cache-invalidation is hard.

UPDATE April 23 2019 (same day)

KeyCDN has a neat looking tool that is similar to Hyperping but more of a one-off kinda deal. It's called Performance Test and I wouldn't be surprised it's biased as heck because they probably run these pings from the same location'ish as where they have the edge locations. Anyway, the results are nevertheless juicy. Note the last, TTFB column numbers.

Performance Test without CDN
Performance Test without CDN

Performance Test with CDN
Performance Test with CDN

Please post a comment if you have thoughts or questions.

KeyCDN vs. DigitalOcean Nginx

Web development, Nginx, Web Performance

tl;dr; The global average response time of serving an image from my NYC DigitalOcean server compared to a CDN is almost 10x.

KeyCDN is a CDN service that I use for side-projects. It's great. It has about ~35 edge locations. I don't know much about how their web servers work but I can't imagine it's much different from the origin server. In principle.

The origin server is my DigitalOcean (6 vCPU, 16 GB RAM, Ubuntu 14) droplet. It's running an up-to-date CloudFlare build of Nginx and the static images are served straight from (SSD) disk with a 4 weeks TTL (max-age=2419200,public,immutable). The SSL is done with LetsEncrypt and I'm somewhat confident the Nginx is decently configured and uses HTTP/2.

So the CDN, on songsearch-2916.kxcdn.com, is basically configured to front any requests to songsear.ch. If the origin has cache-control headers, KeyCDN knows it can hold on to it for a while, but it's not a guarantee that it will for the full time specified in the cache-control. Either way; how does it compare?

The Experiment

I picked a random static asset URL. It's a 32 KB JPEG file. Its origin URL and its CDN URL are:

  1. https://songsear.ch/static/albums/2017/05/24/08/170630_300x300.jpg
  2. https://songsearch-2916.kxcdn.com/static/albums/2017/05/24/08/170630_300x300.jpg

Next, I set up a Hyperping monitor on both URLs as GET requests. For the regions (regions from where Hyperping will do pings from), I picked the following:

  1. San Francisco, USA
  2. New York, USA
  3. London, United Kindom
  4. Frankfurt, Germany
  5. Mumbai, India
  6. São Paulo, Brazil
  7. Sydney, Australia

(I wish I had selected all 12 possible regions when I started but now it's too late for lazy me)

Then, I let Hyperping GET these URLs for a while and behold, here are the numbers:

The Results

Average response time:

  • The CDN: 79 ms
  • The origin: 714 ms

That's a 10x difference!

Mind you, the "average response time" is across all regions. It doesn't reflect what people get. If 90% of your visitors are from Australia, the average response times would, of course, be very different. But as an example, the origin server is in New York and there, the average response time is 26 ms vs. 105 ms which is a 5x difference.

Here are some screenshots from Hyperping:

KeyCDN results
KeyCDN

Origin server
Origin server

Conclusion

KeyCDN's server is clearly fast and worth doing. It's unsurprising that it performs better far away from New York but it's surprising how much faster it is at serving than the origin when pinged from New York (5x difference).

The site is still NOT fronted by a CDN because, apart from the images, almost all content is un-cacheable. However, I need to do more research and experimentation with putting everything behind a CDN and being meticulous with setting no-cache headers on dynamic stuff and using async tools to invalidate CDN caches when appropriate.

Please post a comment if you have thoughts or questions.

Experimenting with Nginx worker_processes

Web development, Nginx, MacOSX, Linux

I have Nginx 1.15.8 installed with Homebrew on my macOS. By default the /usr/local/etc/nginx/nginx.conf it set to...:

worker_processes  1;

But, from the documentation, it says:

"The optimal value depends on many factors including (but not limited to) the number of CPU cores, the number of hard disk drives that store data, and load pattern. When one is in doubt, setting it to the number of available CPU cores would be a good start (the value “auto” will try to autodetect it)." (bold emphasis mine)

What is the ideal number for me? The performance of Nginx on my laptop doesn't really matter. But for my side-projects it's important to have a fast Nginx since it serves static HTML and lots of static assets. However, on my personal servers I have a bunch of other resource hungry stuff going on that I know is more likely to need the resources, like Elasticsearch and uwsgi.

To figure this out, I wrote a benchmark program that requested a small index.html about 10,000 times across 10 concurrent clients with hey.

hey -n 10000 -c 10 http://peterbecom.local/plog/variable_cache_control/awspa

I ran this 10 times between changing the worker_processes in the nginx.conf file. Here's the output:

1 WORKER PROCESSES
BEST  : 13,607.24 reqs/s

2 WORKER PROCESSES
BEST  : 17,422.76 reqs/s

3 WORKER PROCESSES
BEST  : 18,886.60 reqs/s

4 WORKER PROCESSES
BEST  : 19,417.35 reqs/s

5 WORKER PROCESSES
BEST  : 19,094.18 reqs/s

6 WORKER PROCESSES
BEST  : 19,855.32 reqs/s

7 WORKER PROCESSES
BEST  : 19,824.86 reqs/s

8 WORKER PROCESSES
BEST  : 20,118.25 reqs/s

Or, as a graph:

Graph

Now note, this is done here on my MacBook Pro. Not on my Ubuntu DigitalOcean servers. For now, I just want to get a feeling for how these numbers correlate.

Conclusion

The benchmark isn't good enough. The numbers are pretty stable but I'm doing this on my laptop with multiple browsers idling, Slack, and Spotify running. Clearly, the throughput goes up a bit when you allocate more workers but if anything can be learned from this, start with going beyond 1 for a quick fix and from there start poking and more exhaustive benchmarks. And don't forget, if you have time to go deeper on this, to look at the combination of worker_connections and worker_processes.

Please post a comment if you have thoughts or questions.

Be very careful with your add_header in Nginx! You might make your site insecure

Linux, Web development, Nginx

tl;dr; When you use add_header in a location block in Nginx, it undoes all "parent" add_header directives. Dangerous!

Gist of the problem is this:

There could be several add_header directives. These directives are inherited from the previous level if and only if there are no add_header directives defined on the current level.

From the documentation on add_header

The grand but subtle mistake

Basically, I had this:

server {
    server_name example.com;

    ...gzip...
    ...ssl...
    ...root...

    # Great security headers...
    add_header X-Frame-Options SAMEORIGIN;
    add_header X-XSS-Protection "1; mode=block";
    ...more security headers...

    location / {
        try_files    $uri /index.html;
    }
}

And when you curl it, you can see that it works:

$ curl -I https://example.com
[snip]
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Strict-Transport-Security: max-age=63072000; includeSubdomains; preload

The mistake I had, was that I added a new add_header inside a relevant location block. If you do that, all the other "global" add_headers are dropped.
E.g.

server {
    server_name example.com;

    ...gzip...
    ...ssl...
    ...root...

    # Great security headers...
    add_header X-Frame-Options SAMEORIGIN;
    add_header X-XSS-Protection "1; mode=block";
    ...more security headers...

    location / {
        try_files    $uri /index.html;
        # NOTE! Adding some more headers here
+       add_header X-debug-whats-going-on on; 
    }
}

Now, same curl command:

$ curl -I https://example.com
[snip]
X-debug-whats-going-on: on

Bad score on Observatory for www.peterbe.com
Yikes! Now those other useful security headers are gone!

Here are your options:

  1. Don't add headers like that inside location blocks. Yeah, that's not always a choice.
  2. Copy-n-paste all the general security add_header blocks into the location blocks where you have to have "custom" add_header entries.
  3. Use an include file, see below.

How to include files

First create a new file, like /etc/nginx/snippets/general-security-headers.conf then put this into it:

# Great security headers...
add_header X-Frame-Options SAMEORIGIN;
add_header X-XSS-Protection "1; mode=block";
...more security headers...
# More realistically, see https://gist.github.com/plentz/6737338

Now, instead of saying these add_header lines in your /etc/nginx/sites-enabled/example.conf change that to:

server {
    server_name example.com;

    ...gzip...
    ...ssl...
    ...root...

    include /etc/nginx/snippets/general-security-headers.conf;

    location / {
        try_files    $uri /index.html;
        # Note! This gets included *again* because
        # this location block needs its own custom add_header
        # directives.
        include /etc/nginx/snippets/general-security-headers.conf;
        # NOTE! Adding some more headers here
        add_header X-debug-whats-going-on on; 
    }
}

(You need to use your imagination that a real Nginx config site probably has many different more complex location directives)

It's arguably a bit clunky but it works and it's the best of both worlds. The right security headers for all locations and ability to set custom add_header directives for specific locations.

Discussion

I'm most disappointed in myself for not noticing. Not for not noticing this in the Nginx documentation, but that I didn't check my security headers on more than one path. But I'm also quite disappointed in Nginx for this rather odd behaviour. To quote my security engineer at Mozilla, April King:

"add" doesn't usually mean "subtract everything else"

She agreed with me that the way it works is counter-intuitive and showed me this snippet which uses include files the same way.

Please post a comment if you have thoughts or questions.