Peterbe.com

Peter Bengtsson's blog

Page 5

In TypeScript, how to combine known and unknown keys to an object

July 3, 2024
0 comments JavaScript

More than happy to be informed of a better solution here! But this came up in a real-work situation and I "stumbled" on the solution by more or less guessing.

In plain JavaScript, you have an object which you know you set certain keys on. But because this object is (ab)used for a templating engine, we also put keys/values on it that are not known in advance. In our use case, these keys and booleans came from parsing a .yml file which. It looks something like this:


// Code simplified for the sake of the example

const context = {
  currentVersion: "3.12", 
  currentLanguage: "en",
  activeDate: someDateObject,
  // ... other things that are values of type number, bool, Date, and string
  // ...
}
if (someCondition()) {
  context.hasSomething = true
}

for (const [featureFlag, truth] of Object.entries(parseYamlFile('features.yml')) {
  context[featureFlag] = truth
}

const rendered = render(template: { context })

I don't like this design where you "combine" an object with known keys with a spread of unknown keys coming from an external source. But here we are and we have to convert this to TypeScript, the clock's ticking!

Truncated! Read the rest by clicking the link below.

Rate my golf swing (June 2024)

July 1, 2024
0 comments Golf

Never done this before but I thought it could be fun.

Things to keep in mind, I'm 45 years old and have only played for 2-3 years. My handicap is 8. I have a coach who has given me 5+ things to keep in mind, and I usually think only of 1-2 when I swing out on the course, and maybe 2-3 when I'm practicing on the mat.

What I'm currently working on is to not move the right hip too much to my right, but to instead load the power into the right thigh. I'm also trying to not bend my left (glove) hand but have a straight line from the underarm to the knuckles. The other thing I'm trying really hard to do is to move the club far out to my right in the first 20-30° with extended arms, rather than lifting the club up by my hands.

I'm super-far from an expert and I see lots of flaws compared to the pros. There's plenty of stuff still on my plate with the things I know I need to work on. Truth be told, at the moment, the thing I'm working on the most isn't swing mechanics but all mental; take it easy, and don't try to smash it.

Simple object lookup in TypeScript

June 14, 2024
2 comments JavaScript

Ever got this error:

Element implicitly has an 'any' type because expression of type 'string' can't be used to index type '{ foo: string; bar: string; }'. No index signature with a parameter of type 'string' was found on type '{ foo: string; bar: string; }'.(7053)

Yeah, me too. What used to be so simple in JavaScript suddenly feels hard in TypeScript.

In JavaScript,


const greetings = {
  good: "Excellent",
  bad: "Sorry to hear",
}
const answer = prompt("How are you?")
if (typeof answer === "string") {
  alert(greetings[answer] || "OK")
}

To see it in action, I put it into a CodePen.

Now, port that to TypeScript,

Truncated! Read the rest by clicking the link below.

Search GitHub issues by title, only

May 31, 2024
0 comments GitHub

tl;dr You can search GitHub issues specifically only in the title by adding in:title.

Suppose you go to a GitHub repository's Issue list. If you search, it will find issues that match your search in any field. For example:

Finding 70 issues by dialog

Now, add in:title to the search input:

Truncated! Read the rest by clicking the link below.

How do you thousands-comma AND whitespace format a f-string in Python

March 17, 2024
1 comment Python

For some reason, I always forget how to do this. Tired of that. Let's blog about it so it sticks.

To format a number with thousand-commas you do:


>>> n = 1234567
>>> f"{n:,}"
'1,234,567'

To add whitespace to a string you do:


>>> name="peter"
>>> f"{name:<20}"
'peter               '

How to combine these in one expression, you do:


>>> n = 1234567
>>> f"{n:<15,}"
'1,234,567      '

Leibniz formula for π in Python, JavaScript, and Ruby

March 14, 2024
0 comments Python, JavaScript

Officially, I'm one day behind, but here's how you can calculate the value of π using the Leibniz formula.

Python


import math

sum = 0
estimate = 0
i = 0
epsilon = 0.0001
while abs(estimate - math.pi) > epsilon:
    sum += (-1) ** i / (2 * i + 1)
    estimate = sum * 4
    i += 1
print(
    f"After {i} iterations, the estimate is {estimate} and the real pi is {math.pi} "
    f"(difference of {abs(estimate - math.pi)})"
)

Outputs:

After 10000 iterations, the estimate is 3.1414926535900345 and the real pi is 3.141592653589793 (difference of 9.99999997586265e-05)

Truncated! Read the rest by clicking the link below.

Notes on porting a Next.js v14 app from Pages to App Router

March 2, 2024
0 comments React, JavaScript

Unfortunately, the app I ported from using the Pages Router to using App Router, is in a private repo. It's a Next.js static site SPA (Single Page App).

It's built with npm run build and then exported so that the out/ directory is the only thing I need to ship to the CDN and it just works. There's a home page and a few dynamic routes whose slugs depend on an SQL query. So the SQL (PostgreSQL) connection, using knex, has to be present when running npm run build.

In no particular order, let's look at some differences

Build times

With caching

After running next build a bunch of times, the rough averages are:

Pages Router: 20.5 seconds
App Router: 19.5 seconds

Without caching

After running rm -fr .next && next build a bunch of times, the rough averages are:

Pages Router: 28.5 seconds
App Router: 31 seconds

Note

Truncated! Read the rest by clicking the link below.

How to avoid a count query in Django if you can

February 14, 2024
1 comment Django, Python

Suppose you have a complex Django QuerySet query that is somewhat costly (in other words slow). And suppose you want to return:

The first N results
A count of the total possible results

So your implementation might be something like this:


def get_results(queryset, fields, size):
    count = queryset.count()
    results = []
    for record in queryset.values(*fields)[:size]
        results.append(record)
    return {"count": count, "results": results}

That'll work. If there are 1,234 rows in your database table that match those specific filters, what you might get back from this is:


>>> results = get_results(my_queryset, ("name", "age"), 5)
>>> results["count"]
1234
>>> len(results["results"])
5

Or, if the filters would only match 3 rows in your database table:


>>> results = get_results(my_queryset, ("name", "age"), 5)
>>> results["count"]
3
>>> len(results["results"])
3

Between your Python application and your database you'll see:

query 1: SELECT COUNT(*) FROM my_database WHERE ...
query 2: SELECT name, age FROM my_database WHERE ... LIMIT 5

The problem with this is that, in the latter case, you had to send two database queries when all you needed was one.
If you knew it would only match a tiny amount of records, you could do this:


def get_results(queryset, fields, size):
-   count = queryset.count()
    results = []
    for record in queryset.values(*fields)[:size]:
        results.append(record)
+   count = len(results)
    return {"count": count, "results": results}

But that is wrong. The count would max out at whatever the size is.

The solution is to try to avoid the potentially unnecessary .count() query.


def get_results(queryset, fields, size):
    count = 0
    results = []
    for i, record in enumerate(queryset.values(*fields)[: size + 1]):
        if i == size:
            # Alas, there are more records than the pagination
            count = queryset.count()
            break
        count = i + 1
        results.append(record)
    return {"count": count, "results": results}

This way, you only incur one database query when there wasn't that much to find, but if there was more than what the pagination called for, you have to incur that extra database query.

How to restore all unstaged files in with git

February 8, 2024
1 comment GitHub, macOS, Linux

tl;dr git restore -- .

I can't believe I didn't know this! Maybe, at one point, I did, but, since forgotten.

You're in a Git repo and you have edited 4 files and run git status and see this:


❯ git status
On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
    modified:   four.txt
    modified:   one.txt
    modified:   three.txt
    modified:   two.txt

no changes added to commit (use "git add" and/or "git commit -a")

Suppose you realize; "Oh no! I didn't mean to make those changes in three.txt" You can restore that file by mentioning it by name:

Truncated! Read the rest by clicking the link below.

How slow is Node to Brotli decompress a file compared to not having to decompress?

January 19, 2024
3 comments Node, macOS, Linux

tl;dr; Not very slow.

At work, we have some very large .json that get included in a Docker image. The Node server then opens these files at runtime and displays certain data from that. To make the Docker image not too large, we compress these .json files at build-time. We compress the .json files with Brotli to make a .json.br file. Then, in the Node server code, we read them in and decompress them at runtime. It looks something like this:


export function readCompressedJsonFile(xpath) {
  return JSON.parse(brotliDecompressSync(fs.readFileSync(xpath)))
}

The advantage of compressing them first, at build time, which is GitHub Actions, is that the Docker image becomes smaller which is advantageous when shipping that image to a registry and asking Azure App Service to deploy it. But I was wondering, is this a smart trade-off? In a sense, why compromise on runtime (which faces users) to save time and resources at build-time, which is mostly done away from the eyes of users? The question was; how much overhead is it to have to decompress the files after its data has been read from disk to memory?

The benchmark

The files I test with are as follows:


❯ ls -lh pageinfo*
-rw-r--r--  1 peterbe  staff   2.5M Jan 19 08:48 pageinfo-en-ja-es.json
-rw-r--r--  1 peterbe  staff   293K Jan 19 08:48 pageinfo-en-ja-es.json.br
-rw-r--r--  1 peterbe  staff   805K Jan 19 08:48 pageinfo-en.json
-rw-r--r--  1 peterbe  staff   100K Jan 19 08:48 pageinfo-en.json.br

There are 2 groups:

Only English (en)
3 times larger because it has English, Japanese, and Spanish

And for each file, you can see the effect of having compressed them with Brotli.

The smaller JSON file compresses 8x
The larger JSON file compresses 9x

Here's the benchmark code:


import fs from "fs";
import { brotliDecompressSync } from "zlib";
import { Bench } from "tinybench";

const JSON_FILE = "pageinfo-en.json";
const BROTLI_JSON_FILE = "pageinfo-en.json.br";
const LARGE_JSON_FILE = "pageinfo-en-ja-es.json";
const BROTLI_LARGE_JSON_FILE = "pageinfo-en-ja-es.json.br";

function f1() {
  const data = fs.readFileSync(JSON_FILE, "utf8");
  return Object.keys(JSON.parse(data)).length;
}

function f2() {
  const data = brotliDecompressSync(fs.readFileSync(BROTLI_JSON_FILE));
  return Object.keys(JSON.parse(data)).length;
}

function f3() {
  const data = fs.readFileSync(LARGE_JSON_FILE, "utf8");
  return Object.keys(JSON.parse(data)).length;
}

function f4() {
  const data = brotliDecompressSync(fs.readFileSync(BROTLI_LARGE_JSON_FILE));
  return Object.keys(JSON.parse(data)).length;
}

console.assert(f1() === 2633);
console.assert(f2() === 2633);
console.assert(f3() === 7767);
console.assert(f4() === 7767);

const bench = new Bench({ time: 100 });
bench.add("f1", f1).add("f2", f2).add("f3", f3).add("f4", f4);
await bench.warmup(); // make results more reliable, ref: https://github.com/tinylibs/tinybench/pull/50
await bench.run();

console.table(bench.table());

Here's the output from tinybench:

┌─────────┬───────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name │ ops/sec │ Average Time (ns)  │  Margin  │ Samples │
├─────────┼───────────┼─────────┼────────────────────┼──────────┼─────────┤
│    0    │   'f1'    │  '179'  │  5563384.55941942  │ '±6.23%' │   18    │
│    1    │   'f2'    │  '150'  │ 6627033.621072769  │ '±7.56%' │   16    │
│    2    │   'f3'    │  '50'   │ 19906517.219543457 │ '±3.61%' │   10    │
│    3    │   'f4'    │  '44'   │ 22339166.87965393  │ '±3.43%' │   10    │
└─────────┴───────────┴─────────┴────────────────────┴──────────┴─────────┘

Note, this benchmark is done on my 2019 Intel MacBook Pro. This disk is not what we get from the Apline Docker image (running inside Azure App Service). To test that would be a different story. But, at least we can test it in Docker locally.

I created a Dockerfile that contains...

ARG NODE_VERSION=20.10.0

FROM node:${NODE_VERSION}-alpine

and run the same benchmark in there by running docker composite up --build. The results are:

┌─────────┬───────────┬─────────┬────────────────────┬──────────┬─────────┐
│ (index) │ Task Name │ ops/sec │ Average Time (ns)  │  Margin  │ Samples │
├─────────┼───────────┼─────────┼────────────────────┼──────────┼─────────┤
│    0    │   'f1'    │  '151'  │ 6602581.124978315  │ '±1.98%' │   16    │
│    1    │   'f2'    │  '112'  │  8890548.4166656   │ '±7.42%' │   12    │
│    2    │   'f3'    │  '44'   │ 22561206.40002191  │ '±1.95%' │   10    │
│    3    │   'f4'    │  '37'   │ 26979896.599974018 │ '±1.07%' │   10    │
└─────────┴───────────┴─────────┴────────────────────┴──────────┴─────────┘

Analysis/Conclusion

First, focussing on the smaller file: Processing the .json is 25% faster than the .json.br file

Then, the larger file: Processing the .json is 16% faster than the .json.br file

So that's what we're paying for a smaller Docker image. Depending on the size of the .json file, your app runs ~20% slower at this operation. But remember, as a file on disk (in the Docker image), it's ~8x smaller.

I think, in conclusion: It's a small price to pay. It's worth doing. Your context depends.
Keep in mind the numbers there to process that 300KB pageinfo-en-ja-es.json.br file, it was able to do that 37 times in one second. That means it took 27 milliseconds to process that file!

The caveats

To repeat, what was mentioned above: This was run in my Intel MacBook Pro. It's likely to behave differently in a real Docker image running inside Azure.

The thing that I wonder the most about is arguably something that actually doesn't matter. 🙃
When you ask it to read in a .json.br file, there's less data to ask from the disk into memory. That's a win. You lose on CPU work but gain on disk I/O. But only the end net result matters so in a sense that's just an "implementation detail".

Admittedly, I don't know if the macOS or the Linux kernel does things with caching the layer between the physical disk and RAM for these files. The benchmark effectively asks "Hey, hard disk, please send me a file called ..." and this could be cached in some layer beyond my knowledge/comprehension. In a real production server, this only happens once because once the whole file is read, decompressed, and parsed, it won't be asked for again. Like, ever. But in a benchmark, perhaps the very first ask of the file is slower and all the other runs are unrealistically faster.

Feel free to clone https://github.com/peterbe/reading-json-files and mess around to run your own tests. Perhaps see what effect async can have. Or perhaps try it with Bun and it's file system API.