Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page!
Currently only showing blog entries under the category: JavaScript. Clear filter

What English stop words overlap with JavaScript reserved keywords?

07 May 2021 0 comments   JavaScript, MDN


The list of stop words in Elasticsearch is:

a, an, and, are, as, at, be, but, by, for, if, in, into, 
is, it, no, not, of, on, or, such, that, the, their, 
then, there, these, they, this, to, was, will, with

The list of JavaScript reserved keywords is:

abstract, arguments, await, boolean, break, byte, case, 
catch, char, class, const, continue, debugger, default, 
delete, do, double, else, enum, eval, export, extends, 
false, final, finally, float, for, function, goto, if, 
implements, import, in, instanceof, int, interface, let, 
long, native, new, null, package, private, protected, 
public, return, short, static, super, switch, synchronized, 
this, throw, throws, transient, true, try, typeof, var, 
void, volatile, while, with, yield

That means that the overlap is:

for, if, in, this, with

And the remainder of the English stop words is:

a, an, and, are, as, at, be, but, by, into, is, it, no, 
not, of, on, or, such, that, the, their, then, there, 
these, they, to, was, will

Why does this matter? It matters when you're writing a search engine on English text that is about JavaScript. Such as, MDN Web Docs. At the time of writing, you can search for this because there's a special case explicitly for that word. But you can't search for for which is unfortunate.

But there's more! I think we should consider certain prototype words to be considered "reserved" because they are important JavaScript words that should not be treated as stop words. For example...

How to simulate slow lazy chunk-loading in React

25 March 2021 0 comments   React, JavaScript


Suppose you have one of those React apps that lazy-load some chunk. It just basically means it injects a .js static asset URL into the DOM and once it's downloaded by the browser, it carries on the React rendering with the new code loaded. Well, what if the network is really slow? In local development, it can be hard to simulate this. You can mess with the browser's Devtools to try to slow down the network, but even that can be too fast sometimes.

What I often do is, I take this:

const SettingsApp = React.lazy(() => import("./app"));

...and change it to this:

const SettingsApp = React.lazy(() =>
  import("./app").then((module) => {
    return new Promise((resolve) => {
      setTimeout(() => {
        resolve(module as any);
      }, 10000);
    });
  })
);

Now, it won't load that JS chunk until 10 seconds later. Only temporarily, in local development.

I know it's admittedly just a hack but it's nifty. Just don't forget to undo it when you're done simulating your snail-speed web app.

PS. That resolve(module as any); is for TypeScript. You can just change that to resolve(module); if it's regular JavaScript.

In JavaScript (Node) which is fastest, generator function or a big array function?

05 March 2021 0 comments   Node, JavaScript


Sorry about the weird title of this blog post. Not sure what else to call it.

I have a function that recursively traverses the file system. You can iterate over this function to do something with each found file on disk. Silly example:

for (const filePath of walker("/lots/of/files/here")) {
  count += filePath.length;
}

The implementation looks like this:

function* walker(root) {
  const files = fs.readdirSync(root);
  for (const name of files) {
    const filepath = path.join(root, name);
    const isDirectory = fs.statSync(filepath).isDirectory();
    if (isDirectory) {
      yield* walker(filepath);
    } else {
      yield filepath;
    }
  }
}

But I wondered; is it faster to not use a generator function since there might an overhead in swapping from the generator to whatever callback does something with each yielded thing. A pure big-array function looks like this:

function walker(root) {
  const files = fs.readdirSync(root);
  const all = [];
  for (const name of files) {
    const filepath = path.join(root, name);
    const isDirectory = fs.statSync(filepath).isDirectory();
    if (isDirectory) {
      all.push(...walker(filepath));
    } else {
      all.push(filepath);
    }
  }
  return all;
}

It gets the same result/outcome.

It's hard to measure this but I pointed it to some large directory with many files and did something silly with each one just to make sure it does something:

const label = "generator";
console.time(label);
let count = 0;
for (const filePath of walker(SEARCH_ROOT)) {
  count += filePath.length;
}
console.timeEnd(label);
const heapBytes = process.memoryUsage().heapUsed;
console.log(`HEAP: ${(heapBytes / 1024.0).toFixed(1)}KB`);

I ran it a bunch of times. After a while, the numbers settle and you get:

In other words, no speed difference.

Obviously building up a massive array in memory will increase the heap memory usage. Taking a snapshot at the end of the run and printing it each time, you can see that...

Conclusion

The potential swap overhead for a Node generator function is absolutely minuscule. At least in contexts similar to mine.

It's not unexpected that the generator function bounds less heap memory because it doesn't build up a big array at all.

What's lighter than ExpressJS?

25 February 2021 0 comments   Node, JavaScript


tl;dr; polka is the lightest Node HTTP server package.

Highly unscientific but nevertheless worth writing down. Lightest here refers to the eventual weight added to the node_modules directory which is a reflection of network and disk use.

When you write a serious web server in Node you probably don't care about which one is lightest. It's probably more important which ones are actively maintained, reliable, well documented, and generally "more familiar". However, I was interested in setting up a little Node HTTP server for the benefit of wrapping some HTTP endpoints for an integration test suite.

The test

In a fresh new directory, right after having run: yarn init -y run the yarn add ... and see how big the node_modules directory becomes afterward (du -sh node_modules).

The results

  1. polka: 116K
  2. koa: 1.7M
  3. express: 2.4M
  4. fastify: 8.0M

bar chart

Conclusion

polka is the lightest. But I'm not so sure it matters. But it could if this has to be installed a lot. For example, in CI where you run that yarn install a lot. Then it might save quite a bit of electricity for the planet.

The best and simplest way to parse an RSS feed in Node

13 February 2021 0 comments   Node, JavaScript


There are a lot of 'rss' related NPM packages but I think I've found a combination that is great for parsing RSS feeds. Something that takes up the minimal node_modules and works great. I think the killer combination is

The code impressively simple:

const got = require("got");
const parser = require("fast-xml-parser");

(async function main() {
  const buffer = await got("https://hacks.mozilla.org/feed/", {
    responseType: "buffer",
    resolveBodyOnly: true,
    timeout: 5000,
    retry: 5,
  });
  var feed = parser.parse(buffer.toString());
  for (const item of feed.rss.channel.item) {
    console.log({ title: item.title, url: item.link });
    break;
  }
})();


// Outputs...
// {
//   title: 'MDN localization update, February 2021',
//   url: 'https://hacks.mozilla.org/2021/02/mdn-localization-update-february-2021/'
// }

I like about fast-xml-parser is that it has no dependencies. And it's tiny:

▶ du -sh node_modules/fast-xml-parser
104K    node_modules/fast-xml-parser

The got package is quite a bit larger and has more dependencies. But I still love it. It's proven itself to be very reliable and very pleasant API. Both packages support TypeScript too.

A particular detail I like about fast-xml-parser is that it doesn't try to do the downloading part too. This way, I can use my own preferred library and I could potentially write my own caching code if I want to protect against flaky network.

Sneaky block-scoping variables in JavaScript that eslint can't even detect

03 February 2021 0 comments   JavaScript


What do you think this code will print out?

function validateURL(url) {
  if (url.includes("://")) {
    const url = new URL(url);
    return url.protocol === "https:";
  } else {
    return "dunno";
  }
}
console.log(validateURL("http://www.peterbe.com"));

I'll give you a clue that isn't helpful,

▶ eslint --version
v7.19.0

▶ eslint code.js

▶ echo $?
0

OK, the answer is that it crashes:

▶ node code.js
/Users/peterbe/dev/JAVASCRIPT/catching_consts/code.js:3
    const url = new URL(url);
                        ^

ReferenceError: Cannot access 'url' before initialization
    at validateURL (/Users/peterbe/dev/JAVASCRIPT/catching_consts/code.js:3:25)
    at Object.<anonymous> (/Users/peterbe/dev/JAVASCRIPT/catching_consts/code.js:9:13)
...

▶ node --version
v15.2.1

It's an honest and easy mistake to make. If the code was this:

function validateURL(url) {
  const url = new URL(url);
  return url.protocol === "https:";
}
// console.log(validateURL("http://www.peterbe.com"));

you'd get this error:

▶ node code2.js
/Users/peterbe/dev/JAVASCRIPT/catching_consts/code2.js:2
  const url = new URL(url);
        ^

SyntaxError: Identifier 'url' has already been declared

which means node refuses to even start it. But it can't with the original code because of the blocking scope that only happens in runtime.

Easiest solution

function validateURL(url) {
  if (url.includes("://")) {
-   const url = new URL(url);
+   const parsedURL = new URL(url);
-   return url.protocol === "https:";
+   return parsedURL.protocol === "https:";
  } else {
    return "dunno";
  }
}
console.log(validateURL("http://www.peterbe.com"));

Best solution

Switch to TypeScript.

▶ cat code.ts
function validateURL(url: string) {
  if (url.includes('://')) {
    const url = new URL(url);
    return url.protocol === 'https:';
  } else {
    return "dunno";
  }
}
console.log(validateURL('http://www.peterbe.com'));

▶ tsc --noEmit --lib es6,dom code.ts
code.ts:3:25 - error TS2448: Block-scoped variable 'url' used before its declaration.

3     const url = new URL(url);
                          ~~~

  code.ts:3:11
    3     const url = new URL(url);
                ~~~
    'url' is declared here.


Found 1 error.