Comparing different efforts with WebP in Sharp

October 5, 2023
0 comments Node, JavaScript

When you, in a Node program, use sharp to convert an image buffer to a WebP buffer, you have an option of effort. The higher the number the longer it takes but the image it produces is smaller on disk.

I wanted to put some realistic numbers for this, so I wrote a benchmark, run on my Intel MacbookPro.

The benchmark

It looks like this:


async function e6() {
  return await f("screenshot-1000.png", 6);
}
async function e5() {
  return await f("screenshot-1000.png", 5);
}
async function e4() {
  return await f("screenshot-1000.png", 4);
}
async function e3() {
  return await f("screenshot-1000.png", 3);
}
async function e2() {
  return await f("screenshot-1000.png", 2);
}
async function e1() {
  return await f("screenshot-1000.png", 1);
}
async function e0() {
  return await f("screenshot-1000.png", 0);
}

async function f(fp, effort) {
  const originalBuffer = await fs.readFile(fp);
  const image = sharp(originalBuffer);
  const { width } = await image.metadata();
  const buffer = await image.webp({ effort }).toBuffer();
  return [buffer.length, width, { effort }];
}

Then, I ran each function in serial and measured how long it took. Then, do that whole thing 15 times. So, in total, each function is executed 15 times. The numbers are collected and the median (P50) is reported.

A 2000x2000 pixel PNG image

1. e0: 191ms                   235KB
2. e1: 340.5ms                 208KB
3. e2: 369ms                   198KB
4. e3: 485.5ms                 193KB
5. e4: 587ms                   177KB
6. e5: 695.5ms                 177KB
7. e6: 4811.5ms                142KB

What it means is that if you use {effort: 6} the conversion of a 2000x2000 PNG took 4.8 seconds but the resulting WebP buffer became 142KB instead of the least effort which made it 235 KB.

Comparing effort, time and size

This graph demonstrates how the (blue) time goes up the more effort you put in. And how the final size (red) goes down the more effort you put in.

A 1000x1000 pixel PNG image

1. e0: 54ms                    70KB
2. e1: 60ms                    66KB
3. e2: 65ms                    61KB
4. e3: 96ms                    59KB
5. e4: 169ms                   53KB
6. e5: 193ms                   53KB
7. e6: 1466ms                  51KB

A 500x500 pixel PNG image

1. e0: 24ms                    23KB
2. e1: 26ms                    21KB
3. e2: 28ms                    20KB
4. e3: 37ms                    19KB
5. e4: 57ms                    18KB
6. e5: 66ms                    18KB
7. e6: 556ms                   18KB

Conclusion

Up to you but clearly, {effort: 6} is to be avoided if you're worried about it taking a huge amount of time to make the conversion.

Perhaps the takeaway is; that if you run these operations in the build step such that you don't have to ever do it again, it's worth the maximum effort. Beyond that, find a sweet spot for your particular environment and challenge.

Zipping files is appending by default - Watch out!

October 4, 2023
0 comments Linux

This is not a bug in the age-old zip Linux program. It's maybe a bug in its intuitiveness.

I have a piece of automation that downloads a zip file from a file storage cache (GitHub Actions actions/cache in this case). Then, it unpacks it, and plucks some of the files from it into another fresh new directory. Lastly, it creates a new .zip file with the same name. The same name because that way, when the process is done, it uploads the new .zip file into the file storage cache. But be careful; does it really create a new .zip file?

To demonstrate the surprise:


$ cd /tmp/
$ mkdir somefiles
$ touch somefiles/file1.txt
$ touch somefiles/file2.txt
$ zip -r somefiles.zip somefiles
  adding: somefiles/ (stored 0%)
  adding: somefiles/file1.txt (stored 0%)
  adding: somefiles/file2.txt (stored 0%)

Now we have a somefiles.zip to work with. It has 2 files in it.

Next session. Let's say it's another day and a fresh new /tmp directory and the previous somefiles.txt has been downloaded from the first session. This time we want to create a new somefile directory but in it, only have file2.txt from before and a new file file3.txt.


$ rm -fr somefiles
$ unzip somefiles.zip
Archive:  somefiles.zip
   creating: somefiles/
 extracting: somefiles/file1.txt
 extracting: somefiles/file2.txt
$ rm somefiles/file1.txt
$ touch somefiles/file3.txt
$ zip -r somefiles.zip somefiles
updating: somefiles/ (stored 0%)
updating: somefiles/file2.txt (stored 0%)
  adding: somefiles/file3.txt (stored 0%)

And here comes the surprise, let's peek into the newly zipped up somefiles.txt (which was made from the somefiles/ directory which only contained file2.txt and file3.txt):


$ rm -fr somefiles
$ unzip -l somefiles.zip
Archive:  somefiles.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2023-10-04 16:06   somefiles/
        0  2023-10-04 16:05   somefiles/file1.txt
        0  2023-10-04 16:06   somefiles/file2.txt
        0  2023-10-04 16:06   somefiles/file3.txt
---------                     -------
        0                     4 files

I did not see that coming! The command zip -r somefiles.zip somefiles/ doesn't create a fresh new .zip file based on recursively walking the somefiles directory. It does an append by default!

The solution is easy. Right before the zip -r somefiles.zip somefiles command, do a rm somefiles.zip.

Introducing hylite - a Node code-syntax-to-HTML highlighter written in Bun

October 3, 2023
0 comments Node, Bun, JavaScript

hylite is a command line tool for syntax highlight code into HTML. You feed it a file or some snippet of code (plus what language it is) and it returns a string of HTML.

Suppose you have:


❯ cat example.py
# This is example.py
def hello():
    return "world"

When you run this through hylite you get:


❯ npx hylite example.py
<span class="hljs-keyword">def</span> <span class="hljs-title function_">hello</span>():
    <span class="hljs-keyword">return</span> <span class="hljs-string">&quot;world&quot;</span>

Now, if installed with the necessary CSS, it can finally render this:


# This is example.py
def hello():
    return "world"

(Note: At the time of writing this, npx hylite --list-css or npx hylite --css don't work unless you've git clone the github.com/peterbe/hylite repo)

How I use it

This originated because I loved how highlight.js works. It supports numerous languages, can even guess the language, is fast as heck, and the HTML output is compact.

Originally, my personal website, whose backend is in Python/Django, was using Pygments to do the syntax highlighting. The problem with that is it doesn't support JSX (or TSX). For example:


export function Bell({ color }: {color: string}) {
  return <div style={{ backgroundColor: color }}>Ding!</div>
}

The problem is that Python != Node so to call out to hylite I use a sub-process. At the moment, I can't use bunx or npx because that depends on $PATH and stuff that the server doesn't have. Here's how I call hylite from Python:


command = settings.HYLITE_COMMAND.split()
assert language
command.extend(["--language", language, "--wrapped"])
process = subprocess.Popen(
    command,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    cwd=settings.HYLITE_DIRECTORY,
)
process.stdin.write(code)
output, error = process.communicate()

The settings are:


HYLITE_DIRECTORY = "/home/django/hylite"
HYLITE_COMMAND = "node dist/index.js"

How I built hylite

What's different about hylite compared to other JavaScript packages and CLIs like this is that the development requires Bun. It's lovely because it has a built-in test runner, TypeScript transpiler, and it's just so lovely fast at starting for anything you do with it.

In my current view, I see Bun as an equivalent of TypeScript. It's convenient when developing but once stripped away it's just good old JavaScript and you don't have to worry about compatibility.

So I use bun for manual testing like bun run src/index.ts < foo.go but when it comes time to ship, I run bun run build (which executes, with bun, the src/build.ts) which then builds a dist/index.js file which you can run with either node or bun anywhere.

By the way, the README as a section on Benchmarking. It concludes two things:

  1. node dist/index.js has the same performance as bun run dist/index.js
  2. bunx hylite is 7x times faster than npx hylite but it's bullcrap because bunx doesn't check the network if there's a new version (...until you restart your computer)

Shallow clone vs. deep clone, in Node, with benchmark

September 29, 2023
0 comments Node, JavaScript

A very common way to create a "copy" of an Object in JavaScript is to copy all things from one object into an empty one. Example:


const original = {foo: "Foo"}
const copy = Object.assign({}, original)
copy.foo = "Bar"
console.log([original.foo, copy.foo])

This outputs


[ 'Foo', 'Bar' ]

Obviously the problem with this is that it's a shallow copy, best demonstrated with an example:


const original = { names: ["Peter"] }
const copy = Object.assign({}, original)
copy.names.push("Tucker")
console.log([original.names, copy.names])

This outputs:


[ [ 'Peter', 'Tucker' ], [ 'Peter', 'Tucker' ] ]

which is arguably counter-intuitive. Especially since the variable was named "copy".
Generally, I think Object.assign({}, someThing) is often a red flag because if not today, maybe in some future the thing you're copying might have mutables within.

The "solution" is to use structuredClone which has been available since Node 16. Actually, it was introduced within minor releases of Node 16, so be a little bit careful if you're still on Node 16.

Same example:


const original = { names: ["Peter"] };
// const copy = Object.assign({}, original);
const copy = structuredClone(original);
copy.names.push("Tucker");
console.log([original.names, copy.names]);

This outputs:


[ [ 'Peter' ], [ 'Peter', 'Tucker' ] ]

Another deep copy solution is to turn the object into a string, using JSON.stringify and turn it back into a (deeply copied) object using JSON.parse. It works like structuredClone but full of caveats such as unpredictable precision loss on floating point numbers, and not to mention date objects ceasing to be date objects but instead becoming strings.

Benchmark

Given how much "better" structuredClone is in that it's more intuitive and therefore less dangerous for sneaky nested mutation bugs. Is it fast? Before even running a benchmark; no, structuredClone is slower than Object.assign({}, ...) because of course. It does more! Perhaps the question should be: how much slower is structuredClone? Here's my benchmark code:


import fs from "fs"
import assert from "assert"

import Benchmark from "benchmark"

const obj = JSON.parse(fs.readFileSync("package-lock.json", "utf8"))

function f1() {
  const copy = Object.assign({}, obj)
  copy.name = "else"
  assert(copy.name !== obj.name)
}

function f2() {
  const copy = structuredClone(obj)
  copy.name = "else"
  assert(copy.name !== obj.name)
}

function f3() {
  const copy = JSON.parse(JSON.stringify(obj))
  copy.name = "else"
  assert(copy.name !== obj.name)
}

new Benchmark.Suite()
  .add("f1", f1)
  .add("f2", f2)
  .add("f3", f3)
  .on("cycle", (event) => {
    console.log(String(event.target))
  })
  .on("complete", function () {
    console.log("Fastest is " + this.filter("fastest").map("name"))
  })
  .run()

The results:

❯ node assign-or-clone.js
f1 x 8,057,542 ops/sec ±0.84% (93 runs sampled)
f2 x 37,245 ops/sec ±0.68% (94 runs sampled)
f3 x 37,978 ops/sec ±0.85% (92 runs sampled)
Fastest is f1

In other words, Object.assign({}, ...) is 200 times faster than structuredClone.
By the way, I re-ran the benchmark with a much smaller object (using the package.json instead of the package-lock.json) and then Object.assign({}, ...) is only 20 times faster.

Mind you! They're both ridiculously fast in the grand scheme of things.

If you do this...


for (let i = 0; i < 10; i++) {
  console.time("f1")
  f1()
  console.timeEnd("f1")

  console.time("f2")
  f2()
  console.timeEnd("f2")

  console.time("f3")
  f3()
  console.timeEnd("f3")
}

the last bit of output of that is:

f1: 0.006ms
f2: 0.06ms
f3: 0.053ms

which means that it took 0.06 milliseconds for structuredClone to make a convenient deep copy of an object that is 5KB as a JSON string.

Conclusion

Yes Object.assign({}, ...) is ridiculously faster than structuredClone but structuredClone is a better choice.

Pip-Outdated.py with interactive upgrade

September 21, 2023
0 comments Python

Last year I wrote a nifty script called Pip-Outdated.py "Pip-Outdated.py - a script to compare requirements.in with the output of pip list --outdated". It basically runs pip list --outdated but filters based on the packages mentioned in your requirements.in. For people familiar with Node, it's like checking all installed packages in node_modules if they have upgrades, but filter it down by only those mentioned in your package.json.

I use this script often enough that I added a little interactive input to ask if it should edit requirements.in for you for each possible upgrade. Looks like this:


❯ Pip-Outdated.py
black               INSTALLED: 23.7.0    POSSIBLE: 23.9.1
click               INSTALLED: 8.1.6     POSSIBLE: 8.1.7
elasticsearch-dsl   INSTALLED: 7.4.1     POSSIBLE: 8.9.0
fastapi             INSTALLED: 0.101.0   POSSIBLE: 0.103.1
httpx               INSTALLED: 0.24.1    POSSIBLE: 0.25.0
pytest              INSTALLED: 7.4.0     POSSIBLE: 7.4.2

Update black from 23.7.0 to 23.9.1? [y/N/q] y
Update click from 8.1.6 to 8.1.7? [y/N/q] y
Update elasticsearch-dsl from 7.4.1 to 8.9.0? [y/N/q] n
Update fastapi from 0.101.0 to 0.103.1? [y/N/q] n
Update httpx from 0.24.1 to 0.25.0? [y/N/q] n
Update pytest from 7.4.0 to 7.4.2? [y/N/q] y

and then,


❯ git diff requirements.in | cat
diff --git a/requirements.in b/requirements.in
index b7a246e..0e996e5 100644
--- a/requirements.in
+++ b/requirements.in
@@ -9,7 +9,7 @@ python-decouple==3.8
 fastapi==0.101.0
 uvicorn[standard]==0.23.2
 selectolax==0.3.16
-click==8.1.6
+click==8.1.7
 python-dateutil==2.8.2
 gunicorn==21.2.0
 # I don't think this needs `[secure]` because it's only used by
@@ -18,7 +18,7 @@ requests==2.31.0
 cachetools==5.3.1

 # Dev things
-black==23.7.0
+black==23.9.1
 flake8==6.1.0
-pytest==7.4.0
+pytest==7.4.2
 httpx==0.24.1

That's it. Then if you want to actually make these upgrades you run:


❯ pip-compile --generate-hashes requirements.in && pip install -r requirements.txt

To install it, download the script from: https://gist.github.com/peterbe/a2b158c39f1f835c0977c82befd94cdf
and put it in your ~/bin and make it executable.
Now go into a directory that has a requirements.in and run Pip-Outdated.py

Parse a CSV file with Bun

September 13, 2023
0 comments Bun

I'm really excited about Bun and look forward to trying it out more and more.
Today I needed a quick script to parse a CSV file to compute some simple arithmetic on some numbers in it.

To do that, here's what I did:


bun init
bun install csv-simple-parser
code index.ts

And the code:


import parse from "csv-simple-parser";

console.time("total");
const numbers: number[] = [];
const file = Bun.file(process.argv.slice(2)[0]);
type Rec = {
  Pageviews: string;
};
const csv = parse(await file.text(), { header: true }) as Rec[];
for (const row of csv) {
  numbers.push(parseInt(row["Pageviews"] || "0"));
}
console.timeEnd("total");
console.log("Mean  ", numbers.reduce((a, b) => a + b, 0) / numbers.length);
console.log("Median", numbers.sort()[Math.floor(numbers.length / 2)]);

And running it:

wc -l file.csv
   13623 file.csv

❯ /usr/bin/time bun run index.ts file.csv
[8.20ms] total
Mean   7.205534757395581
Median 1
        0.04 real         0.03 user         0.01 sys

(On my Intel MacBook Pro...) The reading in the file and parsing the 13k lines took 8.2 milliseconds. The whole execution took 0.04 seconds. Pretty neat.

Hello-world server in Bun vs Fastify

September 9, 2023
4 comments Node, JavaScript, Bun

Bun 1.0 just launched and I'm genuinely impressed and intrigued. How long can this madness keep going? I've never built anything substantial with Bun. Just various scripts to get a feel for it.

At work, I recently launched a micro-service that uses Node + Fastify + TypeScript. I'm not going to rewrite it in Bun, but I'm going to get a feel for the difference.

Basic version in Bun

No need for a package.json at this point. And that's neat. Create a src/index.ts and put this in:


const PORT = parseInt(process.env.PORT || "3000");

Bun.serve({
  port: PORT,
  fetch(req) {
    const url = new URL(req.url);
    if (url.pathname === "/") return new Response(`Home page!`);
    if (url.pathname === "/json") return Response.json({ hello: "world" });
    return new Response(`404!`);
  },
});
console.log(`Listening on port ${PORT}`);

What's so cool about the convenience-oriented developer experience of Bun is that it comes with a native way for restarting the server as you're editing the server code:


❯ bun --hot src/index.ts
Listening on port 3000

Let's test it:


❯ xh http://localhost:3000/
HTTP/1.1 200 OK
Content-Length: 10
Content-Type: text/plain;charset=utf-8
Date: Sat, 09 Sep 2023 02:34:29 GMT

Home page!

❯ xh http://localhost:3000/json
HTTP/1.1 200 OK
Content-Length: 17
Content-Type: application/json;charset=utf-8
Date: Sat, 09 Sep 2023 02:34:35 GMT

{
    "hello": "world"
}

Basic version with Node + Fastify + TypeScript

First of all, you'll need to create a package.json to install the dependencies, all of which, at this gentle point are built into Bun:


❯ npm i -D ts-node typescript @types/node nodemon
❯ npm i fastify

And edit the package.json with some scripts:


  "scripts": {
    "dev": "nodemon src/index.ts",
    "start": "ts-node src/index.ts"
  },

And of course, the code itself (src/index.ts):


import fastify from "fastify";

const PORT = parseInt(process.env.PORT || "3000");

const server = fastify();

server.get("/", async () => {
  return "Home page!";
});

server.get("/json", (request, reply) => {
  reply.send({ hello: "world" });
});

server.listen({ port: PORT }, (err, address) => {
  if (err) {
    console.error(err);
    process.exit(1);
  }
  console.log(`Server listening at ${address}`);
});

Now run it:


❯ npm run dev

> fastify-hello-world@1.0.0 dev
> nodemon src/index.ts

[nodemon] 3.0.1
[nodemon] to restart at any time, enter `rs`
[nodemon] watching path(s): *.*
[nodemon] watching extensions: ts,json
[nodemon] starting `ts-node src/index.ts`
Server listening at http://[::1]:3000

Let's test it:


❯ xh http://localhost:3000/
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 10
Content-Type: text/plain; charset=utf-8
Date: Sat, 09 Sep 2023 02:42:46 GMT
Keep-Alive: timeout=72

Home page!

❯ xh http://localhost:3000/json
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 17
Content-Type: application/json; charset=utf-8
Date: Sat, 09 Sep 2023 02:43:08 GMT
Keep-Alive: timeout=72

{
    "hello": "world"
}

For the record, I quite like this little setup. nodemon can automatically understand TypeScript. It's a neat minimum if Node is a desire.

Quick benchmark

Bun

Note that this server has no logging or any I/O.


❯ bun src/index.ts
Listening on port 3000

Using hey to test 10,000 requests across 100 concurrent clients:

❯ hey -n 10000 -c 100 http://localhost:3000/

Summary:
  Total:    0.2746 secs
  Slowest:  0.0167 secs
  Fastest:  0.0002 secs
  Average:  0.0026 secs
  Requests/sec: 36418.8132

  Total data:   100000 bytes
  Size/request: 10 bytes

Node + Fastify


❯ npm run start

Using hey again:

❯ hey -n 10000 -c 100 http://localhost:3000/

Summary:
  Total:    0.6606 secs
  Slowest:  0.0483 secs
  Fastest:  0.0001 secs
  Average:  0.0065 secs
  Requests/sec: 15138.5719

  Total data:   100000 bytes
  Size/request: 10 bytes

About a 2x advantage to Bun.

Serving an HTML file with Bun


Bun.serve({
  port: PORT,
  fetch(req) {
    const url = new URL(req.url);
    if (url.pathname === "/") return new Response(`Home page!`);
    if (url.pathname === "/json") return Response.json({ hello: "world" });
+   if (url.pathname === "/index.html")
+     return new Response(Bun.file("src/index.html"));
    return new Response(`404!`);
  },
});

Serves the src/index.html file just right:


❯ xh --headers http://localhost:3000/index.html
HTTP/1.1 200 OK
Content-Length: 889
Content-Type: text/html;charset=utf-8

Serving an HTML file with Node + Fastify

First, install the plugin:

❯ npm i @fastify/static

And make this change:


+import path from "node:path";
+
 import fastify from "fastify";
+import fastifyStatic from "@fastify/static";

 const PORT = parseInt(process.env.PORT || "3000");

 const server = fastify();

+server.register(fastifyStatic, {
+  root: path.resolve("src"),
+});
+
 server.get("/", async () => {
   return "Home page!";
 });
 server.get("/json", (request, reply) => {
   reply.send({ hello: "world" });
 });

+server.get("/index.html", (request, reply) => {
+  reply.sendFile("index.html");
+});
+
 server.listen({ port: PORT }, (err, address) => {
   if (err) {
     console.error(err);

And it works great:


❯ xh --headers http://localhost:3000/index.html
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: public, max-age=0
Connection: keep-alive
Content-Length: 889
Content-Type: text/html; charset=UTF-8
Date: Sat, 09 Sep 2023 03:04:15 GMT
Etag: W/"379-18a77e4e346"
Keep-Alive: timeout=72
Last-Modified: Sat, 09 Sep 2023 03:03:23 GMT

Quick benchmark of serving the HTML file

Bun


❯ hey -n 10000 -c 100 http://localhost:3000/index.html

Summary:
  Total:    0.6408 secs
  Slowest:  0.0160 secs
  Fastest:  0.0001 secs
  Average:  0.0063 secs
  Requests/sec: 15605.9735

  Total data:   8890000 bytes
  Size/request: 889 bytes

Node + Fastify


❯ hey -n 10000 -c 100 http://localhost:3000/index.html

Summary:
  Total:    1.5473 secs
  Slowest:  0.0272 secs
  Fastest:  0.0078 secs
  Average:  0.0154 secs
  Requests/sec: 6462.9597

  Total data:   8890000 bytes
  Size/request: 889 bytes

Again, a 2x performance win for Bun.

Conclusion

There isn't much to conclude here. Just an intro to the beauty of how quick Bun is, both in terms of developer experience and raw performance.
What I admire about Bun being such a convenient bundle is that Python'esque feeling of simplicity and minimalism. (For example python3.11 -m http.server -d src 3000 will make http://localhost:3000/index.html work)

The basic boilerplate of Node with Fastify + TypeScript + nodemon + ts-node is a great one if you're not ready to make the leap to Bun. I would certainly use it again. Fastify might not be the fastest server in the Node ecosystem, but it's good enough.

What's not shown in this little intro blog post, and is perhaps a silly thing to focus on, is the speed with which you type bun --hot src/index.ts and the server is ready to go. It's as far as human perception goes instant. The npm run dev on the other hand has this ~3 second "lag". Not everyone cares about that, but I do. It's more of an ethos. It's that wonderful feeling that you don't pause your thinking.

npm run dev GIF

It's hard to see when I press the Enter key but compare that to Bun:

bun --hot GIF

UPDATE (Sep 11, 2023)

I found this: github.com/SaltyAom/bun-http-framework-benchmark
It's a much better benchmark than mine here. Mind you, as long as you're not using something horribly slow, and you're not doing any I/O the HTTP framework performances don't matter much.

ts-node vs. esrun vs. esno vs. bun

August 28, 2023
0 comments Node, JavaScript

UPDATE (Jan 31, 2024)

Since this was published, I've added tsx to the benchmark. The updated results, if you skip the two slowest are:


Summary
  bun src/index.ts ran
    4.69 ± 0.20 times faster than esrun src/index.ts
    7.07 ± 0.30 times faster than tsx src/index.ts
    7.24 ± 0.33 times faster than esno src/index.ts
    7.40 ± 0.68 times faster than ts-node --transpileOnly src/index.ts

END OF UPDATE

From the totally unscientific bunker research lab of executing TypeScript files on the command line...

I have a very simple TypeScript app that you can run from the command line:


// This is src/index.ts

import { Command } from "commander";
const program = new Command();
program
  .option("-d, --debug", "output extra debugging")
  .option("-s, --small", "small pizza size")
  .option("-p, --pizza-type <type>", "flavour of pizza");

program.parse(process.argv);

const options = program.opts();

console.log("options", options);

tsc

In the original days, there was just tsc which, when given your *.ts would create an equivalent *.js file. Remember this?:


> tsc src/index.ts
> node src/index.js
> rm src/index.js

(note, most likely you'd put "outDir": "./build", in your tsconfig.json so it creates build/index.js instead)

Works. And it checks potential faults in your TypeScript code itself. For example:

❯ tsc src/index.ts
src/index.ts:8:21 - error TS2339: Property 'length' does not exist on type 'Command'.

8 console.log(program.length);
                      ~~~~~~

I don't know about you, but I rarely encounter these kinds of errors. If you view a .ts[x] file you're working on in Zed or VS Code it's already red and has squiggly lines.

VS Code with active TypeScript error

Sure, you'll make sure, one last time in your CI scripts that there are no TypeScript errors like this:

ts-node

ts-node, from that I gather is the "original gangster" of abstractions on top of TypeScript. It works quite similarly to tsc except you don't bother dumping the .js file to disk to then run it with node.

tsc src/index.ts && node src/index.js is the same as ts-node src/index.ts

It also has error checking, by default, when you run it. It can look like this:

❯ ts-node src/index.ts
/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:859
    return new TSError(diagnosticText, diagnosticCodes, diagnostics);
           ^
TSError: ⨯ Unable to compile TypeScript:
src/index.ts:8:21 - error TS2339: Property 'length' does not exist on type 'Command'.

8 console.log(program.length);
                      ~~~~~~

    at createTSError (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:859:12)
    at reportTSError (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:863:19)
    at getOutput (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1077:36)
    at Object.compile (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1433:41)
    at Module.m._compile (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1617:30)
    at Module._extensions..js (node:internal/modules/cjs/loader:1310:10)
    at Object.require.extensions.<computed> [as .ts] (/Users/peterbe/dev/JAVASCRIPT/esrun-tsnode-esno/node_modules/ts-node/src/index.ts:1621:12)
    at Module.load (node:internal/modules/cjs/loader:1119:32)
    at Function.Module._load (node:internal/modules/cjs/loader:960:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12) {
  diagnosticCodes: [ 2339 ]
}

But, suppose you don't really want those TypeScript errors right now. Suppose you are confident it doesn't error, then you want it to run as fast as possible. That's where ts-node --transpileOnly src/index.ts comes in. It's significantly faster. If you compare ts-node src/index.ts with ts-node --transpileOnly src/index.ts:

❯ hyperfine "ts-node src/index.ts" "ts-node --transpileOnly src/index.ts"
Benchmark 1: ts-node src/index.ts
  Time (mean ± σ):     990.7 ms ±  68.5 ms    [User: 1955.5 ms, System: 124.7 ms]
  Range (min … max):   916.5 ms … 1124.7 ms    10 runs

Benchmark 2: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     301.5 ms ±  10.6 ms    [User: 286.7 ms, System: 44.4 ms]
  Range (min … max):   283.0 ms … 313.9 ms    10 runs

Summary
  ts-node --transpileOnly src/index.ts ran
    3.29 ± 0.25 times faster than ts-node src/index.ts

In other words, ts-node --transpileOnly src/index.ts is 3 times faster than ts-node src/index.ts

esno and @digitak/esrun

@digitak/esrun and esno are improvements to ts-node, as far as I can understand, are improvements on ts-node that can only run. I.e. you still have to use tsc --noEmit in your CI scripts. But they're supposedly both faster than ts-node --transpileOnly:

❯ hyperfine "ts-node --transpileOnly src/index.ts" "esrun src/index.ts" "esno src/index.ts"
Benchmark 1: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     291.8 ms ±  10.5 ms    [User: 276.9 ms, System: 43.9 ms]
  Range (min … max):   280.3 ms … 309.1 ms    10 runs

Benchmark 2: esrun src/index.ts
  Time (mean ± σ):     226.4 ms ±   6.0 ms    [User: 187.9 ms, System: 42.8 ms]
  Range (min … max):   216.8 ms … 237.5 ms    13 runs

Benchmark 3: esno src/index.ts
  Time (mean ± σ):     237.2 ms ±   3.9 ms    [User: 222.8 ms, System: 45.2 ms]
  Range (min … max):   229.6 ms … 244.6 ms    12 runs

Summary
  esrun src/index.ts ran
    1.05 ± 0.03 times faster than esno src/index.ts
    1.29 ± 0.06 times faster than ts-node --transpileOnly src/index.ts

In other words, esrun is 1.05e times faster than esno and 1.29 times faster than ts-node --transpileOnly.

But given that I quite like running npm run dev to use ts-node without the --transpileOnly error for realtime TypeScript errors in the console that runs a dev server, I don't know if it's worth it.

(BONUS) bun

If you haven't heard of bun in the Node ecosystem, you've been living under a rock. It's kinda like deno but trying to appeal to regular Node projects from the ground up and it does things like bun install so much faster than npm install that you wonder if it even ran. It too can run in transpile-only mode and just execute the TypeScript code as if it was JavaScript directly. And it's fast!

Because ts-node --transpileOnly is a bit of a "standard", let's compare the two:

❯ hyperfine "ts-node --transpileOnly src/index.ts" "bun src/index.ts"
Benchmark 1: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     286.9 ms ±   6.9 ms    [User: 274.4 ms, System: 41.6 ms]
  Range (min … max):   272.0 ms … 295.8 ms    10 runs

Benchmark 2: bun src/index.ts
  Time (mean ± σ):      40.3 ms ±   2.0 ms    [User: 29.5 ms, System: 9.9 ms]
  Range (min … max):    36.5 ms …  47.1 ms    60 runs

Summary
  bun src/index.ts ran
    7.12 ± 0.40 times faster than ts-node --transpileOnly src/index.ts

Wow! Given its hype, I'm not surprised bun is 7 times faster than ts-node --transpileOnly.

But admittedly, not all programs work seamlessly in bun like my sample app did this in example.

Here's the complete result comparing all of them:

❯ hyperfine "tsc src/index.ts && node src/index.js" "ts-node src/index.ts" "ts-node --transpileOnly src/index.ts" "esrun src/index.ts" "esno src/index.ts" "bun src/index.ts"
Benchmark 1: tsc src/index.ts && node src/index.js
  Time (mean ± σ):      2.158 s ±  0.097 s    [User: 5.145 s, System: 0.201 s]
  Range (min … max):    2.032 s …  2.276 s    10 runs

Benchmark 2: ts-node src/index.ts
  Time (mean ± σ):     942.0 ms ±  40.6 ms    [User: 1877.2 ms, System: 115.6 ms]
  Range (min … max):   907.4 ms … 1012.4 ms    10 runs

Benchmark 3: ts-node --transpileOnly src/index.ts
  Time (mean ± σ):     307.1 ms ±  14.4 ms    [User: 291.0 ms, System: 45.3 ms]
  Range (min … max):   283.1 ms … 329.0 ms    10 runs

Benchmark 4: esrun src/index.ts
  Time (mean ± σ):     276.4 ms ± 121.0 ms    [User: 198.9 ms, System: 45.7 ms]
  Range (min … max):   212.2 ms … 619.2 ms    10 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (619.2 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.

Benchmark 5: esno src/index.ts
  Time (mean ± σ):     257.7 ms ±  14.3 ms    [User: 238.3 ms, System: 48.0 ms]
  Range (min … max):   238.8 ms … 282.0 ms    10 runs

Benchmark 6: bun src/index.ts
  Time (mean ± σ):      40.5 ms ±   1.6 ms    [User: 29.9 ms, System: 9.8 ms]
  Range (min … max):    36.4 ms …  44.8 ms    62 runs

Summary
  bun src/index.ts ran
    6.36 ± 0.44 times faster than esno src/index.ts
    6.82 ± 3.00 times faster than esrun src/index.ts
    7.58 ± 0.47 times faster than ts-node --transpileOnly src/index.ts
   23.26 ± 1.38 times faster than ts-node src/index.ts
   53.29 ± 3.23 times faster than tsc src/index.ts && node src/index.js

Bar chart comparing bun to esno, esrun, ts-node and tsc

Conclusion

Perhaps you can ignore bun. It might best fastest, but it's also "weirdest". It usually works great in small and simple apps and especially smaller ones that just you have to maintain (if "maintain" is even a concern at all).

I don't know how to compare them in size. ts-node is built on top of acorn which is written in JavaScript. @digitak/esrun is a wrapper for esbuild (and esno is wrapper for tsx which is also on top of esbuild) which is a fast bundler written in Golang. So it's packaged as a binary in your node_modules which hopefully works between your laptop, your CI, and your Dockerfile but it's nevertheless a binary.

Given that esrun and esno isn't that much faster than ts-node and ts-node can check your TypeScript that's a bonus for ts-node.
But esbuild is an actively maintained project that seems to become stable and accepted.

As always, this was just a quick snapshot of an unrealistic app that is less than 10 lines of TypeScript code. I'd love to hear more about what kind of results people are getting comparing the above tool when you apply it on much larger projects that have more complex tsconfig.json for things like JSX.

Switching from Next.js to Vite + wouter

July 28, 2023
0 comments React, Node, JavaScript

Next.js is a full front-end web framework. Vite is a build tool so they don't easily compare. But if you're building a single-page app ("SPA"), the difference isn't that big, especially if you bolt on a routing library which is something that Next.js has built in.

My SPA is a relatively straight forward one. It's a React app that uses wonderful Mantine UI framework. The app is CRM for real-estate agents that I've been hacking on with my wife. SEO is not a concern because you can't do anything until you've signed in. So server-side rendering is not a requirement. In that sense, it's like loading Gmail. Yes, users might want a speedy first load when they open it in a fresh new browser tab, but the static assets are most likely going to be heavily (browser) cached by the few users it has.

With that out of the way, let's skim through some of the differences.

Build times

Immediately, this is a tricky one to compare because Next.js has the ability to cache. You get that .next/cache/ directory which is black magic to me, but it clearly speeds things up. And it's incremental so the caching can help partially when only some of the code has changed.

Running, npm run build && npm run export a couple of times yields:

Next.js

Without no .next/cache/ directory

Total time to run npm run build && npm run export: 52 seconds

With the .next/cache/ left before each build

Total time to run npm run build && npm run export: 30 seconds

Vite

Total time to run npm run build: 12 seconds

A curious thing about Vite here is that its output contains a measurement of the time it took. But I ignored that and used /usr/bin/time -h ... instead. This gives me the total time.
I.e. the output of npm run build will say:

✓ built in 7.67s

...but it actually took 12.2 seconds with /usr/bin/time.

Build artifacts

Perhaps not very important because Next.js automatically code splits in its wonderfully clever way.

Next.js

❯ du -sh out
1.8M    out
❯ tree out | rg '\.js|\.css' | wc -l
      52

Vite

❯ du -sh dist
960K    dist

and

❯ tree dist/assets
dist/assets
├── index-1636ae43.css
└── index-d568dfbf.js

Again, it's probably unfair to compare at this point. Most of the weight of these static assets (particularly the .js files) is due to Mantine components being so heavy.

Routing

This isn't really a judgment in any way. More of a record how it differs in functionality.

Next.js

In my app, that I'm switching from Next.js to Vite + wouter, I use the old way of using Next.js which is to use a src/pages/* directory. For example, to make a route to the /account/settings page I first create:


// src/pages/account/settings.tsx

import { Settings } from "../../components/account/settings"

const Page = () => {
  return <Settings />
}
export default Page

I'm glad I built it this way in the first place. When I now port to Vite + wouter, I don't really have to touch that src/components/account/settings.tsx code because that component kinda assumes it's been invoked by some routing.

Vite + wouter

First I installed the router in the src/App.tsx. Abbreviated code:


// src/App.tsx

import { Routes } from "./routes"

export default function App() {
  const { myTheme, colorScheme, toggleColorScheme } = useMyTheme()
  return (
    <ColorSchemeProvider
      colorScheme={colorScheme}
      toggleColorScheme={toggleColorScheme}
    >
      <MantineProvider withGlobalStyles withNormalizeCSS theme={myTheme}>
        <Routes />
      </MantineProvider>
    </ColorSchemeProvider>
  )
}

By the way, the code for Next.js looks very similar in its src/pages/_app.tsx with all those contexts that Mantine make you wrap things in.

And here's the magic routing:


// src/routes.tsx

import { Router, Switch, Route } from "outer"

import { Home } from "./components/home"
import { Authenticate } from "./components/authenticate"
import { Settings } from "./components/account/settings"
import { Custom404 } from "./components/404"

export function Routes() {
  return (
    <Router>
      <Switch>
        <Route path="/signin" component={Authenticate} />
        <Route path="/account/settings" component={Settings} />
        {/* many more lines like this ... */}

        <Route path="/" component={Home} />

        <Route>
          <Custom404 />
        </Route>
      </Switch>
    </Router>
  )
}

Redirecting with router

This is a made-up example, but it demonstrates the pattern with wouter compared to Next.js

Next.js


const { push } = useRouter()

useEffect(() => {
  if (user) {
    push('/signedin')
  }
}, [user])

wouter


const [, setLocation] = useLocation()

useEffect(() => {
  if (user) {
    setLocation('/signedin')
  }
}, [user])

Linking

Next.js


import Link from 'next/link'

// ...

<Link href="/settings" passHref>
  <Anchor>Settings</Anchor>
</Link>

wouter


import { Link } from "wouter"

// ...

<Link href="/settings">
  <Anchor>Settings</Anchor>
</Link>

Getting a query string value

Next.js


import { useRouter } from "next/router"

// ...

const { query } = useRouter()

if (query.name) {
  const name = Array.isArray(query.name) ? query.name[0] : query.name
  // ...
}

wouter


import { useSearch } from "wouter/use-location"

// ...

const search = useSearch()
const searchParams = new URLSearchParams(search)

if (searchParams.get('name')) {
  const name = searchParams.get('name')
  // ...
}

Conclusion

The best thing about Next.js is its momentum. It gets lots of eyes on it. Lots of support opportunities and great chance of its libraries being maintained well into the future. Vite also has great momentum and adaptation. But wouter is less "common".

Comparing apples and oranges is often counter-productive if you don't take all constraints and angles into account and those are usually quite specific. In my case, I just want to build a single-page app. I don't want a Node server. In fact, my particular app is a Python backend that does all the API responses from a fetch in the JavaScript app. That Python app also serves the built static files, including the dist/index.html file. That's how my app can serve the app straight away if the current URL is something like /account/settings. A piece of Python code (more or less the only code that doesn't serve /api/* URLs) collapses all initial serving URLs to serve the dist/index.html file. It's a classic pattern and honestly feels a bit dated in 2023. But it works. And what's so great about all of this is that I have a multi-stage Dockerfile that first does the npm run build (and some COPY --from=frontend /home/node/app/dist ./server/out) and now I can "lump" together the API backend and the front-end code in just 1 server (which I host on Digital Ocean).

If you had to write a SPA in 2023 what would you use? In particular, if it has to be React. Remix is all about server-side rendering. Create-react-app is completely unsupported. Building it from scratch yourself rolling your own TypeScript + Eslint + Rollup/esbuild/Parcel/Webpack does not feel productive unless you have enough time and energy to really get it all right.

In terms of comparing the performance between Next.js and Vite + wouter, the time it takes to build the whole app is actually not that big a deal. It's a rare thing to do. It's something I do after a long coding/debugging session. What's more pressing is how npm run dev works.
With Vite, I type npm run dev and hit Enter. Faster than I can almost notice, after hitting Enter I see...

VITE v4.4.6  ready in 240 ms

  ➜  Local:   http://localhost:3000/
  ➜  Network: use --host to expose
  ➜  press h to show help

and I'm ready to open http://localhost:3000/ to play. With Next.js, after having typed npm run dev and Enter, there's this slight but annoying delay before it's ready.

Benchmark comparison of Elasticsearch highlighters

July 5, 2023
0 comments Elasticsearch

tl;dr; fvh is marginally faster than unified and unified is a bit faster than plain.

When you send a full-text search query to Elasticsearch, you can specify (if and) how it should highlight, with HTML tags, highlights. E.g.


The correct way to index data into <mark>Elasticsearch</mark> with (Python) <mark>elasticsearch</mark>-dsl

Among other configuration options, you can pick one of 3 different highlighter algorithms:

  1. unified (default)
  2. plain
  3. fvh

The last one, fvh, requires that you index more at index-time (in particular to add term_vector="with_positions_offsets" to the mapping). In a previous benchmark I did, the total document size on disk, as described by http://localhost:9200/_cat/indices?v grew by 38%.

I bombarded my local Elasticsearch 7.7 instance with thousands of queries collected from logs. Some single-word, some multi-word. The fields it highlights are things like title (~5-50 words) and body (~100-2,000 words).
Basically, I edited the search query by testing one at a time. For example:


search_query = search_query.highlight(
-   "title", fragment_size=120, number_of_fragments=1, type="unified"
+   "title", fragment_size=120, number_of_fragments=1, type="plain"
)

...etc.

After doing 1,000 searches 3 different times per each highlighter type option, and recording the times it took I recorded the following:

(milliseconds per query, lower is better)

UNIFIED:
  MEAN  18.1ms
  MEDIAN 19.0ms

PLAIN:
  MEAN  24.5ms
  MEDIAN 27.5ms

FVH:
  MEAN  16.1ms
  MEDIAN 17.6ms

Thin marginal win for fvh over unified.

Conclusion

Conclusion? Or should I say "Caveats" instead? There's a lot more to it than raw performance speed. In this benchmark, it takes ~20 milliseconds to search on 2 different indexes, each with a scoring function and indexes containing between 1,000 and 5,000 documents with hundreds of thousands of words. So it's pretty minor.

Each highlighter performs slightly differently too, so you'd have to study the outcome a bit more carefully to get a better feel for if it works the way you and your team prefer it to work.

If there's any conclusion, other than the boring usual "it depends on your setup and preferences", the performance difference is noticeable but not blowing you away. It makes sense that fvh is a bit faster because you've paid for it by indexing more upfront (the offsets) at the expense of memory.