findMatchesInText - Find line and column of matches in a text, in JavaScript
June 22, 2020
0 comments Node, JavaScript
I need this function to relate to open-editor
which is a Node program that can open your $EDITOR
from Node and jump to a specific file, to a specific line, to a specific column.
Here's the code:
function* findMatchesInText(needle, haystack, { inQuotes = false } = {}) {
const escaped = needle.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
let rex;
if (inQuotes) {
rex = new RegExp(`['"](${escaped})['"]`, "g");
} else {
rex = new RegExp(`(${escaped})`, "g");
}
for (const match of haystack.matchAll(rex)) {
const left = haystack.slice(0, match.index);
const line = (left.match(/\n/g) || []).length + 1;
const lastIndexOf = left.lastIndexOf("\n") + 1;
const column = match.index - lastIndexOf + 1;
yield { line, column };
}
}
And you use it like this:
const text = ` bravo
Abra
cadabra
bravo
`;
console.log(Array.from(findMatchesInText("bra", text)));
Which prints:
[
{ line: 1, column: 2 },
{ line: 2, column: 2 },
{ line: 3, column: 5 },
{ line: 5, column: 1 }
]
The inQuotes
option is because a lot of times this function is going to be used for finding the href
value in unstructured documents that contain HTML <a>
tags.
Benchmark compare Highlight.js vs. Prism
May 19, 2020
0 comments Node, JavaScript
tl;dr; I wanted to see which is fastest, in Node, Highlight.js or Prism. The result is; they're both plenty fast but Prism is 9% faster.
The context is all the thousands of little snippets of CSS, HTML, and JavaScript code on MDN.
I first wrote a script that stored almost 9,000 snippets of code. 60% is Javascript and 22% is CSS and rest is HTML.
The mean snippet size was 400 bytes and the median 300 bytes. All ASCII.
Then I wrote three functions:
f1
- opens the snippet, extracts the payload, and saves it in a different place. This measures the baseline for how long the disk I/O read and the disk I/O write takes.f2
- same asf1
but usesconst html = Prism.highlight(payload, Prism.languages[name], name);
before saving.f3
- same asf1
but usesconst html = hljs.highlight(name, payload).value;
before saving.
The experiment
You can see the hacky benchmark code here: https://github.com/peterbe/syntax-highlight-node-benchmark/blob/master/index.js
Results
The results are (after running each 12 times each):
f1 0.947s fastest f2 1.361s 43.6% slower f3 1.494s 57.7% slower
Memory
In terms of memory usage, Prism
maxes heap memory at 60MB (the f1
baseline was 18MB), and Highlight.js
maxes heap memory at 60MB too.
Disk space in HTML
Each library produces different HTML. Examples:
Prism
<span class="token selector">.item::after</span> <span class="token punctuation">{</span>
<span class="token property">content</span><span class="token punctuation">:</span> <span class="token string">"This is my content."</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
Highlight.js
<span class="hljs-selector-class">.item</span><span class="hljs-selector-pseudo">::after</span> {
<span class="hljs-attribute">content</span>: <span class="hljs-string">"This is my content."</span>;
}
Yes, not only does it mean they look different, they use up a different amount of disk space when saved. That matters for web performance and also has an impact on build resources.
f1
- baseline "HTML" files amounts to 11.9MB (across 3,025 files)f2
- Prism: 17.6MBf3
- Highlight.js: 13.6MB
Conclusion
Prism is plenty fast for Node. If you're already using Prism, don't worry about having to switch to Highlight.js for added performance.
RAM memory consumption is about the same.
Final HTML from Prism
is 30% larger than Highlight.js
but when the rendered HTML is included in a full HTML page, the HTML compresses very well because of all the repetition so this is not a good comparison. Or rather, not a lot to worry about.
Well, speed is just one dimension. The features differ too. MDN already uses Prism
but does so in the browser. The ultimate context for this blog post is; the speed if we were to do all the syntax highlighting in the server as a build step.
Throw JavaScript errors with extra information
May 12, 2020
0 comments Node, JavaScript
Did you know, if you can create your own new Error
instance and attach your own custom properties on that? This can come in very handy when you, from the caller, want to get more structured information from the error without relying on the error message.
// WRONG ⛔️
try {
for (const i of [...Array(10000).keys()]) {
if (Math.random() > 0.999) {
throw new Error(`Failed at ${i}`);
}
}
} catch (err) {
const iteration = parseInt(err.toString().match(/Failed at (\d+)/)[1]);
console.warn(`Made it to ${iteration}`);
}
// RIGHT ✅
try {
for (const i of [...Array(10000).keys()]) {
if (Math.random() > 0.999) {
const failure = new Error(`Failed at ${i}`);
failure.iteration = i;
throw failure;
}
}
} catch (err) {
const iteration = err.iteration;
console.warn(`Made it to ${iteration}`);
}
The above examples are obviously a bit contrived but you have to imagine that whatever code can throw an error might be "far away" from where you deal with errors thrown. For example, imagine you start off a build and you want to get extra information about what the context was. In Python, you use exception classes as a form of natural filtering but JavaScript doesn't have that. Using custom error properties can be a great tool to separate unexpected errors from expected errors.
Bonus - Checking for the custom property
Imagine this refactoring:
try {
for (const i of [...Array(10000).keys()]) {
if (Math.random() > 0.999) {
const failure = new Error(`Failed at ${i}`);
failure.iteration = i;
throw failure;
}
if (Math.random() < 0.001) {
throw new Error("something else is wrong");
}
}
} catch (err) {
const iteration = err.iteration;
console.warn(`Made it to ${iteration}`);
}
With that code it's very possible you'd get Made it to undefined
. So here's how you'd make the distinction:
try {
for (const i of [...Array(10000).keys()]) {
if (Math.random() > 0.999) {
const failure = new Error(`Failed at ${i}`);
failure.iteration = i;
throw failure;
}
if (Math.random() < 0.001) {
throw new Error("something else is wrong");
}
}
} catch (err) {
if (err.hasOwnProperty("iteration")) {
const iteration = err.iteration;
console.warn(`Made it to ${iteration}`);
} else {
throw err;
}
}
```
How to use minimalcss without a server
April 24, 2020
0 comments Web development, Node, JavaScript
minimalcss requires that you have your HTML in a serving HTTP web page so that puppeteer
can open it to find out the CSS within. Suppose, in your build system, you don't yet really have a server. Well, what you can do is start one on-the-fly and shut it down as soon as you're done.
Suppose you have .html file
First install all the stuff:
yarn add minimalcss http-server
Then run it:
const path = require("path");
const minimalcss = require("minimalcss");
const httpServer = require("http-server");
const HTML_FILE = "index.html"; // THIS IS YOURS
(async () => {
const server = httpServer.createServer({
root: path.dirname(path.resolve(HTML_FILE)),
});
server.listen(8080);
let result;
try {
result = await minimalcss.minimize({
urls: ["http://0.0.0.0:8080/" + HTML_FILE],
});
} catch (err) {
console.error(err);
throw err;
} finally {
server.close();
}
console.log(result.finalCss);
})();
And the index.html
file:
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="styles.css">
</head>
<body>
<p>Hi @peterbe</p>
</body>
</html>
And the styles.css
file:
h1 {
color: red;
}
p,
li {
font-weight: bold;
}
And the output from running that Node script:
p{font-weight:700}
It works!
Suppose all you have is the HTML string and the CSS blob(s)
Suppose all you have is a string of HTML and a list of strings of CSS:
const fs = require("fs");
const path = require("path");
const minimalcss = require("minimalcss");
const httpServer = require("http-server");
const HTML_BODY = `
<p>Hi Peterbe</p>
`;
const CSSes = [
`
h1 {
color: red;
}
p,
li {
font-weight: bold;
}
`,
];
(async () => {
const csses = CSSes.map((css, i) => {
fs.writeFileSync(`${i}.css`, css);
return `<link rel="stylesheet" href="${i}.css">`;
});
const html = `<!doctype html><html>
<head>${csses}</head>
<body>${HTML_BODY}</body>
</html>`;
const fp = path.resolve("./index.html");
fs.writeFileSync(fp, html);
const server = httpServer.createServer({
root: path.dirname(fp),
});
server.listen(8080);
let result;
try {
result = await minimalcss.minimize({
urls: ["http://0.0.0.0:8080/" + path.basename(fp)],
});
} catch (err) {
console.error(err);
throw err;
} finally {
server.close();
fs.unlinkSync(fp);
CSSes.forEach((_, i) => fs.unlinkSync(`${i}.css`));
}
console.log(result.finalCss);
})();
Truth be told, you'll need a good pinch of salt to appreciate that example code. It works but most likely, if you're into web performance so much that you're even doing this, your parameters are likely to be more complex.
Suppose you have your own puppeteer
instance
In the first example above, minimalcss
will create an instance of puppeteer
(e.g. const browser = await puppeteer.launch()
) but that means you have less control over which version of puppeteer
or which parameters you need. Also, if you have to run minimalcss
on a bunch of pages it's costly to have to create and destroy puppeteer
browser instances repeatedly.
To modify the original example, here's how you use your own instance of puppeteer
:
const path = require("path");
+ const puppeteer = require("puppeteer");
const minimalcss = require("minimalcss");
const httpServer = require("http-server");
const HTML_FILE = "index.html"; // THIS IS YOURS
(async () => {
const server = httpServer.createServer({
root: path.dirname(path.resolve(HTML_FILE)),
});
server.listen(8080);
+ const browser = await puppeteer.launch(/* your special options */);
+
let result;
try {
result = await minimalcss.minimize({
urls: ["http://0.0.0.0:8080/" + HTML_FILE],
+ browser,
});
} catch (err) {
console.error(err);
throw err;
} finally {
+ await browser.close();
server.close();
}
console.log(result.finalCss);
})();
Note that this doesn't buy us anything in this particular example. But that's where your imagination comes in!
Conclusion
You can see the code here as a git
repo if that helps.
The point is that this might solve some of the chicken-and-egg problem you might have is that you're building your perfect HTML + CSS and you want to perfect it before you ship it.
Note also that there are other ways to run minimalcss
other than programmatically. For example, minimalcss-server
is minimalcss
wrapped in a express
server.
Another thing that you might have is that you have multiple .html
files that you want to process. The same technique applies but you just need to turn it into a loop and make sure you call server.close()
(and optionally await browser.close()
) when you know you've processed the last file. Exercise left to the reader?
How post JSON with curl to an Express app
April 15, 2020
2 comments Node, JavaScript
tl;dr; No need install or require body-parser and it's important to send the right content-type header.
I know Express has great documentation but I'm still confused about how to receive JSON and/or how to test it from curl
. A great deal of confusion comes from the fact that, I think, body-parser
used to be a third-party library you had to install yourself to add it to your Express app. You don't. It now gets installed by installing express
. E.g.
▶ yarn init -y ▶ yarn add express ▶ ls node_modules/body-parser HISTORY.md LICENSE README.md index.js lib package.json
Let's work backward. This is how you set up the Express handler:
const express = require("express"); // v4.17.x as of Apr 2020
const app = express();
app.use(express.json());
app.post("/echo", (req, res) => {
res.json(req.body);
});
app.listen(5000);
And, this is how you test it:
▶ curl -XPOST -d '{"foo": "bar"}' -H 'content-type: application/json' localhost:5000/echo {"foo":"bar"}%
That's it. No need to require("body-parser")
or anything like that. And make sure you're sending the content-type: application/json
in the curl
command.
Things that can go wrong
I kept fumbling around on StackOverflow questions and rummaging the Express documentation until I figured out what mistake I kept doing. So, here's a variant of the handler above, but much more verbose:
app.post("/echo", (req, res) => {
if (req.body === undefined) {
throw new Error("express.json middleware not installed");
}
if (!Object.keys(req.body).length) {
// E.g curl -v -XPOST http://localhost:5000/echo
if (!req.get("content-Type")) {
return res.status(400).send("no content-type header\n");
}
// E.g. curl -v -XPOST -d '{"foo": "bar"}' http://localhost:5000/echo
if (!req.get("content-Type").includes("application/json")) {
return res.status(400).send("content-type not application/json\n");
}
// E.g. curl -XPOST -H 'content-type:application/json' http://localhost:5000/echo
return res.status(400).send("no data payload included\n");
}
// At this point 'req.body' is *something*.
// For example, you might want to `console.log(req.body.foo)`
res.json(req.body);
});
How you treat these things is up to you. For example, an empty JSON data might be OK in your application.
I.e. perhaps curl -XPOST -d '{}' -H 'content-type:application/json' http://localhost:5000/echo
might be fine.
An important option
express.json()
is a piece of middleware. By default, it has a simple mechanism for bothering to do put .body
into the request object. The default configuration is as if you'd typed:
app.use(express.json({
type: 'application/json',
}));
(it's actually a bit more complicated than that)
If you're confident that you'll always be sending JSON to this handler, and you don't want to have to force clients to have to specify the application/json
Content-Type you can change this to:
app.use(express.json({ type: '*/*', }));
Now you'll find that curl -XPOST -d '{"foo": "bar3"}' localhost:5000/
will work fine.
Instead of curl
, let's fetch
This code works the same with node-fetch
or browser Fetch API.
fetch("http://localhost:5000/echo", {
method: "post",
body: JSON.stringify({ foo: "bar" }),
headers: { "Content-Type": "application/json" },
})
.then((res) => res.json())
.then((json) => console.log(json));
How to install Node 12 on Ubuntu (Eoan Ermine) 19.10
April 8, 2020
0 comments Node, Linux
I'm setting up a new Ubuntu (Eoan Ermine) 19.10 server and I noticed that apt install nodejs
gives you Node v10 which is an LTS (Long Term Support) version that'll last till April 2021. However, I want Node v12 which is the most recent LTS release as of April 2020.
To install it I used these instructions:
curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash - sudo apt-get install -y nodejs
That worked great.
When it finished, it spat out this nice little blurb about how to install yarn
:
... Fetched 7454 B in 1s (12.3 kB/s) Reading package lists... Done ## Run `sudo apt-get install -y nodejs` to install Node.js 12.x and npm ## You may also need development tools to build native addons: sudo apt-get install gcc g++ make ## To install the Yarn package manager, run: curl -sL https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add - echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list sudo apt-get update && sudo apt-get install yarn
By the way, I have no idea what nodejs-mozilla
but running apt show nodejs-mozilla
yields:
Package: nodejs-mozilla Version: 12.16.1-0ubuntu0.19.10.1 Priority: optional Section: universe/javascript Origin: Ubuntu Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com> Bugs: https://bugs.launchpad.net/ubuntu/+filebug Installed-Size: 42.0 MB Depends: libc6 (>= 2.29), libgcc1 (>= 1:3.4), libstdc++6 (>= 9) Homepage: http://nodejs.org/ Download-Size: 10.4 MB APT-Sources: http://mirrors.digitalocean.com/ubuntu eoan-updates/universe amd64 Packages Description: evented I/O for V8 javascript Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices. . Node.js is bundled with several useful libraries to handle server tasks: . System, Events, Standard I/O, Modules, Timers, Child Processes, POSIX, HTTP, Multipart Parsing, TCP, DNS, Assert, Path, URL, Query Strings.
Installing it doesn't add a node
executable and I can't find a home page for it. apt
can be weird sometimes.