Peterbe.com

Peter Bengtsson's blog

Page 8

The technology behind You Should Watch

January 28, 2023
0 comments You Should Watch, React, Firebase, JavaScript

I recently launched You Should Watch which is a mobile-friendly web app to have a to-watch list of movies and TV shows as well being able to quickly share the links if you want someone to "you should watch" it.

I'll be honest, much of the motivation of building that web app was to try a couple of newish technologies that I wanted to either improve on or just try for the first time. These are the interesting tech pillars that made it possible to launch this web app in what was maybe 20-30 hours of total time.

All the code for You Should Watch is here: https://github.com/peterbe/youshouldwatch-next

The Movie Database API

The cornerstone that made this app possible in the first place. The API is free for developers who don't intend to earn revenue on whatever project they build with it. More details in their FAQ.

The search functionality is important. The way it works is that you can do a "multisearch" which means it finds movies, TV shows, or people. Then, when you have each search result's id and media_type you can fetch a lot more information specifically. For example, that's how the page for a person displays things differently than the page for a movie.

Next.js and the new App dir

In Next.js 13 you have a choice between regular pages directory or an app directory where every page (which becomes a URL) has to be called page.tsx.

No judgment here. It was a bit cryptic to rewrap my brain on how this works. In particular, the head.tsx is now different from the page.tsx and since both, in server-side rendering, need some async data I have to duplicate the await getMediaData() instead of being able to fetch it once and share with drop-drilling or context.

Vercel deployment

Wow! This was the most pleasant experience I've experienced in years. So polished and so much "just works". You sign in, with your GitHub auth, click to select the GitHub repo (that has a next.config.js and package.json etc) and you're done. That's it! Now, not only does every merged PR automatically (and fast!) get deployed, but you also get a preview deployment for every PR (which I didn't use).

I'm still using the free hobby tier but god forbid this app gets serious traffic, I'd just bump it up to $20/month which is cheap. Besides, the app is almost entirely CDN cacheable so only the search XHR backend would linearly increase its load with traffic I think.

Well done Vercel!

Playwright and VS Code

Not the first time I used Playwright but it was nice to return and start afresh. It definitely has improved in developer experience.

Previously I used npx and the terminal to run tests, but this time I tried "Playwright Test for VSCode" which was just fantastic! There are some slightly annoying things in that I had to use the mouse cursor more than I'd hoped, but it genuinely helped me be productive. Playwright also has the ability to generate JS code based on me clicking around in a temporary incognito browser window. You do a couple of things in the browser then paste in the generated source code into tests/basics.spec.ts and do some manual tidying up. To run the debugger like that, one simply types pnpm dlx playwright codegen

`pnpm`

It seems hip and a lot of people seem to recommend it. Kinda like yarn was hip and often recommended over npm (me included!).

Sure it works and it installs things fast but is it noticeable? Not really. Perhaps it's 4 seconds when it would have been 5 seconds with npm. Apparently pnpm does clever symlinking to avoid a disk-heavy node_modules/ but does it really matter ...much?
It's still large:


❯ du -sh node_modules
468M    node_modules

A disadvantage with pnpm is that GitHub Dependabot currently doesn't support it :(
An advantage with pnpm is that pnpm up -i --latest is great interactive CLI which works like yarn upgrade-interactive --latest

`just`

just is like make but written in Rust. Now I have a justfile in the root of the repo and can type shortcut commands like just dev or just emu[TAB] (to tab autocomplete).

In hindsight, my justfile ended up being just a list of pnpm run ... commands but the idea is that just would be for all and any command under one roof.

End of the day, it becomes a nifty little file of "recipes" of useful commands and you can easily string them together. For example just lint is the combination of typing pnpm run prettier:check and pnpm run tsc and pnpm run lint.

Pico.css

A gorgeously simple looking pure-CSS framework. Yes, it's very limited in components and I don't know how well it "tree shakes" but it's so small and so neat that it had everything I needed.

My favorite React component library is Mantine but I definitely love the piece of mind that Pico.css is just CSS so you're not A) stuck with React forever, and B) not unnecessary JS code that slows things down.

Firebase

Good old Firebase. The bestest and easiest way to get a reliable and realtime database that is dirt cheap, simple, and has great documentation. I do regret not trying Supabase but I knew that getting the OAuth stuff to work with Google on a custom domain would be tricky so I stayed with Firebase.

`react-lite-youtube-embed`

A port of Paul Irish's Lite YouTube Embed which makes it easy to display YouTube thumbnails in a web performant way. All you have to do is:


import LiteYouTubeEmbed from "react-lite-youtube-embed";

<LiteYouTubeEmbed
   id={youtubeVideo.id}
   title={youtubeVideo.title} />

In conclusion

It's amazing how much time these tools saved compared to just years ago. I could build a fully working side-project with automation and high quality entirely thanks to great open source or well-tuned proprietary components, in just about one day if you sum up the hours.

Announcing: You Should Watch

January 27, 2023
3 comments You Should Watch

tl;dr You Should Watch is a mobile-friendly web app for remembering and sharing movies and TV shows you should watch. Like a to-do list to open when you finally sit down with the remote in your hand.

I built this in the holidays in some spare afternoons and evenings around the 2022 new years. The idea wasn't new in my head because it's an actual itch I've often had: what on earth should I watch now tonight?! Oftentimes you read or hear about great movies or TV shows but a lot of those memories are long gone by the time the kids have been put to sleep and you have control of the remote.

It's not rocket science. It's a neat little app that works great on your phone browser but also works as a website on your computer. You don't have to sign in but if you do, your list outlives the memory of your device. I.e. your selections get saved in the cloud and you can pick back up whenever you're on a different device. If you do decide to sign in, it currently uses Google for that. Your list is anonymized and even I, the owner of the database, can't tell which movies and TV shows you select.

If you do sign in to save your list, every time you check a movie or TV show off it goes into your archive. That can be useful for your next dinner party or date or cookout when you're scrambling to answer "Seen any good shows recently?".

One feature that I've personally enjoyed is the list of recommendations that each TV show or movie has. It's not a perfect list but it's fun and useful. Suppose you can't even think of what to watch and just want inspiration, start off by finding a movie like (but don't want to watch right now), then click on it and scroll down to the "Recommendations". Even if your next movie or TV show isn't on that list, perhaps clicking on something similar will take you to the next page where the right inspiration might be found. Try it.

It's free and will always be free. It was fun to build and thankfully free to run so it won't go away. I hope you enjoy it and get value from it. Please please share your thoughts and constructive criticism.

Try it on https://ushdwat.ch/

Some pictures

Your Watch list (mobile)

Search results

Navigating by "Recommendations"

"Add to Home Screen" on iOS Safari

First impressions of Meilisearch and how it compares to Elasticsearch

January 26, 2023
1 comment Elasticsearch

tl;dr Meilisearch is like Elasticsearch but simpler. Decent parity in functionality and performance, but definitely intriguing if you don't already know Elasticsearch or want to run with fewer resources.

Meilisearch is a full-text search solution that you can use to power a really good site-search solution. My personal blog uses Elasticsearch but I wanted to experiment with switching to Meilisearch. This blog post is about some impressions based on this experiment.

Here are some of my observations:

Memory usage

When I start Elasticsearch and index all my blog posts and all comments, on the Activity Monitor that java process uses 1.3GB. The meilisearch process peaks at 290MB.

Indexing performance

In my case, it doesn't matter. When you constantly update the index (Elasticsearch) the time is unimportantly small when the dataset is small.
If you do a mass-reindexing you do that, in Elasticsearch, by creating a new index with a timestamp (e.g. blogpost-20230126134512) and then swap the alias with which you send your search queries. In that strategy, it doesn't matter how many seconds it takes because nobody's waiting for it to finish fast.

At the moment I don't even know how to append more to a Meilisearch index. I only know how to index everything all at once.

Note with the Elasticseach SDK (in Python) you can pass a generator to the parallel_bulk helper function meaning you can kinda "stream" in all the documents without loading them all into memory. I.e. I can't do queued = index.add_documents(get_docs_iterator()) so I have to instead do queued = index.add_documents(list(get_docs_iterator())).

Complex relevancy is easier, but more magic

Ranking search results by "matchness" is easy. I.e. when the search terms are "more present" it's likely to be more relevant. Elasticsearch sorts by _score by default. So does Meilisearch. In reality, you want to control that more. In my use case, I want the ranking to be a function of "matchness" and popularity (which is a number I derive from pageview metrics). In Elasticsearch you have the power writing a "Function score query" which gives you lots of flexibility to control exactly how. E.g. multiply or sum or an average or combination where you write a log function.

With Meilisearch you can only control the sort order of relevancy "algorithms". An example is:


[
  "words",
+ "popularity:desc",
  "typo",
  "proximity",
  "attribute",
  "sort",
  "exactness"
]

I can't exactly phrase myself how I'd exactly prefer it but it feels a bit like magic. The lack of functionality also speaks to a strength of Meilisearch in that it's easy to get something to incorporate it.

Highlighting is easy

What I want is that the title to be highlighted. In full. For the text I'd rather have snippets that focus where the highlights appear within the text. This was a breeze! Here's how you do it:


res = client.index("blogitems").search(
    q,
    {
        "attributesToHighlight": ["title", "text"],
        "highlightPreTag": "<mark>",
        "highlightPostTag": "</mark>",
        "attributesToCrop": ["text"],
        "cropLength": 30,
    },
)

Relevancy in between fields is too basic

In Elasticsearch you use a boost multiplier to specify, that the title is more important than the body. For example:


from elasticsearch_dsl import Q

...

title_match = Q("match", title={"query": word, "boost": 3.5}) 
body_match = Q("match", body={"query": word, "boost": 1.0})
match = title_match | body_match

That means you can express that the title is 3.5 times more important than the match on body.

With Meilisearch you have no such functionality (as far as I can see) but you can say that title > body and the way you do that is a bit odd. It's the order with which you define the searchable fields.

Search performance

My dataset is small. About 2,000 documents. I wasn't able to measure a significant difference. In both my search query implementations the time it takes is between 10 and 30 milliseconds. Both are fast enough. The time that matters is the networking overheads. The networking to and fro the database probably matters more but if the network is localhost the search time is irrelevant.

In conclusion

When you're already comfortable and versed in the more powerful beast that is Elasticsearch, it's less relevant. However, Meilisearch feels like a nicer experience in its simplicity if you're confronted with a choice on your next full-text search project.

You could say that in terms of core search functionality, to me, Meilisearch sits between PostgreSQL's full-text and Elasticsearch.

What often matters more, if the project is a team effort that involves many people that might outlast you, the operational side matters more. I.e. do you install it yourself or do you use a proprietary cloud provider (which both Elastic and Meilisearch Cloud are) then that's what needs to be more carefully considered? It's good to know though that Meilisearch has most of the core functionality, including great documentation, to build something really great.

Pip-Outdated.py - a script to compare requirements.in with the output of pip list --outdated

December 22, 2022
0 comments Python

Simply by posting this, there's a big chance you'll say "Hey! Didn't you know there's already a well-known script that does this? Better." Or you'll say "Hey! That'll save me hundreds of seconds per year!"

The problem

Suppose you have a requirements.in file that is used, by pip-compile to generate the requirements.txt that you actually install in your Dockerfile or whatever server deployment. The requirements.in is meant to be the human-readable file and the requirements.txt is for the computers. You manually edit the version numbers in the requirements.in and then run pip-compile --generate-hashes requirements.in to generate a new requirements.txt. But the "first-class" packages in the requirements.in aren't the only packages that get installed. For example:

▶ cat requirements.in | rg '==' | wc -l
      54

▶ cat requirements.txt | rg '==' | wc -l
     102

In other words, in this particular example, there are 76 "second-class" packages that get installed. There might actually be more stuff installed that you didn't describe. That's why pip list | wc -l can be even higher. For example, you might have locally and manually done pip install ipython for a nicer interactive prompt.

The solution

The command pip list --outdated will list packages based on the requirements.txt not the requirements.in. To mitigate that, I wrote a quick Python CLI script that combines the output of pip list --outdated with the packages mentioned in requirements.in:


#!/usr/bin/env python

import subprocess


def main(*args):
    if not args:
        requirements_in = "requirements.in"
    else:
        requirements_in = args[0]
    required = {}
    with open(requirements_in) as f:
        for line in f:
            if "==" in line:
                package, version = line.strip().split("==")
                package = package.split("[")[0]
                required[package] = version

    res = subprocess.run(["pip", "list", "--outdated"], capture_output=True)
    if res.returncode:
        raise Exception(res.stderr)

    lines = res.stdout.decode("utf-8").splitlines()
    relevant = [line for line in lines if line.split()[0] in required]

    longest_package_name = max([len(x.split()[0]) for x in relevant]) if relevant else 0

    for line in relevant:
        p, installed, possible, *_ = line.split()
        if p in required:
            print(
                p.ljust(longest_package_name + 2),
                "INSTALLED:",
                installed.ljust(9),
                "POSSIBLE:",
                possible,
            )


if __name__ == "__main__":
    import sys

    sys.exit(main(*sys.argv[1:]))

Installation

To install this, you can just download the script and run it in any directory that contains a requirements.in file.

Or you can install it like this:

curl -L https://gist.github.com/peterbe/099ad364657b70a04b1d65aa29087df7/raw/23fb1963b35a2559a8b24058a0a014893c4e7199/Pip-Outdated.py > ~/bin/Pip-Outdated.py
chmod +x ~/bin/Pip-Outdated.py

Pip-Outdated.py

How to change the current query string URL in NextJS v13 with next/navigation

December 9, 2022
4 comments React, JavaScript

At the time of writing, I don't know if this is the optimal way, but after some trial and error, I got it working.

This example demonstrates a hook that gives you the current value of the ?view=... (or a default) and a function you can call to change it so that ?view=before becomes ?view=after.

In NextJS v13 with the pages directory:


import { useRouter } from "next/router";

export function useNamesView() {
    const KEY = "view";
    const DEFAULT_NAMES_VIEW = "buttons";
    const router = useRouter();

    let namesView: Options = DEFAULT_NAMES_VIEW;
    const raw = router.query[KEY];
    const value = Array.isArray(raw) ? raw[0] : raw;
    if (value === "buttons" || value === "table") {
        namesView = value;
    }

    function setNamesView(value: Options) {
        const [asPathRoot, asPathQuery = ""] = router.asPath.split("?");
        const params = new URLSearchParams(asPathQuery);
        params.set(KEY, value);
        const asPath = `${asPathRoot}?${params.toString()}`;
        router.replace(asPath, asPath, { shallow: true });
    }

    return { namesView, setNamesView };
}

In NextJS v13 with the app directory.


import { useRouter, useSearchParams, usePathname } from "next/navigation";

type Options = "buttons" | "table";

export function useNamesView() {
    const KEY = "view";
    const DEFAULT_NAMES_VIEW = "buttons";
    const router = useRouter();
    const searchParams = useSearchParams();
    const pathname = usePathname();

    let namesView: Options = DEFAULT_NAMES_VIEW;
    const value = searchParams.get(KEY);
    if (value === "buttons" || value === "table") {
        namesView = value;
    }

    function setNamesView(value: Options) {
        const params = new URLSearchParams(searchParams);
        params.set(KEY, value);
        router.replace(`${pathname}?${params}`);
    }

    return { namesView, setNamesView };
}

The trick is that you only want to change 1 query string value and respect whatever was there before. So if the existing URL was /page?foo=bar and you want that to become /page?foo=bar&and=also you have to consume the existing query string and you do that with:


const searchParams = useSearchParams();
...
const params = new URLSearchParams(searchParams);
params.set('and', 'also')

How much faster is Cheerio at parsing depending on xmlMode?

December 5, 2022
0 comments Node, JavaScript

Cheerio is a fantastic Node library for parsing HTML and then being able to manipulate and serialize it. But you can also just use it for parsing HTML and plucking out what you need. We use that to prepare the text that goes into our search index for our site. It basically works like this:


const body = await getBody('http://localhost:4002' + eachPage.path)
const $ = cheerio.load(body)
const title = $('h1').text()
const intro = $('p.intro').text()
...

But it hit me, can we speed that up? cheerio actually ships with two different parsers:

One is faster and one is more strict.
But I wanted to see this in a real-world example.

So I made two runs where I used:


const $ = cheerio.load(body)

in one run, and:


const $ = cheerio.load(body, { xmlMode: true })

in another.

After having parsed 1,635 pages of HTML of various sizes the results are:

FILE: load.txt
MEAN:   13.19457640586797
MEDIAN: 10.5975

FILE: load-xmlmode.txt
MEAN:   3.9020372860635697
MEDIAN: 3.1020000000000003

So, using {xmlMode:true} leads to roughly a 3x speedup.

I think it pretty much confirms the original benchmark, but now I know based on a real application.

Programmatically control the matrix in a GitHub Action workflow

November 30, 2022
0 comments GitHub

If you've used GitHub Actions before you might be familiar with the matrix strategy. For example:


name: My workflow

jobs:
  build:
    strategy:
      matrix:
        version: [10, 12, 14, 16, 18]
    steps:
      - name: Set up Node ${{ matrix.node }}
        uses: actions/setup-node@v3
        with:
          node-version: ${{ matrix.node }}
      ...

But what if you want that list of things in the matrix to be variable? For example, on rainy days you want it to be [10, 12, 14] and on sunny days you want it to be [14, 16, 18]. Or, more seriously, what if you want it to depend on how the workflow is started?

Let's explain this with a scoped example

You can make a workflow run on a schedule, on pull requests, on pushes, on manual "Run workflow", or as a result on some other workflow finishing.

First, let's set up some sample on directives:


name: My workflow

on:
  workflow_dispatch:
  schedule:
    - cron: '*/5 * * * *'
  workflow_run:
    workflows: ['Build and Deploy stuff']
    types:
      - completed

The workflow_dispatch makes it so that a button like this appears:

The schedule, in this example, means "At every 5th minute"

And workflow_run, in this example, means that it waits for another workflow, in the same repo, with name: 'Build and Deploy stuff' has finished (but not necessarily successfully)

Let's define some choice business logic

For the sake of the demo, let's say this is the rule:

If the workflow runs because of the schedule, you want the matrix to be [16, 18].
If the workflow runs because of the "Run workflow" button press, you want the matrix to be [18].
If the workflow runs because of the Build and Deploy stuff workflow has successfully finished, you want the matrix to be [10, 12, 14, 16, 18].

It's arbitrary but it could be a lot more complex than this.

What's also important to appreciate is that you could use individual steps that look something like this:


  - steps:
     - name: Only if started on a workflow_dispatch
        if: ${{ github.event_name == 'workflow_dispatch' }}
        run: echo "yes it was run because of a workflow_dispatch"

But the rest of the workflow is realistically a lot more complex with many steps and you don't want to have to sprinkle the line if: ${{ github.event_name == 'workflow_dispatch' }} into every single step.

The solution to avoiding repetition is to use a job that depends on another job. We'll have a job that figures out the array for the matrix and another job that uses that.

Let's write the business logic in JavaScript

First we inject a job that looks like this:


jobs:
  matrix_maker:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set-matrix.outputs.result }}
    steps:
      - uses: actions/github-script@v6
        id: set-matrix
        with:
          script: |
            if (context.eventName === "workflow_dispatch") {
              return [18]
            }
            if (context.eventName === "schedule") {
              return [16, 18]
            }
            if (context.eventName === "workflow_run") {
              if (context.payload.workflow_run.conclusion === "success") {
                return [10, 12, 14, 16, 18]
              }
              throw new Error(`It was a workflow_run but not success ('${context.payload.workflow_run.conclusion}')`)
            }
            throw new Error("Unable to find a reason")

      - name: Debug output
        run: echo "${{ steps.set-matrix.outputs.result }}"

Now we can write the "meat" of the workflow that uses this output:



  build:
    needs: matrix_maker
    strategy:
      matrix:
        version: ${{ fromJSON(needs.matrix_maker.outputs.matrix) }}
    steps:
      - name: Set up Node ${{ matrix.version }}
        uses: actions/setup-node@v3
        with:
          node-version: ${{ matrix.version }}

Combined, the entire thing can look like this:


name: My workflow

on:
  workflow_dispatch:
  schedule:
    - cron: '*/5 * * * *'
  workflow_run:
    workflows: ['Build and Deploy stuff']
    types:
      - completed

jobs:
  matrix_maker:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set-matrix.outputs.result }}
    steps:
      - uses: actions/github-script@v6
        id: set-matrix
        with:
          script: |
            if (context.eventName === "workflow_dispatch") {
              return [18]
            }
            if (context.eventName === "schedule") {
              return [16, 18]
            }
            if (context.eventName === "workflow_run") {
              if (context.payload.workflow_run.conclusion === "success") {
                return [10, 12, 14, 16, 18]
              }
              throw new Error(`It was a workflow_run but not success ('${context.payload.workflow_run.conclusion}')`)
            }
            throw new Error("Unable to find a reason")

      - name: Debug output
        run: echo "${{ steps.set-matrix.outputs.result }}"

  build:
    needs: matrix_maker
    strategy:
      matrix:
        version: ${{ fromJSON(needs.matrix_maker.outputs.matrix) }}
    steps:
      - name: Set up Node ${{ matrix.version }}
        uses: actions/setup-node@v3
        with:
          node-version: ${{ matrix.version }}

Conclusion

I've extrapolated this demo from a more complex one at work. (this is my defense for typos and why it might fail if you verbatim copy-n-paste this). The bare bones are there for you to build on.

In this demo, I've used actions/github-script with JavaScript, because it's convenient and you don't need do to things like actions/checkout and npm ci if you want this to be a standalone Node script. Hopefully you can see that this is just a start and the sky's the limit.

Thanks to fellow GitHub Hubber @joshmgross for the tips and help!

Also, check out Tips and tricks to make you a GitHub Actions power-user

First impressions trying out Rome to format/lint my TypeScript and JavaScript

November 14, 2022
1 comment Node, JavaScript

Rome is a new contender to compete with Prettier and eslint, combined. It's fast and its suggestions are much easier to understand.

I have a project that uses .js, .ts, and .tsx files. At first, I thought, I'd just use rome to do formatting but the linter part was feeling nice as I was experimenting so I thought I'd kill two birds with one stone.

Things that worked well

It is fast

My little project only has 28 files, but time rome check lib scripts components *.ts consistently takes 0.08 seconds.

The CLI looks great

You get this nice prompt after running npx rome init the first time:

Suggestions just look great

Easy to understand and needs no explanation because the suggested fix tells a story that means it's immediately easy to understand what the warning is trying to say.

It is smaller

If I run npx create-next-app@latest, say yes to Eslint, and then run npm I -D prettier, the node_modules becomes 275.3 MiB.
Whereas if I run npx create-next-app@latest, say no to Eslint, and then run npm I -D rome, the node_modules becomes 200.4 MiB.

Editing the `rome.json`'s JSON schema works in VS Code

I don't know how this magically worked, but I'm guessing it just does when you install the Rome VS Code extension. Neat with autocomplete!

Things that didn't work so well

Almost all things that I'm going to "complain" about is down to usability. I might look back at this in a year (or tomorrow!) and laugh at myself for being dim, but it nevertheless was part of my experience so it's worth pointing out.

Lint, check, or format?

It's confusing what is what. If lint means checking without modifying, what is check then? I'm guessing rome format means run the lint but with permission to edit my files.

What is rome format compared to rome check --apply then??

I guess rome check --apply doesn't just complain but actually applies the things it spots. So what is rome check --apply-suggested?? (if you're reading this and feel eager to educate me with a comment, please do, but I'm trying to point out that it's not user-friendly)

How do I specify wildcards?

Unfortunately, in this project, not all files are in one single directory (e.g. rome check src/ is not an option). How do I specify a wildcard expression?


▶ rome check *.ts
Checked 3 files in 942µs

Cool, but how do I do all .ts files throughout the project?


▶ rome check "**/*.ts"
**/*.ts internalError/io ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  ✖ No such file or directory (os error 2)


Checked 0 files in 66µs

Clearly, it's not this:


▶ rome check **/*.ts

...

The number of diagnostics exceeds the number allowed by Rome.
Diagnostics not shown: 1018.
Checked 2534 files in 1387ms
Skipped 1 files
Error: errors where emitted while running checks

...because bash will include all the files from node_modules/**/*.ts.

In the end, I ended up with this (in my package.json):

"scripts": {
    "code:lint": "rome check lib scripts components *.ts",
    ...

There's no documentation about how to ignore certain rules

Yes, I can contribute this back to the documentation, but today's not the day to do that.

It took me a long time to find out how to disable certain rules (in the rome.json file) and finally I landed on this:

{
  "linter": {
    "enabled": true,
    "rules": {
      "recommended": true,
      "style": {
        "recommended": true,
        "noImplicitBoolean": "off"
      },
      "a11y": {
        "useKeyWithClickEvents": "off",
        "useValidAnchor": "warn"
      }
    }
  }
}

Much better than having to write inline code comments with the source files themselves.

However, it's still not clear to me what "recommended": true means. Is it shorthand for listing all the default rules all set to true? If I remove that, are no rules activated?

The `rome.json` file is JSON

JSON is cool for many things, but writing comments is not one of them.

For example, I don't know what would be better, Yaml or Toml, but it would be nice to write something like:

"a11y": {
    # Disabled because of issue #1234
    # Consider putting this back in December after the refactor launch
    "useKeyWithClickEvents": "off",

Nextjs and rome needs to talk

When create-react-app first came onto the scene, the coolest thing was the zero-config webpack. But, if you remember, it also came with a really nice zero-config eslint configuration for React apps. It would even print warnings when the dev server was running. Now it's many years later and good linting config is something you depend/rely on in a framework. Like it or not, there are specific things in Nextjs that is exclusive to that framework. It's obviously not an easy people-problem to solve but it would be nice if Nextjs and rome could be best friends so you get all the good linting ideas from the code Nextjs framework but all done using rome instead.

How to count the most common lines in a file

October 7, 2022
0 comments Bash, macOS, Linux

tl;dr sort myfile.log | uniq -c | sort -n -r

I wanted to count recurring lines in a log file and started writing a complicated Python script but then wondered if I can just do it with bash basics.
And after some poking and experimenting I found a really simple one-liner that I'm going to try to remember for next time:

You can't argue with the nice results :)


▶ cat myfile.log
one
two
three
one
two
one
once
one

▶ sort myfile.log | uniq -c | sort -n -r
   4 one
   2 two
   1 three
   1 once

Find the largest node_modules directories with bash

September 30, 2022
0 comments Bash, macOS, Linux

tl;dr; fd -I -t d node_modules | rg -v 'node_modules/(\w|@)' | xargs du -sh | sort -hr

It's very possible that there's a tool that does this, but if so please enlighten me.
The objective is to find which of all your various projects' node_modules directory is eating up the most disk space.
The challenge is that often you have nested node_modules within and they shouldn't be included.

The command uses fd which comes from brew install fd and it's a fast alternative to the built-in find. Definitely worth investing in if you like to live fast on the command line.
The other important command here is rg which comes from brew install ripgrep and is a fast alternative to built-in grep. Sure, I think one can use find and grep but that can be left as an exercise to the reader.

▶ fd -I -t d node_modules | rg -v 'node_modules/(\w|@)' | xargs du -sh | sort -hr
1.1G    ./GROCER/groce/node_modules/
1.0G    ./SHOULDWATCH/youshouldwatch/node_modules/
826M    ./PETERBECOM/django-peterbecom/adminui/node_modules/
679M    ./JAVASCRIPT/wmr/node_modules/
546M    ./WORKON/workon-fire/node_modules/
539M    ./PETERBECOM/chiveproxy/node_modules/
506M    ./JAVASCRIPT/minimalcss-website/node_modules/
491M    ./WORKON/workon/node_modules/
457M    ./JAVASCRIPT/battleshits/node_modules/
445M    ./GITHUB/DOCS/docs-internal/node_modules/
431M    ./GITHUB/DOCS/docs/node_modules/
418M    ./PETERBECOM/preact-cli-peterbecom/node_modules/
418M    ./PETERBECOM/django-peterbecom/adminui0/node_modules/
399M    ./GITHUB/THEHUB/thehub/node_modules/
...

How it works:

fd -I -t d node_modules: Find all directories called node_modules but ignore any .gitignore directives in their parent directories.
rg -v 'node_modules/(\w|@)': Exclude all finds where the word node_modules/ is followed by a @ or a [a-z0-9] character.
xargs du -sh: For each line, run du -sh on it. That's like doing cd some/directory && du -sh, where du means "disk usage" and -s means total and -h means human-readable.
sort -hr: Sort by the first column as a "human numeric sort" meaning it understands that "1M" is more than "20K"

Now, if I want to free up some disk space, I can look through the list and if I recognize a project I almost never work on any more, I just send it to rm -fr.

Peterbe.com

Playwright and VS Code

In conclusion

Some pictures

Memory usage

Indexing performance

Complex relevancy is easier, but more magic

Highlighting is easy

Relevancy in between fields is too basic

Search performance

In conclusion

The problem

The solution

Installation

Let's explain this with a scoped example

Let's define some choice business logic

Let's write the business logic in JavaScript

Conclusion

Things that worked well

It is fast

The CLI looks great

Suggestions just look great

It is smaller

Editing the rome.json's JSON schema works in VS Code

Things that didn't work so well

Lint, check, or format?

How do I specify wildcards?

There's no documentation about how to ignore certain rules

The rome.json file is JSON

Nextjs and rome needs to talk

Editing the `rome.json`'s JSON schema works in VS Code

The `rome.json` file is JSON