Filtered by JavaScript, Python

Page 13

Reset

Optimize DOM selector lookups by pre-warming by selectors' parents

February 11, 2019
0 comments Web development, Node, Web Performance, JavaScript

tl;dr; minimalcss 0.8.2 introduces a 20% post-processing optimization by lumping many CSS selectors to their parent CSS selectors as a pre-emptive cache.

In minimalcss the general core of it is that it downloads a DOM tree, as HTML, parses it and parses all the CSS stylesheets associated. These might be from <link ref="stylesheet"> or <style> tags.
Once the CSS stylesheets are turned into an AST it loops over each and every CSS selector and asks a simple question; "Does this CSS selector exist in the DOM?". The equivalent is to open your browser's Web Console and type:

>>> document.querySelectorAll('div.foo span.bar b').length > 0
false

For each of these lookups (which is done with cheerio by the way), minimalcss reduces the CSS, as an AST, and eventually spits the AST back out as a CSS string. The only problem is; it's slow. In the case of view-source:https://semantic-ui.com/ in the CSS it uses, there are 6,784 of them. What to do?

First of all, there isn't a lot you can do. This is the work that needs to be done. But one thing you can do is be smart about which selectors you look at and use a "decision cache" to pre-emptively draw conclusions. So, if this is what you have to check:

  1. #example .alternate.stripe
  2. #example .theming.stripe
  3. #example .solid .column p b
  4. #example .solid .column p

As you process the first one you extract that the parent CSS selector is #example and if that doesn't exist in the DOM, you can efficiently draw conclusion about all preceeding selectors that all start with #example .... Granted, if they call exist you will pay a penalty of doing an extra lookup. But that's the trade-off that this optimization is worth.

Check out the comments where I tested a bloated page that uses Semantic-UI before and after. Instead of doing 3,285 of these document.querySelector(selector) calls, it's now able too come to the exact same conclusion with just 1,563 lookups.

Sadly, the majority of the time spent processing lies in network I/O and other overheads but this work did reduce something that used to take 6.3s (median) too 5.1s (median).

Hooks tip! Avoid infinite recursion in React.useEffect()

February 6, 2019
1 comment React, JavaScript

React 16.8.0 with Hooks was released today. A big deal. Executive summary; components as functions is all the rage now.

What used to be this:


class MyComponent extends React.Component {
  ...

  componentDidMount() {
    ...
  }
  componentDidUpdate() {
    ...
  }

  render() { STUFF }
}

...is now this:


function MyComponent() {
  ...

  React.useEffect(() => {
    ...
  })

  return STUFF
}

Inside the useEffect "side-effect callback" you can actually update state. But if you do, and this is no different that old React.Component.componentDidUpdate, it will re-run the side-effect callback. Here's a simple way to cause an infinite recursion:


// DON'T DO THIS

function MyComponent() {
  const [counter, setCounter] = React.useState(0);

  React.useEffect(() => {
    setCounter(counter + 1);
  })

  return <p>Forever!</p>
}

The trick is to pass a second argument to React.useEffect that is a list of states to exclusively run on.

Here's how to fix the example above:


function MyComponent() {
  const [counter, setCounter] = React.useState(0);
  const [times, setTimes] = React.useState(0);

  React.useEffect(
    () => {
      if (times % 3 === 0) {
        setCounter(counter + 1);
      }
    },
    [times]  // <--- THIS RIGHT HERE IS THE KEY!
  );

  return (
    <div>
      <p>
        Divisible by 3: {counter}
        <br />
        Times: {times}
      </p>
      <button type="button" onClick={e => setTimes(times + 1)}>
        +1
      </button>
    </div>
  );
}

You can see it in this demo.

Note, this isn't just about avoiding infinite recursion. It can also be used to fit your business logic and/or an optimization to avoid executing the effect too often.

Displaying fetch() errors and unwanted responses in React

February 6, 2019
0 comments Web development, React, JavaScript

tl;dr; You can use error instanceof window.Response to distinguish between fetch exceptions and fetch responses.

When you do something like...


const response = await fetch(URL);

...two bad things can happen.

  1. The XHR request fails entirely. I.e. there's not even a response with a HTTP status code.
  2. The response "worked" but the HTTP status code was not to your liking.

Either way, your React app needs to deal with this. Ideally in a not-too-clunky way. So here is one take on this challenge/opportunity which I hope can inspire you to extend it the way you need it to go.

The trick is to "clump" exceptions with responses. Then you can do this:


function ShowServerError({ error }) {
  if (!error) {
    return null;
  }
  return (
    <div className="alert">
      <h3>Server Error</h3>
      {error instanceof window.Response ? (
        <p>
          <b>{error.status}</b> on <b>{error.url}</b>
          <br />
          <small>{error.statusText}</small>
        </p>
      ) : (
        <p>
          <code>{error.toString()}</code>
        </p>
      )}
    </div>
  );
}

The greatest trick the devil ever pulled was to use if (error instanceof window.Reponse) {. Then you know that error thing is the outcome of THIS = await fetch(URL) (or fetch(URL).then(THIS) if you prefer). Another good trick the devil pulled was to be aware that exceptions, when asked to render in React does not naturally call its .toString() so you have to do that yourself with {error.toString()}.

This codesandbox demonstrates it quite well. (Although, at the time of writing, codesandbox will spew warnings related to testing React components in the console log. Ignore that.)

If you can't open that codesandbox, here's the gist of it:


React.useEffect(() => {
  url &&
    (async () => {
      let response;
      try {
        response = await fetch(url);
      } catch (ex) {
        return setServerError(ex);
      }
      if (!response.ok) {
        return setServerError(response);
      }
      // do something here with `await response.json()`
    })(url);
}, [url]);

By the way, another important trick is to be subtle with how you put the try { and } catch(ex) {.


// DON'T DO THIS

try {
  const response = await fetch(url);
  if (!response.ok) {
    setServerError(response);
  }
  // do something here with `await response.json()`
} catch (ex) {
  setServerError(ex);
}

Instead...


// DO THIS

let response;
try {
  response = await fetch(url);
} catch (ex) {
  return setServerError(ex);
}
if (!response.ok) {
  return setServerError(response);
}
// do something here with `await response.json()`

If you don't do that you risk catching other exceptions that aren't exclusively the fetch() call. Also, notice the use of return inside the catch block which will exit the function early leaving you the rest of the code (de-dented 1 level) to deal with the happy-path response object.

Be aware that the test if (!response.ok) is simplistic. It's just a shorthand for checking if the "status in the range 200 to 299, inclusive". Realistically getting a response.status === 400 isn't an "error" really. It might just be a validation error hint from a server, and likely the await response.json() will work and contain useful information. No need to throw up a toast or a flash message that the communication with the server failed.

Conclusion

The details matter. You might want to deal with exceptions entirely differently from successful responses with bad HTTP status codes. It's nevertheless important to appreciate two things:

  1. Handle complete fetch() failures and feed your UI or your retry mechanisms.

  2. You can, in one component distinguish between a "successful" fetch() call and thrown JavaScript exceptions.

Format thousands in Python

February 1, 2019
9 comments Python

tl;dr; Use f"{number:,}" to thousands format an integer to a string.

I keep forgetting and having to look this up every time. Hopefully by blogging about it, this time it'll stick in my memory. And hopefully in yours too :)

Suppose you have a number, like 1234567890 and you want to display it, here's how you do it:


>>> number = 1234567890
>>> f"{number:,}"
'1,234,567,890'

In the past, before Python 3.6, I've been using:


>>> number = 1234567890
>>> format(number, ",")
'1,234,567,890'

All of this and more detail can be found in PEP 378 -- Format Specifier for Thousands Separator. For example, you can do this beast too:


>>> number = 1234567890
>>> f"{number:020,.2f}"
'0,001,234,567,890.00'

which demonstrates (1) how to do zero-padding (of length 20), (2) the thousands comma, (3) round to 2 significant figures. All useful weapons to be able to draw from the top of your head.

UPDATE

Also, incredibly useful is the equivalent of somestring.ljust(10):


>>> mystr = "peter"
>>> f"{mystr:10}"
'peter     '
>>> f"{mystr:>10}"
'     peter'

hashin 0.14.5 and canonical pip hashes

January 31, 2019
0 comments Python

Prior to version 0.14.5 hashin would write write down the hashes of PyPI packages in the order they appear in PyPI's JSON response. That means there's a slight chance that two distinct clients/computers/humans might actually get different output when then run hashin Django==2.1.5.

The pull request has a pretty hefty explanation as it demonstrates the fix.

Do note that if the existing order of hashes in a requirements file is not in the "right" order, hashin won't correct it unless any of the hashes are different.

Thanks @SomberNight for patiently pushing for this.

variable_cache_control - Django view decorator to set max_age in runtime

January 22, 2019
0 comments Django, Python

tl;dr; If you use the django.views.decorators.cache.cache_control decorator, consider this one instead to change the max_age depending on the request.

I had/have a Django view function that looks something like this:


@cache_control(public=True, max_age=60 * 60)
def home(request, oc=None, page=1):
    ...

But, that number 60 * 60 I really needed it to be different depending on the request parameters. For example, that oc=None, if that's not None I know the page's Cache-Control header can and should be different.

So I wrote this decorator:


from django.utils.cache import patch_cache_control


def variable_cache_control(**kwargs):
    """Same as django.views.decorators.cache.cache_control except this one will
    allow the `max_age` parameter be a callable.
    """

    def _cache_controller(viewfunc):
        @functools.wraps(viewfunc)
        def _cache_controlled(request, *args, **kw):
            response = viewfunc(request, *args, **kw)
            copied = kwargs
            if kwargs.get("max_age") and callable(kwargs["max_age"]):
                max_age = kwargs["max_age"](request, *args, **kw)
                # Can't re-use, have to create a shallow clone.
                copied = dict(kwargs, max_age=max_age)
            patch_cache_control(response, **copied)
            return response

        return _cache_controlled

    return _cache_controller

Now, I can do this instead:


def _best_max_age(req, oc=None, **kwargs):
    max_age = 60 * 60
    if oc:
        max_age *= 10
    return max_age

@variable_cache_control(public=True, max_age=_best_max_age)
def home(request, oc=None, page=1):
    ...

I hope it inspires.

An example of using Immer to handle nested objects in React state

January 18, 2019
1 comment React, JavaScript

When Immer first came out I was confused. I kinda understood what I was reading but I couldn't really see what was so great about it. As always, nothing beats actual code you type yourself to experience how something works.

Here is, I believe, a great example: https://codesandbox.io/s/y2m399pw31

If you're reading this on your mobile it might be hard to see what it does. Basically, it's a very simple React app that displays a "todo list like" thing. The state (aka. this.state.tasks) is a pure JavaScript array. The React components that display the data (e.g. <List tasks={this.state.tasks}/> and <ShowItem item={item} />) are pure (i.e. extends React.PureComponent) meaning React natively protects from re-rendering a component when the props haven't changed. So no wasted render-cycles.

What Immer does is that it helps mutate an object in a smart way. I'm sure you've heard that you're never supposed to mutate state objects (arrays are a form of mutable objects too!) and instead do things like const stuff = Object.assign({}, this.state.stuff); or const things = this.state.things.slice(0);. However, those things are shallow copies meaning any mutable objects within (i.e. nested objects) don't get the clone treatment and can thus cause problems with not re-rendering when they should.

Here's the core gist:


import React from "react";
import produce from "immer";

class App extends React.Component {
  state = {
    tasks: [[false, { text: "Do something", date: new Date() }]]
  };
  onToggleDone = (i, done) => {
    // Immer
    // This is what the blog post is all about...
    const tasks = produce(this.state.tasks, draft => {
      draft[i][0] = done;
      draft[i][1].date = new Date();
    });

    // Pure JS
    // Don't do this!
    // const tasks = this.state.tasks.slice(0);
    // tasks[i][0] = done;
    // tasks[i][1].date = new Date();

    this.setState({ tasks });
  };
  render() {
    // appreviated, but...
    return <List tasks={this.state.tasks}/>
  }
}

class List extends React.PureComponent {
   ...

It just works. Neat!

By the way, here's a code sandbox that accomplishes the same thing but with ImmutableJS which I think is uglier. I think it's uglier because now the rendering components need to be aware that it's rendering immutable.Map objects instead.

Caveats

  1. The cost of doing what immer.produce isn't free. It's some smart work that needs to be done. But the alternative is to deep clone the object which is going to be much slower. Immer isn't the fastest kid on the block but unlike MobX and ImmutableJS once you've done this smart stuff you're back to plain JavaScript objects.

  2. Careful with doing something like console.log(draft) since it will raise a TypeError in your web console. Just be aware of that or use console.log(JSON.stringify(draft)) instead.

  3. If you know with confidence that your mutable object does not, and will not, have nested mutable objects you can use object spread, Object.assign(), or .slice(0) and save yourself the trouble of another dependency.

Use vars() to send an argparse Namespace into a function in Python

January 8, 2019
1 comment Python

I only just learned about this today after all these years and thought you might like it too.

The trick is to conveniently turn an argparse.Namespace into keyword arguments that you can send to a function. This is the old/wrong way I've been doing it for years:


# THE OLD WAY

def main(things, option_a, option_n):
    print(locals())  # Debugging 


import argparse

parser = argparse.ArgumentParser()
parser.add_argument("things", help="Bla bla", nargs="*")
parser.add_argument("-o", "--option-a", help="Bla bla", default="Op A")
parser.add_argument("-n", "--option-n", help="Ble ble", default="Op N")
args = parser.parse_args()

main(
    things=args.things,
    option_a=args.option_a,
    option_n=args.option_n
)

That works but the tedious thing is to have to have spell out every single argument, twice!, when sending the argparse Namespace into the function. Here's the much moar betterest way:


# THE NEW WAY

def main(things, option_a, option_n):
    print(locals())  # Debugging 


import argparse

parser = argparse.ArgumentParser()
parser.add_argument("things", help="Bla bla", nargs="*")
parser.add_argument("-o", "--option-a", help="Bla bla", default="Op A")
parser.add_argument("-n", "--option-n", help="Ble ble", default="Op N")
args = parser.parse_args()

# The only difference and the magic sauce...
main(**vars(args))  

What's neat about this is that you don't have to type up every argument defined in the parser to the get it as arguments into a function. And as a bonus, Python will name match keyword arguments to arguments so the order doesn't matter.

Caveat! This "trick" assumes that the arguments in the parser match the arguments in the function. So if the main() function takes an argument called foo_bar you have to have an argument in the parser called --foo-bar.

Number.prototype.toString() is incredibly useful to display numbers

January 4, 2019
0 comments JavaScript

tl;dr; Use Number.prototype.toString() to display percentages that might be floating point numbers.

10% entered
I started writing a complicated solution but as I discovered corner cases and surprised I was brutally forced to do some research and actually read some documentation. Turns out Number.prototype.toString(), with the precision argument omitted, is the ideal solution.

The application I was working on has an input field to type in a percentage. I.e. a number between 0 and 100. But whatever the user types in, we store the number in decimal. So, if the user typed in "10" into the input widget, we actually store it as 0.1 in the database. Most people will type in a whole number (aka. an integer) like "12" or "5" but some people actually need more precision so they might type in "0.2%" which means 0.002 stored in the backend database.

But the widget is a React controlled component meaning it's value prop needs to be potentially formatted to what gives the best user experience. If the user types in whole numbers set the value prop to a whole number. If the user types in floating point numbers set the value prop type a floating point number with the "matching formatting".

0.12% entered
I started writing an overly complicated function that tries to figure out how many decimal-points the user typed in. For example 0.123 is 3 because parseInt(0.123 * 10 ** 3, 10) === 0.123 * 10 ** 3. But, that approach doesn't work because of floating point arithmetic and the rounding problem. For example 103441 !== 10.3441 * (10 ** 4) === 103440.99999999999. So, don't look for a number to pass into .toFixed().

Turns out Number.prototype.toString() is all you need. If you omit the precision argument, it figures out how many significant digits to use based on the input. It's best explained with some examples:

> (33).toString()
"33"
> (33.3).toString()
"33.3"
> (33.10000).toString()
"33.1"
> (10.3441).toString()
"10.3441"

Perfect!

Next level stuff

So actually, it's a bit more complicated than that. You see, the number stored in the backend database might be 0.007 which you and I know as "0.7%" but be warned:

> 0.008 * 100
0.8
> 0.007 * 100
0.7000000000000001

You know, because of floating-point arithmetic, which every high-level software engineer remembers understanding one time years ago but now know just to watch out for.

So if you use the toString() on that you'd get...

> var backendPercentage = 0.007
> (100 * backendPercentage).toString() + '%'
"0.700000000000001%"

Ouch! So how to solve that? Use Math.round(number * 100) / 100 to get rid of those rounding errors. Apparently, it's very fast too. So, now combine this with the toString():

> var backendPercentage = 0.007
> (Math.round(100 * backendPercentage * 100) / 100).toString() + '%'
"0.7%"

Perfect!

Concurrent download with hashin without --update-all

December 18, 2018
0 comments Web development, Python

Last week, I landed concurrent downloads in hashin. The example was that you do something like...


$ time hashin -r some/requirements.txt --update-all

...and the whole thing takes ~2 seconds even though it that some/requirements.txt file might contain 50 different packages, and thus 50 different PyPI.org lookups.

Just wanted to point out, this is not unique to use with --update-all. It's for any list of packages. And I want to put some better numbers on that so here goes...

Suppose you want to create a requirements file for every package in the current virtualenv you might do it like this:


# the -e filtering removes locally installed packages from git URLs
$ pip freeze | grep -v '-e ' | xargs hashin -r /tmp/reqs.txt

Before running that I injected a little timer on each pypi.org download. It looked like this:


def get_package_data(package, verbose=False):
    url = "https://pypi.org/pypi/%s/json" % package
    if verbose:
        print(url)
+   t0 = time.time()
    content = json.loads(_download(url))
    if "releases" not in content:
        raise PackageError("package JSON is not sane")
+   t1 = time.time()
+   print(t1 - t0)

I also put a print around the call to pre_download_packages(lookup_memory, specs, verbose=verbose) to see what the "total time" was.

The output looked like this:

▶ pip freeze | grep -v '-e ' | xargs python hashin.py -r /tmp/reqs.txt
0.22896194458007812
0.2900810241699219
0.2814369201660156
0.22658205032348633
0.24882292747497559
0.268247127532959
0.29332590103149414
0.23981380462646484
0.2930259704589844
0.29442572593688965
0.25312376022338867
0.34232664108276367
0.49491214752197266
0.23823285102844238
0.3221290111541748
0.28302812576293945
0.567702054977417
0.3089122772216797
0.5273139476776123
0.31477880477905273
0.6202089786529541
0.28571176528930664
0.24558186531066895
0.5810830593109131
0.5219211578369141
0.23252081871032715
0.4650228023529053
0.6127192974090576
0.6000659465789795
0.30976200103759766
0.44440698623657227
0.3135409355163574
0.638585090637207
0.297544002532959
0.6462509632110596
0.45389699935913086
0.34597206115722656
0.3462028503417969
0.6250648498535156
0.44159507751464844
0.5733060836791992
0.6739277839660645
0.6560370922088623
SUM TOTAL TOOK 0.8481268882751465

If you sum up all the individual times it would have become 17.3 seconds. It's 43 individual packages and 8 CPUs multiplied by 5 means it had to wait with some before downloading the rest.

Clearly, this works nicely.