Peterbe.com

A blog and website by Peter Bengtsson

An example of using Immer to handle nested objects in React state

18 January 2019 0 comments   Javascript, ReactJS

https://github.com/mweststrate/immer


When Immer first came out I was confused. I kinda understood what I was reading but I couldn't really see what was so great about it. As always, nothing beats actual code you type yourself to experience how something works.

Here is, I believe, a great example: https://codesandbox.io/s/y2m399pw31

If you're reading this on your mobile it might be hard to see what it does. Basically, it's a very simple React app that displays a "todo list like" thing. The state (aka. this.state.tasks) is a pure JavaScript array. The React components that display the data (e.g. <List tasks={this.state.tasks}/> and <ShowItem item={item} />) are pure (i.e. extends React.PureComponent) meaning React natively protects from re-rendering a component when the props haven't changed. So no wasted render-cycles.

What Immer does is that it helps mutate an object in a smart way. I'm sure you've heard that you're never supposed to mutate state objects (arrays are a form of mutable objects too!) and instead do things like const stuff = Object.assign({}, this.state.stuff); or const things = this.state.things.slice(0);. However, those things are shallow copies meaning any mutable objects within (i.e. nested objects) don't get the clone treatment and can thus cause problems with not re-rendering when they should.

Here's the core gist:

import React from "react";
import produce from "immer";

class App extends React.Component {
  state = {
    tasks: [[false, { text: "Do something", date: new Date() }]]
  };
  onToggleDone = (i, done) => {
    // Immer
    // This is what the blog post is all about...
    const tasks = produce(this.state.tasks, draft => {
      draft[i][0] = done;
      draft[i][1].date = new Date();
    });

    // Pure JS
    // Don't do this!
    // const tasks = this.state.tasks.slice(0);
    // tasks[i][0] = done;
    // tasks[i][1].date = new Date();

    this.setState({ tasks });
  };
  render() {
    // appreviated, but...
    return <List tasks={this.state.tasks}/>
  }
}

class List extends React.PureComponent {
   ...

It just works. Neat!

By the way, here's a code sandbox that accomplishes the same thing but with ImmutableJS which I think is uglier. I think it's uglier because now the rendering components need to be aware that it's rendering immutable.Map objects instead.

Caveats

  1. The cost of doing what immer.produce isn't free. It's some smart work that needs to be done. But the alternative is to deep clone the object which is going to be much slower. Immer isn't the fastest kid on the block but unlike MobX and ImmutableJS once you've done this smart stuff you're back to plain JavaScript objects.

  2. Careful with doing something like console.log(draft) since it will raise a TypeError in your web console. Just be aware of that or use console.log(JSON.stringify(draft)) instead.

  3. If you know with confidence that your mutable object does not, and will not, have nested mutable objects you can use object spread, Object.assign(), or .slice(0) and save yourself the trouble of another dependency.

Use vars() to send an argparse Namespace into a function in Python

08 January 2019 0 comments   Python

https://docs.python.org/3/library/functions.html#vars


I only just learned about this today after all these years and thought you might like it too.

The trick is to conveniently turn an argparse.Namespace into keyword arguments that you can send to a function. This is the old/wrong way I've been doing it for years:

# THE OLD WAY

def main(things, option_a, option_n):
    print(locals())  # Debugging 


import argparse

parser = argparse.ArgumentParser()
parser.add_argument("things", help="Bla bla", nargs="*")
parser.add_argument("-o", "--option-a", help="Bla bla", default="Op A")
parser.add_argument("-n", "--option-n", help="Ble ble", default="Op N")
args = parser.parse_args()

main(
    things=args.things,
    option_a=args.option_a,
    option_n=args.option_n
)

That works but the tedious thing is to have to have spell out every single argument, twice!, when sending the argparse Namespace into the function. Here's the much moar betterest way:

# THE NEW WAY

def main(things, option_a, option_n):
    print(locals())  # Debugging 


import argparse

parser = argparse.ArgumentParser()
parser.add_argument("things", help="Bla bla", nargs="*")
parser.add_argument("-o", "--option-a", help="Bla bla", default="Op A")
parser.add_argument("-n", "--option-n", help="Ble ble", default="Op N")
args = parser.parse_args()

# The only difference and the magic sauce...
main(**vars(args))  

What's neat about this is that you don't have to type up every argument defined in the parser to the get it as arguments into a function. And as a bonus, Python will name match keyword arguments to arguments so the order doesn't matter.

Caveat! This "trick" assumes that the arguments in the parser match the arguments in the function. So if the main() function takes an argument called foo_bar you have to have an argument in the parser called --foo-bar.

Number.prototype.toString() is incredibly useful to display numbers

04 January 2019 0 comments   Javascript


tl;dr; Use Number.prototype.toString() to display percentages that might be floating point numbers.

10% entered
I started writing a complicated solution but as I discovered corner cases and surprised I was brutally forced to do some research and actually read some documentation. Turns out Number.prototype.toString(), with the precision argument omitted, is the ideal solution.

The application I was working on has an input field to type in a percentage. I.e. a number between 0 and 100. But whatever the user types in, we store the number in decimal. So, if the user typed in "10" into the input widget, we actually store it as 0.1 in the database. Most people will type in a whole number (aka. an integer) like "12" or "5" but some people actually need more precision so they might type in "0.2%" which means 0.002 stored in the backend database.

But the widget is a React controlled component meaning it's value prop needs to be potentially formatted to what gives the best user experience. If the user types in whole numbers set the value prop to a whole number. If the user types in floating point numbers set the value prop type a floating point number with the "matching formatting".

0.12% entered
I started writing an overly complicated function that tries to figure out how many decimal-points the user typed in. For example 0.123 is 3 because parseInt(0.123 * 10 ** 3, 10) === 0.123 * 10 ** 3. But, that approach doesn't work because of floating point arithmetic and the rounding problem. For example 103441 !== 10.3441 * (10 ** 4) === 103440.99999999999. So, don't look for a number to pass into .toFixed().

Turns out Number.prototype.toString() is all you need. If you omit the precision argument, it figures out how many significant digits to use based on the input. It's best explained with some examples:

> (33).toString()
"33"
> (33.3).toString()
"33.3"
> (33.10000).toString()
"33.1"
> (10.3441).toString()
"10.3441"

Perfect!

Next level stuff

So actually, it's a bit more complicated than that. You see, the number stored in the backend database might be 0.007 which you and I know as "0.7%" but be warned:

> 0.008 * 100
0.8
> 0.007 * 100
0.7000000000000001

You know, because of floating-point arithmetic, which every high-level software engineer remembers understanding one time years ago but now know just to watch out for.

So if you use the toString() on that you'd get...

> var backendPercentage = 0.007
> (100 * backendPercentage).toString() + '%'
"0.700000000000001%"

Ouch! So how to solve that? Use Math.round(number * 100) / 100 to get rid of those rounding errors. Apparently, it's very fast too. So, now combine this with the toString():

> var backendPercentage = 0.007
> (Math.round(100 * backendPercentage * 100) / 100).toString() + '%'
"0.7%"

Perfect!

Concurrent download with hashin without --update-all

18 December 2018 0 comments   Python, Web development


Last week, I landed concurrent downloads in hashin. The example was that you do something like...

$ time hashin -r some/requirements.txt --update-all

...and the whole thing takes ~2 seconds even though it that some/requirements.txt file might contain 50 different packages, and thus 50 different PyPI.org lookups.

Just wanted to point out, this is not unique to use with --update-all. It's for any list of packages. And I want to put some better numbers on that so here goes...

Suppose you want to create a requirements file for every package in the current virtualenv you might do it like this:

# the -e filtering removes locally installed packages from git URLs
$ pip freeze | grep -v '-e ' | xargs hashin -r /tmp/reqs.txt

Before running that I injected a little timer on each pypi.org download. It looked like this:

def get_package_data(package, verbose=False):
    url = "https://pypi.org/pypi/%s/json" % package
    if verbose:
        print(url)
+   t0 = time.time()
    content = json.loads(_download(url))
    if "releases" not in content:
        raise PackageError("package JSON is not sane")
+   t1 = time.time()
+   print(t1 - t0)

I also put a print around the call to pre_download_packages(lookup_memory, specs, verbose=verbose) to see what the "total time" was.

The output looked like this:

▶ pip freeze | grep -v '-e ' | xargs python hashin.py -r /tmp/reqs.txt
0.22896194458007812
0.2900810241699219
0.2814369201660156
0.22658205032348633
0.24882292747497559
0.268247127532959
0.29332590103149414
0.23981380462646484
0.2930259704589844
0.29442572593688965
0.25312376022338867
0.34232664108276367
0.49491214752197266
0.23823285102844238
0.3221290111541748
0.28302812576293945
0.567702054977417
0.3089122772216797
0.5273139476776123
0.31477880477905273
0.6202089786529541
0.28571176528930664
0.24558186531066895
0.5810830593109131
0.5219211578369141
0.23252081871032715
0.4650228023529053
0.6127192974090576
0.6000659465789795
0.30976200103759766
0.44440698623657227
0.3135409355163574
0.638585090637207
0.297544002532959
0.6462509632110596
0.45389699935913086
0.34597206115722656
0.3462028503417969
0.6250648498535156
0.44159507751464844
0.5733060836791992
0.6739277839660645
0.6560370922088623
SUM TOTAL TOOK 0.8481268882751465

If you sum up all the individual times it would have become 17.3 seconds. It's 43 individual packages and 8 CPUs multiplied by 5 means it had to wait with some before downloading the rest.

Clearly, this works nicely.

elapsed function in bash to print how long things take

12 December 2018 0 comments   Linux, MacOSX


I needed this for a project and it has served me pretty well. Let's jump right into it:

# This is elapsed.sh

SECONDS=0

function elapsed()
{
  local T=$SECONDS
  local D=$((T/60/60/24))
  local H=$((T/60/60%24))
  local M=$((T/60%60))
  local S=$((T%60))
  (( $D > 0 )) && printf '%d days ' $D
  (( $H > 0 )) && printf '%d hours ' $H
  (( $M > 0 )) && printf '%d minutes ' $M
  (( $D > 0 || $H > 0 || $M > 0 )) && printf 'and '
  printf '%d seconds\n' $S
}

And here's how you use it:

# Assume elapsed.sh to be in the current working directory
source elapsed.sh

echo "Doing some stuff..."
# Imagine it does something slow that
# takes about 3 seconds to complete.
sleep 3
elapsed

echo "Some quick stuff..."
sleep 1
elapsed

echo "Doing some slow stuff..."
sleep 61
elapsed

The output of running that is:

Doing some stuff...
3 seconds
Some quick stuff...
4 seconds
Doing some slow stuff...
1 minutes and 5 seconds

Basically, if you have a bash script that does a bunch of slow things, it having a like of elapsed there after some blocks of code will print out how long the script has been running.

It's not beautiful but it works.

How I performance test PostgreSQL locally on macOS

10 December 2018 2 comments   PostgreSQL, MacOSX, Web development


It's weird to do performance analysis of a database you run on your laptop. When testing some app, your local instance probably has 1/1000 the amount of realistic data compared to a production server. Or, you're running a bunch of end-to-end integration tests whose PostgreSQL performance doesn't make sense to measure.

Anyway, if you are doing some performance testing of an app that uses PostgreSQL one great tool to use is pghero. I use it for my side-projects and it gives me such nice insights into slow queries that I'm willing to live with the cost that it is to run it on a production database.

This is more of a brain dump of how I run it locally:

First, you need to edit your postgresql.conf. Even if you used Homebrew to install it, it's not clear where the right config file is. Start psql (on any database) and type this to find out which file is the one:

$ psql kintobench
kintobench=# show config_file;
               config_file
-----------------------------------------
 /usr/local/var/postgres/postgresql.conf
(1 row)

Now, open /usr/local/var/postgres/postgresql.conf and add the following lines:

# Peterbe: From Pghero's configuration help.
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all

Now, to restart the server use:

▶ brew services restart postgresql
Stopping `postgresql`... (might take a while)
==> Successfully stopped `postgresql` (label: homebrew.mxcl.postgresql)
==> Successfully started `postgresql` (label: homebrew.mxcl.postgresql)

The next thing you need is pghero itself and it's easy to run in docker. So to start, you need Docker for mac installed. You also need to know the database URL. Here's how I ran it:

docker run -ti -e DATABASE_URL=postgres://peterbe:@host.docker.internal:5432/kintobench -p 8080:8080 ankane/pghero

Duplicate indexes

Note the trick of peterbe:@host.docker.internal because I don't use a password but inside the Docker container it doesn't know my terminal username. And the host.docker.internal is so the Docker container can reach the PostgreSQL installed on the host.

Once that starts up you can go to http://localhost:8080 in a browser and see a listing of all the cumulatively slowest queries. There are other cool features in pghero too that you can immediately benefit from such as hints about unused/redundent database indices.

Hope it helps!