Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page!
Currently only showing blog entries under the category: MacOSX. Clear filter

How I performance test PostgreSQL locally on macOS

10 December 2018 2 comments   Web development, MacOSX, PostgreSQL


It's weird to do performance analysis of a database you run on your laptop. When testing some app, your local instance probably has 1/1000 the amount of realistic data compared to a production server. Or, you're running a bunch of end-to-end integration tests whose PostgreSQL performance doesn't make sense to measure.

Anyway, if you are doing some performance testing of an app that uses PostgreSQL one great tool to use is pghero. I use it for my side-projects and it gives me such nice insights into slow queries that I'm willing to live with the cost that it is to run it on a production database.

This is more of a brain dump of how I run it locally:

First, you need to edit your postgresql.conf. Even if you used Homebrew to install it, it's not clear where the right config file is. Start psql (on any database) and type this to find out which file is the one:

$ psql kintobench

kintobench=# show config_file;
               config_file
-----------------------------------------
 /usr/local/var/postgres/postgresql.conf
(1 row)

Now, open /usr/local/var/postgres/postgresql.conf and add the following lines:

# Peterbe: From Pghero's configuration help.
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all

Now, to restart the server use:

▶ brew services restart postgresql
Stopping `postgresql`... (might take a while)
==> Successfully stopped `postgresql` (label: homebrew.mxcl.postgresql)
==> Successfully started `postgresql` (label: homebrew.mxcl.postgresql)

The next thing you need is pghero itself and it's easy to run in docker. So to start, you need Docker for mac installed. You also need to know the database URL. Here's how I ran it:

docker run -ti -e DATABASE_URL=postgres://peterbe:@host.docker.internal:5432/kintobench -p 8080:8080 ankane/pghero

Duplicate indexes

Note the trick of peterbe:@host.docker.internal because I don't use a password but inside the Docker container it doesn't know my terminal username. And the host.docker.internal is so the Docker container can reach the PostgreSQL installed on the host.

Once that starts up you can go to http://localhost:8080 in a browser and see a listing of all the cumulatively slowest queries. There are other cool features in pghero too that you can immediately benefit from such as hints about unused/redundent database indices.

Hope it helps!

The best grep tool in the world; ripgrep

19 June 2018 3 comments   Linux, Web development, MacOSX

https://github.com/BurntSushi/ripgrep


tl;dr; ripgrep (aka. rg) is the best tool to grep today.

ripgrep is a tool for searching files. Its killer feature is that it's fast. Like, really really fast. Faster than sift, git grep, ack, regular grep etc.

If you don't believe me, either read this detailed blog post from its author or just jump straight to the conclusion:

Benchmark
Benchmark

I used to use git grep whenever I was inside a git repo and sift for everything else. That alone, was a huge step up from regular grep. Granted, almost all my git repos are small enough that regular git grep is faster than I can perceive many times. But with ripgrep I can just add --no-ignore-vcs and it searches in all the files mentioned in .gitignore too. That's useful when you want to search in your own source as well as the files in node_modules.

The installation instructions are easy. I installed it with brew install ripgrep and the best way to learn how to use it is rg --help. Remember that it has a lot of cool features that are well worth learning. It's written in Rust and so far I haven't had a single crash, ever. The ability to search by file type gets some getting used to (tip! use: rg --type-list) and remember that you can pipe rg output to another rg. For example, to search for all lines that contain query and string you can use rg query | rg string.

How to unset aliases set by Oh My Zsh

14 June 2018 3 comments   Linux, MacOSX


I use Oh My Zsh and I highly recommend it. However, it sets some aliases that I don't want. In particular, there's a plugin called git.plugin.zsh (located in ~/.oh-my-zsh/plugins/git/git.plugin.zsh) that interfers with a global binary I have in $PATH. So when I start a shell the executable gg becomes...:

▶ which gg
gg: aliased to git gui citool

That overrides /usr/local/bin/gg which is the one I want to execute when I type gg. To unset that I can run...:

unset gg

▶ which gg
/usr/local/bin/gg

To override it "permanently", I added, to the end of ~/.zshrc:

# This unsets ~/.oh-my-zsh/plugins/git/git.plugin.zsh
# So my /usr/local/bin/gg works instead
unalias gg

Now whenever I start a new terminal, it defaults to the gg in /usr/local/bin/gg instead.

gtop is best

02 May 2018 0 comments   Linux, MacOSX, JavaScript

https://github.com/aksakalli/gtop


To me, using top inside a Linux server via SSH is all muscle-memory and it's definitely good enough. On my Macbook when working on some long-running code that is resource intensive the best tool I know of is: gtop

gtop in action
gtop in action

I like it because it has the graphs I want and need. It splits up the work of each CPU which is awesome. That's useful for understanding how well a program is able to leverage more than one CPU process.

And it's really nice to have the list of Processes there to be able to quickly compare which programs are running and how that might affect the use of the CPUs.

Instead of listing alternatives I've tried before, hopefully this Reddit discussion has good links to other alternatives

Make .local domains NOT slow in macOS

29 January 2018 15 comments   Linux, MacOSX


Problem

I used to have a bunch of domains in /etc/hosts like peterbecom.dev for testing Nginx configurations locally. But then it became impossible to test local sites in Chrome because an .dev is force redirected to HTTPS. No problem, so I use .local instead. However, DNS resolution was horribly slow. For example:

time curl -I http://peterbecom.local/about/minimal.css > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  1763    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
curl -I http://peterbecom.local/about/minimal.css > /dev/null  0.01s user 0.01s system 0% cpu 5.585 total

5.6 seconds to open a local file in Nginx.

Solution

Here's that one weird trick to solve it: Add an entry for IPv4 AND IPv6 in /etc/hosts.

So now I have:

▶ cat /etc/hosts | grep peterbecom
127.0.0.1       peterbecom.local
::1             peterbecom.local

Verification

Ah! Much better. Thing are fast again:

time curl -I http://peterbecom.local/about/minimal.css > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  1763    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl -I http://peterbecom.local/about/minimal.css > /dev/null  0.01s user 0.01s system 37% cpu 0.041 total

0.04 seconds instead of 5.6.

When Docker is too slow, use your host

11 January 2018 3 comments   Web development, Django, MacOSX, Docker


I have a side-project that is basically a React frontend, a Django API server and a Node universal React renderer. The killer feature is its Elasticsearch database that searches almost 2.5M large texts and 200K named objects. All the data is stored in a PostgreSQL and there's some Python code that copies that stuff over to Elasticsearch for indexing.

Timings for searches in Songsearch
The PostgreSQL database is about 10GB and the Elasticsearch (version 6.1.0) indices are about 6GB. It's moderately big and even though individual searches take, on average ~75ms (in production) it's hefty. At least for a side-project.

On my MacBook Pro, laptop I use Docker to do development. Docker makes it really easy to run one command that starts memcached, Django, a AWS Product API Node app, create-react-app for the search and a separate create-react-app for the stats web app.

At first I tried to also run PostgreSQL and Elasticsearch in Docker too, but after many attempts I had to just give up. It was too slow. Elasticsearch would keep crashing even though I extended my memory in Docker to 4GB.

This very blog (www.peterbe.com) has a similar stack . Redis, PostgreSQL, Elasticsearch all running in Docker. It works great. One single docker-compose up web starts everything I need. But when it comes to much larger databases, I found my macOS host to be much more performant.

So the dark side of this is that I have remember to do more things when starting work on this project. My PostgreSQL was installed with Homebrew and is always running on my laptop. For Elasticsearch I have to open a dedicated terminal and go to a specific location to start the Elasticsearch for this project (e.g. make start-elasticsearch).

The way I do this is that I have this in my Django projects settings.py:

import dj_database_url
from decouple import config


DATABASES = {
    'default': config(
        'DATABASE_URL',
        # Hostname 'docker.for.mac.host.internal' assumes
        # you have at least Docker 17.12.
        # For older versions of Docker use 'docker.for.mac.localhost'
        default='postgresql://peterbe@docker.for.mac.host.internal/songsearch',
        cast=dj_database_url.parse
    )
}

ES_HOSTS = config('ES_HOSTS', default='docker.for.mac.host.internal:9200', cast=Csv())

(Actually, in reality the defaults in the settings.py code is localhost and I use docker-compose.yml environment variables to override this, but the point is hopefully still there.)

And that's basically it. Now I get Docker to do what various virtualenvs and terminal scripts used to do but the performance of running the big databases on the host.