Peterbe.com

A blog and website by Peter Bengtsson

Filtered home page!
Currently only showing blog entries under thecategory: Linux. Clear filter

Create a large empty file for testing

Linux

Because I always end up Googling this and struggling to find it easily, I'm going to jot it down here so it's more present on the web for others (and myself!) to quickly find.

Suppose you want to test something like a benchmark; for example, a unit test that has to process a largish file. You can use the dd command which is available on macOS and most Linuxes.

▶ dd if=/dev/zero of=big.file count=1024 bs=1024

▶ ls -lh big.file
-rw-r--r--  1 peterbe  staff   1.0M Sep  8 15:54 big.file

So the count=1024 creates a 1MB file. To create a 500KB one you simply use...

▶ dd if=/dev/zero of=big.file count=500 bs=1024

▶ ls -lh big.file
-rw-r--r--  1 peterbe  staff   500K Sep  8 15:55 big.file

It creates a binary file so you can't cat view it. But if you try to use less, for example, you'll see this:

▶ less big.file
"big.file" may be a binary file.  See it anyway? [Enter]

^@^@^@...snip...^@^@^@
big.file (END)

Please post a comment if you have thoughts or questions.

Comparing compression commands with hyperfine

Bash, MacOSX, Linux

Today I stumbled across a neat CLI for benchmark comparing CLIs for speed: hyperfine. By David @sharkdp Peter.
It's a great tool in your arsenal for quick benchmarks in the terminal.

It's written in Rust and is easily installed with brew install hyperfine. For example, let's compare a couple of different commands for compressing a file into a new compressed file. I know it's comparing apples and oranges but it's just an example:

hyperfine usage example
(click to see full picture)

It basically executes the following commands over and over and then compares how long each one took on average:

  • apack log.log.apack.gz log.log
  • gzip -k log.log
  • zstd log.log
  • brotli -3 log.log

If you're curious about the ~results~ apples vs oranges, the final result is:

▶ ls -lSh log.log*
-rw-r--r--  1 peterbe  staff    25M Jul  3 10:39 log.log
-rw-r--r--  1 peterbe  staff   2.4M Jul  5 22:00 log.log.apack.gz
-rw-r--r--  1 peterbe  staff   2.4M Jul  3 10:39 log.log.gz
-rw-r--r--  1 peterbe  staff   2.2M Jul  3 10:39 log.log.zst
-rw-r--r--  1 peterbe  staff   2.1M Jul  3 10:39 log.log.br

The point is that you type hyperfine followed by each command in quotation marks. The --prepare is run for each command and you can also use --cleanup="{cleanup command here}.

It's versatile so it doesn't have to be different commands but it can be: hyperfine "python optimization1.py" "python optimization2.py" to compare to Python scripts.

🎵 You can also export the output to a Markdown file. Here, I used:

▶ hyperfine "apack log.log.apack.gz log.log" "gzip -k log.log" "zstd log.log" "brotli -3 log.log" --prepare="rm -fr log.log.*" --export-markdown log.compress.md
▶ cat log.compress.md | pbcopy

and it becomes this:

Command Mean [ms] Min [ms] Max [ms] Relative
apack log.log.apack.gz log.log 291.9 ± 7.2 283.8 304.1 4.90 ± 0.19
gzip -k log.log 240.4 ± 7.3 232.2 256.5 4.03 ± 0.18
zstd log.log 59.6 ± 1.8 55.8 65.5 1.00
brotli -3 log.log 122.8 ± 4.1 117.3 132.4 2.06 ± 0.09

Please post a comment if you have thoughts or questions.

./bin/huey-isnt-running.sh - A bash script to prevent lurking ghosts

Python, Linux, Bash

tl;dr; Here's a useful bash script to avoid starting something when its already running as a ghost process.

Huey is a great little Python library for doing background tasks. It's like Celery but much lighter, faster, and easier to understand.

What cost me almost an hour of hair-tearing debugging today was that I didn't realize that a huey daemon process had gotten stuck in the background with code that wasn't updating as I made changes to the tasks.py file in my project. I just couldn't understand what was going on.

The way I start my project is with honcho which is a Python Foreman clone. The Procfile looks something like this:

elasticsearch: cd /Users/peterbe/dev/PETERBECOM/elasticsearch-7.7.0 && ./bin/elasticsearch -q
web: ./bin/run.sh web
minimalcss: cd minimalcss && PORT=5000 yarn run start
huey: ./manage.py run_huey --flush-locks --huey-verbose
adminui: cd adminui && yarn start
pulse: cd pulse && yarn run dev

And you start that with simply typing:

honcho start

When you Ctrl-C, it kills all those processes but somehow somewhere it doesn't always kill everything. Restarting the computer isn't a fun alternative.

So, to prevent my sanity from draining I wrote this script:

#!/usr/bin/env bash
set -eo pipefail

# This is used to make sure that before you start huey, 
# there isn't already one running the background.
# It has happened that huey gets lingering stuck as a 
# ghost and it's hard to notice it sitting there 
# lurking and being weird.

bad() {
    echo "Huey is already running!"
    exit 1
}

good() {
    echo "Huey is NOT already running"
    exit 0
}

ps aux | rg huey | rg -v 'rg huey' | rg -v 'huey-isnt-running.sh' && bad || good

(If you're wondering what rg is; it's short for ripgrep)

And I change my Procfile accordingly:

-huey: ./manage.py run_huey --flush-locks --huey-verbose
+huey: ./bin/huey-isnt-running.sh && ./manage.py run_huey --flush-locks --huey-verbose

There really isn't much rocket science or brain surgery about this blog post but I hope it inspires someone who's been in similar trenches that a simple bash script can make all the difference.

Please post a comment if you have thoughts or questions.

How I added brotli_static to nginx 1.17 in Ubuntu (Eoan Ermine) 19.10

Nginx, Linux

I knew I didn't want to download the sources to nginx to install it on my new Ubuntu 19.10 server because I'll never have the discipline to remember to keep it upgraded. No, I'd rather just run apt update && apt upgrade every now and then.

Why is this so hard?! All I need is the ability to set brotli_static on; in my Nginx config so it'll automatically pick the .br file if it exists on disk.

These instructions totally helped but here they are specifically for my version (all run as root):

git clone --recursive https://github.com/google/ngx_brotli.git

apt install brotli
apt-get build-dep nginx

# Note the version of which nginx you have installed
nginx -v
# ...which informs which URL to wget
wget https://nginx.org/download/nginx-1.17.9.tar.gz
aunpack nginx-1.17.9.tar.gz
nginx -V 2>&1 >/dev/null | grep -o " --.*" | grep -oP .+?(?=--add-dynamic-module)| head -1 > nginx-1.17.9/build_args.txt
cd nginx-1.17.9/
./configure --with-compat $(cat build_args.txt) --add-dynamic-module=../ngx_brotli
make install

cp objs/ngx_http_brotli_filter_module.so  /usr/lib/nginx/modules/
chmod 644 /usr/lib/nginx/modules/ngx_http_brotli_filter_module.so
cp objs/ngx_http_brotli_static_module.so /usr/lib/nginx/modules/
chmod 644 /usr/lib/nginx/modules/ngx_http_brotli_static_module.so

ls -l /etc/nginx/modules

Now I can edit my /etc/nginx/nginx.conf (somewhere near the top) to:

load_module /usr/lib/nginx/modules/ngx_http_brotli_filter_module.so;
load_module /usr/lib/nginx/modules/ngx_http_brotli_static_module.so;

And test that it works:

nginx -t

Please post a comment if you have thoughts or questions.

How to install Node 12 on Ubuntu (Eoan Ermine) 19.10

Node, Linux

https://github.com/nodesource/distributions/blob/master/README.md#debinstall

I'm setting up a new Ubuntu (Eoan Ermine) 19.10 server and I noticed that apt install nodejs gives you Node v10 which is an LTS (Long Term Support) version that'll last till April 2021. However, I want Node v12 which is the most recent LTS release as of April 2020.

To install it I used these instructions:

curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -
sudo apt-get install -y nodejs

That worked great.
When it finished, it spat out this nice little blurb about how to install yarn:

...
Fetched 7454 B in 1s (12.3 kB/s)
Reading package lists... Done

## Run `sudo apt-get install -y nodejs` to install Node.js 12.x and npm
## You may also need development tools to build native addons:
     sudo apt-get install gcc g++ make
## To install the Yarn package manager, run:
     curl -sL https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
     echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
     sudo apt-get update && sudo apt-get install yarn

By the way, I have no idea what nodejs-mozilla but running apt show nodejs-mozilla yields:

Package: nodejs-mozilla
Version: 12.16.1-0ubuntu0.19.10.1
Priority: optional
Section: universe/javascript
Origin: Ubuntu
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 42.0 MB
Depends: libc6 (>= 2.29), libgcc1 (>= 1:3.4), libstdc++6 (>= 9)
Homepage: http://nodejs.org/
Download-Size: 10.4 MB
APT-Sources: http://mirrors.digitalocean.com/ubuntu eoan-updates/universe amd64 Packages
Description: evented I/O for V8 javascript
 Node.js is a platform built on Chrome's JavaScript runtime for easily
 building fast, scalable network applications. Node.js uses an
 event-driven, non-blocking I/O model that makes it lightweight and
 efficient, perfect for data-intensive real-time applications that run
 across distributed devices.
 .
 Node.js is bundled with several useful libraries to handle server
 tasks:
 .
 System, Events, Standard I/O, Modules, Timers, Child Processes, POSIX,
 HTTP, Multipart Parsing, TCP, DNS, Assert, Path, URL, Query Strings.

Installing it doesn't add a node executable and I can't find a home page for it. apt can be weird sometimes.

Please post a comment if you have thoughts or questions.

uwsgi weirdness with --http

Python, Linux

Instead of upgrading everything on my server, I'm just starting from scratch. From Ubuntu 16.04 to Ubuntu 19.04 and I also upgraded everything else in sight. One of them was uwsgi. I copied various user config files but for uwsgi things didn't very well. On the old server I had uwsgi version 2.0.12-debian and on the new one 2.0.18-debian. The uWSGI changelog is pretty hard to read but I sure don't see any mention of this.

You see, on SongSearch I have it so that Nginx talks to Django via a uWSGI socket. But the NodeJS server talks to Django via 127.0.0.1:PORT. So I need my uWSGI config to start both. Here was the old config:

[uwsgi]
plugins = python35
virtualenv = /var/lib/django/songsearch/venv
pythonpath = /var/lib/django/songsearch
user = django
uid = django
master = true
processes = 3
enable-threads = true
touch-reload = /var/lib/django/songsearch/uwsgi-reload.touch
http = 127.0.0.1:9090
module = songsearch.wsgi:application
env = LANG=en_US.utf8
env = LC_ALL=en_US.UTF-8
env = LC_LANG=en_US.UTF-8

(The only difference on the new server was the python37 plugin instead)

I start it and everything looks fine. No errors in the log files. And netstat looks like this:

# netstat -ntpl | grep 9090
tcp        0      0 127.0.0.1:9090          0.0.0.0:*               LISTEN      1855/uwsgi

But every time I try to curl localhost:9090 I kept getting curl: (52) Empty reply from server. Nothing in the log files! It seemed no matter what I tried I just couldn't talk to it over HTTP. No, I'm not a sysadmin. I'm just a hobbyist trying to stand up my little server with the tools and limited techniques I know but I was stumped.

The solution

After endless Googling for a resolution and trying all sorts of uwsgi commands directly, I somehow stumbled on the solution.

[uwsgi]
plugins = python35
virtualenv = /var/lib/django/songsearch/venv
pythonpath = /var/lib/django/songsearch
user = django
uid = django
master = true
processes = 3
enable-threads = true
touch-reload = /var/lib/django/songsearch/uwsgi-reload.touch
-http = 127.0.0.1:9090
+http-socket = 127.0.0.1:9090
module = songsearch.wsgi:application
env = LANG=en_US.utf8
env = LC_ALL=en_US.UTF-8
env = LC_LANG=en_US.UTF-8

With this one subtle change, I can now curl localhost:9090 and I still have the /var/run/uwsgi/app/songsearch/socket socket. So, yay!

I'm blogging about this in case someone else ever gets stuck in the same nasty surprise as me.

Also, I have to admit, I was fuming with rage from this frustration. It's really inspired me to revive the quest for an alternative to uwsgi because I'm not sure it's that great anymore. There are new alternatives such as gunicorn, gunicorn with Meinheld, bjoern etc.

Please post a comment if you have thoughts or questions.