A blog and website by Peter Bengtsson
Inside a step in a GitHub Action, I want to run a script, and depending on the outcome of that, maybe do some more things. Essentially, if the script fails, I want to print some extra user-friendly messages, but the whole Action should still fail with the same exit code.
In pseudo-code, this is what I want to achieve:
exit_code = that_other_script() if exit_code > 0: print("Extra message if it failed") exit(exit_code)
So here's how to do that with bash
:
# If it's not the default, make it so that it proceeds even if
# any one line exits non-zero
set +e
./script/update-internal-links.js --check
exit_code=$?
if [ $exit_code != 0 ]; then
echo "Extra message here informing that the script failed"
exit $exit_code
fi
The origin, for me, at the moment, was that I had a GitHub Action where it calls another script that might fail. If it fails, I wanted to print out a verbose extra hint to whoever looks at the output. Steps in GitHub Action runs with set -e
by default I think, meaning that if anything goes wrong in the step it leaves the step and runs those steps with if: ${{ failure() }}
next.
If you've used GitHub Actions before you might be familiar with the matrix
strategy. For example:
name: My workflow
jobs:
build:
strategy:
matrix:
version: [10, 12, 14, 16, 18]
steps:
- name: Set up Node ${{ matrix.node }}
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
...
But what if you want that list of things in the matrix to be variable? For example, on rainy days you want it to be [10, 12, 14]
and on sunny days you want it to be [14, 16, 18]
. Or, more seriously, what if you want it to depend on how the workflow is started?
You can make a workflow run on a schedule, on pull requests, on pushes, on manual "Run workflow", or as a result on some other workflow finishing.
First, let's set up some sample on
directives:
name: My workflow
on:
workflow_dispatch:
schedule:
- cron: '*/5 * * * *'
workflow_run:
workflows: ['Build and Deploy stuff']
types:
- completed
The workflow_dispatch
makes it so that a button like this appears:
The schedule
, in this example, means "At every 5th minute"
And workflow_run
, in this example, means that it waits for another workflow, in the same repo, with name: 'Build and Deploy stuff'
has finished (but not necessarily successfully)
For the sake of the demo, let's say this is the rule:
[16, 18]
. [18]
. Build and Deploy stuff
workflow has successfully finished, you want the matrix to be [10, 12, 14, 16, 18]
.It's arbitrary but it could be a lot more complex than this.
What's also important to appreciate is that you could use individual steps that look something like this:
- steps:
- name: Only if started on a workflow_dispatch
if: ${{ github.event_name == 'workflow_dispatch' }}
run: echo "yes it was run because of a workflow_dispatch"
But the rest of the workflow is realistically a lot more complex with many steps and you don't want to have to sprinkle the line if: ${{ github.event_name == 'workflow_dispatch' }}
into every single step.
The solution to avoiding repetition is to use a job that depends on another job. We'll have a job that figures out the array for the matrix
and another job that uses that.
First we inject a job that looks like this:
jobs:
matrix_maker:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.result }}
steps:
- uses: actions/github-script@v6
id: set-matrix
with:
script: |
if (context.eventName === "workflow_dispatch") {
return [18]
}
if (context.eventName === "schedule") {
return [16, 18]
}
if (context.eventName === "workflow_run") {
if (context.payload.workflow_run.conclusion === "success") {
return [10, 12, 14, 16, 18]
}
throw new Error(`It was a workflow_run but not success ('${context.payload.workflow_run.conclusion}')`)
}
throw new Error("Unable to find a reason")
- name: Debug output
run: echo "${{ steps.set-matrix.outputs.result }}"
Now we can write the "meat" of the workflow that uses this output:
build:
needs: matrix_maker
strategy:
matrix:
version: ${{ fromJSON(needs.matrix_maker.outputs.matrix) }}
steps:
- name: Set up Node ${{ matrix.version }}
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.version }}
Combined, the entire thing can look like this:
name: My workflow
on:
workflow_dispatch:
schedule:
- cron: '*/5 * * * *'
workflow_run:
workflows: ['Build and Deploy stuff']
types:
- completed
jobs:
matrix_maker:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.result }}
steps:
- uses: actions/github-script@v6
id: set-matrix
with:
script: |
if (context.eventName === "workflow_dispatch") {
return [18]
}
if (context.eventName === "schedule") {
return [16, 18]
}
if (context.eventName === "workflow_run") {
if (context.payload.workflow_run.conclusion === "success") {
return [10, 12, 14, 16, 18]
}
throw new Error(`It was a workflow_run but not success ('${context.payload.workflow_run.conclusion}')`)
}
throw new Error("Unable to find a reason")
- name: Debug output
run: echo "${{ steps.set-matrix.outputs.result }}"
build:
needs: matrix_maker
strategy:
matrix:
version: ${{ fromJSON(needs.matrix_maker.outputs.matrix) }}
steps:
- name: Set up Node ${{ matrix.version }}
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.version }}
I've extrapolated this demo from a more complex one at work. (this is my defense for typos and why it might fail if you verbatim copy-n-paste this). The bare bones are there for you to build on.
In this demo, I've used actions/github-script
with JavaScript, because it's convenient and you don't need do to things like actions/checkout
and npm ci
if you want this to be a standalone Node script. Hopefully you can see that this is just a start and the sky's the limit.
Thanks to fellow GitHub Hubber @joshmgross for the tips and help!
Also, check out Tips and tricks to make you a GitHub Actions power-user
tl;dr
- name: Only if auto-merge is enabled
if: ${{ github.event.pull_request.auto_merge }}
run: echo "Auto-merge IS ENABLED"
- name: Only if auto-merge is NOT enabled
if: ${{ !github.event.pull_request.auto_merge }}
run: echo "Auto-merge is NOT enabled"
The use case that I needed was that I have a workflow that does a bunch of things that aren't really critical to test the PR, but they also take a long time. In particular, every pull request deploys a "preview environment" so you get a "staging" site for each pull request. Well, if you know with confidence that you're not going to be clicking around on that preview/staging site, why bother deploying it (again)?
Also, a lot of PRs get the "Auto-merge" enabled because whoever pressed that button knows that as long as it builds OK, it's ready to merge in.
What's cool about the if:
statements above is that they will work in all of these cases too:
on:
workflow_dispatch:
pull_request:
push:
branches:
- main
I.e. if this runs because it was a push to main
the line ${{ !github.event.pull_request.auto_merge }}
will resolve to truthy. Same if you use the workflow dispatch from workflow_dispatch
.
Auto-merge is a fantastic GitHub Actions feature. You first need to set up some branch protections and then, as soon as you've created the PR you can press the "Enable auto-merge (squash)". It will ("Squash and merge") merge the PR as soon as all branch protection checks succeeded. Neat.
But what if you have a workflow that is made up of half critical and half not-so-important stuff. In particular, what if there's stuff in the workflow that is really slow and you don't want to wait. One example is that you might have a build-and-deploy workflow where you've decided that the "build" part of that is a required check, but the (slow) deployment is just a nice-to-have. Here's an example of that:
name: Build and Deploy stuff
on:
workflow_dispatch:
pull_request:
permissions:
contents: read
jobs:
build-stuff:
runs-on: ubuntu-latest
steps:
- name: Slight delay
run: sleep 5
deploy-stuff:
needs: build-stuff
runs-on: ubuntu-latest
steps:
- name: Do something
run: sleep 26
It's a bit artificial but perhaps you can see beyond that. What you can do is set up a required status check, as a branch protection, just for the build-stuff
job.
Note how the job is made up of build-stuff
and deploy-stuff
, where the latter depends on the first. Now set up branch protection purely based on the build-stuff
. This option should appear as you start typing buil
there in the "Status checks that are required." section of Branch protections.
Now, when the PR is created it immediately starts working on that build-stuff
job. While that's running you press the "Enable auto-merge (squash)" button:
What will happen is that as soon as the build-stuff
job (technically the full name becomes "Build and Deploy stuff / build-stuff") goes green, the PR is auto-merged. But the next (dependent) job deploy-stuff
now starts so even if the PR is merged you still have an ongoing workflow job running. Note the little orange dot (instead of the green checkmark).
It's quite an advanced pattern and perhaps you don't have the use case yet, but it's good to know it's possible. What our use case at work was, was that we use auto-merge a lot in automation and our complete workflow depended on a slow step that is actually conditional (and a bit slow). So we didn't want the auto-merge to be delayed because of something that might be slow and might also turn out to not be necessary.
Web development, GitHub, JavaScript
tl;dr; docsQL is a web app for analyzing lots of Markdown content files with SQL queries.
Sample instance based on MDN's open source content.
When I worked on the code for MDN in 2019-2021 I often found that I needed to understand the content better to debug or test or just find a sample page that uses some feature. I ended up writing a lot of one-off Python scripts that would traverse the repository files just to do some quick lookup that was too complex for grep
. Eventually, I built a prototype called "Traits DB" which was powered by an in-browser SQL engine called alasql
. Then in 2021, I joined GitHub to work on GitHub Docs and here there are lots of Markdown files too that trigger different features based on various front-matter keys.
docsQL does two things:
.md
files into a docs.json
file which can be queried The analyzing portion has a killer feature in that you can write your own plugins tailored specifically to your project. Your project might use some quirks that are unique. In GitHub Docs, for example, we use something called "LiquidJS" which is like a pre-Markdown processing to do things like versioning. So I can write a custom JavaScript plugin that extends data you get from reading in the front-matter.
Here's an example plugin:
const regex = /💩/g;
export default function countCocoIceMentions({ data, content }) {
const inTitle = (data.title.match(regex) || []).length;
const inBody = (content.match(regex) || []).length;
return {
chocolateIcecreamMentions: inTitle + inBody,
};
}
Now, if you add that to your project, you'll be able to run:
SELECT title, chocolateIcecreamMentions FROM ?
WHERE chocolateIcecreamMentions > 0
ORDER BY 2 DESC LIMIT 15
It's up to you. One important fact to keep in mind is that not everyone speaks SQL fluently. And even if you're somewhat confident with SQL, it might not be obvious how this particular engine works or what the fields are. (Mind you, there's a "Help" which shows you all fields and a collection of sample queries).
But it's really intuitive to extend an already written SQL query. So if someone shares their query, it's easy to just extend it. For example, your colleague might share a URL with an SQL query in the query string, but you want to change the sort order so you just edit DESC
for ASC
.
I would recommend that any team that has a project with a bunch of Markdown files, add docsql
as a dependency somewhere, have it build with your directory of Markdown files, and then publish the docsql/out/
directory as a static web page which you can host on Netlify or GitHub Pages.
This way, your team gets a centralized place where team members can share URLs with each other that has queries in it. When someone shares one of these, they get added to your "Saved queries" and you can extend them from there to add to your own list.
The project is here: github.com/peterbe/docsql and it's MIT licensed. The analyzing part is all Node. It's a CLI that is able to dynamically import other .mjs
files based on scanning the directory at runtime.
The front-end is a NextJS static build which uses Mantine for the React UI components.
You can install it npx
like this:
npx docsql /path/to/my/markdown/files
But if you want to control it a bit better you can simply add it to your own Node project with: npm save docsql
or yarn add docsql
.
First of all, it's a very new project. My initial goal was to get the basics working. A lot of edges have been left rough. Especially in areas of installation, performance, and SQL editor. Please come and help out if you see something. In particular, if you tried to set it up but found it hard, we can work together to either improve the documentation to fix some scripts that would help the next person.
For feature requests and bug reports use: https://github.com/peterbe/docsql/issues/new
Or just comment here on the blog post.
Table of contents
actions/github-script
when bash is too clunkyworkflow_dispatch
instead of running on pushespull_request_target
with the code from a pull requestactions/github-script
when bash is too clunkybash
is impressively simple but sometimes you want a bit more scripting. Use actions/github-script
to be able to express yourself with JavaScript and get a bunch of goodies built-in.
- name: Print something
uses: actions/github-script@v5.1.0
with:
script: |
const { owner, repo } = context.repo
console.log(`The owner of ${repo} is ${owner}`)
This example obviously doesn't demonstrate the benefit because it's only 2 lines of actual business logic. But if you find yourself typing more and more complex bash that you, reaching for actions/github-script
is a nifty alternative.
See the documentation on actions/github-script
.
In the above example on actions/github-script
we saw a simple way to use JavaScript instead of bash
and how it has access to useful stuff in the context. An immediate disadvantage, as you might have noticed, is that that JavaScript in the Yaml file isn't syntax highlighted in any way because it's treated as a blob string of code.
If your business logic needs to evolve to something more sophisticated, you can just create a regular Node script anywhere in your repo and do:
run: ./scripts/my-script.js
Thing is, it doesn't really matter what you call the script or where you put it. But my recommendation is, put your scripts in a directory called .github/actions-scripts/
because it reminds you that this script is all about complementing your GitHub Actions. If you put it in scripts/
or bin/
in the root of your project, it's not clear that those scripts are related to running Actions.
Note that if you do this, you'll need to make sure you use actions/checkout
and actions/setup-node
too if you haven't done so already. Example:
name: Using Action script
on:
pull_request:
permissions:
contents: read
jobs:
action-scripts:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v2
- name: Install Node dependencies
run: npm install --no-save @actions/core @actions/github
- name: Gets labels of this PR
id: label-getter
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: .github/actions-scripts/get-labels.mjs
- name: Debug what the step above did
run: echo "${{ steps.label-getter.outputs.currentLabels }}"
And .github/actions-scripts/get-labels.mjs
:
#!/usr/bin/env node
import { context, getOctokit } from "@actions/github";
import { setOutput } from "@actions/core";
console.assert(process.env.GITHUB_TOKEN, "GITHUB_TOKEN not present");
const octokit = getOctokit(process.env.GITHUB_TOKEN);
main();
async function getCurrentPRLabels() {
const {
repo: { owner, repo },
payload: { number },
} = context;
console.assert(number, "number not present");
const { data: currentLabels } = await octokit.rest.issues.listLabelsOnIssue({
owner,
repo,
issue_number: number,
});
console.log({ currentLabels });
return currentLabels.map((label) => label.name).join(", ");
}
async function main() {
const labels = await getCurrentPRLabels();
setOutput("currentLabels", labels);
}
Don't forget to chmox +x .github/actions-scripts/get-labels.mjs
You don't need to go into your repo's "Secrets" tab to make ${{ secrets.GITHUB_TOKEN }}
available.
But imagine you want to hack on this script locally, you just need to create a personal access token, and type, in your terminal:
GITHUB_TOKEN=arealonefromyourdevelopersettings node .github/actions-scripts/get-labels.mjs
And working that way is more convenient than having to constantly edit the .github/workflows/*.yml
file to see if the changes worked.
workflow_dispatch
instead of running on pushesThe most common Actions are run on pull requests or on pushes. For actions that test stuff, it's not uncommon to see at the top of the .yml
file, something like this:
name: Testing that the pull request is good
on:
pull_request:
...
And perhaps you have something operational that runs when the pull requests have been merged:
name: Celebrate that the pull request landed
on:
push:
...
That's nice but what if you're debugging something in that workflow and you don't want to trigger it by making a commit into main
. What you can do is add this:
name: Celebrate that the pull request landed
on:
push:
+ workflow_dispatch:
...
In fact, you don't have to use it with on.push:
you can use it with on.schedule:.cron:
too. Or even, on its on. At work, we have a workflow that is just:
name: Manually purge CDN
on:
workflow_dispatch:
jobs:
purge:
runs-on: ubuntu-latest
...
Now, to run it, you just need to find the workflow in your repository's "Actions" tab and press the "Run workflow" button.
pull_request_target
with the code from a pull requestYou might have heard of pull_request_target
as an option so that you can do privileged things in the workflow that you would otherwise not allow in untrusted pull requests. In particular, you might need to use secrets when analyzing a new pull request. But you can't use secrets on a regular on.pull_request
workflow. So you use on.pull_request_target
. But now, how do you get the code that was being changed in the PR? Since pull_request_target
runs on HEAD
(the latest commit in the main
(or master
) branch).
To run a pull_request_target
workflow, based on the code in a PR, use:
name: Analyze and report on PR code
on:
pull_request_target:
permissions:
contents: read
jobs:
action-scripts:
runs-on: ubuntu-latest
steps:
- name: Check out PR code
uses: actions/checkout@v2
with:
# THIS is the magic
ref: ${{ github.event.pull_request.head.sha }}
Word of warning, that will mix the pull requests code with your fully-loaded pull_request_target
workflow. Even if that pull request comes with its own attempt to override your pull_request_target
Actions workflow, it won't be run here. But, if your workflow depends on external scripts (e.g. run: node .github/actions-scripts/something.mjs
) then that would be run from the pull request, with secrets potentially enabled and available.
Another option is to do two checkouts. One of your HEAD
code and one of the pull request, but carefully mix the two. Example:
name: Analyze and report on PR code
on:
pull_request_target:
permissions:
contents: read
jobs:
action-scripts:
runs-on: ubuntu-latest
steps:
- name: Check out HEAD code
uses: actions/checkout@v2
- name: Check out *their* code
uses: actions/checkout@v2
with:
path: ./pr-code
ref: ${{ github.event.pull_request.head.sha }}
- name: Analyze their code
env:
SECRET_TOKEN_NEEDED: ${{ secrets.SPECIAL_SECRET }}
run: ./scripts/analyze.py --repo-root=./pr-code
Now, you can be certain that it's only code in the main
branch HEAD that executes things but it safely has access to the pull requests suggested code.
No denying, NextJS is massively popular and a lot of web apps depend on it and their Actions will test things like npm run build
working.
The problem with NextJS, for any non-trivial app, is that it's slow. Even with its fancy SWC compiler simply because you probably have a lot of files. The fastest transpiler is one that doesn't need to do anything and that's where .next/cache
comes in. To use it in your CI add this:
- name: Setup node
uses: actions/setup-node@v2
with:
cache: npm
- name: Install dependencies
run: npm ci
+ - name: Cache nextjs build
+ uses: actions/cache@v2
+ with:
+ path: .next/cache
+ key: ${{ runner.os }}-nextjs-${{ hashFiles('package*.json') }}
- name: Run build script
run: npm run build
If you use yarn
add ${{ hashFiles('yarn.lock') }}
there on the key
line.
But here's the rub. If you run this workflow only on on.pull_request
any caching made will only be reusable by other runs on the same pull request. I.e. if you commit some, make a pull request, commit some more and run the workflow again.
To make that cached asset become available to other pull requests, you need to do one of two things: Also run this on on.push
or have a dedicated workflow that runs on on.push
whose only job is to execute these lines that warm up the cache.
You might have experienced an Action that uses a matrix
strategy to test once per version of Node, Python, or whatever. For example:
jobs:
test:
name: Node ${{ matrix.node }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
node:
- 12
- 14
- 16
- 17
But, it doesn't have to be for just different versions of a language like that. It can be any array of strings. For example, if you have a slow set of tests you can break it up by your own things:
jobs:
test:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
test-group:
[
content,
graphql,
meta,
rendering,
routing,
unit,
linting,
translations,
]
steps:
- name: Check out repo
uses: actions/checkout@v2.4.0
- name: Setup node
uses: actions/setup-node
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test -- tests/${{ matrix.test-group }}/
You might have something like this at the top of your workflow:
on:
push:
branches:
- main
workflow_dispatch:
pull_request:
So your workflow might be doing some special scripts or something that depends on the branch name. I.e. if it's a PR that's running the workflow you (e.g. a PR to merge someone-fork:my-cool-branch
to origin:main
), you want the name my-cool-branch
. But if it's run again after it's been merged into main
, you want the name main
.
When it's a pull request (or pull_request_target
) you want to read ${{ github.head_ref }}
and when it's a push you want to read ${{ github.ref_name }}
. So, in a simple way, to get either my-cool-branch
or main
use:
- name: Name of the branch
run: echo "${{ github.head_ref || github.ref_name }}"
For better reproducibility, it's good to use exact versions of third-party actions. That way it's less likely take you to surprise you when new versions come out and suddenly fail things.
But once you use more specific versions (or perhaps the exact SHA like uses: actions/checkout@ec3a7ce113134d7a93b817d10a8272cb61118579
) then you'll want upgrades of these to be automated.
Create a file called .github/dependabot.yml
and put this in it:
version: 2
updates:
- package-ecosystem: 'github-actions'
directory: '/'
schedule:
interval: monthly