Testing Shell Scripts: bats-core, shunit2, Mocking Commands, Fixtures & CI Integration — Stop Shipping Untested Bash

When a Python or Go developer adds a function, they add a test. When a shell developer adds a function, they… usually don’t. The result: bash codebases that nobody dares refactor, where every change is “deploy and pray,” and where production failures often turn out to be regressions in code that “obviously works.”

This isn’t because shell can’t be tested. It’s because the tooling is less famous. It’s actually as easy as Python’s pytest once you know the pattern.

This lesson covers:

bats-core — the de-facto standard test framework for bash. Clean syntax, isolated test runs, parallel execution.
shunit2 — the POSIX-shell-friendly alternative. Use this if you support dash/ash/busybox.
Test patterns for pure functions, side-effecting functions, scripts with arguments, and scripts that call external commands.
Mocking curl, aws, kubectl, anything by overriding PATH.
Fixtures — mktemp -d, golden files, isolating tests from each other.
Coverage with kcov.
CI integration — GitHub Actions and GitLab CI examples that run your test suite on every push.

By the end, you’ll have a tested shell project that’s safe to refactor and a CI pipeline that catches regressions before they reach production.

1. Why test shell? The five most common bugs that tests catch

Before tooling — what are we testing for? In real-world shell, the bugs that bite hardest are:

Quoting: mv $f /backup/ breaks the moment $f contains a space.
Off-by-one in arithmetic: $(( count - 1 )) going negative.
Glob expansion in unexpected places: [[ $x == *.log ]] succeeds for “anything-dot-log” and you forgot to set -f.
Subshell variable scope: cmd | while read; do FOUND=1; done and $FOUND is empty afterwards.
Exit code propagation: cmd1 | cmd2 returning cmd2’s code, not cmd1’s, and set -o pipefail was forgotten.

These are exactly the kinds of bugs unit tests find. A 5-line test that runs your function with a filename containing a space catches a class-1 bug for life.

2. bats-core — the modern bash testing framework

bats-core is a maintained fork of the original bats (Bash Automated Testing System). Test files look like this:

#!/usr/bin/env bats

@test "addition works" {
  result=$(( 1 + 1 ))
  [ "$result" -eq 2 ]
}

@test "string contains substring" {
  haystack="hello world"
  [[ $haystack == *world* ]]
}

@test "command succeeds" {
  run echo "hello"
  [ "$status" -eq 0 ]
  [ "$output" = "hello" ]
}

The @test "name" { ... } syntax is bats’s only syntax extension; everything else is plain bash. Each @test block is run in its own subshell, so tests can’t pollute each other.

2.1 Installation

# Recommended: pin a version via git submodule or vendored install.
git clone https://github.com/bats-core/bats-core.git
sudo bats-core/install.sh /usr/local

# Or via package manager (older versions, but fine for casual use):
brew install bats-core           # macOS
sudo apt install bats            # Debian/Ubuntu (often older)
sudo dnf install bats            # Fedora

# In CI, pin a specific version for reproducibility:
git clone --depth=1 --branch=v1.10.0 https://github.com/bats-core/bats-core.git /tmp/bats
sudo /tmp/bats/install.sh /usr/local

Verify:

$ bats --version
Bats 1.10.0

2.2 The `run` helper — captures output and status

The most important bats primitive:

@test "ls shows expected file" {
  run ls /tmp
  [ "$status" -eq 0 ]
  [[ "$output" == *"somefile"* ]]
}

run executes its arguments and captures:

$status — exit code.
$output — combined stdout+stderr (one string).
$lines[0], $lines[1], … — output split by newlines.

This is how you test side-effecting code. Without run, a non-zero exit aborts the test (because set -e is on by default in bats); with run, the failure is captured into $status and you can assert on it.

2.3 Common assertions

bats relies on bash’s [ ] and [[ ]] for assertions:

@test "various assertions" {
  # Exit code:
  [ "$status" -eq 0 ]                  # Equal
  [ "$status" -ne 0 ]                  # Not equal
  [ "$status" -lt 5 ]                  # Less than

  # String:
  [ "$output" = "exact" ]              # Exact match
  [ "$output" != "" ]                  # Not empty
  [[ "$output" == prefix* ]]           # Glob match
  [[ "$output" =~ ^[0-9]+$ ]]          # Regex match

  # Lines:
  [ "${#lines[@]}" -eq 3 ]             # Number of lines
  [ "${lines[0]}" = "first" ]          # First line content

  # Files:
  [ -f /path/to/file ]                 # File exists
  [ -d /path/to/dir ]                  # Directory exists
  [ ! -e /path/that/should/not/exist ] # Doesn't exist
}

2.4 `bats-assert` and `bats-support` — better failure messages

Plain [ ] gives you “test failed” with no detail. The companion libraries bats-support and bats-assert give you readable diagnostic output:

load 'test_helper/bats-support/load'
load 'test_helper/bats-assert/load'

@test "with assertion library" {
  run echo "hello"
  assert_success                       # equiv. [ "$status" -eq 0 ]
  assert_output "hello"                # equiv. [ "$output" = "hello" ]
  assert_line --index 0 "hello"        # specific line
  refute_output --partial "world"      # output does NOT contain
}

When this fails, you see:

✗ with assertion library
   (in test file test/example.bats, line 6)
     `assert_output "hello world"' failed

     -- output differs --
     expected : hello world
     actual   : hello
     --

vs the plain [ failure:

✗ with assertion library
   (in test file test/example.bats, line 6)
     `[ "$output" = "hello world" ]' failed

The diff is much more actionable. Always install bats-assert + bats-support for serious projects.

Set them up as git submodules:

mkdir -p test/test_helper
git submodule add https://github.com/bats-core/bats-support test/test_helper/bats-support
git submodule add https://github.com/bats-core/bats-assert  test/test_helper/bats-assert
git submodule add https://github.com/bats-core/bats-file    test/test_helper/bats-file
git commit -m "test: add bats helper libraries"

bats-file is also useful — assert_file_exists, assert_file_contains, assert_dir_exists, etc.

2.5 setup / teardown — fixtures

setup() {
  # Runs before each test. Common pattern: make a temp dir.
  TEST_DIR=$(mktemp -d)
  export TEST_DIR

  # If your script depends on env vars, set them here:
  export TZ=UTC
  export LC_ALL=C
}

teardown() {
  # Runs after each test, even on failure. Clean up.
  rm -rf "$TEST_DIR"
}

@test "creates a file" {
  touch "$TEST_DIR/test.txt"
  [ -f "$TEST_DIR/test.txt" ]
}

setup_file() and teardown_file() (newer bats) run once for the whole file — useful for expensive setup like building binaries.

2.6 Running tests

# Single file:
bats test/myscript.bats

# Multiple files / a directory:
bats test/

# With pretty output (TAP is default):
bats --pretty test/

# Parallel execution (much faster on many tests):
bats --jobs 8 test/

# Filter to tests matching a pattern:
bats --filter 'addition' test/

Output looks like:

test/myscript.bats
 ✓ addition works
 ✓ string contains substring
 ✓ command succeeds
 ✗ broken test
   (in test file test/myscript.bats, line 12)
     `[ "$status" -eq 0 ]' failed

4 tests, 1 failure

bats --jobs N runs N tests in parallel. Tests must be independent (no shared state — that’s what setup/teardown is for) for parallelism to be safe.

3. Testing pure functions

The easiest case: a function with no side effects, no external commands.

Suppose we have:

# lib/string.sh
trim() {
  local s=$1
  s=${s#"${s%%[![:space:]]*}"}        # Strip leading whitespace
  s=${s%"${s##*[![:space:]]}"}        # Strip trailing whitespace
  printf '%s' "$s"
}

starts_with() {
  local prefix=$1 str=$2
  [[ $str == "$prefix"* ]]
}

Test file:

#!/usr/bin/env bats
# test/lib/string.bats

setup() {
  load '../test_helper/bats-support/load'
  load '../test_helper/bats-assert/load'
  source "$BATS_TEST_DIRNAME/../../lib/string.sh"
}

@test "trim removes leading whitespace" {
  result=$(trim "   hello")
  assert_equal "$result" "hello"
}

@test "trim removes trailing whitespace" {
  result=$(trim "hello   ")
  assert_equal "$result" "hello"
}

@test "trim removes both" {
  result=$(trim "  hello  ")
  assert_equal "$result" "hello"
}

@test "trim handles tabs" {
  result=$(trim $'\t\thello\t\t')
  assert_equal "$result" "hello"
}

@test "trim of empty string is empty" {
  result=$(trim "")
  assert_equal "$result" ""
}

@test "trim of whitespace-only is empty" {
  result=$(trim "    ")
  assert_equal "$result" ""
}

@test "starts_with: matching prefix" {
  starts_with "foo" "foobar"
}

@test "starts_with: non-matching prefix" {
  ! starts_with "bar" "foobar"
}

@test "starts_with: empty prefix matches anything" {
  starts_with "" "foobar"
}

Note the ! starts_with pattern — bats considers a false exit as test failure, so ! flips it to success when we expect the function to return non-zero.

3.1 Sourcing patterns

Two ways to bring code under test into the test file:

Sourcing — for libraries:

source "$BATS_TEST_DIRNAME/../lib/mylib.sh"

$BATS_TEST_DIRNAME is the directory of the current test file — use this rather than relative paths so tests work regardless of where bats is invoked.

Running — for executables:

@test "myscript with --version" {
  run "$BATS_TEST_DIRNAME/../bin/myscript" --version
  assert_success
  assert_output --partial "version"
}

You can also export the executable path in setup_file() for cleaner tests:

setup_file() {
  export MYSCRIPT="$BATS_TEST_DIRNAME/../bin/myscript"
}

@test "version flag" {
  run "$MYSCRIPT" --version
  assert_success
}

4. Mocking external commands

The hardest part of shell testing: your script calls curl, aws, kubectl — how do you test it without hitting the real network?

The answer: prepend a mocks directory to PATH and put fake versions of those commands in it.

4.1 The PATH-override pattern

Suppose myscript calls curl:

# bin/fetch-config
#!/usr/bin/env bash
set -Eeuo pipefail
URL=$1
curl -fsS "$URL" -o config.json

Test:

#!/usr/bin/env bats

setup() {
  TEST_DIR=$(mktemp -d)
  export PATH="$TEST_DIR/bin:$PATH"
  mkdir -p "$TEST_DIR/bin"

  # Fake curl that records its args and writes a fixed response.
  cat > "$TEST_DIR/bin/curl" <<'EOF'
#!/usr/bin/env bash
echo "curl called: $*" >> "$TEST_DIR/curl-calls.log"
# Find the -o argument and write to it.
out=""
while [[ $# -gt 0 ]]; do
  case $1 in
    -o) out=$2; shift 2;;
    *)  shift;;
  esac
done
[[ -n $out ]] && echo '{"key":"value"}' > "$out"
exit 0
EOF
  chmod +x "$TEST_DIR/bin/curl"
}

teardown() {
  rm -rf "$TEST_DIR"
}

@test "fetch-config writes config.json" {
  cd "$TEST_DIR"
  run "$BATS_TEST_DIRNAME/../bin/fetch-config" "https://example.com/config"
  [ "$status" -eq 0 ]
  [ -f config.json ]
  run cat config.json
  [[ "$output" == *'"key":"value"'* ]]
}

@test "fetch-config calls curl with the URL" {
  cd "$TEST_DIR"
  "$BATS_TEST_DIRNAME/../bin/fetch-config" "https://example.com/config"
  run cat "$TEST_DIR/curl-calls.log"
  [[ "$output" == *"https://example.com/config"* ]]
}

How it works:

setup creates $TEST_DIR/bin/curl — a real script that masquerades as curl.
PATH=$TEST_DIR/bin:$PATH puts our fake first, so when the script-under-test runs curl ..., it actually invokes our fake.
The fake records its invocation to a log file (so we can assert what was called) and writes a canned response.

This is the same pattern Python’s unittest.mock.patch does — except in shell it’s just PATH. Simple, no library needed.

4.2 A reusable mock builder

Writing the mock script inline gets repetitive. A small helper:

# test/test_helper/mock.sh

# mock_command NAME [OUTPUT] [EXIT_CODE]
# Creates an executable in $TEST_DIR/bin that prints OUTPUT and exits with EXIT_CODE.
# Records every invocation to $TEST_DIR/<name>-calls.log.
mock_command() {
  local name=$1
  local out=${2:-}
  local code=${3:-0}
  cat > "$TEST_DIR/bin/$name" <<EOF
#!/usr/bin/env bash
echo "\$*" >> "$TEST_DIR/$name-calls.log"
[[ -n '$out' ]] && printf '%s\n' '$out'
exit $code
EOF
  chmod +x "$TEST_DIR/bin/$name"
}

# assert_called COMMAND ARGS_PATTERN
# Asserts the mock COMMAND was invoked with arguments matching the regex.
assert_called() {
  local name=$1 pattern=$2
  local log="$TEST_DIR/$name-calls.log"
  [ -f "$log" ] || return 1
  grep -qE "$pattern" "$log"
}

# assert_call_count COMMAND N
assert_call_count() {
  local name=$1 expected=$2
  local log="$TEST_DIR/$name-calls.log"
  local actual=0
  [ -f "$log" ] && actual=$(wc -l < "$log")
  [ "$actual" -eq "$expected" ]
}

Use it like this:

setup() {
  TEST_DIR=$(mktemp -d)
  mkdir -p "$TEST_DIR/bin"
  export PATH="$TEST_DIR/bin:$PATH"
  source "$BATS_TEST_DIRNAME/test_helper/mock.sh"
}

@test "fetch-config calls curl exactly once" {
  mock_command curl '{"k":"v"}' 0
  cd "$TEST_DIR"
  run "$BATS_TEST_DIRNAME/../bin/fetch-config" "https://example.com/config"
  assert_call_count curl 1
  assert_called curl 'https://example.com/config'
}

Now your test reads cleanly. Mock building is one line per dependency.

4.3 Mocking commands that need different responses on different calls

# Variant: fail on the first call, succeed on the second.
cat > "$TEST_DIR/bin/curl" <<'EOF'
#!/usr/bin/env bash
COUNT_FILE="$TEST_DIR/curl-count"
count=0
[ -f "$COUNT_FILE" ] && count=$(cat "$COUNT_FILE")
count=$((count + 1))
echo "$count" > "$COUNT_FILE"

if [ "$count" -lt 2 ]; then
  echo "transient error" >&2
  exit 22
fi
exit 0
EOF
chmod +x "$TEST_DIR/bin/curl"

@test "fetch-config retries on transient failure" {
  run "$BATS_TEST_DIRNAME/../bin/fetch-config" "https://example.com/config"
  [ "$status" -eq 0 ]
  # Verify it tried twice:
  count=$(cat "$TEST_DIR/curl-count")
  [ "$count" -eq 2 ]
}

This is how you test the retry-with-backoff logic from L17.

4.4 What can’t be mocked easily

bash builtins (echo, printf, read, cd) — these don’t go through PATH. You can override them with bash functions, but it’s tricky.
Functions defined in the same script — these aren’t subprocesses; they’re just code paths. Use code-level dependency injection instead (pass the function name as a parameter).
Shell features (<(), |, >>) — tested via the surrounding integration test, not mocking.

For most DevOps scripts, mocking the ~5 external CLIs (curl, aws, kubectl, jq, psql, etc.) is enough.

5. Fixtures and golden files

5.1 The golden-file pattern

For commands that produce non-trivial output (a JSON, a config file), compare against a checked-in expected output:

test/
  fixtures/
    input/
      sample.csv
    expected/
      summary.json
  reports.bats

@test "report generation matches golden output" {
  cp "$BATS_TEST_DIRNAME/fixtures/input/sample.csv" "$TEST_DIR/"
  cd "$TEST_DIR"
  run "$BATS_TEST_DIRNAME/../bin/generate-report" sample.csv

  [ "$status" -eq 0 ]
  diff -u "$BATS_TEST_DIRNAME/fixtures/expected/summary.json" summary.json
}

diff -u produces a unified diff that bats prints on failure. You’ll see exactly what differed.

To regenerate golden files when the expected output legitimately changes:

make update-golden
# or:
UPDATE_GOLDEN=1 bats test/

@test "report generation matches golden output" {
  cp "$BATS_TEST_DIRNAME/fixtures/input/sample.csv" "$TEST_DIR/"
  cd "$TEST_DIR"
  run "$BATS_TEST_DIRNAME/../bin/generate-report" sample.csv
  [ "$status" -eq 0 ]

  if [ "${UPDATE_GOLDEN:-0}" = "1" ]; then
    cp summary.json "$BATS_TEST_DIRNAME/fixtures/expected/summary.json"
  fi
  diff -u "$BATS_TEST_DIRNAME/fixtures/expected/summary.json" summary.json
}

The UPDATE_GOLDEN=1 pattern is borrowed from Go’s go test -update and Python’s pytest --snapshot-update. Useful for test-driven changes to output formats.

5.2 Strategies for non-deterministic output

If your output contains timestamps, UUIDs, or random data, strip them before comparing:

@test "report output (timestamps stripped)" {
  run "$BATS_TEST_DIRNAME/../bin/generate-report" sample.csv

  # Filter out the timestamp line before comparing.
  filtered=$(echo "$output" | grep -v 'generated_at')
  expected=$(grep -v 'generated_at' "$BATS_TEST_DIRNAME/fixtures/expected/summary.txt")
  [ "$filtered" = "$expected" ]
}

Or post-process the output to a normalised form:

normalise() {
  sed -E 's/[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9:Z]+/<TIMESTAMP>/g'
}

6. shunit2 — for POSIX scripts

If your script runs under /bin/sh (not bash), bats won’t work — it requires bash for its own runtime. Use shunit2.

6.1 Installation and basic test

# macOS:
brew install shunit2

# Manually (vendored in your repo):
curl -L https://raw.githubusercontent.com/kward/shunit2/master/shunit2 \
  -o test/shunit2
chmod +x test/shunit2

Test file:

#!/bin/sh
# test/string_test.sh

. /path/to/lib/string.sh

testTrim() {
  result=$(trim "   hello   ")
  assertEquals "hello" "$result"
}

testStartsWith() {
  if starts_with "foo" "foobar"; then
    :
  else
    fail "expected 'foo' to be a prefix of 'foobar'"
  fi
}

testStartsWithNegative() {
  if starts_with "bar" "foobar"; then
    fail "should not match"
  fi
}

# Load shunit2 — must be the LAST line.
. ./test/shunit2

Run:

$ ./test/string_test.sh
testTrim
testStartsWith
testStartsWithNegative

Ran 3 tests.

OK

6.2 shunit2 vs bats — the trade-off

	bats-core	shunit2
Shell required	bash 3.2+	`/bin/sh` (POSIX)
Syntax	`@test "name" {...}`	`testFoo() {...}`
Per-test isolation	Subshell automatic	Manual via `setUp`/`tearDown`
Parallel runs	`--jobs N` built-in	No
Output	TAP, pretty, junit	shunit’s own format
Helpers (assert_x)	bats-assert	Built-in (assertEquals, etc.)
Adoption	Most modern projects	Older / POSIX projects

Pick bats-core unless you specifically need POSIX/dash compatibility. The vast majority of shell scripts are bash scripts; bats’s better tooling and parallelism win.

7. Coverage with `kcov`

kcov is a code-coverage tool that works for shell scripts. It traces execution and produces a line-by-line coverage report.

7.1 Installation

# Linux:
sudo apt install kcov                  # Debian/Ubuntu
sudo dnf install kcov                  # Fedora

# macOS — kcov is Linux-only. In CI (Linux), it works fine.

# Verify:
kcov --version

7.2 Running tests under kcov

# Run bats with kcov instrumenting it:
kcov --include-path=lib,bin coverage/ bats test/

# Open the HTML report:
xdg-open coverage/index.html       # Linux
open coverage/index.html           # macOS

The report shows which lines of your lib/*.sh and bin/* were executed by tests, and which weren’t. A line that’s never hit is a candidate for either deletion or a new test.

7.3 Coverage in CI (with Codecov)

# .github/workflows/test.yml
- name: Run tests with coverage
  run: kcov --include-path=lib,bin coverage/ bats test/

- name: Upload to Codecov
  uses: codecov/codecov-action@v3
  with:
    files: ./coverage/*/cov.xml

Codecov supports the kcov output format. Now every PR shows coverage delta.

8. CI integration

8.1 GitHub Actions

# .github/workflows/test.yml
name: tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          submodules: recursive          # Pull in bats-support / bats-assert.

      - name: Install bats
        run: |
          git clone --depth=1 --branch=v1.10.0 https://github.com/bats-core/bats-core.git /tmp/bats
          sudo /tmp/bats/install.sh /usr/local

      - name: Install shellcheck
        run: sudo apt-get install -y shellcheck

      - name: Run shellcheck
        run: shellcheck bin/* lib/*.sh

      - name: Run bats tests
        run: bats --jobs 4 --pretty test/

      - name: Run kcov for coverage
        run: |
          sudo apt-get install -y kcov
          kcov --include-path=lib,bin coverage/ bats test/

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          files: ./coverage/*/cov.xml

Three layers of testing: shellcheck (static analysis from L13), bats (unit tests), kcov (coverage). Run all three on every push.

8.2 GitLab CI

# .gitlab-ci.yml
stages:
  - test

shellcheck:
  stage: test
  image: koalaman/shellcheck-alpine
  script:
    - shellcheck bin/* lib/*.sh

bats:
  stage: test
  image: bats/bats:1.10.0
  script:
    - bats --pretty test/

coverage:
  stage: test
  image: ubuntu:24.04
  script:
    - apt-get update && apt-get install -y bats kcov
    - kcov --include-path=lib,bin coverage/ bats test/
  artifacts:
    paths:
      - coverage/
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/*/cobertura.xml

8.3 Run tests across multiple shell/OS combinations

A matrix build catches portability bugs:

jobs:
  test:
    strategy:
      fail-fast: false
      matrix:
        os: [ubuntu-22.04, ubuntu-24.04, macos-13, macos-14]
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - run: brew install bats-core || (sudo apt-get update && sudo apt-get install -y bats)
      - run: bats test/

For BSD-vs-GNU bugs (the kind L19 was full of), this is invaluable — run the same tests on macOS to catch BSD-specific failures.

9. A complete tested mini-project

Let’s tie it all together. A small CLI that fetches JSON from a URL and extracts a field:

my-tool/
├── bin/
│   └── my-tool
├── lib/
│   ├── http.sh
│   └── json.sh
├── test/
│   ├── lib/
│   │   ├── http.bats
│   │   └── json.bats
│   ├── integration/
│   │   └── my-tool.bats
│   ├── fixtures/
│   │   └── sample.json
│   └── test_helper/
│       ├── bats-support/  (submodule)
│       ├── bats-assert/   (submodule)
│       └── mock.sh
├── .github/workflows/test.yml
└── Makefile

9.1 The library

# lib/json.sh
extract_field() {
  local json=$1 field=$2
  printf '%s' "$json" | jq -r ".$field"
}

# lib/http.sh
fetch_url() {
  local url=$1
  curl -fsS --max-time 30 "$url"
}

9.2 The CLI

# bin/my-tool
#!/usr/bin/env bash
set -Eeuo pipefail

SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)
source "$SCRIPT_DIR/../lib/http.sh"
source "$SCRIPT_DIR/../lib/json.sh"

main() {
  local url=$1 field=$2
  local body
  body=$(fetch_url "$url") || { echo "Failed to fetch $url" >&2; exit 1; }
  extract_field "$body" "$field"
}

main "$@"

9.3 Unit tests for `lib/json.sh`

# test/lib/json.bats
#!/usr/bin/env bats

setup() {
  load '../test_helper/bats-support/load'
  load '../test_helper/bats-assert/load'
  source "$BATS_TEST_DIRNAME/../../lib/json.sh"
}

@test "extract_field: top-level field" {
  result=$(extract_field '{"name":"alice"}' 'name')
  assert_equal "$result" "alice"
}

@test "extract_field: nested field" {
  result=$(extract_field '{"user":{"name":"bob"}}' 'user.name')
  assert_equal "$result" "bob"
}

@test "extract_field: missing field returns 'null'" {
  result=$(extract_field '{"name":"alice"}' 'age')
  assert_equal "$result" "null"
}

@test "extract_field: invalid JSON exits non-zero" {
  run extract_field 'not json' 'name'
  assert_failure
}

9.4 Integration test for the full CLI (with mocked curl)

# test/integration/my-tool.bats
#!/usr/bin/env bats

setup() {
  load '../test_helper/bats-support/load'
  load '../test_helper/bats-assert/load'
  source "$BATS_TEST_DIRNAME/../test_helper/mock.sh"

  TEST_DIR=$(mktemp -d)
  mkdir -p "$TEST_DIR/bin"
  export PATH="$TEST_DIR/bin:$PATH"
  export TEST_DIR
}

teardown() {
  rm -rf "$TEST_DIR"
}

@test "my-tool extracts a field from the URL response" {
  mock_command curl '{"name":"alice","age":30}' 0
  run "$BATS_TEST_DIRNAME/../../bin/my-tool" 'http://example/api/user/1' 'name'
  assert_success
  assert_output "alice"
}

@test "my-tool exits 1 when curl fails" {
  mock_command curl '' 22
  run "$BATS_TEST_DIRNAME/../../bin/my-tool" 'http://example/missing' 'name'
  assert_failure
  assert_output --partial "Failed to fetch"
}

@test "my-tool extracts nested fields" {
  mock_command curl '{"user":{"name":"bob"}}' 0
  run "$BATS_TEST_DIRNAME/../../bin/my-tool" 'http://example/api' 'user.name'
  assert_success
  assert_output "bob"
}

@test "my-tool calls curl with the right URL" {
  mock_command curl '{"name":"x"}' 0
  "$BATS_TEST_DIRNAME/../../bin/my-tool" 'http://example/abc' 'name'
  assert_called curl 'http://example/abc'
}

9.5 The Makefile

.PHONY: test test-unit test-integration shellcheck coverage clean

test: shellcheck test-unit test-integration

shellcheck:
	shellcheck bin/* lib/*.sh

test-unit:
	bats --pretty test/lib/

test-integration:
	bats --pretty test/integration/

coverage:
	kcov --include-path=lib,bin coverage/ bats test/

clean:
	rm -rf coverage/

Now make test runs everything in a developer’s local checkout, and CI runs the same target. One command, fully covered.

10. Testing patterns by problem type

10.1 Testing scripts that read stdin

@test "process-csv handles standard input" {
  run bash -c 'echo "1,2,3" | "$1"' _ "$BATS_TEST_DIRNAME/../bin/process-csv"
  [ "$status" -eq 0 ]
  [[ "$output" == *"3 fields"* ]]
}

The bash -c wrapper is needed because run doesn’t pipe directly. Or use a heredoc:

@test "process-csv handles heredoc input" {
  run "$BATS_TEST_DIRNAME/../bin/process-csv" <<EOF
1,2,3
4,5,6
EOF
  [ "$status" -eq 0 ]
}

10.2 Testing scripts that prompt the user

For scripts with interactive prompts, provide input via the <<< here-string:

@test "confirm-action accepts 'yes'" {
  run "$BATS_TEST_DIRNAME/../bin/confirm-action" <<< "yes"
  [ "$status" -eq 0 ]
}

@test "confirm-action rejects 'no'" {
  run "$BATS_TEST_DIRNAME/../bin/confirm-action" <<< "no"
  [ "$status" -ne 0 ]
}

10.3 Testing async / background work

For scripts that fork background processes, you have to wait for them. The pattern:

@test "async-job completes and writes result file" {
  run timeout 10 "$BATS_TEST_DIRNAME/../bin/async-job"
  [ "$status" -eq 0 ]
  [ -f "$TEST_DIR/result" ]
}

timeout ensures the test doesn’t hang forever. Use BATS_TEST_TIMEOUT (newer bats) for per-test limits.

10.4 Testing functions that read environment variables

@test "function uses CONFIG_FILE env var" {
  CONFIG_FILE=/tmp/custom-config run my_function
  assert_output --partial "/tmp/custom-config"
}

KEY=VALUE run cmd sets the env var only for that one invocation — perfect for tests.

11. Common pitfalls and how to avoid them

11.1 Pitfall: Tests that pass locally, fail in CI

Almost always one of:

Locale: CI uses C.UTF-8, you use en_US.UTF-8. Set export LC_ALL=C in setup.
Time zone: CI is UTC, you’re in IST. Set export TZ=UTC in setup.
PATH: CI doesn’t have the binary you have. Vendor it or install it explicitly.
Dependencies: jq/yq/curl version differences. Pin versions in CI.

The fix: make setup exhaustive — set every environment variable your script reads, in setup.

11.2 Pitfall: Tests that occasionally fail (flaky)

If a test passes 9/10 times, it’s broken. Common causes:

Race conditions in parallel tests sharing a resource. Use mktemp -d per test.
Timing — sleeping for “long enough” rather than waiting for a condition. Replace sleep N with a polling loop.
External services — never call real APIs in unit tests. Mock everything.

11.3 Pitfall: Tests that test the implementation, not the behaviour

# BAD — testing implementation:
@test "myfn uses 'awk' to parse" {
  source lib/myfn.sh
  type myfn | grep -q awk    # This breaks if you switch to sed.
}

# GOOD — testing behaviour:
@test "myfn returns the right answer" {
  source lib/myfn.sh
  result=$(myfn input)
  [ "$result" = "expected" ]
}

If you can change the implementation without breaking the test, the test is good. If a refactor breaks tests without changing behaviour, the tests are coupled too tightly.

11.4 Pitfall: Running tests as root

Tests should run as a regular user. If your script needs root, mock the privileged commands (apt-get, systemctl, etc.) and verify they were called correctly. Don’t actually run them.

mock_command sudo '' 0     # sudo becomes a no-op
mock_command systemctl '' 0

12. Quick reference card

bats-core essentials

@test "name" {
  run command args
  [ "$status" -eq 0 ]
  [ "$output" = "expected" ]
}

setup()     { TEST_DIR=$(mktemp -d); }
teardown()  { rm -rf "$TEST_DIR"; }
load 'test_helper/bats-support/load'
load 'test_helper/bats-assert/load'
assert_success / assert_failure
assert_equal "$a" "$b"
assert_output "exact" / assert_output --partial "substring"

Mocking in 3 lines

mkdir -p "$TEST_DIR/bin"
export PATH="$TEST_DIR/bin:$PATH"
echo '#!/bin/bash\necho mocked' > "$TEST_DIR/bin/curl" && chmod +x "$_"

Running tests

bats test/                  # All tests
bats --pretty test/         # Friendlier output
bats --jobs 8 test/         # Parallel
bats --filter 'pattern'     # Subset

Project layout

lib/        # functions to test
bin/        # scripts to test
test/
  lib/      # unit tests for functions (one .bats per .sh)
  integration/  # end-to-end tests of bin/* scripts
  fixtures/     # test data
  test_helper/  # bats-support, bats-assert, mock.sh

CI in 5 lines

- run: git clone --depth=1 --branch=v1.10.0 https://github.com/bats-core/bats-core.git /tmp/bats
- run: sudo /tmp/bats/install.sh /usr/local
- run: shellcheck bin/* lib/*.sh
- run: bats --jobs 4 test/

13. Wrap-up

Shell scripts deserve the same testing discipline as any other code. The tools are there — bats-core is genuinely pleasant to use, mocking via PATH is mechanical, and CI integration is two lines.

The recipe:

Pure functions in lib/*.sh — easy to test, just source and call.
CLI scripts in bin/* — test via run with mocked external commands.
Mocks via PATH override — one helper function, used everywhere.
Fixtures via mktemp -d — fresh, isolated, auto-cleaned.
CI runs shellcheck + bats + kcov — every push, every PR.

Once you have this in place, refactoring shell becomes safe. Adding features becomes test-first. Production regressions drop to near zero. The investment is small (an afternoon to set up; minutes per test thereafter); the payoff is enormous.

Next: L22 — packaging shell scripts: shebangs, PATH discipline, distro-portable scripts, make install, deb/rpm packaging, and how to ship a shell tool that installs cleanly on any modern Unix.