This is the first lesson of Wave 2. Wave 1 (L1–L12) gave you the mental model and the toolkit. From here onwards, the focus shifts: we’re not learning what shell can do — we’re learning how to write shell you’d be willing to ship to production and trust to run unattended at 03:00.
The first and biggest discipline is defensive scripting: writing scripts that, when something goes wrong, fail loudly and immediately rather than silently corrupting state and continuing. Every senior engineer has a horror story about a shell script that “ran fine” but actually skipped half its work because of an unquoted variable or a missing pipefail.
By the end of this lesson you will:
- Know every flag in
set -Eeuo pipefail, what it does, and (critically) where it doesn’t fire. - Apply IFS hardening as a routine reflex, not a special case.
- Run ShellCheck on every script and read its findings fluently — including knowing which rules to disable and how to suppress them safely.
- Write a reusable
lib/errors.shwithdie,warn,assert_*helpers that you can drop into any project. - Use the ERR trap to print line, function, and command context when something fails.
- Propagate errors across function boundaries (a non-trivial bash quirk most engineers get wrong).
This is the most important lesson in Tier 3. Internalise it now and the next nine lessons will land with much less friction.
1. The strict-mode preamble in full
Every production shell script you write should start with the same five lines (after the shebang):
#!/usr/bin/env bash
set -Eeuo pipefail
IFS=$'\n\t'
That’s the canonical “strict mode” preamble. Let’s go through it flag by flag. The order in -Eeuo is conventional but irrelevant; bash treats them independently.
-e (errexit) — exit on any unhandled error
set -e
Causes the shell to exit immediately if any simple command exits with non-zero status. Without -e, scripts blunder through errors:
#!/usr/bin/env bash
# Without -e
cp /nonexistent /tmp/somewhere # FAILS — but we keep going
echo "Done!" # prints "Done!" — but the cp failed
With -e, the script exits as soon as cp fails. This is what you want 99% of the time.
The gotcha: -e does NOT trigger in many cases that surprise people:
- Inside
if,while,untilconditions: that’s the whole point of those constructs — they test exit codes.if grep -q PATTERN file; then ... # grep returning 1 (no match) doesn't exit - After
&&or||in a chain (until the rightmost):cmd1 && cmd2 # cmd1 failing doesn't exit; the whole chain exits 1 - Negated commands (
! cmd):! cmd # never exits; the `!` *expects* failure - Inside command substitution (
$(…)) — by default, the failure of a command in$(...)does not propagate. (This was tightened in bash 4.4+ withinherit_errexit.)X=$(cmd1; cmd2) # cmd1 failing doesn't exit if cmd2 succeeds shopt -s inherit_errexit # bash 4.4+ — propagates errexit into $(...) - The last command in a pipeline is what
$?becomes; intermediate command failures are silently swallowed unlesspipefailis set. - Functions return-statuses for
set -epurposes: if a function fails insideif !,set -edoesn’t fire there either — same as any other command.
So -e is a coarse safety net, not a fine-grained correctness check. It catches “I forgot to handle the failure of cp” but won’t help with subtler bugs.
-E (errtrace) — extend ERR trap to functions and subshells
set -E
Without -E, an ERR trap defined at the top level does not fire inside functions, subshells, or command substitutions. With -E, traps inherit. This is essential if you want the ERR trap pattern (covered below) to be useful.
If you don’t use ERR traps, -E is harmless. But you should use ERR traps. So set -E always.
-u (nounset) — fail on unset variables
set -u
Causes the shell to exit when you reference an unset variable:
set -u
echo "$UNDEFINED_VAR" # error: UNDEFINED_VAR: unbound variable
This catches typos:
NAME="alice"
echo "$NAEM" # without -u, prints empty string. with -u, fails immediately.
The gotcha: things you might think are “unset” actually aren’t. The empty string "" is a set variable with empty value:
NAME=""
echo "$NAME" # works fine — empty, but set
If you genuinely want “default if unset,” use parameter expansion:
echo "${NAME:-default}" # default if unset OR empty
echo "${NAME-default}" # default if unset only (empty stays empty)
For arrays, "${ARR[@]}" on an empty array sometimes triggers -u in older bash versions. The safe pattern:
ARR=()
for item in "${ARR[@]+"${ARR[@]}"}"; do …; done # works on bash 3.2 with -u
For modern bash (4.4+), "${ARR[@]}" on an empty array is fine.
-o pipefail — pipeline returns first non-zero status
We covered this in L8. Without pipefail, the exit status of a | b | c is just c’s status. With pipefail, the whole pipeline returns the rightmost non-zero status. So:
set -o pipefail
cat /nonexistent | grep PATTERN # cat fails with 1, grep "succeeds" with 1 (no match)
# without pipefail: $? = 1 (looks like normal grep no-match)
# with pipefail: $? = 1 (because cat failed)
Combined with -e, this means the script also exits.
IFS=$'\n\t' — restrict word splitting to newline + tab
We covered this in L2. The default IFS includes space, which means for f in $UNQUOTED_VAR splits on spaces — disastrous if $UNQUOTED_VAR contains filenames with spaces. Setting IFS to just newline and tab means word-splitting only happens on those, so spaces in filenames are preserved.
IFS=$'\n\t'
NAMES="alice bob carol"
for n in $NAMES; do echo "[$n]"; done # one element: [alice bob carol] — no space-split
This forces you to always quote when iterating, which is the right discipline. for n in "${ARR[@]}" (quoted, array) is the canonical correct way; the loose for n in $VAR becomes harder to misuse.
Checking strict mode is active
# These lines should appear right after the shebang:
set -Eeuo pipefail
IFS=$'\n\t'
# To verify in a running script:
echo "errexit: $-"
# 'h', 'B', 'e', 'u', 'o' (and others) appear in $-
The $- variable shows the active flags as letters.
2. The ERR trap — line-precise diagnostics
The minimum-viable error handler is just set -e: exit on failure. But that gives you no information about what failed. The ERR trap is the next level:
#!/usr/bin/env bash
set -Eeuo pipefail
IFS=$'\n\t'
trap 'on_error $LINENO' ERR
on_error() {
local line=$1
echo "[ERROR] $0: line $line: command failed (exit $?)" >&2
# cleanup if needed
exit 1
}
# Now any uncaught failure prints which line failed
cp /nonexistent /tmp/x # ERR trap fires before exit
Use single quotes for the trap argument so $LINENO expands at trap-time, not at registration-time. (Same as L10’s signal trap rule.)
For richer diagnostics, capture more context:
trap 'on_error $? $LINENO "$BASH_COMMAND"' ERR
on_error() {
local exit_code=$1 line=$2 cmd=$3
printf '[ERROR] %s:%d: command "%s" exited with %d\n' "$0" "$line" "$cmd" "$exit_code" >&2
exit "$exit_code"
}
$BASH_COMMAND is the command that just failed. Combined with $LINENO, this gives you a full diagnostic line:
[ERROR] ./deploy.sh:42: command "kubectl apply -f manifest.yaml" exited with 1
This is the level of diagnostics you want in production.
Why -E matters
Without -E, ERR traps registered at the top level do NOT fire inside functions:
#!/usr/bin/env bash
set -eo pipefail # NOTE: no -E
trap 'echo "TRAPPED at $LINENO"' ERR
myfn() {
cp /nonexistent /tmp/x # this fails but ERR trap does NOT fire
}
myfn # script exits silently
With -E:
#!/usr/bin/env bash
set -Eeo pipefail
trap 'echo "TRAPPED at $LINENO"' ERR
myfn() {
cp /nonexistent /tmp/x
}
myfn # ERR trap fires, prints line number
This is why -E is part of the canonical preamble.
Multi-line stack traces
For really verbose error reporting, walk the bash call stack:
on_error() {
local exit_code=$? line=${BASH_LINENO[0]}
printf '\n[FATAL] script %s exited with %d at line %d\n' "$0" "$exit_code" "$line" >&2
printf 'Call stack:\n' >&2
local i=0
while caller $i >/dev/null 2>&1; do
printf ' %s\n' "$(caller $i)" >&2
((i++))
done
exit "$exit_code"
}
trap on_error ERR
caller N prints LINE FUNCTION FILE for the Nth frame in the call stack. Combined with set -E, this gives full traces:
[FATAL] script ./deploy.sh exited with 1 at line 42
Call stack:
42 do_deploy ./deploy.sh
78 main ./deploy.sh
103 main ./deploy.sh
Use this in any script complex enough to have nested functions.
3. The die / warn / info family
Every production script should have a small set of helpers for consistent error reporting and structured output. The canonical names are die (fatal), warn (non-fatal), info (informational). Here’s the minimum set:
#!/usr/bin/env bash
set -Eeuo pipefail
IFS=$'\n\t'
readonly SCRIPT_NAME="${0##*/}"
die() { printf '[%s] FATAL: %s\n' "$SCRIPT_NAME" "$*" >&2; exit 1; }
warn() { printf '[%s] WARN: %s\n' "$SCRIPT_NAME" "$*" >&2; }
info() { printf '[%s] INFO: %s\n' "$SCRIPT_NAME" "$*" >&2; }
debug() { [[ "${DEBUG:-0}" == 1 ]] && printf '[%s] DEBUG: %s\n' "$SCRIPT_NAME" "$*" >&2 || true; }
# Usage:
[[ -f "$CONFIG" ]] || die "config file $CONFIG not found"
warn "skipping $f — not a regular file"
info "starting deploy of $TAG"
DEBUG=1 ./script # turn on debug
Notes:
- All output goes to stderr (
>&2). Stdout is reserved for the script’s actual data output. This is how Unix tools have always worked, and it lets the script be used in pipelines:output_data=$(./myscript)works even if it logs heavily. diemustexit— it never returns. Convention: exit code 1 for general errors, 2 for “usage” errors, anything else as needed.${0##*/}strips the path so we just print the script’s basename.debugis a no-op unlessDEBUG=1is set in the environment.
Argument-validation idioms with die
[[ $# -ge 2 ]] || die "usage: $0 <env> <tag>"
[[ -d "$WORK_DIR" ]] || die "WORK_DIR ($WORK_DIR) is not a directory"
command -v kubectl >/dev/null 2>&1 || die "kubectl not found in PATH"
[[ "$ENV" =~ ^(dev|staging|prod)$ ]] || die "invalid env: $ENV"
Read those left-to-right: “this condition must be true OR die.” It’s the most idiomatic shell error-checking pattern.
assert_* helpers
For more elaborate validation, build up an assert family:
assert_command() { command -v "$1" >/dev/null 2>&1 || die "command not found: $1"; }
assert_file() { [[ -f "$1" ]] || die "file not found: $1"; }
assert_dir() { [[ -d "$1" ]] || die "directory not found: $1"; }
assert_var() { [[ -n "${!1:-}" ]] || die "required variable is unset or empty: $1"; }
assert_readable() { [[ -r "$1" ]] || die "file not readable: $1"; }
# Usage:
assert_command kubectl
assert_command jq
assert_var KUBE_NAMESPACE # checks that $KUBE_NAMESPACE is set & non-empty
assert_dir "$DEPLOY_DIR"
${!1:-} is indirect variable expansion — the value of the variable whose name is $1. Combined with :- (default empty if unset), it lets us check arbitrary variable names.
This sort of helpers turn 30-line “validate inputs” blocks into a tight 5-line preamble.
4. ShellCheck — your script linter
ShellCheck is to shell what TypeScript is to JavaScript: it catches an enormous fraction of real bugs before runtime. Run it on every script. Always.
Install
# macOS
brew install shellcheck
# Debian / Ubuntu
sudo apt install shellcheck
# Alpine
apk add shellcheck
# Other: download from https://github.com/koalaman/shellcheck/releases
Run
shellcheck myscript.sh
shellcheck -x myscript.sh # follow `source` statements (-x = external sources)
The most common findings (and what to do)
ShellCheck assigns each rule a number: SC2086, SC2155, etc. The most common in real scripts:
SC2086 — “Double quote to prevent globbing and word splitting”
NAME="alice smith"
cp $NAME /tmp/ # SC2086: cp alice smith /tmp/ — looks for two files
cp "$NAME" /tmp/ # CORRECT
The single most-common shell bug. The fix is always to quote.
SC2046 — “Quote this to prevent word splitting”
ls $(find . -name '*.txt') # SC2046 — output of find may have spaces
ls "$(find . -name '*.txt')" # NOPE — collapses all into single arg
mapfile -d '' FILES < <(find . -name '*.txt' -print0)
ls "${FILES[@]}" # CORRECT
This one is harder to fix mechanically — sometimes you really do want word-splitting. ShellCheck warns regardless; you have to think about which case applies.
SC2155 — “Declare and assign separately to avoid masking return values”
local var="$(get_value)" # SC2155
# Why? `local` always succeeds, so $? after this is local's status, NOT get_value's.
# If get_value fails, you don't notice.
# CORRECT:
local var
var="$(get_value)"
Fix: declare first, assign on the next line. Otherwise set -e won’t catch failure of the assigned expression.
SC2034 — “var appears unused”
NAME="alice" # never used elsewhere — typo? dead code?
Either remove or use, or annotate as exported (export NAME=...).
SC2154 — “var is referenced but not assigned”
echo "$VAARS" # typo — meant $VARS?
SC2126 — “Consider using grep -c instead of grep | wc -l”
grep PATTERN file | wc -l # SC2126 — slower
grep -c PATTERN file # better
SC2068 — “Double quote array expansions”
for arg in $@; do …; done # SC2068
for arg in "$@"; do …; done # CORRECT
SC2002 — “Useless cat”
cat file | grep PATTERN # SC2002
grep PATTERN < file # better
grep PATTERN file # best
SC2164 — “Use cd … || exit”
cd /tmp # SC2164 — what if /tmp doesn't exist?
cd /tmp || die "cannot cd" # better
Suppressing rules
When ShellCheck is wrong (rare but happens), suppress the specific rule above the line:
# shellcheck disable=SC2086
COMMAND="ls -la"
$COMMAND # we WANT word-splitting here
# Or for an entire file (top of file, after shebang):
# shellcheck disable=SC2086,SC2046
Always add a comment explaining why you suppressed:
# shellcheck disable=SC2086 # we deliberately split — kubectl args from build
$KUBECTL_ARGS
CI integration
Add a step to your pipeline that fails the build on any ShellCheck warning:
# .github/workflows/lint.yml (snippet)
- name: ShellCheck
run: shellcheck scripts/*.sh
For pre-commit:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/koalaman/shellcheck-precommit
rev: v0.10.0
hooks:
- id: shellcheck
There’s no excuse for not running ShellCheck. Install it on day one.
5. Error propagation across function boundaries
This is where most engineers get tripped up. Bash has subtle rules about how errors propagate, especially with command substitution and pipelines.
The problem: command-substitution exit codes
#!/usr/bin/env bash
set -Eeuo pipefail
get_count() {
cat /nonexistent # this fails
echo "10"
}
# What does $COUNT become?
COUNT=$(get_count) # $COUNT = "10"; the cat failure was swallowed!
echo "got: $COUNT" # "got: 10" — function appeared to succeed!
The function get_count internally failed at cat /nonexistent, but because it ran inside $(...), set -e doesn’t propagate the failure out by default.
The fix is inherit_errexit (bash 4.4+):
shopt -s inherit_errexit # add to your preamble
COUNT=$(get_count) # NOW the cat failure does propagate; script exits
For older bash, the workaround is to capture and check explicitly:
if ! COUNT=$(get_count); then
die "get_count failed"
fi
Or to split assignment from the failable command:
if ! OUTPUT=$(get_count 2>&1); then
die "get_count failed: $OUTPUT"
fi
The pipefail dance with set -e
A subtle case:
set -eo pipefail
cmd1 | cmd2 # if cmd1 fails, $? becomes cmd1's status (pipefail), then -e fires
That’s the desired behaviour. But:
set -eo pipefail
output=$(cmd1 | cmd2) # $? = the right-most failure ... but does -e fire?
Without inherit_errexit, no — the failure is contained in the command sub. With inherit_errexit, yes. Same lesson: enable inherit_errexit when on bash 4.4+.
Functions and local masking exit codes
We saw SC2155 above — but the symptom is worth re-stating:
process() {
local result="$(some_failing_command)" # SC2155
echo "$result"
}
Here, local always succeeds. Even though some_failing_command failed, the value passed to local is its empty stdout, and the function continues. set -e does not fire. The fix:
process() {
local result
result="$(some_failing_command)" # NOW set -e can fire on failure
echo "$result"
}
This is one of the strongest reasons to lint with ShellCheck.
if ! and until swallow failures
if ! some_check; then
…
fi
Inside an if (or while, until) condition, set -e is suspended for that command. Don’t put expensive logic in if conditions if you want set -e to catch its failures — restructure so the call is at the top level:
some_check
status=$?
if [[ $status -ne 0 ]]; then …; fi # but now set -e already exited :(
Easier:
if some_check; then
…
else
die "some_check failed"
fi
The “explicit if/else with explicit error path” is sometimes the cleanest pattern.
6. The lib/errors.sh framework
Let’s assemble everything into a reusable library. Save this as lib/errors.sh and source it from every script:
# lib/errors.sh — defensive shell helpers
# Source this from every script:
# source "$(dirname "${BASH_SOURCE[0]}")/lib/errors.sh"
# strict mode
set -Eeuo pipefail
shopt -s inherit_errexit nullglob 2>/dev/null || true # bash 4.4+; nullglob always ok
IFS=$'\n\t'
readonly SCRIPT_NAME="${0##*/}"
# Logging
_log() { local lvl=$1; shift; printf '[%s] %s: %s\n' "$SCRIPT_NAME" "$lvl" "$*" >&2; }
die() { _log "FATAL" "$@"; exit 1; }
warn() { _log "WARN " "$@"; }
info() { _log "INFO " "$@"; }
debug() { [[ "${DEBUG:-0}" == 1 ]] && _log "DEBUG" "$@" || true; }
# Validation
assert_command() { command -v "$1" >/dev/null 2>&1 || die "command not found: $1"; }
assert_file() { [[ -f "$1" ]] || die "file not found: $1"; }
assert_dir() { [[ -d "$1" ]] || die "directory not found: $1"; }
assert_var() { [[ -n "${!1:-}" ]] || die "required variable is unset or empty: \$$1"; }
assert_readable() { [[ -r "$1" ]] || die "file not readable: $1"; }
assert_writable() { [[ -w "$1" ]] || die "file not writable: $1"; }
assert_in() {
local val=$1; shift
local choice
for choice in "$@"; do [[ "$val" == "$choice" ]] && return 0; done
die "value '$val' is not in: $*"
}
# ERR trap — run on any uncaught failure
_on_err() {
local exit_code=$? line=${BASH_LINENO[0]} cmd=${BASH_COMMAND}
printf '\n[%s] FATAL: %s:%d: "%s" exited %d\n' "$SCRIPT_NAME" "${BASH_SOURCE[1]:-$0}" "$line" "$cmd" "$exit_code" >&2
if [[ ${#FUNCNAME[@]} -gt 1 ]]; then
printf 'Call stack:\n' >&2
local i=0
while caller $i >/dev/null 2>&1; do
printf ' %s\n' "$(caller $i)" >&2
((i++))
done
fi
exit "$exit_code"
}
trap _on_err ERR
Now any script that sources this gets:
#!/usr/bin/env bash
source "$(dirname "${BASH_SOURCE[0]}")/lib/errors.sh"
# strict mode active, ERR trap active, helpers available
assert_command kubectl
assert_command jq
assert_var TARGET_ENV
assert_in "$TARGET_ENV" dev staging prod
info "deploying to $TARGET_ENV"
kubectl apply -f manifests/
info "deploy complete"
Any failure produces:
[deploy.sh] FATAL: ./deploy.sh:8: "kubectl apply -f manifests/" exited 1
Call stack:
8 do_deploy ./deploy.sh
20 main ./deploy.sh
This is what production-grade shell looks like.
7. Patterns for partial success
Strict mode is the default. But sometimes you genuinely want “do this, ignore failure”:
# Pattern 1: Explicit `|| true` to suppress one command
rm -rf /tmp/cache 2>/dev/null || true
# Pattern 2: Capture exit code without exiting
if some_optional_check; then
info "optional check passed"
fi
# (no else — failure is fine)
# Pattern 3: Set a flag, branch on it
if has_optional_dep; then
USE_OPTIONAL=1
else
USE_OPTIONAL=0
fi
# Pattern 4: Run a series with set +e locally
do_optional_steps() {
set +e
cmd1
cmd2
cmd3
set -e
}
Pattern 4 is sometimes necessary for “best effort” cleanup or batch operations — but use it sparingly and document why. For most production scripts, fail-fast is right.
try/catch the bash way
Bash has no try/catch. The closest pattern:
{
some_thing
another_thing
} || {
warn "best-effort batch failed; continuing"
}
The { } group runs as a unit; the || after the closing brace gives you the “catch” branch. Use this for genuinely-optional work; don’t normalize it.
8. Complete example: a real defensive script
Putting it all together — a deploy script with full defensive engineering:
#!/usr/bin/env bash
# deploy.sh — push image to a Kubernetes namespace
# Usage: deploy.sh <env> <image-tag>
# 1. Strict-mode preamble
set -Eeuo pipefail
shopt -s inherit_errexit nullglob
IFS=$'\n\t'
# 2. Source our helpers
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=lib/errors.sh
source "$SCRIPT_DIR/lib/errors.sh"
# 3. Argument validation
[[ $# -eq 2 ]] || die "usage: $0 <env> <image-tag>"
readonly ENV="$1"
readonly TAG="$2"
assert_in "$ENV" dev staging prod
[[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]] || die "tag must be vMAJOR.MINOR.PATCH; got: $TAG"
assert_command kubectl
assert_command jq
# 4. Look up cluster context
KUBE_CONTEXT="kloudvin-${ENV}"
kubectl config get-contexts -o name | grep -qx "$KUBE_CONTEXT" || die "no kubectl context: $KUBE_CONTEXT"
# 5. Atomic deploy — set, then poll for rollout
info "switching to context $KUBE_CONTEXT"
kubectl config use-context "$KUBE_CONTEXT"
NAMESPACE="kloudvin"
DEPLOYMENT="api"
info "setting image to $TAG"
kubectl -n "$NAMESPACE" set image "deployment/$DEPLOYMENT" "$DEPLOYMENT=ghcr.io/kloudvin/api:$TAG"
info "waiting for rollout (timeout 5m)"
if ! kubectl -n "$NAMESPACE" rollout status "deployment/$DEPLOYMENT" --timeout=5m; then
warn "rollout failed; rolling back"
kubectl -n "$NAMESPACE" rollout undo "deployment/$DEPLOYMENT"
die "rollout failed for $DEPLOYMENT in $NAMESPACE; rolled back"
fi
info "rollout complete: $DEPLOYMENT @ $TAG in $NAMESPACE/$ENV"
Breakdown:
- Section 1: strict-mode preamble.
- Section 2: import helpers via sourcing. The
# shellcheck source=...directive tells ShellCheck where to find the sourced file, so it can lint cross-file. - Section 3: every input validated up-front. We fail before doing any actual work.
- Section 4: validate cluster context exists.
- Section 5: do the actual work; on rollout failure, roll back and die. We want to die, so the CI pipeline marks the deploy as failed.
This kind of script — strict, validated, atomic, fail-loud — is what you ship to a team you’ll work with for years.
9. Common pitfalls
“I added set -e but it’s still not catching X”
set -e has many exemptions (covered above). When something isn’t being caught, check:
- Is it inside
if/while/until? - Is it in
$(...)withoutinherit_errexit? - Is it after
&&or||? - Is it a
local var=$(failing_cmd)?
The fix is usually to restructure or add explicit checking.
Trap fires twice on script exit
If you have both an EXIT trap and an ERR trap, they both fire. Either:
# Pattern A: ERR exits, EXIT does cleanup. ERR triggers EXIT.
trap '_on_err' ERR
trap '_cleanup' EXIT
Or:
# Pattern B: One unified handler that knows the difference
_on_exit() {
local exit_code=$?
if [[ $exit_code -ne 0 ]]; then
warn "script exiting with $exit_code"
fi
_cleanup
}
trap _on_exit EXIT
We covered this in L10.
pipefail makes intentional head short-circuit fail
set -o pipefail
some_long_output | head -n 10 # head closes stdin early; some_long_output gets SIGPIPE; pipefail trips
The fix:
set -o pipefail
some_long_output | { head -n 10; cat >/dev/null; } # absorb the rest
# Or:
some_long_output | head -n 10 || [[ $? -eq 141 ]] # tolerate SIGPIPE (128+13)
Pick one; document the choice in a comment.
Sourcing lib/errors.sh from a non-standard relative path
source ./lib/errors.sh # WRONG — depends on cwd at invocation
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/lib/errors.sh" # CORRECT — always relative to the script's location
Always use BASH_SOURCE[0] + dirname for reliable sourcing.
ShellCheck false positives
Sometimes you’ll really want word-splitting (SC2086) or really want to use unquoted $@. The right answer is to suppress the rule on that line with an explanatory comment:
# shellcheck disable=SC2086 # OPTS is built up as a string of safe shell tokens
some_command $OPTS
Don’t suppress globally to silence noise. Diagnose each case.
10. Twelve idioms for daily use
# 1. Strict-mode preamble (every script)
set -Eeuo pipefail
shopt -s inherit_errexit nullglob
IFS=$'\n\t'
# 2. die / warn / info — minimum logging
die() { printf '[%s] FATAL: %s\n' "${0##*/}" "$*" >&2; exit 1; }
warn() { printf '[%s] WARN: %s\n' "${0##*/}" "$*" >&2; }
info() { printf '[%s] INFO: %s\n' "${0##*/}" "$*" >&2; }
# 3. Basic ERR trap with line number
trap 'die "line $LINENO: $BASH_COMMAND failed (exit $?)"' ERR
# 4. assert_command
assert_command() { command -v "$1" >/dev/null 2>&1 || die "missing command: $1"; }
# 5. assert_var (checks variable is set and non-empty by name)
assert_var() { [[ -n "${!1:-}" ]] || die "required var unset: \$$1"; }
# 6. assert_in (value must be in a list)
assert_in() { local v=$1; shift; for c; do [[ "$v" == "$c" ]] && return 0; done; die "$v not in: $*"; }
# 7. Quoted argument validation
[[ $# -ge 2 ]] || die "usage: $0 <env> <tag>"
# 8. Regex validation
[[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]] || die "bad tag: $TAG"
# 9. Robust source path
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/lib/errors.sh"
# 10. Suppress ShellCheck rule with reason
# shellcheck disable=SC2086 # intentional split
# 11. local + assign on separate lines (avoid SC2155)
local result
result="$(some_command)"
# 12. Atomic write + rename
TMP=$(mktemp); some_command > "$TMP" && mv "$TMP" target
11. What you must internalise before lesson 14
- What does
set -Edo, and why is it necessary? (Propagates ERR trap into functions / subshells / command substitution.) - What’s
inherit_errexit? (Propagatesset -einto command substitution$(...).) - Where does
set -enot fire? (Insideif/while/untilconditions; in$(...)withoutinherit_errexit; after&&/||; on commands prefixed with!.) - What’s the fix for SC2155 (
local x="$(...)")? (Declare and assign on separate lines.) - Why must trap commands be in single quotes? (To prevent expansion at registration; expansion happens at fire time.)
- What’s the canonical ERR trap line? (
trap 'die "line $LINENO: $BASH_COMMAND failed (exit $?)"' ERR.) - What goes to stdout vs stderr? (Data → stdout. Logs and errors → stderr.)
- Why call ShellCheck on every script? (Catches an enormous fraction of common shell bugs at lint time.)
- How do you suppress a ShellCheck rule with explanation? (
# shellcheck disable=SC2086 # reason.) - What’s
assert_var "VAR_NAME"checking? (That the variable named “VAR_NAME” is set and non-empty, via indirect expansion${!1:-}.)
If anything felt fuzzy, re-read. The next nine lessons assume you internalised these.
What’s next
Lesson 14: Argument Parsing — getopts, getopt, manual parsing & long options. Most real scripts take options (-v, --verbose, --config FILE). Bash gives you getopts for short options out of the box, but long options (--verbose) need extra work. We cover getopts deeply, the GNU getopt command (different! easy to confuse with getopts), the canonical manual long-option parsing pattern, optional vs required arguments, and how to integrate with usage() and the strict-mode preamble. After L14 your scripts will accept arguments the way kubectl, git, and other production CLIs do.
See you there.