Most shell scripts run sequentially: one command, then the next. That’s fine until you’re processing 1,000 files, hitting 50 endpoints, or fanning out across 30 nodes. Suddenly serial execution means waiting hours when the CPU is at 8% utilisation.
Real concurrency in shell isn’t hard, but it has sharp edges:
- Backgrounding (
&) withwait— the bare-metal primitive. xargs -P N— the simplest job pool; one command per input line, N concurrent.- GNU
parallel— declarative parallelism with progress, retry, and structured output. - FIFOs (
mkfifo) — named pipes for IPC between long-running processes. flock— kernel-level mutual exclusion to serialise access to shared resources.
We covered the basics of &/wait in L9. This lesson goes deep, builds patterns you’ll actually use in production, and covers the race conditions to avoid.
By the end you’ll be able to run 100 deploys in parallel safely, monitor them, recover from partial failures, and never accidentally clobber a shared file.
1. Backgrounding refresher: & and wait
cmd & # start cmd, return immediately, $! is its PID
wait # wait for ALL background jobs
wait $PID # wait for one specific PID
wait -n # wait for ANY background job to finish (bash 4.3+)
#!/usr/bin/env bash
# Run three jobs in parallel, wait for all
do_thing 1 &
do_thing 2 &
do_thing 3 &
wait
echo "all done"
The exit code of wait is the exit code of the last job (with no PID arg) or of that job (with PID). To capture per-job:
do_thing 1 & PID1=$!
do_thing 2 & PID2=$!
do_thing 3 & PID3=$!
wait $PID1; RC1=$?
wait $PID2; RC2=$?
wait $PID3; RC3=$?
echo "results: $RC1 $RC2 $RC3"
This works for a known small number of jobs. For arbitrary counts, you need a job pool.
Job pool: bounded parallelism with wait -n
#!/usr/bin/env bash
set -Eeuo pipefail
MAX_JOBS=${MAX_JOBS:-4}
JOBS=()
for input in input1 input2 input3 input4 input5 input6 input7 input8 input9 input10; do
# If we've reached the cap, wait for any one to finish first
while (( ${#JOBS[@]} >= MAX_JOBS )); do
wait -n # wait for ANY background to finish
# Rebuild JOBS — only PIDs still alive
NEW=()
for pid in "${JOBS[@]}"; do
kill -0 "$pid" 2>/dev/null && NEW+=("$pid")
done
JOBS=("${NEW[@]}")
done
do_thing "$input" &
JOBS+=($!)
done
wait # wait for the last batch
This caps concurrency at MAX_JOBS. For most use cases, xargs -P does this more cleanly.
wait -n exit code (bash 4.3+)
After wait -n returns, $? is the exit code of the job that finished. To loop until all are done while monitoring:
JOBS=()
for input in "${INPUTS[@]}"; do
worker "$input" &
JOBS+=($!)
done
FAILURES=0
while (( ${#JOBS[@]} > 0 )); do
wait -n
rc=$?
(( rc != 0 )) && ((FAILURES++))
# Note: bash doesn't tell us WHICH job — we'd need to track manually.
done
echo "$FAILURES failures"
If you need per-job exit codes, track PIDs and wait $PID individually. For “all-or-nothing” semantics, wait (no args) at the end and check the global exit.
2. xargs -P N — the simplest job pool
The cleanest way to run a bounded-parallel set of commands over a list:
# Run gzip on every .log file, 8 in parallel
find /var/log -name '*.log' -print0 | xargs -0 -P 8 -n 1 gzip
Flags:
-P N— run N processes concurrently.-n 1— pass 1 argument per command (so each gzip handles one file).-0— input is NUL-separated (paired withfind -print0).-I {}— substitute{}in the command line (lets you put the arg somewhere other than the end).
# Custom command shape — pass each filename as $1 to a function call
printf '%s\n' "${FILES[@]}" | xargs -I {} -P 4 -n 1 bash -c 'process "$@"' _ {}
The trick bash -c '...' _ {} is: bash -c runs the script, the _ is $0, and {} becomes $1. Then process "$@" calls your function with the filename.
Number of cores
# Use all available cores
NPROC=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)
# Then:
xargs -P "$NPROC" -n 1 ...
For I/O-bound work (network calls, disk I/O), you can usefully set this to 2-4x cores. For CPU-bound (compression, encryption), stay at nproc.
Capturing output safely
When parallel commands write to stdout, lines can interleave. Best practice: have each one write to its own file, merge at the end.
mkdir -p /tmp/joblogs
find . -name '*.log' -print0 | xargs -0 -P 8 -n 1 -I {} \
bash -c 'gzip --keep "{}" 2>"/tmp/joblogs/$(basename "{}").err"'
cat /tmp/joblogs/*.err # merge afterwards
Or use xargs --process-slot-var:
xargs -P 4 -n 1 -I {} --process-slot-var=SLOT \
bash -c 'echo "slot=$SLOT processing {}"' \
< input.txt
SLOT becomes 0…3, letting each worker write to its own log file or use its own port etc. This is GNU-only but useful.
xargs -P exit code semantics
xargs exits with:
- 0 if everything succeeded
- 123 if any command exited 1-125
- 124 if any died on signal
- 125 if xargs itself failed
- 126 if a command couldn’t be executed
So xargs -P correctly fails if any subprocess fails. Good for use with set -e.
3. GNU parallel — declarative parallelism
xargs -P is fine for “run this command on every input.” parallel is for everything more elaborate: progress bars, retries, ETA, structured output, multi-input combinations.
brew install parallel # macOS
sudo apt install parallel # Debian/Ubuntu
Basic equivalent to xargs -P
# All three are equivalent
ls *.log | xargs -P 8 -n 1 gzip
ls *.log | parallel -j 8 gzip
parallel -j 8 gzip ::: *.log
::: is parallel’s syntax for inline argument lists. -j N is the parallelism degree.
Templated commands
parallel -j 4 'curl -sL https://api.example.com/{} -o {}.json' ::: alice bob carol dave
The {} is each input. parallel also supports:
{1},{2}, … — positional from multiple input lists{.}— input with extension stripped{/}— basename only{//}— dirname only{#}— job number
# Compress, naming output by job number
parallel -j 4 'gzip -c {} > {.}.{#}.gz' ::: file*.log
Multiple input lists
parallel -j 8 'echo {1} {2}' ::: a b c ::: 1 2 3
# a 1
# a 2
# a 3
# b 1
# ...
Cartesian product of inputs by default. Use --xapply (or :::+) for paired:
parallel -j 8 --xapply 'echo {1} {2}' ::: a b c ::: 1 2 3
# a 1
# b 2
# c 3
Progress bar and ETA
parallel --bar 'process {}' ::: input1 input2 input3 ...
parallel --eta 'process {}' ::: input1 input2 input3 ...
Both show live progress. Useful for long-running batches.
Retry on failure
parallel --retries 3 'curl https://api.example.com/{}' ::: $(seq 1 1000)
Each command retries up to 3 times if it fails. Combined with --joblog:
parallel --joblog jobs.log --retries 3 'curl ...' ::: ...
You get a log of every job: input, exit code, time, retries.
Result aggregation with --results
parallel --results /tmp/jobout 'curl https://api.example.com/{}' ::: 1 2 3
Each job’s stdout/stderr go to /tmp/jobout/1/..., organized by argument. No interleaving.
Limit memory and CPU
parallel --memfree 1G 'big_cmd {}' ::: ... # only run new jobs while >1G free
parallel --load 80% 'big_cmd {}' ::: ... # only while CPU load <80%
Distribute across machines
parallel can SSH to remote machines and run jobs there:
parallel -S host1,host2,host3 -j 4 'process {}' ::: ...
-j 4 is per-host concurrency. This is genuinely impressive for distributed work without any framework — but you need passwordless SSH set up.
parallel vs xargs -P summary
Use xargs -P when:
- The task is “run command X on each line of input.”
- You don’t need progress, retry, or per-job logs.
- You want maximum portability (xargs is on every system).
Use parallel when:
- You need progress, retry, joblog, ETA.
- You have multiple input lists to combine.
- You’re doing distributed work via SSH.
- Output needs to be aggregated cleanly.
Both have their place.
4. FIFOs — named pipes for IPC
A FIFO is a “named pipe” — a filesystem entry that acts as a pipe. Two unrelated processes can communicate through it.
Basics
mkfifo /tmp/myfifo
# Process A: writes
echo "hello from A" > /tmp/myfifo &
# Process B: reads
cat /tmp/myfifo
# hello from A
The write blocks until something opens the FIFO for reading; the read blocks until something writes. This is synchronous IPC.
Use case: background producer + consumer
#!/usr/bin/env bash
set -Eeuo pipefail
FIFO=$(mktemp -u /tmp/myfifo.XXXXXX)
mkfifo "$FIFO"
trap 'rm -f "$FIFO"' EXIT
# Producer in background
(
for i in {1..10}; do
sleep 0.1
echo "msg $i"
done > "$FIFO"
) &
# Consumer in foreground
while IFS= read -r line; do
echo "got: $line"
done < "$FIFO"
wait
This pattern lets you set up a producer/consumer pipeline where the producer’s output is processed line-by-line by the consumer in the same shell context. With anonymous pipes (|), the right side runs in a subshell and can’t easily update parent variables.
A worker pool with FIFOs
#!/usr/bin/env bash
set -Eeuo pipefail
NUM_WORKERS=${NUM_WORKERS:-4}
FIFO=$(mktemp -u /tmp/workers.XXXXXX)
mkfifo "$FIFO"
trap 'rm -f "$FIFO"' EXIT
# Open the FIFO twice (read+write) so it doesn't close
exec 3<>"$FIFO"
# Pre-fill with NUM_WORKERS tokens
for ((i=0; i<NUM_WORKERS; i++)); do
echo >&3
done
worker() {
local item=$1
process "$item" # the actual work
echo >&3 # return token after we're done
}
for item in "${ITEMS[@]}"; do
read -u 3 # consume a token (blocks if none)
worker "$item" &
done
wait
exec 3>&- # close FD 3
This is a classic “semaphore via FIFO” pattern. The FIFO acts as a counting semaphore: tokens limit concurrency to NUM_WORKERS.
xargs -P does this internally and more cleanly. The FIFO version is useful when you need finer control (e.g., variable-cost jobs, weighted slots).
5. flock — cross-process mutual exclusion
When multiple invocations of a script (cron jobs, signal handlers, manual runs) might collide, you need a lock. We saw flock briefly in L10. Here’s the full pattern.
Single-instance script
#!/usr/bin/env bash
set -Eeuo pipefail
LOCKFILE=/var/run/myscript.lock
# Acquire exclusive lock on FD 200; fail if already locked
exec 200>"$LOCKFILE"
flock -n 200 || { echo "another instance is running" >&2; exit 1; }
# ... rest of script ...
flock -n is non-blocking — it fails immediately if the lock is held. flock (no -n) blocks until acquired.
The lock auto-releases when the process exits (even on SIGKILL), because the kernel releases all FDs. No explicit unlock needed.
Locking a region of work
{
flock 200
critical_section
} 200>"$LOCKFILE"
The block in { ... } 200>FILE opens FD 200 and runs flock 200 to acquire. When the block exits, FD 200 is closed and the lock released. Useful when only part of a script is sensitive.
Self-locking script (one-liner)
#!/usr/bin/env bash
set -Eeuo pipefail
exec 200>"/var/run/${0##*/}.lock"
flock -n 200 || exit 0 # silently exit if locked
# ... rest ...
Combined with cron, this gives you “run every minute, but skip if previous run is still going” semantics.
flock shared vs exclusive
flock -s 200 # shared (multiple readers OK)
flock -x 200 # exclusive (default; only one)
For database-style “many readers, one writer” patterns. Most scripts just want exclusive.
Timeout on flock acquisition
flock -w 30 200 || die "could not acquire lock after 30s"
Useful when “wait but don’t wait forever” is the right behaviour.
6. Race conditions to avoid
TOCTOU — Time-Of-Check, Time-Of-Use
The classic shell race:
if [[ ! -f "$FILE" ]]; then
touch "$FILE"
fi
Between the [[ -f ]] test and the touch, another process can create the file. Then both processes proceed, possibly clobbering. The fix is to use atomic operations:
# Atomic create-if-not-exists
( set -C; echo "$$" > "$FILE" ) 2>/dev/null && IS_CREATOR=1 || IS_CREATOR=0
set -C (noclobber) makes > fail if the file exists. The whole thing is atomic at the kernel level: the file either exists or is created by this process. No race.
Or use flock — acquire the lock before checking, so no one else can race.
Concurrent writes to the same file
# Three jobs in parallel all writing to log.txt — lines interleave at byte level
job1 >> log.txt &
job2 >> log.txt &
job3 >> log.txt &
For small writes (under PIPE_BUF, typically 4KB), append (>>) is atomic on Linux. For larger writes, lines can split. Prefer:
# Each job writes to its own file
job1 > log.1 &
job2 > log.2 &
job3 > log.3 &
wait
cat log.1 log.2 log.3 > log.txt
Or use flock to serialise:
log() { ( flock 200; printf '%s\n' "$*" >> log.txt ) 200>log.lock; }
Stale lock files (the cleanup problem)
If a process crashes without cleanup, its lock file may remain:
LOCKFILE=/var/run/myscript.lock
[[ -f "$LOCKFILE" ]] && exit 1 # WRONG — stale lock blocks forever
The right answer is flock: kernel-managed locks auto-release on process death. No PID files, no staleness, no manual cleanup.
exec 200>"$LOCKFILE"
flock -n 200 || exit 1
# lock is held by THIS process; releases when this process exits, no matter how
This is why every modern shell script that needs single-instance uses flock, not “PID file checks.”
Subshell variable scoping
COUNT=0
{ for i in {1..100}; do ((COUNT++)); done; } &
wait
echo "$COUNT" # still 0 — subshell ran in its own COUNT
We covered this in L4. Subshells don’t propagate variables back. For accumulation across parallel work, use a file:
echo 0 > /tmp/count
for i in {1..100}; do
( count=$(< /tmp/count); echo $((count + 1)) > /tmp/count ) & # WRONG — race!
done
wait
Even this has a TOCTOU race. Use flock to serialise the read-modify-write:
LOCK=/tmp/count.lock
COUNTFILE=/tmp/count
echo 0 > "$COUNTFILE"
increment() {
( flock 200; echo $(( $(< "$COUNTFILE") + 1 )) > "$COUNTFILE" ) 200>"$LOCK"
}
for i in {1..100}; do
increment &
done
wait
echo "count: $(< $COUNTFILE)"
Or accept the limitation and accumulate after-the-fact:
{ for i in {1..100}; do echo "$i"; done > items; }
parallel -j 4 do_thing :::: items > results
total=$(wc -l < results)
For most scripts, “do work in parallel, write results to per-job files, aggregate after” is the simplest and safest pattern.
7. A complete parallel deploy script
Tying everything together — deploy 50 services in parallel, with bounded concurrency, retry, locking, and structured logging:
#!/usr/bin/env bash
# parallel-deploy.sh — deploy a list of services in parallel
set -Eeuo pipefail
source "$(dirname "${BASH_SOURCE[0]}")/lib/log.sh"
LOCKFILE=/var/run/parallel-deploy.lock
exec 200>"$LOCKFILE"
flock -n 200 || { error "another deploy is running"; exit 1; }
[[ $# -ge 1 ]] || { error "usage: $0 <services-file> [tag]"; exit 2; }
SVC_FILE=$1
TAG="${2:-latest}"
JOBS="${JOBS:-8}"
[[ -r "$SVC_FILE" ]] || { error "cannot read $SVC_FILE"; exit 1; }
readarray -t SERVICES < "$SVC_FILE"
info "deploying ${#SERVICES[@]} services with concurrency=$JOBS, tag=$TAG"
mkdir -p /tmp/deploy-results
RESULTS_DIR=$(mktemp -d /tmp/deploy.XXXXXX)
trap 'rm -rf "$RESULTS_DIR"' EXIT
deploy_one() {
local svc=$1
local logfile="$RESULTS_DIR/$svc.log"
# 3 retries with exponential backoff
local attempt
for attempt in 1 2 3; do
info "deploy $svc (attempt $attempt)"
if kubectl set image "deployment/$svc" "$svc=ghcr.io/myorg/$svc:$TAG" \
&& kubectl rollout status "deployment/$svc" --timeout=2m \
> "$logfile" 2>&1; then
info "deploy $svc OK"
return 0
fi
warn "deploy $svc attempt $attempt failed; sleeping $((attempt * 5))s"
sleep $((attempt * 5))
done
error "deploy $svc FAILED after 3 attempts"
return 1
}
export -f deploy_one
export TAG RESULTS_DIR
printf '%s\n' "${SERVICES[@]}" | parallel -j "$JOBS" --joblog "$RESULTS_DIR/joblog" \
--halt soon,fail=10% deploy_one {}
RC=$?
# Summarise results
SUCCESS=$(awk -F'\t' 'NR>1 && $7==0 { c++ } END { print c+0 }' "$RESULTS_DIR/joblog")
FAILED=$(awk -F'\t' 'NR>1 && $7!=0 { c++ } END { print c+0 }' "$RESULTS_DIR/joblog")
info "deploy summary" total=${#SERVICES[@]} success=$SUCCESS failed=$FAILED
[[ $RC -eq 0 ]] || error "some deploys failed; see $RESULTS_DIR for details"
exit $RC
Notes:
flockensures only one deploy can run at a time across the whole machine.parallel -j $JOBScaps concurrency.--halt soon,fail=10%aborts the whole batch if 10%+ of jobs fail (don’t keep deploying after a clear pattern of failure).- Each job’s stdout/stderr goes to its own file under
RESULTS_DIR. - The joblog (
--joblog) gives us machine-parseable results we summarise viaawk. export -f deploy_oneis needed forparallelto find the function in the child shells.lib/log.shfrom L15 providesinfo/warn/error.
This is what shipping shell scripts looks like at scale.
8. Common pitfalls
wait returning 127
If you call wait $PID for a PID that has already been reaped, wait returns 127. Avoid by always wait’ing exactly once per spawned PID.
xargs -P and signals
If you Ctrl-C xargs -P, it kills itself but its children may keep running. To propagate:
trap 'kill $(jobs -p) 2>/dev/null' INT TERM
Or use xargs --process-slot-var and arrange for children to exit on signal.
parallel lecture on first run
The first time you run parallel, it asks you to “cite” via parallel --citation. To skip in scripts:
parallel --will-cite ...
Or run parallel --citation manually once to suppress the prompt forever.
Background jobs killed when parent exits
By default, when a script exits, its background jobs receive SIGHUP (well, terminal-related ones do). Use nohup or disown for jobs you want to outlive the script:
nohup long_thing & # ignores SIGHUP, redirects stdout/err
disown $! # remove from shell's job table
& inside a function vs at script top level
my_func() {
cmd & # PID is in $! INSIDE the function only
}
my_func
echo "$!" # NOT the PID of cmd; it's the PID of the LAST background launched at this scope
If you want a function to launch and return the PID, capture inside:
my_func() {
cmd &
echo $!
}
PID=$(my_func)
wait "$PID"
FIFO blocking forever
A FIFO write blocks until a reader opens it. If your reader exited or never started, the writer hangs. Fix: ensure both ends are open, or use exec 3<>FIFO to keep an FD open in the script itself so neither end “closes.”
flock on NFS
flock doesn’t work reliably across NFS — different kernels handle remote locks differently. For NFS, use lockfile-create (procmail) or rely on atomic ln (hard links are atomic on most filesystems).
9. Twelve idioms for daily use
# 1. Run three commands in parallel, wait for all
cmd1 & cmd2 & cmd3 & wait
# 2. xargs job pool over a list
find . -name '*.log' -print0 | xargs -0 -P 8 -n 1 gzip
# 3. Number of cores cross-platform
NPROC=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)
# 4. parallel basic
parallel -j 8 'curl -s https://api.example.com/{}' ::: 1 2 3 4 5
# 5. parallel with retry + joblog
parallel --joblog jobs.log --retries 3 'cmd {}' ::: ...
# 6. parallel with progress
parallel --bar 'cmd {}' ::: input*
# 7. Single-instance via flock (non-blocking)
exec 200>/var/run/myscript.lock; flock -n 200 || exit 0
# 8. Atomic create-if-not-exists (no race)
( set -C; echo "$$" > "$LOCK" ) 2>/dev/null
# 9. FIFO-based semaphore worker pool
mkfifo "$FIFO"; exec 3<>"$FIFO"
for i in $(seq 1 4); do echo >&3; done
for item in "${ITEMS[@]}"; do
read -u 3
( do_work "$item"; echo >&3 ) &
done
wait
# 10. Per-job result files (no interleaving)
job() { local id=$1; do_work > "results/$id.out"; }
for id in 1 2 3; do job $id & done; wait
# 11. wait -n for any-finishes (bash 4.3+)
while (( ${#JOBS[@]} > 0 )); do wait -n; done
# 12. Disown a long-running job from the shell
nohup long_job & disown $!
10. What you must internalise before lesson 17
- What’s
wait -nfor? (Wait for ANY background job to finish — bash 4.3+.) - What’s
xargs -P 4 -n 1? (Run 4 in parallel, 1 input arg per command.) - What’s the
-0flag’s purpose? (NUL-separated input — paired withfind -print0for filename safety.) - What does
parallel --jobloggive you? (A tab-separated file with input, exit code, time, retries — machine-parseable results.) - What does
parallel --halt soon,fail=10%do? (Stop launching new jobs as soon as 10% of jobs have failed.) - What’s a FIFO and how do you create one? (
mkfifo /tmp/fifo— a filesystem-named pipe.) - What’s
flock -n? (Non-blocking lock acquisition — fails immediately if already held.) - Why use
flockinstead of[[ -f $LOCKFILE ]]? (flockuses kernel-managed locks that auto-release on process death; no staleness.) - What’s a TOCTOU race? (Time-Of-Check, Time-Of-Use — the gap between checking a condition and acting on it allows another process to change state in between.)
- What’s the safest pattern for accumulating results from parallel jobs? (Each job writes to its own file; aggregate after
wait.)
What’s next
Lesson 17: Network Operations — curl/wget Mastery, /dev/tcp Sockets, Retry-with-Backoff & Idempotent HTTP. Almost every modern script makes HTTP calls — to APIs, to artifact registries, to webhooks. We’ll cover curl (every flag worth knowing), wget (when and why), bash’s built-in /dev/tcp socket support (no curl needed!), retry-with-exponential-backoff patterns, idempotency keys for safe API calls, and the canonical “wait for service to be up” pattern. After L17, your scripts will hit the network reliably.
See you there.