Loops are where good shell scripts grow up and bad shell scripts destroy production. Almost every notorious shell-script disaster you’ve ever read about — files deleted by accident, partial deployments left in inconsistent states, log-rotation jobs that ate the wrong directory — boils down to one of three categories of loop bug:
- Word-splitting bugs: iterating over
$(ls)or unquoted variables and getting wrong filenames when names contain spaces, tabs, newlines, or globs. - Subshell bugs:
cmd | while read line; do COUNT=$((COUNT+1)); done— the loop body runs in a subshell, soCOUNTdoesn’t update in the parent, and after the loop your counter is still zero. - Empty-glob bugs:
for f in *.logwhen no.logfiles exist — bash leaves the literal*.logas the iterator value, so you process a “file” called*.log.
Every one of these is a specific corollary of the lessons we covered in L2 (quoting and IFS) and L3 (exit codes). This lesson shows you the loop forms themselves, and shows you which iteration patterns are correct and which are landmines.
Read this carefully. Type the examples. Run them with weird filenames (spaces, newlines, leading dashes) and watch them break — then fix them with the patterns below.
1. The four loop forms in bash
Bash has four loop constructs. Three you’ll use constantly; one (select) you’ll see rarely.
for VAR in LIST; do BODY; done
for (( init; condition; update )); do BODY; done
while CONDITION; do BODY; done
until CONDITION; do BODY; done
The for-in form iterates over a list of words. The C-style for (( )) is bash-specific and gives you C-like loops. while runs the body as long as CONDITION exits zero (success). until is the inverse — runs the body until CONDITION exits zero. Both while and until evaluate CONDITION before each iteration.
There is no do-while form; the closest you can get is to put the work at the start of the body and break when done.
2. The for-in loop and the four ways to use it correctly
The for-in loop iterates over a list of words. The fundamental thing to remember: bash splits the list on IFS (whitespace by default) and processes glob patterns. You almost never want this for arbitrary input.
Form 1: explicit list of words
for fruit in apple banana cherry; do
echo "$fruit"
done
Output:
apple
banana
cherry
This works because the words are literal — no variables, no globs, no IFS surprises. The list can also span multiple lines:
for service in \
postgres \
redis \
nginx; do
systemctl restart "$service"
done
The trailing backslashes continue the line. This is a perfectly fine pattern for short, fixed lists.
Form 2: glob expansion (the right way to iterate over files)
for log in /var/log/*.log; do
[ -e "$log" ] || continue # handle the no-match case
echo "Processing $log"
gzip "$log"
done
Bash expands /var/log/*.log to a list of matching paths. This is the correct way to iterate over files. Unlike for f in $(ls), glob expansion is byte-exact — filenames with spaces, tabs, newlines, glob characters in their names, all of them work correctly.
The [ -e "$log" ] || continue line handles the empty-glob case: when no files match, bash by default leaves the pattern as the literal value. So for log in /var/log/*.log with no logs would set log="/var/log/*.log" for one iteration, which is almost never what you want.
There are two ways to handle this:
Option A (per-loop) — check existence at the top of the body:
for log in /var/log/*.log; do
[ -e "$log" ] || continue
process "$log"
done
Option B (script-wide) — enable nullglob:
shopt -s nullglob # bash-only; non-matching globs expand to nothing
for log in /var/log/*.log; do
process "$log"
done
shopt -u nullglob # restore default if needed
nullglob is bash-only. POSIX shells don’t have it. If you’re writing portable shell, use option A. We’ll cover nullglob, dotglob, globstar, and the glob options in detail in lesson 11.
Form 3: iterate over an array (the safe way to iterate over collected data)
LOGS=(
/var/log/app.log
"/var/log/with space.log"
/var/log/another.log
)
for log in "${LOGS[@]}"; do
echo "Processing: $log"
done
The "${LOGS[@]}" form expands to one quoted token per array element — preserving spaces, newlines, every byte exactly. This is the right way to iterate over a list of names you’ve collected from somewhere. Lesson 6 covers arrays in depth.
Form 4: iterate over $@ — the script’s positional parameters
for arg in "$@"; do
echo "Got argument: $arg"
done
This is the canonical way to iterate over command-line arguments. The double-quotes are essential — for arg in $@ (unquoted) splits arguments on IFS again, breaking arguments that contain spaces. We covered this in L2. Always "$@".
You can also write for arg do ... done — bash treats a for with no in as iterating over "$@" automatically. Useful shorthand:
for arg do
echo "Got: $arg"
done
The four wrong ways to iterate
These are the common mistakes. Each one breaks under specific inputs:
for f in $(ls /var/log) # WRONG — splits on whitespace, breaks on filenames with spaces
for f in `ls /var/log` # WRONG — same as above, with deprecated syntax
for f in $FILES # WRONG — splits on $IFS, breaks if FILES has weird chars
cat list.txt | while read f # SUBTLE BUG — body runs in subshell; variables don't propagate (section 5)
Memorise: never iterate over the output of ls, never iterate over an unquoted variable expansion, never use a pipe to feed while read if you need to capture state. The right replacements are: globs, find -print0 + xargs -0, mapfile/readarray, and while read fed by input redirection not a pipe (section 5).
3. The C-style for (( )) loop
When you need a numeric counter, the C-style for is far cleaner than seq-and-for-in:
for (( i = 0; i < 10; i++ )); do
echo "Iteration $i"
done
Inside (( )) you have full C arithmetic — increment (i++), decrement (i--), compound assignment (i += 5), bitwise, etc. No $ prefix on variables.
The three sections (init, condition, update) are all optional — for ((;;)) is a forever-loop, like while true:
for ((;;)); do
echo "Press Ctrl+C to stop"
sleep 1
done
Compared to the older POSIX-portable seq form:
# Portable but slower (forks seq)
for i in $(seq 0 9); do
echo "Iteration $i"
done
# Bash-only, faster (no fork)
for ((i = 0; i < 10; i++)); do
echo "Iteration $i"
done
The C-style form is faster (no seq subprocess), more flexible (custom step, descending counts), and more readable when you’re doing real arithmetic.
If you need to iterate by a non-default step:
for ((i = 100; i > 0; i -= 5)); do
echo "Countdown: $i"
done
Or with bash’s brace expansion (lesson 11), for fixed ranges:
for i in {0..9}; do
echo "$i"
done
for i in {0..100..5}; do # step of 5; bash 4+
echo "$i"
done
Brace expansion is faster than seq and doesn’t fork, but it’s expanded eagerly — {0..1000000} builds the entire list in memory before the loop starts. For very large counts, use for ((;;)).
4. while and until
while and until are command-runners just like if. They run a command, look at its exit code, and decide whether to enter (or re-enter) the body.
COUNT=0
while (( COUNT < 5 )); do
echo "Count: $COUNT"
(( COUNT++ ))
done
i=0
until (( i >= 5 )); do
echo "i: $i"
(( i++ ))
done
while CONDITION; do BODY; done is read “while CONDITION is true (zero exit), run BODY.” until CONDITION; do BODY; done is “until CONDITION becomes true, run BODY.” Mechanically until X is exactly while ! X. Use while 99% of the time; until only when “until X happens” reads more naturally.
Infinite loops
while true; do
echo "Forever"
sleep 1
done
while :; do # the colon command — minimal overhead, equivalent to true
echo "Forever"
sleep 1
done
while : is a tiny bit faster than while true (: is a built-in always; true is a built-in in bash but a separate binary in some shells). Rare to matter, but a common idiom.
Reading input line-by-line (the right way)
This is the most important while pattern in shell. To process a file (or any input) one line at a time:
while IFS= read -r line; do
echo "Got line: $line"
done < input.txt
Three critical pieces:
IFS=— clears IFS for this command only, so leading and trailing whitespace are preserved exactly.read -r— raw mode; do not interpret backslash escapes. Without-r, a literal\followed by a newline is interpreted as a line continuation. Almost always wrong for arbitrary data.done < input.txt— input redirection (lesson 7). The file is piped to the loop’sstdin, so each call toreadconsumes one line.
This pattern is byte-exact: it preserves every character of the input, including spaces, tabs, leading dashes, embedded glob characters, embedded backslashes — everything.
Parsing structured input with read
read can split a line into multiple variables:
echo "alice 30 engineer" | while read -r name age role; do
echo "Name: $name, Age: $age, Role: $role"
done
If you give read more variables than fields, the extras are empty. If you give read fewer variables than fields, the last variable gets everything that’s left over (including IFS-separated fields, joined back with spaces). This behaviour is sometimes useful, sometimes surprising.
For CSV-like input, set IFS for the read:
while IFS=',' read -r name age role; do
echo "$name | $age | $role"
done < users.csv
For tab-separated files:
while IFS=$'\t' read -r col1 col2 col3; do
echo "$col1 | $col2 | $col3"
done < data.tsv
The IFS=',' or IFS=$'\t' is set only for the read invocation — bash’s prefix-environment-variable syntax (lesson 1). The rest of the script’s IFS is unaffected.
5. The subshell trap: pipes and while read
This is one of the most subtle, surprising bugs in bash. Watch closely.
COUNT=0
echo "line1
line2
line3" | while read -r line; do
COUNT=$((COUNT + 1))
done
echo "Count: $COUNT"
What does this print? You might expect 3. It actually prints 0.
The reason: in bash (and POSIX shells generally), each stage of a pipeline runs in its own subshell. The while loop on the right side of | is a subshell with its own copy of COUNT. When the loop ends, the subshell exits, and the parent’s COUNT is unchanged.
This is the most common shell bug in production scripts. It silently produces wrong results.
The fix: use input redirection instead of a pipe. Input redirection runs the body in the current shell, so variables persist:
COUNT=0
while read -r line; do
COUNT=$((COUNT + 1))
done < <(echo "line1
line2
line3")
echo "Count: $COUNT" # 3 — correct
The < <(echo ...) is process substitution (lesson 7) — <(cmd) produces a filename that bash makes readable as input. The outer < redirects that file into the loop’s stdin. Same effect as a pipe, but the while runs in the parent shell.
Or read from a real file:
COUNT=0
while read -r line; do
(( COUNT++ ))
done < input.txt
echo "Count: $COUNT" # works
Or use mapfile to read the whole file into an array first:
mapfile -t LINES < input.txt
COUNT="${#LINES[@]}"
echo "Count: $COUNT"
mapfile is bash 4+. The -t flag strips trailing newlines from each line. Lesson 6 covers mapfile and arrays in depth.
There are bash settings that change this behaviour:
shopt -s lastpipe # bash-only; when in non-interactive mode,
# the last pipe stage runs in the current shell
With lastpipe enabled, cmd | while read line; do COUNT=...; done works as expected. But it’s a non-default setting and only works in scripts (not interactive shells), and only if job control is off. The portable, explicit fix is process substitution or input redirection.
The rule: never use cmd | while read if the loop body needs to update variables. Use < <(cmd) instead.
A real-world example: counting matches
# WRONG — the count is always 0
COUNT=0
grep -E '^ERROR' /var/log/app.log | while read -r line; do
(( COUNT++ ))
done
echo "Errors: $COUNT"
# RIGHT — process substitution preserves variable scope
COUNT=0
while read -r line; do
(( COUNT++ ))
done < <(grep -E '^ERROR' /var/log/app.log)
echo "Errors: $COUNT"
# BEST — let grep do the counting
COUNT="$(grep -cE '^ERROR' /var/log/app.log)"
echo "Errors: $COUNT"
The third form is fastest and clearest. Whenever you find yourself counting in a shell loop, ask whether grep -c, wc -l, or awk could do it for you. Shell loops over many lines are slow; built-in tools are fast.
6. break and continue
break exits the innermost loop. continue skips to the next iteration. Both can take an integer to operate on outer loops:
for i in 1 2 3; do
for j in a b c; do
if [[ "$j" == "b" ]]; then
break # exits the inner loop
fi
echo "$i $j"
done
done
# Output:
# 1 a
# 2 a
# 3 a
for i in 1 2 3; do
for j in a b c; do
if [[ "$j" == "b" ]]; then
break 2 # exits BOTH loops
fi
echo "$i $j"
done
done
# Output:
# 1 a
for i in 1 2 3; do
for j in a b c; do
if [[ "$j" == "b" ]]; then
continue # skip this j, move to next j
fi
echo "$i $j"
done
done
# Output:
# 1 a
# 1 c
# 2 a
# 2 c
# 3 a
# 3 c
break N and continue N work on N levels of nesting. Use break 2 instead of flag variables — it’s clearer and faster.
7. The case statement inside loops
case was introduced in lesson 3 as a multi-branch conditional. Inside a loop, it’s the cleanest way to dispatch on per-item values:
for arg in "$@"; do
case "$arg" in
-v|--verbose)
VERBOSE=1
;;
-q|--quiet)
QUIET=1
;;
-h|--help)
show_help
exit 0
;;
--)
shift
break # rest are positional args
;;
-*)
echo "Unknown option: $arg" >&2
exit 2
;;
*)
POSITIONAL+=("$arg")
;;
esac
done
This is a hand-rolled argument parser. getopts (lesson 17) does this more rigorously, but the hand-rolled form is often clearer and supports long options without extra work.
Inside a case arm, ;; ends the arm. ;& falls through to the next arm. ;;& continues evaluating subsequent patterns. We covered these in lesson 3.
8. Command substitution gotchas
Lesson 2 covered the basics of $(cmd). Inside a loop, the gotchas multiply.
Trailing newlines are stripped
NAME=$(echo "hello")
printf '%q\n' "$NAME" # 'hello' — no trailing newline
$(...) strips all trailing newlines from the command’s output. This is usually what you want. But if the trailing newlines are significant (rare, but possible — e.g., a file’s exact byte content), you need to preserve them with a sentinel:
CONTENT=$(cat file.txt; printf x)
CONTENT="${CONTENT%x}" # remove the sentinel
The trailing printf x adds a non-newline byte that isn’t stripped; then we remove it with parameter expansion. Niche but occasionally critical.
Output that contains globs
FILES_OUTPUT=$(ls /tmp)
for f in $FILES_OUTPUT; do # WRONG: word-split on IFS, glob-expand
echo "$f"
done
If /tmp has a file named *, the unquoted $FILES_OUTPUT will glob-expand and you’ll iterate over every file in your current directory instead. Never iterate over $(ls)-style output unquoted.
Multi-line output and IFS
TEXT=$(printf 'line1\nline2\nline3\n')
for line in $TEXT; do # works ONLY because IFS includes \n by default
echo "$line"
done
This works under default IFS (space tab newline), but if you’ve set IFS=$'\n\t' in strict mode, the lines split on newlines correctly — but if any line contains a tab, it splits there too. The robust replacement:
mapfile -t LINES < <(printf 'line1\nline2\nline3\n')
for line in "${LINES[@]}"; do
echo "$line"
done
mapfile -t reads input line-by-line into an array, splitting only on newlines. It’s the modern, correct replacement for for line in $(cmd).
9. Iterating over files: the canonical patterns
This is the section everyone needs, copied from real production shell scripts.
Pattern A: glob (when files are on disk in a known location)
shopt -s nullglob # optional but recommended
for f in /path/to/*.log; do
process "$f"
done
shopt -u nullglob
Or, without nullglob:
for f in /path/to/*.log; do
[ -e "$f" ] || continue
process "$f"
done
Pattern B: find -print0 + while read -d ''
For deeply nested directories or filtered traversal:
while IFS= read -r -d '' f; do
process "$f"
done < <(find /path -type f -name '*.log' -print0)
find -print0 separates filenames with NUL bytes (\0), which is the only character that can’t appear in a filename. read -d '' (empty delimiter = NUL) parses NUL-separated input. This is the only completely robust way to handle arbitrary filenames in a shell pipeline.
Pattern C: mapfile for line-oriented input
mapfile -t FILES < <(find /path -type f -name '*.log')
for f in "${FILES[@]}"; do
process "$f"
done
Simpler than the find -print0 form, but breaks on filenames containing newlines. Acceptable for trusted inputs (your own deployment artifacts), risky for user-supplied data.
Pattern D: xargs for parallel processing
find /path -type f -name '*.log' -print0 | xargs -0 -P 4 -I {} process "{}"
xargs -0 parses NUL-separated input. -P 4 runs 4 in parallel. -I {} substitutes the filename for {}. This is the right pattern for “process N files concurrently.” Lesson 14 covers xargs and parallelism in depth.
Choosing between them
- Pattern A (glob): when files are in one or two known directories. Fastest, simplest.
- Pattern B (
find -print0): when traversal is deep or filtered, or when filenames may be hostile. - Pattern C (
mapfile): when you want all the names in memory and need to access them by index. - Pattern D (
xargs -0 -P): when each file is independent and parallelism wins.
10. Real-world example: archive and compress old logs
#!/usr/bin/env bash
# rotate-logs.sh — compress logs older than N days, then delete after M days
set -euo pipefail
IFS=$'\n\t'
LOG_DIR="${LOG_DIR:-/var/log/myapp}"
COMPRESS_AFTER_DAYS="${COMPRESS_AFTER_DAYS:-7}"
DELETE_AFTER_DAYS="${DELETE_AFTER_DAYS:-30}"
# 1. Compress logs older than N days, but not already compressed
COMPRESSED_COUNT=0
while IFS= read -r -d '' f; do
echo "Compressing: $f"
if gzip -- "$f"; then
(( COMPRESSED_COUNT++ ))
else
echo "Failed to compress: $f" >&2
fi
done < <(find "$LOG_DIR" -type f -name '*.log' -mtime "+${COMPRESS_AFTER_DAYS}" -print0)
echo "Compressed ${COMPRESSED_COUNT} files."
# 2. Delete compressed logs older than M days
DELETED_COUNT=0
while IFS= read -r -d '' f; do
echo "Deleting: $f"
if rm -- "$f"; then
(( DELETED_COUNT++ ))
else
echo "Failed to delete: $f" >&2
fi
done < <(find "$LOG_DIR" -type f -name '*.log.gz' -mtime "+${DELETE_AFTER_DAYS}" -print0)
echo "Deleted ${DELETED_COUNT} files."
Things to notice:
- Strict mode and IFS hardening.
- Defaults via
${VAR:-default}for everything tunable. find -print0+while IFS= read -r -d '' f— the bulletproof iteration pattern. Handles every possible filename, including ones with newlines, spaces, glob chars, leading dashes.rm --andgzip --use--as end-of-options to handle filenames that start with-. You should always do this when passing variable filenames to commands that take options.- Counting is done in the parent shell (process substitution, not pipes), so the counters update correctly.
- Error handling: each operation that can fail has an explicit non-fatal error path that logs to stderr and continues.
This is the production-grade form. The naive form using for f in $(ls *.log) would have broken in three different ways on this filesystem: spaces, leading-dash filenames, and the no-match case.
11. The ten loop idioms you should have in muscle memory
# 1. Iterate over files in a directory (with empty-glob protection)
for f in /path/*.log; do
[ -e "$f" ] || continue
process "$f"
done
# 2. Iterate over command-line arguments
for arg in "$@"; do
echo "$arg"
done
# 3. C-style numeric loop
for ((i=0; i<10; i++)); do
echo "$i"
done
# 4. Read a file line by line
while IFS= read -r line; do
echo "Got: $line"
done < file.txt
# 5. Read CSV
while IFS=',' read -r col1 col2 col3; do
echo "$col1 | $col2 | $col3"
done < data.csv
# 6. Iterate over the output of a command (variables persist)
while IFS= read -r line; do
(( COUNT++ ))
done < <(my-command)
# 7. Find with NUL-safe iteration
while IFS= read -r -d '' f; do
process "$f"
done < <(find . -type f -print0)
# 8. mapfile into array
mapfile -t LINES < file.txt
for line in "${LINES[@]}"; do
echo "$line"
done
# 9. Retry loop with exponential backoff
for ((i=1; i<=5; i++)); do
if my-command; then
break
fi
sleep $((2 ** i))
done
# 10. Forever loop with controlled exit
while :; do
if [[ -f /tmp/stop ]]; then break; fi
do_one_iteration
sleep 5
done
Internalise these and 90% of your loop-writing time disappears.
12. What you must internalise before lesson 5
- Why is
for f in $(ls)always wrong? (Word splitting on IFS plus globbing breaks any filename with spaces, tabs, newlines, or glob chars.) - What are the four correct ways to iterate over files? (Glob with empty-glob protection;
find -print0+while read -d '';mapfile;xargs -0.) - Why does
cmd | while read line; do COUNT=...; donenot updateCOUNTin the parent? (Each pipe stage runs in a subshell; subshell variables don’t propagate back.) - What’s the fix? (Process substitution:
done < <(cmd). Ormapfile -t arr < <(cmd)then iterate the array.) - What’s the difference between
read lineandIFS= read -r line? (IFS=preserves leading/trailing whitespace;-rdisables backslash interpretation. Almost always you want both.) - When do you use the C-style
for ((;;))vsfor-in? (C-style for numeric loops with arithmetic;for-infor word lists, arrays, and globs.) - What does
break 2do? (Breaks out of two levels of nested loops.) - What’s the difference between
until Xandwhile ! X? (Mechanically identical; pick whichever reads more naturally.) - What’s
nullglob? (A bash option that makes non-matching globs expand to nothing instead of leaving the literal pattern.) - Why use
--when calling commands with variable filenames? (--is end-of-options; protects against filenames that start with-.)
If any felt fuzzy, re-read. Lesson 5 (functions, scope, return) builds on all of this — every function is an exit-code-returning thing that often contains loops.
What’s next
Lesson 5 covers functions, local scope, the difference between return and exit, argument passing ($1, $@, $*), positional-parameter manipulation with shift, recursive functions, and the function-style for organising larger scripts (the main "$@" pattern). Bring everything from lessons 1–4.