Most bugs are in your error handling code

While reading What tools made you better programmer I came across a link to Error Handling in a Correctness-Critical Rust Project which included these two tidbits:

almost all (92%) of the catastrophic system failures are the result of incorrect handling of non-fatal errors explicitly signaled in software.

in 58% of the catastrophic failures, the underlying faults could easily have been detected through simple testing of error handling code.

Error handling in bash

I’ve been learning about approaches to error handling in bash. I found Error handling in BASH, Exit Shell Script Based on Process Exit Code, Writing Robust Bash Shell Scripts and Bash: Error handling all of which contained a nugget of useful information.

Particularly I learned about $LINENO, set -e (equiv: set -o errexit), set -o pipefail, set -E, set -u (equiv: set -o nounset), set -C (equiv: set -o noclobber), #!/bin/bash -eEu, trap, shopt -s expand_aliases, alias, $PIPESTATUS, subshells, unset, and more.

If you call exit without specifying a return code then $? is used, I didn’t know that.

I found this code snippet that demos a relatively race-condition free method of acquiring a lock file:

if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; 
then
   trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT

   critical-section
   
   rm -f "$lockfile"
   trap - INT TERM EXIT
else
   echo "Failed to acquire lockfile: $lockfile." 
   echo "Held by $(cat $lockfile)"
fi