Bash common mistakes

Bash common mistakes and useful tricks

Never use backticks

  • ❌ wrong

    1
    2
    3
    # they are not always portable to other OS
    # can no be nested without spaces
    `call_command_in_subshell`
  • ✔️ correct

    1
    $(call_command_in_subshell)

Multiline pipe

  • ❌ incorrect

    1
    2
    # hard to read
    ls ${long_list_of_parameters} | grep ${foo} | grep -v grep | pgrep | wc -l | sort | uniq
  • ✔️ correct

    1
    2
    3
    4
    5
    6
    7
    ls ${long_list_of_parameters}	\
    | grep ${foo} \
    | grep -v grep \
    | pgrep \
    | wc -l \
    | sort \
    | uniq

Avoid overusing grep and grep -v

  • ❌ incorrect

    1
    ps ax | grep ${processname} | grep -v grep
  • ✔️ correct (with userland utilities):

    1
    pgrep ${processname}

Replace awk(1) to print an element

  • ❌ incorrect

    1
    ${listofthings} | awk '{ print $3 }' # get the third item
  • ✔️ correct

    1
    2
    listofthings=(${listofthings}) # convert to array
    ${listofthings[2]} # get the third item (start counting from 0)

Use built in variable expansion instead of sed/awk

  • ❌ incorrect

    1
    2
    VAR=FOO
    printf ${VAR} | awk '{print tolower($0)}' # foo
  • ✔️ correct

    1
    2
    3
    4
    5
    6
    7
    8
    9
    # ${VAR^} # upper single
    # ${VAR^^} # upper all
    # ${VAR,} # lower single
    # ${VAR,,} # lower all
    # ${VAR~} # swap case single
    # ${VAR~~} # swap case all

    VAR=BAR
    printf ${VAR,,} # bar
  • same thing with string replacement.

    1
    2
    3
    4
    5
    6
    7
    8
    # ${VAR/PATTERN/STRING} # single replacement
    # ${VAR//PATTERN/STRING} # all match replacement
    # Use ${VAR#PATTERN} ${VAR%PATTERN} ${VAR/PATTERN} for string removal

    VAR=foofoobar
    ${VAR/foo/bar} # barfoobar
    ${VAR//foo/bar} # barbarbar
    ${VAR//foo} # bar

Avoid seq for ranges

Use the built in {x..y} expression instead.

1
2
3
for k in {1..100}; do
$(do_awesome_stuff_with_input ${k})
done

Timeouts

Bash arithmetic instead of expr

Never use bc(1) for modulo operations

disown

disown is a bash built-in that can be used to remove a job from the job table of a bash script. You can remove one or multiple of these processes with disown and the script will not care about it anymore.

Basic parallelism

  • Usually people use & to send a process to the background and wait to wait for the process to finish.
  • People then often use named pipes, files and global variables to communicate between the parent and sub programs.

xargs

For file-based in-node parallelization, xargs is the easiest way to parallelize the processing of list elements.

1
2
3
4
5
6
7
8
9
# simple example: replace all occurences of "foo" with "bar" in ".txt" files
# will process each file individually and up 16 processes in parallel
find . -name "*.txt" | xargs -n1 -P16 -I{} sed -i 's/foo/bar/g' {}

# complex example: HDF5 repack for transparent compression of files
# find all ".h5" files in "${dirName}" and use up to 64 processes in parallel to independently compress them
find ${dirName} -name "*.h5" | xargs -n1 -P64 -I{} \
sh -c 'echo "compress $1 ..." && \
h5repack -i $1 -o $1.gz -f GZIP=1 && mv $1.gz $1' _ {}

coproc and GNU parallel

Trapping, exception handling and failing gracefully

  • trap is used for signal handling in bash, a generic error handling function may be used like this:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    readonly banner="my first bash project >>"
    function fail() {
    # generic fail function for bash scripts
    # arg: 1 - custom error message
    # arg: 2 - file
    # arg: 3 - line number
    # arg: 4 - exit status
    echo "${banner} ERROR: ${1}." >&2
    [[ ${2+defined} && ${3+defined} && ${4+defined} ]] && \
    echo "${banner} file: ${2}, line number: ${3}, exit code: ${4}. exiting!"

    # generic clean up code goes here (tempfiles, forked processes,..)

    exit 1
    } ; trap 'fail "caught signal"' HUP KILL QUIT
1
do_stuff ${withinput} || fail "did not do stuff correctly" ${FILENAME} ${LINENO} $?

Trapping on EXIT instead of a specific signal is particularly useful for cleanup handlers since this executes the handler regardless of the reason for the script’s termination. This also includes reaching the end of your script and aborts due to set -e.

You don’t need cat

  • Sometimes cat is not available, but with bash you can read files anyhow.

    1
    2
    batterystatus=$(< /sys/class/power_supply/BAT0/status)
    printf "%s\n" ${batterystatus}
  • Avoid cat where reading a file can be achieved through passing the file name as a parameter.

    • ❌ incorrect cat ${FILENAME} | grep -v ...
    • ✔️ correct grep -v ... ${FILENAME}.

locking (file based)

  • flock(1) is an userland utility for managing file based locking from within shell scripts. It supports exclusive and shared locks.

Use the getopt builtin for command line parameters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
printf "This script is: %s\n" ${0##/*/}

[[ "${#}" == 0 ]] && {
# no arguments
printf "No options given: %s\n" ${OPTIND}
exit 1
}

log="" # numeric, log
table="" # single fill
stores=( ) # array

# : after a letter is for string into parameter
while getopts ":dhls:t:" opt; do
case "${opt}" in
d) set -x ;;
h) printf "Help page\n" ; exit ;;
s) stores[${#stores[*]}]="${OPTARG}" ;;
t)
if [ -z "${table}" ]; then
table="${OPTARG}"
fi
;;
l) (( log++ )) ;;
*)
printf "\n Option does not exist: %s\nOne option\n" ${OPTARG}
exit 1
;;
esac
done

# set debug if log is more than two
[[ "${log}" >= 2 ]] && {
set -x ; log=""
}
[[ "${log}" == "" ]] && unset log