Most programming languages have a set of "best practices" that should be followed when writing code in that language. However, I have not been able to find a comprehensive one for shell scripting so have decided to write my own based on my experience writing shell scripts over the years.
A note on portability: Since I mainly write shell scripts to run on systems which have Bash 4.2 installed, I don't need to worry about portability much, but you might need to! The list below is written with Bash 4.2 (and other modern shells) in mind. If you are writing a portable script, some points will not apply. Needless to say, you should perform sufficient testing after making any changes based on this list :-)
Here is my list of best practices for shell scripting (in no particular order):
- Use functions
- Document your functions
- Use
shift
to read function arguments - Declare your variables
- Quote all parameter expansions
- Use arrays where appropriate
- Use
"$@"
to refer to all arguments - Use uppercase variable names for environment variables only
- Prefer shell builtins over external programs
- Avoid unnecessary pipelines
- Avoid parsing
ls
- Use globbing
- Use null delimited output where possible
- Don't use backticks
- Use process substitution instead of creating temporary files
- Use
mktemp
if you have to create temporary files - Use
[[
and((
for test conditions - Use commands in test conditions instead of exit status
- Use
set -e
- Write error messages to stderr
Each one of the points above is described in some detail below.
- Use functions
Unless you're writing a very small script, use functions to modularise your code and make it more readable, reusable and maintainable. The template I use for all my scripts is shown below. As you can see, all code is written inside functions. The script starts off with a call to the
main
function.#!/bin/bash set -e usage() { } my_function() { } main() { } main "$@"
- Document your functions
Add sufficient documentation to your functions to specify what they do and what arguments are required to invoke them. Here is an example:
# Processes a file. # $1 - the name of the input file # $2 - the name of the output file process_file(){ }
- Use
shift
to read function argumentsInstead of using
$1
,$2
etc to pick up function arguments, useshift
as shown below. This makes it easier to reorder arguments, if you change your mind later.# Processes a file. # $1 - the name of the input file # $2 - the name of the output file process_file(){ local -r input_file="$1"; shift local -r output_file="$1"; shift }
- Declare your variables
If your variable is an integer, declare it as such. Also, make all your variables
readonly
unless you intend to change their value later in your script. Uselocal
for variables declared within functions. This helps convey your intent. If portability is a concern, usetypeset
instead ofdeclare
. Here are a few examples:declare -r -i port_number=8080 declare -r -a my_array=( apple orange ) my_function() { local -r name=apple }
- Quote all parameter expansions
To prevent word-splitting and file globbing you must quote all variable expansions. In particular, you must do this if you are dealing with filenames that may contain whitespace (or other special characters). Consider this example:
# create a file containing a space in its name touch "foo bar" declare -r my_file="foo bar" # try rm-ing the file without quoting the variable rm $my_file # it fails because rm sees two arguments: "foo" and "bar" # rm: cannot remove `foo': No such file or directory # rm: cannot remove `bar': No such file or directory # need to quote the variable rm "$my_file" # file globbing example: mesg="my pattern is *.txt" echo $mesg # this is not quoted so *.txt will undergo expansion # will print "my pattern is foo.txt bar.txt" # need to quote it for correct output echo "$msg"
It's good practice to quote all your variables. If you do need word-splitting, consider using an array instead. See the next point.
- Use arrays where appropriate
Don't store a collection of elements in a string. Use an array instead. For example:
# using a string to hold a collection declare -r hosts="host1 host2 host3" for host in $hosts # not quoting $hosts here, since we want word splitting do echo "$host" done # use an array instead! declare -r -a host_array=( host1 host2 host3 ) for host in "${host_array[@]}" do echo "$host" done
- Use
"$@"
to refer to all argumentsDon't use
$*
. Refer to my previous post: Difference between $*, $@, "$*" and "$@". Here is an example:main() { # print each argument for i in "$@" do echo "$i" done } # pass all arguments to main main "$@"
- Use uppercase variable names for ENVIRONMENT variables only
My personal preference is that all variables should be lowercase, except for environment variables. For example:
declare -i port_number=8080 # JAVA_HOME and CLASSPATH are environment variables "$JAVA_HOME"/bin/java -cp "$CLASSPATH" app.Main "$port_number"
- Prefer shell builtins over external programs
The shell has the ability to manipulate strings and perform simple arithmetic so you don't need to invoke programs like
cut
andsed
. Here are a few examples:declare -r my_file="/var/tmp/blah" # instead of dirname, use: declare -r file_dir="{my_file%/*}" # instead of basename, use: declare -r file_base="{my_file##*/}" # instead of sed 's/blah/hello', use: declare -r new_file="${my_file/blah/hello}" # instead of bc <<< "2+2", use: echo $(( 2+2 )) # instead of grepping a pattern in a string, use: [[ $line =~ .*blah$ ]] # instead of cut -d:, use an array: IFS=: read -a arr <<< "one:two:three"
Note that an external program will perform better when operating on large files/input.
- Avoid unnecessary pipelines
Pipelines add extra overhead to your script so try to keep your pipelines small. Common examples of useless pipelines are
cat
andecho
, shown below:- Avoid unnecessary
cat
If you are not familiar with the infamous Useless Use of Cat award, take a look here. The
cat
command should only be used for concatenating files, not for sending the output of a file to another command.# instead of cat file | command # use command < file
- Avoid unnecessary
echo
You should only use
echo
if you want to output some text to stdout, stderr, file etc. If you want to send text to another command, don'techo
it through a pipe! Use a here-string instead. Note that here-strings are not portable (but most modern shells support them) so use a heredoc if you are writing a portable script. (See my earlier post: Useless Use of Echo.)# instead of echo text | command # use command <<< text # for portability, use a heredoc command << END text END
- Avoid unnecessary
grep
Piping from
grep
toawk
orsed
is unnecessary. Since bothawk
andsed
cangrep
, you don't need thegrep
in your pipeline. (Check out my previous post: Useless Use of Grep.)# instead of grep pattern file | awk '{print $1}' # use awk '/pattern/{print $1}' file # instead of grep pattern file | sed 's/foo/bar/g' # use sed -n '/pattern/{s/foo/bar/p}' file
- Other unnecessary pipelines
Here are a few other examples:
# instead of command | sort | uniq # use command | sort -u # instead of command | grep pattern | wc -l # use command | grep -c pattern
- Avoid unnecessary
- Avoid parsing
ls
The problem is that
ls
outputs filenames separated by newlines, so if you have a filename containing a newline character you won't be able to parse it correctly. It would be nice ifls
could output null delimited filenames but, unfortunately, it can't. Instead ofls
, use file globbing or an alternative command which outputs null terminated filenames, such asfind -print0
. - Use globbing
Globbing (or filename expansion) is the shell's way of generating a list of files matching a pattern. In bash, you can make globbing more powerful by enabling extended pattern matching operators using the
extglob
shell option. Also, enablenullglob
so that you get an empty list if no matches are found. Globbing can be used instead offind
in some cases and, once again, don't parsels
! Here are a couple of examples:shopt -s nullglob shopt -s extglob # get all files with a .yyyymmdd.txt suffix declare -a dated_files=( *.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].txt ) # get all non-zip files declare -a non_zip_files=( !(*.zip) )
- Use null delimited output where possible
In order to correctly handle filenames containing whitespace and newline characters, you should use null delimited output, which results in each line being terminated by a NUL (
\000
) character instead of a newline. Most programs support this. For example,find -print0
outputs filenames followed by a null character andxargs -0
reads arguments separated by null characters.# instead of find . -type f -mtime +5 | xargs rm -f # use find . -type f -mtime +5 -print0 | xargs -0 rm -f # looping over files find . -type f -print0 | while IFS= read -r -d $'\0' filename; do echo "$filename" done
- Don't use backticks
Use
$(command)
instead of`command`
because it is easier to nest multiple commands and makes your code more readable. Here is a simple example:# ugly escaping required when using nested backticks a=`command1 \`command2\`` # $(...) is cleaner b=$(command1 $(command2))
- Use process substitution instead of creating temporary files
In most cases, if a command takes a file as an input, the file can be replaced by the output of another command using process substitution:
<(command)
. This saves you from having to write out a temp file, passing that temp file to the command and finally deleting the temp file. This is shown below:# using temp files command1 > file1 command2 > file2 diff file1 file2 rm file1 file2 # using process substitution diff <(command1) <(command2)
- Use
mktemp
if you have to create temporary filesTry to avoid creating temporary files. If you must, use
mktemp
to create a temporary directory and then write your files to it. Make sure you remove the directory after you are done.# set up a trap to delete the temp dir when the script exits unset temp_dir trap '[[ -d "$temp_dir" ]] && rm -rf "$temp_dir"' EXIT # create the temp dir declare -r temp_dir=$(mktemp -dt myapp.XXXXXX) # write to the temp dir command > "$temp_dir"/foo
- Use
[[
and((
for test conditionsPrefer
[[ ... ]]
over[ ... ]
because it is safer and provides a richer set of features. Use(( ... ))
for arithmetic conditions because it allows you to perform comparisons using familiar mathematical operators such as<
and>
instead of-lt
and-gt
. Note that if you desire portability, you have to stick to the old-fashioned[ ... ]
. Here are a few examples:[[ $foo == "foo" ]] && echo "match" # don't need to quote variable inside [[ [[ $foo == "a" && $bar == "a" ]] && echo "match" declare -i num=5 (( num < 10 )) && echo "match" # don't need the $ on $num in ((
- Use commands in test conditions instead of exit status
If you want to check whether a command succeeded before doing something, use the command directly in the condition of your if-statement instead of checking the command's exit status.
# don't use exit status grep -q pattern file if (( $? == 0 )) then echo "pattern was found" fi # use the command as the condition if grep -q pattern file then echo "pattern was found" fi
- Use
set -e
Put this at the top of your script. This tells the shell to exit the script as soon as any statement returns a non-zero exit code.
- Write error messages to stderr
Error messages belong on stderr not stdout.
echo "An error message" >&2
If you have any other suggestions for my list, please share them in the comments section below!
Your link to Difference between $*, $@, "$*" and "$@" points to the wrong place ;-)
ReplyDeleteNext stop: a Sonar quality profile for shell scripts?
Well spotted, Gavin! I have fixed the link now. Thanks for reading :-)
ReplyDeleteIt's easier to port a shell, than a shell script, my shell-veteran friend likes to say. There's more to that then just a grain of truth. Other than that, I found myself quite in agreement!
ReplyDeleteYou gathered a lot of useful recommendations that could be useful to *many* shell programmers! Maybe you could identify “bashisms” in your text, I noted 13 and 17. I am currently having a lot of trouble with bugs in Bash pledging its job management – Bash regularly core dumps!
ReplyDeleteI just wrote a short text about a common error of shell beginners, consisting in implementing complex treatments in the shell, while these should be delegated to filters. I would love to read your comments on this text! http://unix-workstation.blogspot.de/2015/04/delegating-complex-treatments-to.html