Shell programming is a minefield of errors. It is easy to make mistakes. Given that shell scripts run and protect hundreds of billions of dollars worth of assets, it is important to avoid these easy-to-make mistakes and remember these easy-to-forget details.
Last year, I wrote a book on Linux command-line tips and tricks (see note below) and made several updates to it. Annoyingly, I continue to discover something new and important about the
bash shell program almost every week. I do not want this happening after I had ordered my author copy. The discoveries make me wonder what I have been doing all these years without knowing these
sh and bash Are Not the Same
The Bourne shell (
sh) program began its life in the 70s with the Unix operating system. In Ubuntu Linux, they continue to have this old shell alongside its new avatar, the Bourne-Again shell (
bash -version in the command line, it will display its version number. Try
sh -version, you get an error. The two are different. While
sh remains an ancient relic,
bash continues to be developed and has a lot more features.
It was my practice (in the late 90s) to run my shell scripts in SCO Unix with the
sh command. I continued this in Ubuntu and found that a lot of online script examples did not work with it. (As a security measure, I never give the extension
.sh or the
+x permission to my scripts. My scripts remain anonymous with an innocuous
Aware of this problem, a lot of script authors place a comment
#!/bin/bash on the first line. This comment ensures that the script will be run with
bash even if it is invoked with
Some fanatics use the comment
#!/usr/bin/env bash instead as a more failsafe measure. They say that
bash may not always be at
/bin so it is better make
env to find it. By this, they assume that
env will always be found at
/usr/bin. Seems overkill to me. If you are on Ubuntu as most people are, then
#!/bin/bash should do fine.
if Statements Are Not What They Seem to Be
if statement is very unusual.
if test-expression; then
test-expression needs to return 0 (zero) to be true and any non-zero value to be false. In most languages, 1 (one) evaluates as true and 0 (zero) evaluates as false. Why does
bash behave differently?
This is because shell scripts often need to determine how other programs have performed. They do this by reading the exit value of those programs. By convention, when a program exits without an error, it returns control to the invoking program (the shell program) with an exit code of 0 (zero). If it needs to exit after encountering an error, it returns with a non-zero exit code. To help in troubleshooting, program authors publish special meaning to each non-zero exit code.
Thus, in the
if statement, the
test-expression could be a program. If the program executed successfully and returned 0 to the shell, then the
if statement behaves as if it evaluated to
true. If the program exited with a non-zero value, then the
if statement behaves as if evaluated to
The if statement evaluates commands and checks their exit values. It does not evaluate expressions as true or false.
What you need to remember is that the
if statement is not looking for the boolean values
Does this mean that
if true; then will evaluate to
false because it is not
0 (zero)? No!
That brings us to another strange feature of shell.
true is actually a program! It is not part of the shell language. In Ubuntu, it resides in /usr/bin/true and it exits with a return code of
0. There is also a
false program residing at /usr/bin/false and it exits with a return code of
[ Is a Program
To test whether a file exits, you can use
if [ -f the-file.ext ]; then. Here, the single bracket
[ is not part of the language. It is a program at /usr/bin/[ and its arguments are:
What seems like keywords or programming constructs are actually programs.
To ensure that the
[ commands are executed properly, there has to be a space after the opening bracket and before the closing bracket. If you omit the first, you are not invoking the correct program. If you omit the latter, you failed to terminate the command with the correct closing argument.
Beware of Space in String Comparisons
When you assign a value to a
string variable, DO NOT leave any space before and after the
= sign. If you do, it seems to the shell that the variable is a command and the
= and the attempted value for the variable are its arguments.
When you test whether two
strings are equal, DO leave a space before and after the
= sign. If you do not, the
[ program will think you are trying to make an assignment. This assignment statement has an exit value of 0 (zero). That means the
if statement will always be forced to evaluate to
sTest = "hello"
if [ "$sTest"="hellooooooooo" ]; then
if [ "$sTest" = "hello" ]; then
[[ Is Not the Fail-Safe Version of [
[, which is a program, the
[[ construct is a part of the shell language. Some misguided fellows on the Internet recommend that you replace all your
[ evaluations with
[[ ones. Do not follow this advice.
[[ are used for a more literal evaluation of text
strings. You do not have to quote everything.
- Words and file names are not expanded. However, other forms of expansion such as expansions and substitutions are performed.
= operator behaves like the way
== operators do with
== operators compare the text expression on the left with a pattern on the right.
- A pattern is a text
string containing at least one wildcard character (
?) or a bracket expression
[..]. A bracket expression encloses a set of characters or a range of characters (separated by a hyphen (-)) between the square brackets (
- A new
=~ operator is available. (It cannot be used with
[.) It compares the text on the left with a regular expression on the right. (It will exit with a return value of 2 if the regular expression is invalid.)
=~ operator is great for matching substrings.
$ if [[ "Hello?" =~ ell ]]; then echo "Yes"; else echo "No"; fi
$ if [[ "Hello?" =~ ^Hell ]]; then echo "Yes"; else echo "No"; fi
$ if [[ "Hello?" =~ ?$ ]]; then echo "Yes"; else echo "No"; fi
$ if [[ "Hello?" =~ "?"$ ]]; then echo "Yes"; else echo "No"; fi
[[ have their legitimate use-cases. Do not use one for the other.
|Operator use ||Result |
[ -f "$file" ]
|Does it exist as a file? |
[ -d "$file" ]
|Does it exist as a directory? |
[ -h "$file" ]
|Does it exist as a soft link? |
[ -r "$file" ]
|Is the file readable? |
[ -w "$file" ]
|Is the file writeable? |
[ -x "$file" ]
|Is the file executable? |
[ -z "$string" ]
|Is the string empty? |
[ -n "$string" ]
|Is the string not empty? |
[ "$string1" = "$string2" ]
|Are the strings same?
= is same as
[ "$string1" != "$string2" ]
|Are the strings different? |
[ "$string1" < "$string2" ]
|Does first string sort ahead of second? |
[ "$string1" > "$string2" ]
|Does first string sort after second? |
[ n1 -eq n2 ]
|Are the numbers same? |
[ n1 -ne n2 ]
|Are the numbers different? |
[ n1 -le n2 ]
n1 less than or equal to
[ n1 -ge n2 ]
n1 greater than or equal to
[ n1 -lt n2 ]
n1 less than
[ n1 -gt n2 ]
n1 greater than
[ ! e ]
|Is the expression false |
[ e1 ] && [ e2 ]
|Are both expressions true? |
[ e1 ] || [ e2 ]
|Is one of the expressions true? |
Do not use
-o logical operators. You will make mistakes reading and writing them. They are the
sh way of doing things. Square brackets and operators
|| are so
Arithmetic Operations Are Not Straight-Forward
If you set
a=1 and then try
$a echo as 2 or 11? The answer is
a+1. Until a few years back, I did not know how to perform arithmetic operations in
bash. I never had to so I never learned it. I just assumed that it must be the same as in other languages but it was not to be. To add one plus one, you can use:
a=$(( a+1 ))
Array Operations Can Be Cryptic
Does every language out there need to have a totally different method to create and use arrays? Who so evil? Why?
var=(hello world how are you)
Bash Is a Minefield of Careless Errors
A shell script will execute like no tomorrow irrespective of any errors it encounters. If a statement encounters an error and exits with a non-zero exit code,
bash is happy to display any error message it wants but will nonchalantly continue to execute the subsequent statements.
If you try to use an undefined variable,
bash will not treat it as an error.
bash will substitute an empty string and proceed. If you try
sudo rm -rf $non-existent-variable/, the command evaluates to
sudo rm -rf /. I have not tried it yet so I cannot tell what protections Linux has.
These shell behaviours are extremely dangerous. To fail early, place the following statement at the top of your scripts.
That is, after the
-u prevents the use of undefined variables. The option
-e stops execution of the script when an error is encountered. This is convenient when you are building your scripts. Its disadvantage is your code will never get a chance to evaluate the error code of the previous statement. If you are using
if-else constructs in which you check previous error codes, then use
set -u. There is also a
-x option for verbose error information.
It is not possible to cover all
bash secrets in one article. In the next article, I will cover how
bash performs text expansions, substitutions and removals.
- This article was originally published in the Open Source For You magazine in 2022. I re-posted it on CodeProject in 2023.
- I made the ebook version of Linux Command-Line Tips & Tricks free on many ebook stores. This article has been sourced from it.