Showing posts with label bash. Show all posts
Showing posts with label bash. Show all posts

Saturday, August 26, 2023

Using "flock" to Prevent Multiple Instances of a Script from Running

In a previous post, I wrote about how you can use the lockfile command to ensure that only one instance of a script is running at a time. An alternative to lockfile is the flock command, which is used as follows:

flock /path/to/mylockfile cmd

By default, if the lock cannot be immediately acquired, flock will wait indefinitely until it becomes available. However, you can use the --nonblock (or -n) flag if you want flock to fail (with an exit code of 1) rather than wait if the lock cannot be immediately acquired. You can also specify how long flock should wait by passing in a --timeout in seconds.

A convenient form of flock often used within shell scripts is to use a file descriptor, as follows:

(
flock -n 9 || exit 1
# ... commands executed under lock ...
) 9>/path/to/mylockfile

If you want to prevent multiple instances of a shell script from running simultaneously, add the following boilerplate at the top of your script, which will cause the script to lock itself automatically on first run:

[ "${FLOCKER}" != "$0" ] && exec env FLOCKER="$0" flock -en "$0" "$0" "$@" || :

Related posts:
Using "lockfile" to Prevent Multiple Instances of a Script from Running
Retrying Commands in Shell Scripts
Executing a Shell Command with a Timeout

Saturday, December 24, 2016

Shell Scripting: Parsing options using getopt and getopts

This post shows how you can parse shell options using getopts and getopt.

Using getopts:

getopts is a bash built-in command. I find it a lot easier to use than getopt.

Here is an example of using getopts:

# options a and b are followed by a colon because they require arguments
while getopts "ha:b:" opt; do
  case "$opt" in
    h)
      echo "help"
     ;;
    a)
      option_a=$OPTARG
      ;;
    b)
      option_b=$OPTARG
      ;;
  esac
done
shift $((OPTIND-1))

echo "option_a: $option_a"
echo "option_b: $option_b"

# read positional parameters
echo "Param 1: $1"
echo "Param 2: $2"

Running it:

$ myscript.sh -a foo -b bar hello world
option_a: foo
option_b: bar
Param 1: hello
Param 2: world
Using getopt:

getopt supports long options and that's the only time I use it.

Here is an example of using getopt:

options=$(getopt -n "$0" -o ha:b: -l "help,alpha:,bravo:"  -- "$@")
(( $? != 0 )) && echo "Incorrect options provided" >&2 && exit 1
eval set -- "$options"

while true; do
  case "$1" in
    -h|--help)
        echo "help"
        ;;
    -a|--alpha)
        shift
        option_a="$1"
        ;;
    -b|--bravo)
        shift
        option_b="$1"
        ;;
    --)
        shift
        break
        ;;
  esac
  shift
done

echo "option_a: $option_a"
echo "option_b: $option_b"

# read positional parameters
echo "Param 1: $1"
echo "Param 2: $2"

Running it:

$ myscript.sh --alpha foo --bravo bar hello world
option_a: foo
option_b: bar
Param 1: hello
Param 2: world

Saturday, March 22, 2014

Bash Redirection and Piping Shortcuts

Redirecting both stdout and stderr

In order to redirect both standard output and standard error to a file, you would traditionally do this:

my_command > file 2>&1
A shorter way to write the same thing is by using &> (or >&) as shown below:
my_command &> file

Similarly, to append both standard output and standard error to a file, use &>>:

my_command &>> file
Piping both stdout and stderr

To pipe both standard output and standard error, you would traditionally do this:

my_command 2>&1 | another_command

A shorter way is to use |& as shown below:

my_command |& another_command

Other posts you might like:
Shell Scripting - Best Practices
All posts with label: bash

Sunday, February 23, 2014

Using "lockfile" to Prevent Multiple Instances of a Script from Running

This post describes how you can ensure that only one instance of a script is running at a time, which is useful if your script:

  • uses significant CPU or IO and running multiple instances at the same time would risk overloading the system, or
  • writes to a file or other shared resource and running multiple instances at the same time would risk corrupting the resource

In order to prevent multiple instances of a script from running, your script must first acquire a "lock" and hold on to that lock until the script completes. If the script cannot acquire the lock, it must wait until the lock becomes available. So, how do you acquire a lock? There are different ways, but the simplest is to use the lockfile command to create a "semaphore file". This is shown in the snippet below:

#!/bin/bash
set -e

# waits until a lock is acquired and
# deletes the lock on exit.
# prevents multiple instances of the script from running
acquire_lock() {
    lock_file=/var/tmp/foo.lock
    echo "Acquiring lock ${lock_file}..."
    lockfile "${lock_file}"
    trap "rm -f ${lock_file} && echo Released lock ${lock_file}" INT TERM EXIT
    echo "Acquired lock"
}

acquire_lock
# do stuff

The acquire_lock function first invokes the lockfile command in order to create a file. If lockfile cannot create the file, it will keep trying forever until it does. You can use the -r option if you only want to retry a certain number of times. Once the file has been created, we need to ensure that it is deleted once the script completes or is terminated. This is done using the trap command, which deletes the file when the script completes or when the shell receives an interrupt or terminate signal. I also like to use set -e in all my scripts, which makes the script exit if any command fails. In this case, if lockfile fails, the script will exit and the trap will not be set.

lockfile can be used in other ways as well. For example, instead of preventing multiple instances of the entire script from running, you may want to use a more granular approach and use locks only around those parts of your script which are not safe to run concurrently.

Note, that if you cannot use lockfile, there are other alternatives such as using mkdir or flock as described in BashFAQ/045.

Other posts you might like:
Shell Scripting - Best Practices
Retrying Commands in Shell Scripts
Executing a Shell Command with a Timeout

Saturday, February 08, 2014

Retrying Commands in Shell Scripts

There are many cases in which you may wish to retry a failed command a certain number of times. Examples are database failures, network communication failures or file IO problems.

The snippet below shows a simple method of retrying commands in bash:

#!/bin/bash

MAX_ATTEMPTS=5
attempt_num=1
until command || (( attempt_num == MAX_ATTEMPTS ))
do
    echo "Attempt $attempt_num failed! Trying again in $attempt_num seconds..."
    sleep $(( attempt_num++ ))
done

In this example, the command is attempted a maximum of five times and the interval between attempts is increased incrementally whenever the command fails. The time between the first and second attempt is 1 second, that between the second and third is 2 seconds and so on. If you want, you can change this to a constant interval or random exponential backoff instead.

I have created a useful retry function (shown below) which allows me to retry commands from different places in my script without duplicating the retry logic. This function returns a non-zero exit code when all attempts have been exhausted.

#!/bin/bash

# Retries a command on failure.
# $1 - the max number of attempts
# $2... - the command to run
retry() {
    local -r -i max_attempts="$1"; shift
    local -r cmd="$@"
    local -i attempt_num=1

    until $cmd
    do
        if (( attempt_num == max_attempts ))
        then
            echo "Attempt $attempt_num failed and there are no more attempts left!"
            return 1
        else
            echo "Attempt $attempt_num failed! Trying again in $attempt_num seconds..."
            sleep $(( attempt_num++ ))
        fi
    done
}

# example usage:
retry 5 ls -ltr foo

Related Posts:
Executing a Shell Command with a Timeout
Retrying Operations in Java

Sunday, October 20, 2013

Shell Scripting - Best Practices

Most programming languages have a set of "best practices" that should be followed when writing code in that language. However, I have not been able to find a comprehensive one for shell scripting so have decided to write my own based on my experience writing shell scripts over the years.

A note on portability: Since I mainly write shell scripts to run on systems which have Bash 4.2 installed, I don't need to worry about portability much, but you might need to! The list below is written with Bash 4.2 (and other modern shells) in mind. If you are writing a portable script, some points will not apply. Needless to say, you should perform sufficient testing after making any changes based on this list :-)

Here is my list of best practices for shell scripting (in no particular order):

  1. Use functions
  2. Document your functions
  3. Use shift to read function arguments
  4. Declare your variables
  5. Quote all parameter expansions
  6. Use arrays where appropriate
  7. Use "$@" to refer to all arguments
  8. Use uppercase variable names for environment variables only
  9. Prefer shell builtins over external programs
  10. Avoid unnecessary pipelines
  11. Avoid parsing ls
  12. Use globbing
  13. Use null delimited output where possible
  14. Don't use backticks
  15. Use process substitution instead of creating temporary files
  16. Use mktemp if you have to create temporary files
  17. Use [[ and (( for test conditions
  18. Use commands in test conditions instead of exit status
  19. Use set -e
  20. Write error messages to stderr

Each one of the points above is described in some detail below.

  1. Use functions

    Unless you're writing a very small script, use functions to modularise your code and make it more readable, reusable and maintainable. The template I use for all my scripts is shown below. As you can see, all code is written inside functions. The script starts off with a call to the main function.

    #!/bin/bash
    set -e
    
    usage() {
    }
    
    my_function() {
    }
    
    main() {
    }
    
    main "$@"
    
  2. Document your functions

    Add sufficient documentation to your functions to specify what they do and what arguments are required to invoke them. Here is an example:

    # Processes a file.
    # $1 - the name of the input file
    # $2 - the name of the output file
    process_file(){
    }
    
  3. Use shift to read function arguments

    Instead of using $1, $2 etc to pick up function arguments, use shift as shown below. This makes it easier to reorder arguments, if you change your mind later.

    # Processes a file.
    # $1 - the name of the input file
    # $2 - the name of the output file
    process_file(){
        local -r input_file="$1";  shift
        local -r output_file="$1"; shift
    }
    
  4. Declare your variables

    If your variable is an integer, declare it as such. Also, make all your variables readonly unless you intend to change their value later in your script. Use local for variables declared within functions. This helps convey your intent. If portability is a concern, use typeset instead of declare. Here are a few examples:

    declare -r -i port_number=8080
    declare -r -a my_array=( apple orange )
    
    my_function() {
        local -r name=apple
    }
    
  5. Quote all parameter expansions

    To prevent word-splitting and file globbing you must quote all variable expansions. In particular, you must do this if you are dealing with filenames that may contain whitespace (or other special characters). Consider this example:

    # create a file containing a space in its name
    touch "foo bar"
    
    declare -r my_file="foo bar"
    
    # try rm-ing the file without quoting the variable
    rm  $my_file
    # it fails because rm sees two arguments: "foo" and "bar"
    # rm: cannot remove `foo': No such file or directory
    # rm: cannot remove `bar': No such file or directory
    
    # need to quote the variable
    rm "$my_file"
    
    # file globbing example:
    mesg="my pattern is *.txt"
    echo $mesg
    # this is not quoted so *.txt will undergo expansion
    # will print "my pattern is foo.txt bar.txt"
    
    # need to quote it for correct output
    echo "$msg"
    
    

    It's good practice to quote all your variables. If you do need word-splitting, consider using an array instead. See the next point.

  6. Use arrays where appropriate

    Don't store a collection of elements in a string. Use an array instead. For example:

    # using a string to hold a collection
    declare -r hosts="host1 host2 host3"
    for host in $hosts  # not quoting $hosts here, since we want word splitting
    do
        echo "$host"
    done
    
    # use an array instead!
    declare -r -a host_array=( host1 host2 host3 )
    for host in "${host_array[@]}"
    do
        echo "$host"
    done
    
  7. Use "$@" to refer to all arguments

    Don't use $*. Refer to my previous post: Difference between $*, $@, "$*" and "$@". Here is an example:

    main() {
        # print each argument
        for i in "$@"
        do
            echo "$i"
        done
    }
    # pass all arguments to main
    main "$@"
    
  8. Use uppercase variable names for ENVIRONMENT variables only

    My personal preference is that all variables should be lowercase, except for environment variables. For example:

    declare -i port_number=8080
    
    # JAVA_HOME and CLASSPATH are environment variables
    "$JAVA_HOME"/bin/java -cp "$CLASSPATH" app.Main "$port_number"
    
  9. Prefer shell builtins over external programs

    The shell has the ability to manipulate strings and perform simple arithmetic so you don't need to invoke programs like cut and sed. Here are a few examples:

    declare -r my_file="/var/tmp/blah"
    
    # instead of dirname, use:
    declare -r file_dir="{my_file%/*}"
    
    # instead of basename, use:
    declare -r file_base="{my_file##*/}"
    
    # instead of sed 's/blah/hello', use:
    declare -r new_file="${my_file/blah/hello}"
    
    # instead of bc <<< "2+2", use:
    echo $(( 2+2 ))
    
    # instead of grepping a pattern in a string, use:
    [[ $line =~ .*blah$ ]]
    
    # instead of cut -d:, use an array:
    IFS=: read -a arr <<< "one:two:three"
    

    Note that an external program will perform better when operating on large files/input.

  10. Avoid unnecessary pipelines

    Pipelines add extra overhead to your script so try to keep your pipelines small. Common examples of useless pipelines are cat and echo, shown below:

    1. Avoid unnecessary cat

      If you are not familiar with the infamous Useless Use of Cat award, take a look here. The cat command should only be used for concatenating files, not for sending the output of a file to another command.

      # instead of
      cat file | command
      # use
      command < file
      
    2. Avoid unnecessary echo

      You should only use echo if you want to output some text to stdout, stderr, file etc. If you want to send text to another command, don't echo it through a pipe! Use a here-string instead. Note that here-strings are not portable (but most modern shells support them) so use a heredoc if you are writing a portable script. (See my earlier post: Useless Use of Echo.)

      # instead of
      echo text | command
      # use
      command <<< text
      
      # for portability, use a heredoc
      command << END
      text
      END
      
    3. Avoid unnecessary grep

      Piping from grep to awk or sed is unnecessary. Since both awk and sed can grep, you don't need the grep in your pipeline. (Check out my previous post: Useless Use of Grep.)

      # instead of
      grep pattern file | awk '{print $1}'
      # use
      awk '/pattern/{print $1}' file
      
      # instead of
      grep pattern file | sed 's/foo/bar/g'
      # use
      sed -n '/pattern/{s/foo/bar/p}' file
      
    4. Other unnecessary pipelines

      Here are a few other examples:

      # instead of
      command | sort | uniq
      # use
      command | sort -u
      
      # instead of
      command | grep pattern | wc -l
      # use
      command | grep -c pattern
      
  11. Avoid parsing ls

    The problem is that ls outputs filenames separated by newlines, so if you have a filename containing a newline character you won't be able to parse it correctly. It would be nice if ls could output null delimited filenames but, unfortunately, it can't. Instead of ls, use file globbing or an alternative command which outputs null terminated filenames, such as find -print0.

  12. Use globbing

    Globbing (or filename expansion) is the shell's way of generating a list of files matching a pattern. In bash, you can make globbing more powerful by enabling extended pattern matching operators using the extglob shell option. Also, enable nullglob so that you get an empty list if no matches are found. Globbing can be used instead of find in some cases and, once again, don't parse ls! Here are a couple of examples:

    
    shopt -s nullglob
    shopt -s extglob
    
    # get all files with a .yyyymmdd.txt suffix
    declare -a dated_files=( *.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].txt )
    
    # get all non-zip files
    declare -a non_zip_files=( !(*.zip) )
    
    
  13. Use null delimited output where possible

    In order to correctly handle filenames containing whitespace and newline characters, you should use null delimited output, which results in each line being terminated by a NUL (\000) character instead of a newline. Most programs support this. For example, find -print0 outputs filenames followed by a null character and xargs -0 reads arguments separated by null characters.

    # instead of
    find . -type f -mtime +5 | xargs rm -f
    # use
    find . -type f -mtime +5 -print0 | xargs -0 rm -f
    
    # looping over files
    find . -type f -print0 | while IFS= read -r -d $'\0' filename; do
        echo "$filename"
    done
    
  14. Don't use backticks

    Use $(command) instead of `command` because it is easier to nest multiple commands and makes your code more readable. Here is a simple example:

    # ugly escaping required when using nested backticks
    a=`command1 \`command2\``
    
    # $(...) is cleaner
    b=$(command1 $(command2))
    
  15. Use process substitution instead of creating temporary files

    In most cases, if a command takes a file as an input, the file can be replaced by the output of another command using process substitution: <(command). This saves you from having to write out a temp file, passing that temp file to the command and finally deleting the temp file. This is shown below:

    # using temp files
    command1 > file1
    command2 > file2
    diff file1 file2
    rm file1 file2
    
    # using process substitution
    diff <(command1) <(command2)
    
  16. Use mktemp if you have to create temporary files

    Try to avoid creating temporary files. If you must, use mktemp to create a temporary directory and then write your files to it. Make sure you remove the directory after you are done.

    # set up a trap to delete the temp dir when the script exits
    unset temp_dir
    trap '[[ -d "$temp_dir" ]] && rm -rf "$temp_dir"' EXIT
    
    # create the temp dir
    declare -r temp_dir=$(mktemp -dt myapp.XXXXXX)
    
    # write to the temp dir
    command > "$temp_dir"/foo
    
  17. Use [[ and (( for test conditions

    Prefer [[ ... ]] over [ ... ] because it is safer and provides a richer set of features. Use (( ... )) for arithmetic conditions because it allows you to perform comparisons using familiar mathematical operators such as < and > instead of -lt and -gt. Note that if you desire portability, you have to stick to the old-fashioned [ ... ]. Here are a few examples:

    [[ $foo == "foo" ]] && echo "match"  # don't need to quote variable inside [[
    [[ $foo == "a" && $bar == "a" ]] && echo "match"
    
    declare -i num=5
    (( num < 10 )) && echo "match"       # don't need the $ on $num in ((
    
  18. Use commands in test conditions instead of exit status

    If you want to check whether a command succeeded before doing something, use the command directly in the condition of your if-statement instead of checking the command's exit status.

    
    # don't use exit status
    grep -q pattern file
    if (( $? == 0 ))
    then
        echo "pattern was found"
    fi
    
    # use the command as the condition
    if grep -q pattern file
    then
        echo "pattern was found"
    fi
    
  19. Use set -e

    Put this at the top of your script. This tells the shell to exit the script as soon as any statement returns a non-zero exit code.

  20. Write error messages to stderr

    Error messages belong on stderr not stdout.

    echo "An error message" >&2
    

If you have any other suggestions for my list, please share them in the comments section below!

Sunday, August 25, 2013

Executing a Shell Command with a Timeout

Sometimes you may want to kill a command if it has been running for more than a specific time limit. For example, a shell script connecting to a network resource may hang for a long period of time if the resource is unavailable and it would be desirable to kill it and send out an alert.

This post describes different ways of running commands with time limits.

1) GNU coreutils timeout command
The easiest way to run a command with a time limit is by using the timeout command from GNU coreutils. For example, to run a command with a timeout of 2 minutes:

$ timeout 2m /path/to/command with args
$ echo $?
124
If the command has not completed within the specified time limit, the timeout utility will kill it (by sending it a TERM signal) and then exit with status 124.

2) The expect command
Another way to run a command with a timeout is by using expect as shown below:

$ expect -c "
    set echo '-noecho';
    set timeout 10;
    spawn -noecho /path/to/command with args;
    expect timeout { exit 124 } eof { exit 0 }"
$ echo $?
124
In the example above, the timeout is set to 10 seconds and expect will exit with a status of 124 when the command exceeds this time limit. Otherwise, it will exit with a status of 0. Unfortunately, you lose the exit code of the command you are running.

3) Using a custom timeout script
If you cannot use the two approaches above, you can write your own timeout script. Mine is shown below. It first starts a "watchdog" process which keeps checking to see if the command is running by executing kill -0 periodically. If it is still running after the time limit has been exceeded, the watchdog kills it.

#!/bin/bash
while getopts "t:" opt; do
  case "$opt" in
      t) timeout=$OPTARG ;;
  esac
done
shift $((OPTIND-1))

start_watchdog(){
  timeout="$1"
  (( i = timeout ))
  while (( i > 0 ))
  do
    kill -0 $$ || exit 0
    sleep 1
    (( i -= 1 ))
  done

  echo "killing process after timeout of $timeout seconds"
  kill $$
}

start_watchdog "$timeout" 2>/dev/null &
exec "$@"
Example:
$ timeout.sh -t 2 sleep 5
killing process after timeout of 2 seconds
Terminated

Saturday, December 22, 2012

Useless Use of Echo

Most of us are familiar with the Useless Use of Cat Award which is awarded for unnecessary use of the cat command. For example, in nearly all cases, cat file | command arg can be rewritten as <file command arg.

In a similar vein, this post is about the useless use of the echo command. In nearly all cases:

echo string | command arg
can be rewritten using a heredoc:
command arg << END
string
END
or, using a here-string:
command arg <<< string
Note: Here-strings are not portable (but most modern shells support them) so use the heredoc alternative shown above if you are writing a portable script!

Thursday, August 09, 2012

Running a command on multiple hosts

There are different ways you can run a command on multiple machines.

1. For loop
If you want to execute the same command on a few hosts, you can use a for loop as shown below:

for host in host1 host2 host3
do
    ssh $host "hostname; who -b"
done
The example above iterates over a list of hosts, and runs two commands on each one to print the name of the host and the time it was rebooted.

2. While loop
If your list of hosts is stored in a file, you can use a while loop as shown below:

while IFS= read -r host
do
    ssh -n $host "hostname; who -b"
done < /tmp/myhosts
You must provide the -n option to ssh, otherwise it will only run on the first host in your file and then the loop will terminate.

3. Parallel ssh
Parallel ssh (pssh) allows you to run a command on several hosts at the same time and is much faster than using a sequential loop if the number of hosts is large. You can specify how many parallel processes it uses to ssh to the various hosts (default is 32).

$ pssh
Usage: pssh [OPTIONS] -h hosts.txt prog [arg0] ..

  -h --hosts   hosts file (each line "host[:port] [user]")
  -l --user    username (OPTIONAL)
  -p --par     max number of parallel threads (OPTIONAL)
  -o --outdir  output directory for stdout files (OPTIONAL)
  -t --timeout timeout in seconds to do ssh to a host (OPTIONAL)
  -v --verbose turn on warning and diagnostic messages (OPTIONAL)
  -O --options SSH options (OPTIONAL)

$ pssh -h /tmp/myhosts -o /tmp/output "hostname; who -b"

Saturday, August 04, 2012

bash error: value too great for base

I came across this interesting error today:
-bash: 08: value too great for base (error token is "08")
It was coming from a script which works out the previous month by extracting the current month from the current date and then decrementing it. The code looks like this:
today="$(date +%Y%m%d)"
month=${today:4:2}
prevmonth=$((--month))
This script throws an error only if the current month is 08 or 09. I found that the reason for this is that numbers starting with 0 are interpreted as octal numbers and 8 and 9 are not in the base-8 number system, hence the error. There are more details on the bash man page:
Constants with a leading 0 are interpreted as octal numbers. A leading 0x or 0X denotes hexadecimal. Otherwise, numbers take the form [base#]n, where base is a decimal number between 2 and 64 represent- ing the arithmetic base, and n is a number in that base. If base# is omitted, then base 10 is used.
To fix this issue, I specified the base-10 prefix as shown below:
today="$(date +%Y%m%d)"
month=10#${today:4:2}
prevmonth=$((--month))

Saturday, November 05, 2011

Regular Expressions in Bash

Traditionally, external tools such as grep, sed, awk and perl have been used to match a string against a regular expression, but the Bash shell has this functionality built into it as well!

In Bash, the =~ operator allows you to match a string on the left against an extended regular expression on the right and returns 0 if the string matches the pattern, and 1 otherwise. Capturing groups are saved in the array variable BASH_REMATCH with the first element, Group 0, representing the entire expression.

The following script matches a string against a regex and prints out the capturing groups:

#!/bin/bash

if [ $# -lt 2 ]
then
    echo "Usage: $0 regex string" >&2
    exit 1
fi

regex=$1
input=$2

if [[ $input =~ $regex ]]
then
    echo "$input matches regex: $regex"

    #print out capturing groups
    for (( i=0; i<${#BASH_REMATCH[@]}; i++))
    do
        echo -e "\tGroup[$i]: ${BASH_REMATCH[$i]}"
    done
else
    echo "$input does not match regex: $regex"
fi
Example usage:
sharfah@starship:~> matcher.sh '(.*)=(.*)' foo=bar
foo=bar matches regex (.*)=(.*)
    Group[0]: foo=bar
    Group[1]: foo
    Group[2]: bar

Saturday, October 08, 2011

Splitting a large file into smaller pieces

If you have a large file and want to break it into smaller pieces, you can use the Unix split command. You can tell it what the prefix of each split file should be and it will then append an alphabet (or number) to the end of each name.

In the example below, I split a file containing 100,000 lines. I instruct split to use numeric suffixes (-d), put 10,000 lines in each split file (-l 10000) and use suffixes of length 3 (-a 3). As a result, ten split files are created, each with 10,000 lines.

$ ls
hugefile

$ wc -l hugefile
100000 hugefile

$ split -d -l 10000 -a 3 hugefile hugefile.split.

$ ls
hugefile                hugefile.split.005
hugefile.split.000      hugefile.split.006
hugefile.split.001      hugefile.split.007  
hugefile.split.002      hugefile.split.008
hugefile.split.003      hugefile.split.009
hugefile.split.004

$ wc -l *split*
 10000 hugefile.split.000
 10000 hugefile.split.001
 10000 hugefile.split.002
 10000 hugefile.split.003
 10000 hugefile.split.004
 10000 hugefile.split.005
 10000 hugefile.split.006
 10000 hugefile.split.007
 10000 hugefile.split.008
 10000 hugefile.split.009
100000 total

Sunday, October 02, 2011

Better Bash Completion for Tmux

In my previous post, I wrote about how awesome tmux is for managing multiple terminals. However, even though it is widely used, I haven't been able to find a good Bash completion script for it. The tmux package does come with bash_completion_tmux.sh but this does not complete command options or command aliases. So I wrote a better version which completes tmux commands, aliases and their options. However, there is still room for improvement. It would be nice if it could complete session and window names too, but I haven't found the time to implement this yet.

Here is a demo:

$ tmux lis[TAB]
list-buffers   list-clients   list-commands
list-keys      list-panes     list-sessions  list-windows

$ tmux list-windows -[TAB]
-a -t

$ tmux list-windows -a
sharfah:0: less [180x82] [layout f0de,180x82,0,0]
sharfah:1: tmp [180x82] [layout f0de,180x82,0,0] (active)
sharfah:2: isengard [180x82] [layout f0de,180x82,0,0]
sharfah:3: java [180x82] [layout f0de,180x82,0,0]
My completion script is shown below. You need to source it in your Bash profile. Alternatively, save it to your Bash completion directory e.g. ~/.bash/.bash/.bash_completion.d and it should automatically get picked up.

The script is also available in my GitHub dotfiles repository. If you can improve it, fork it and send me a pull request!

#
# tmux completion
# by Fahd Shariff
#
_tmux() {
  # an array of commands and their options
  declare -A tmux_cmd_map
  tmux_cmd_map=( ["attach-session"]="-dr -t target-session" \
                 ["bind-key"]="-cnr -t key-table key command arguments" \
                 ["break-pane"]="-d -t target-pane" \
                 ["capture-pane"]="-b buffer-index -E end-line -S start-line -t target-pane" \
                 ["choose-buffer"]="-t target-window template" \
                 ["choose-client"]="-t target-window template" \
                 ["choose-session"]="-t target-window template" \
                 ["choose-window"]="-t target-window template" \
                 ["clear-history"]="-t target-pane" \
                 ["clock-mode"]="-t target-pane" \
                 ["command-prompt"]="-I inputs -p prompts -t target-client template" \
                 ["confirm-before"]="-p prompt -t target-client command" \
                 ["copy-mode"]="-u -t target-pane" \
                 ["delete-buffer"]="-b buffer-index" \
                 ["detach-client"]="-P -s target-session -t target-client" \
                 ["display-message"]="-p -c target-client -t target-pane message" \
                 ["display-panes"]="-t target-client" \
                 ["find-window"]="-t target-window match-string" \
                 ["has-session"]="-t target-session" \
                 ["if-shell"]="shell-command command" \
                 ["join-pane"]="-dhv -p percentage|-l size -s src-pane -t dst-pane" \
                 ["kill-pane"]="-a -t target-pane" \
                 ["kill-server"]="kill-server" \
                 ["kill-session"]="-t target-session" \
                 ["kill-window"]="-t target-window" \
                 ["last-pane"]="-t target-window" \
                 ["last-window"]="-t target-session" \
                 ["link-window"]="-dk -s src-window -t dst-window" \
                 ["list-buffers"]="list-buffers" \
                 ["list-clients"]="-t target-session" \
                 ["list-commands"]="list-commands" \
                 ["list-keys"]="-t key-table" \
                 ["list-panes"]="-as -t target" \
                 ["list-sessions"]="list-sessions" \
                 ["list-windows"]="-a -t target-session" \
                 ["load-buffer"]="-b buffer-index path" \
                 ["lock-client"]="-t target-client" \
                 ["lock-server"]="lock-server" \
                 ["lock-session"]="-t target-session" \
                 ["move-window"]="-dk -s src-window -t dst-window" \
                 ["new-session"]="-d -n window-name -s session-name -t target-session -x width -y height command" \
                 ["new-window"]="-adk -n window-name -t target-window command" \
                 ["next-layout"]="-t target-window" \
                 ["next-window"]="-a -t target-session" \
                 ["paste-buffer"]="-dr -s separator -b buffer-index -t target-pane" \
                 ["pipe-pane"]="-t target-pane-o command" \
                 ["previous-layout"]="-t target-window" \
                 ["previous-window"]="-a -t target-session" \
                 ["refresh-client"]="-t target-client" \
                 ["rename-session"]="-t target-session new-name" \
                 ["rename-window"]="-t target-window new-name" \
                 ["resize-pane"]="-DLRU -t target-pane adjustment" \
                 ["respawn-pane"]="-k -t target-pane command" \
                 ["respawn-window"]="-k -t target-window command" \
                 ["rotate-window"]="-DU -t target-window" \
                 ["run-shell"]="command" \
                 ["save-buffer"]="-a -b buffer-index" \
                 ["select-layout"]="-np -t target-window layout-name" \
                 ["select-pane"]="-lDLRU -t target-pane" \
                 ["select-window"]="-lnp -t target-window" \
                 ["send-keys"]="-t target-pane key " \
                 ["send-prefix"]="-t target-pane" \
                 ["server-info"]="server-info" \
                 ["set-buffer"]="-b buffer-index data" \
                 ["set-environment"]="-gru -t target-session name value" \
                 ["set-option"]="-agsuw -t target-session|target-window option value" \
                 ["set-window-option"]="-agu -t target-window option value" \
                 ["show-buffer"]="-b buffer-index" \
                 ["show-environment"]="-g -t target-session" \
                 ["show-messages"]="-t target-client" \
                 ["show-options"]="-gsw -t target-session|target-window" \
                 ["show-window-options"]="-g -t target-window" \
                 ["source-file"]="path" \
                 ["split-window"]="-dhvP -p percentage|-l size -t target-pane command" \
                 ["start-server"]="start-server" \
                 ["suspend-client"]="-t target-client" \
                 ["swap-pane"]="-dDU -s src-pane -t dst-pane" \
                 ["swap-window"]="-d -s src-window -t dst-window" \
                 ["switch-client"]="-lnp -c target-client -t target-session" \
                 ["unbind-key"]="-acn -t key-table key" \
                 ["unlink-window"]="-k -t target-window" )

   declare -A tmux_alias_map
   tmux_alias_map=( ["attach"]="attach-session" \
                  ["detach"]="detach-client" \
                  ["has"]="has-session" \
                  ["lsc"]="list-clients" \
                  ["lscm"]="list-commands" \
                  ["ls"]="list-sessions" \
                  ["lockc"]="lock-client" \
                  ["locks"]="lock-session" \
                  ["new"]="new-session" \
                  ["refresh"]="refresh-client" \
                  ["rename"]="rename-session" \
                  ["showmsgs"]="show-messages" \
                  ["source"]="source-file" \
                  ["start"]="start-server" \
                  ["suspendc"]="suspend-client" \
                  ["switchc"]="switch-client" \
                  ["breakp"]="break-pane" \
                  ["capturep"]="target-pane]" \
                  ["displayp"]="display-panes" \
                  ["findw"]="find-window" \
                  ["joinp"]="join-pane" \
                  ["killp"]="kill-pane" \
                  ["killw"]="kill-window" \
                  ["lastp"]="last-pane" \
                  ["last"]="last-window" \
                  ["linkw"]="link-window" \
                  ["lsp"]="list-panes" \
                  ["lsw"]="list-windows" \
                  ["movew"]="move-window" \
                  ["neww"]="new-window" \
                  ["nextl"]="next-layout" \
                  ["next"]="next-window" \
                  ["pipep"]="pipe-pane" \
                  ["prevl"]="previous-layout" \
                  ["prev"]="previous-window" \
                  ["renamew"]="rename-window" \
                  ["resizep"]="resize-pane" \
                  ["respawnp"]="respawn-pane" \
                  ["respawnw"]="respawn-window" \
                  ["rotatew"]="rotate-window" \
                  ["selectl"]="select-layout" \
                  ["selectp"]="select-pane" \
                  ["selectw"]="select-window" \
                  ["splitw"]="[shell-command]" \
                  ["swapp"]="swap-pane" \
                  ["swapw"]="swap-window" \
                  ["unlinkw"]="unlink-window" \
                  ["bind"]="bind-key" \
                  ["lsk"]="list-keys" \
                  ["send"]="send-keys" \
                  ["unbind"]="unbind-key" \
                  ["set"]="set-option" \
                  ["setw"]="set-window-option" \
                  ["show"]="show-options" \
                  ["showw"]="show-window-options" \
                  ["setenv"]="set-environment" \
                  ["showenv"]="show-environment" \
                  ["confirm"]="confirm-before" \
                  ["display"]="display-message" \
                  ["clearhist"]="clear-history" \
                  ["deleteb"]="delete-buffer" \
                  ["lsb"]="list-buffers" \
                  ["loadb"]="load-buffer" \
                  ["pasteb"]="paste-buffer" \
                  ["saveb"]="save-buffer" \
                  ["setb"]="set-buffer" \
                  ["showb"]="show-buffer" \
                  ["if"]="if-shell" \
                  ["lock"]="lock-server" \
                  ["run"]="run-shell" \
                  ["info"]="server-info" )

   local cur="${COMP_WORDS[COMP_CWORD]}"
   local prev="${COMP_WORDS[COMP_CWORD-1]}"
   COMPREPLY=()

   # completing an option
   if [[ "$cur" == -* ]]; then
     #tmux options
     if [[ "$prev" == "tmux" ]]; then
         COMPREPLY=( $( compgen -W "-2 -8 -c -f -L -l -q -S -u -v -V" -- $cur ) )
     else
         #find the tmux command so that we can complete the options
         local cmd="$prev"
         local i=$COMP_CWORD
         while [[ "$cmd" == -* ]]
         do
             cmd="${COMP_WORDS[i]}"
             ((i--))
         done

         #if it is an alias, look up what the alias maps to
         local alias_cmd=${tmux_alias_map[$cmd]}
         if [[ -n ${alias_cmd} ]]
         then
             cmd=${alias_cmd}
         fi

         #now work out the options to this command
         local opts=""
         for opt in ${tmux_cmd_map[$cmd]}
         do
              if [[ "$opt" == -* ]]; then
                  len=${#opt}
                  i=1
                  while [ $i -lt $len ]; do
                      opts="$opts -${opt:$i:1}"
                      ((i++))
                  done
              fi
         done
         COMPREPLY=($(compgen -W "$opts" -- ${cur}))
     fi
   else
     COMPREPLY=($(compgen -W "$(echo ${!tmux_cmd_map[@]} ${!tmux_alias_map[@]})" -- ${cur}))
   fi
   return 0
}
complete -F _tmux tmux
Related posts:
Managing Multiple Terminals with Tmux Writing your own Bash Completion Function

Saturday, October 01, 2011

Managing Multiple Terminals with Tmux

I've started using tmux, which is a "terminal multiplexer", similar to screen. It allows you to manage a number of terminals from a single screen. So, for example, instead of having 5 PuTTY windows cluttering up your desktop, you now have only one window, containing 5 terminals. If you close this window, you can simply open a new one and "attach" to your running tmux session, to get all your terminals back at the same state you left them in.

There are lots of cool things you can do with tmux. For example, you can split a terminal window horizontally or vertically into "panes". This allows you to look at files side by side, or simply watch a process in one pane while you do something else in another.

I took the following screenshot of tmux in action:


The status bar along the bottom shows that I have 5 terminal windows open. I am currently in the one labelled "1-demo" and within this window I have 4 panes, each running a different command.

There are quite a few key bindings to learn, but once you have mastered them you will be able to jump back and forth between windows, move them around and kill them without lifting your hands off the keyboard. You can also set your own key bindings for things you do frequently. For example, my Ctrl-b / binding splits my window vertically and opens up a specified man page on the right. My Ctrl+b S binding allows me to SSH to a server in a new window.

Here is my tmux configuration taken from ~/.tmux.conf which shows my key bindings and colour setup. You can download this file from my GitHub dotfiles repository.

bind | split-window -h
bind - split-window -v
bind _ split-window -v
bind R source-file ~/.tmux.conf \; display-message "tmux.conf reloaded!"

bind / command-prompt -p "man" "split-window -h 'man %%'"
bind S command-prompt -p "ssh" "new-window -n %1 'exec ssh %1'"
bind h split-window -h  "man tmux"

set -g terminal-overrides 'xterm*:smcup@:rmcup@'

set -g history-limit 9999

# Terminal emulator window title
set -g set-titles on
set -g set-titles-string '#S:#I.#P #W'

# notifications
setw -g monitor-activity on
setw -g visual-activity on

# auto rename
setw -g automatic-rename on

# Clock
setw -g clock-mode-colour green
setw -g clock-mode-style 24

# Window status colors
setw -g window-status-bg colour235
setw -g window-status-fg colour248
setw -g window-status-alert-attr underscore
setw -g window-status-alert-bg colour235
setw -g window-status-alert-fg colour248
setw -g window-status-current-attr bright
setw -g window-status-current-bg colour235
setw -g window-status-current-fg colour248

# Message/command input colors
set -g message-bg colour240
set -g message-fg yellow
set -g message-attr bright

# Status Bar
set -g status-bg colour235
set -g status-fg colour248
set -g status-interval 1
set -g status-left '[#H]'
set -g status-right ''

set -g pane-border-fg white
set -g pane-border-bg default
set -g pane-active-border-fg white
set -g pane-active-border-bg default

Sunday, September 25, 2011

Speeding up Bash Profile Load Time

I started noticing a considerable delay whenever opening a new terminal or connecting to another server. After profiling my Bash profile with a few time commands, I discovered that the slowest part was the loading of the completion file:
$ time  ~/.bash/.bash_completion

real    0m0.457s
user    0m0.183s
sys     0m0.276s
The Bash completion script I use is from http://bash-completion.alioth.debian.org. I found that there is an existing bug for this issue #467231: bash_completion is big and loads slowly; load-by-need proposed and someone has submitted a script to speed up Bash completion load time called dyncomp.sh.

This is a one-time script, which only needs to be run when you install your Bash completions or modify them. It loads your completions and moves the completion functions out of the script and into a separate directory. They are only loaded when needed. This speeds up the load time considerably and new terminal windows open up instantly!

$ time  ~/.bash/.bash_dyncompletion

real    0m0.020s
user    0m0.018s
sys     0m0.002s
You can visit my GitHub dotfiles repository for the latest version of my Bash profile.

Saturday, August 20, 2011

LESSOPEN Powers Up Less

A really useful feature of the Unix less pager is LESSOPEN which is the "input preprocessor" for less. This is a script, defined in the LESSOPEN environment variable, which is invoked before the file is opened. It gives you the chance to modify the way the contents of the file are displayed. Why would you want to do this? The most common reason is to uncompress files before you view them, allowing you to less GZ files. But it also allows you to list the contents of zip files and other archives. I like to use it to format XML files and to view Java class files by invoking jad.

You can download a really useful LESSOPEN script from http://sourceforge.net/projects/lesspipe/ and then extend it if necessary.

To use it, simply add export LESSOPEN="|/path/to/bin/lesspipe.sh %s" to your bashrc.

You can then less:

  • directories
  • compressed files
  • archives, to list the files contained in them
  • files contained in archives e.g. less foo.zip:bar.txt
  • binary files

Saturday, August 13, 2011

Dotfiles in Git

I've added all my dotfiles (including my entire bash profile and vimrc) to my GitHub dotfiles repository. Whenever I make any changes, I will commit them to the repository.

In order to download the latest version, go to my Downloads page. Alternatively, if you have git installed, use the following command, to clone my repository:

git clone git://github.com/sharfah/dotfiles.git
This will download them to a directory called dotfiles. You can then copy the files recursively (cp -r) to your home directory (don't forget to backup your original files first!). Alternatively, use symlinks.

Saturday, August 06, 2011

My Bash Profile - Part VI: Inputrc

inputrc is the name of the readline startup file. You can set key bindings and certain variables in this file. One of my favourite key bindings is Alt+L to ls -ltrF. I also have bindings which allow you to go back and forth across words using the Ctrl+Left/Right Arrow keys.

To take a look at all your current key bindings execute the command bind -P or bind -p. Check out the man pages for more information.

Update: My dotfiles are now in Git. For the latest version, please visit my GitHub dotfiles repository.

Here is my INPUTRC:

set bell-style none
set completion-ignore-case On
set echo-control-characters Off
set enable-keypad On
set mark-symlinked-directories On
set show-all-if-ambiguous On
set show-all-if-unmodified On
set skip-completed-text On
set visible-stats On

"\M-l": "ls -ltrF\r"
"\M-h": "dirs -v\r"

# If you type any text and press Up/Down,
# you can search your history for commands starting
# with that text
"\e[B": history-search-forward
"\e[A": history-search-backward

# Use Ctrl or Alt Arrow keys to move along words
"\C-[OD" backward-word
"\C-[OC" forward-word
"\e\e[C": forward-word
"\e\e[D": backward-word

"\M-r": forward-search-history
If you have any useful bindings, please share them in the comments section below.

More posts on my Bash profile:

Saturday, June 18, 2011

Efficiently Navigating Directories on UNIX

I find myself, like most developers, spending a lot of time navigating directories. Flipping back and forth between logs and application directories with long names can be quite tedious. So, with the help of a few new functions, aliases and config tweaks I've made the navigation process easier and more efficient. You no longer need to remember long paths because you can jump straight to them using their names. You can also choose to bookmark your favourite directories. Here is my setup:

1. Go up to a specific directory
I have a function called upto which allows you to jump up to any directory, on the current path, just by name. This is very useful if you are deep in a directory. I also have autocompletion for this function, so that it shows me valid directory names and completes them for me.

#
# Go up to the specified directory
#
upto(){
  if [ -z $1 ]; then
      echo "Usage: upto [directory]"
      return 1
  fi
  local upto=$1
  cd "${PWD/\/$upto\/*//$upto}"
}

#
# Completion function for upto
#
_upto(){
  local cur=${COMP_WORDS[COMP_CWORD]}
  d=${PWD//\//\ }
  COMPREPLY=( $( compgen -W "$d" -- $cur ) )
}
complete -F _upto upto
Example:
[/www/public_html/animals/hippopotamus/habitat/swamps/images] $ upto h[TAB][TAB]
habitat       hippopotamus
[/www/public_html/animals/hippopotamus/habitat/swamps/images] $ upto hippopotamus
[/www/public_html/animals/hippopotamus] $
2. Go up a specific number of directories
If you know how many levels you want to go up, you can use the up function e.g. up 5 will move you up 5 directories.
#
# Go up a specified number of directories
#
up(){
  if [ -z $1 ]
  then
    cd ..
    return
  fi
  local levels=$1
  local result="."
  while [ $levels -gt 0 ]
  do
    result=$result/..
    ((levels--))
  done
  cd $result
}
3. Go down to a specific directory
Sometimes you want to change to a directory but can't remember the path, or the path name is too long to type. I have a function called jd which allows you to jump down to a directory any level below the current one. It uses Bash's globstar feature so make sure you have it enabled (using shopt -s globstar). (Warning: this may be slow on large directory structures because of the searching involved.)
#
# Jumps to a directory at any level below.
# using globstar
#
jd(){
  if [ -z $1 ]; then
      echo "Usage: jd [directory]";
      return 1
  else
      cd **/$1
  fi
}
Example:
[/www/public_html/animals/hippopotamus/habitat/swamps/images] $ upto hippopotamus
[/www/public_html/animals/hippopotamus] $ jd images
[/www/public_html/animals/hippopotamus/habitat/swamps/images] $
4. CDPATH
The CDPATH variable is a colon-separated list of directories in which the shell looks for destination directories specified by the cd command. Mine is shown below. No matter what directory I am currently in, I can quickly jump to a project in my dev directory with cd <project> because it is on my path.
export CDPATH=".::..:../..:~:~/dev/"
5. Shell Options
I have set the following useful shell options in my .bashrc. The autocd option allows you to change to a directory without using the cd command and cdspell automatically corrects typos in directory names.
shopt -s cdspell     # correct dir spelling errors on cd
shopt -s autocd      # if a command is a dir name, cd to it
shopt -s cdable_vars # if cd arg is not a dir, assume it is a var
6. Quick Aliases
alias ..='cd ..'
alias ...='cd ../..'
alias ....='cd ../../..'
alias .....='cd ../../../..'
alias ......='cd ../../../../..'
7. Keeping a history of visited directories
I came across a useful post on Linux Gazette: History of visited directories in BASH. It contains a script which maintains a history of directories you have visited and then allows you to switch to them easily using a reference number. The command cd -- shows you your history and cd -2 would take you to the second item in your history list. For example:
[/www/public_html/animals] $ cd --
 1  /tmp
 2  /www/public_html/animals/hippopotamus/habitat/swamps/images
 3  /www/public_html/animals/lion
[/www/public_html/animals] $ cd -2
[/www/public_html/animals/hippopotamus/habitat/swamps/images] $
8. Bookmarks
I spend a lot of time moving between different directories especially between logs and application directories. I have implemented a bookmarking feature which allows you to bookmark your favourite directories and then change to them easily.
  • bm: bookmark the current directory
  • bcd: change to the specified bookmark
  • brm: remove a bookmark
  • bcl: clear all bookmarks
  • bll: list all bookmarks
#-------------------------------
# Directory Bookmark Functions
#-------------------------------

#
# Add a bookmark, if it doesn't exist
#
bm(){
  local val=$(pwd)
  for i in ${bookmarks[@]}
  do
    if [ "$i" == "$val" ]
    then
       return 1
    fi
  done
  num=${#bookmarks[@]}
  bookmarks[$num]=$val
}

#
# Goto specified bookmark
# or previous one by default
#
bcd(){
  index=$1
  if [ -z $index ]
  then
     index=$((${#bookmarks[@]}-1))
  fi
  local val=${bookmarks[$index]}
  if [ -z $val ]
  then
     echo "No such bookmark. Type bll to list bookmarks."
     return 1
  else
     cd "$val"
  fi
}

#
# Remove a bookmark
#
brm(){
  if [ $# -lt 1 ]
  then
     echo "Usage: brm [bookmark-index]"
     return 1
  fi
  if [ -z ${bookmarks[$1]} ]
  then
     echo "No such bookmark"
     return 1
  fi
  bookmarks=(${bookmarks[@]:0:$1} ${bookmarks[@]:$(($1 + 1))})
}

#
# Remove all bookmarks
#
bcl(){
    bookmarks=()
}

#
# List all bookmarks
#
bll(){
  if [ ${#bookmarks[@]} -ne 0 ]
  then
     local i=0
     while [ $i -lt ${#bookmarks[@]} ]
     do
       echo $i: ${bookmarks[$i]}
       ((i++))
     done
  fi
  return 0
}

If you have any useful directory-related functions, share them in the comments below!

Related Posts:

Saturday, June 04, 2011

Associative Arrays in Bash 4

Associative arrays allow you to store key-value pairs and retrieve values using their keys. You can think of them as maps (or hashes) in other programming languages. The following example illustrates how associative arrays can be used to map countries to their capital cities:
#!/bin/bash

#declare an associative array
declare -A capital

#there are different ways to populate the array:
capital=([UK]="London" [Japan]="Tokyo")

#or...
capital[Germany]="Berlin"
capital[China]="Beijing"

#you can even append using +=
capital+=([Belgium]="Brussels" [Egypt]="Cairo")

#print the number of entries
echo "Size: ${#capital[@]}"

#retrieve the capital of Germany
echo "Capital of Germany: ${capital[Germany]}"

#iterate over the keys and print all the entries
echo "Country -> Capital"
for country in "${!capital[@]}"
do
   echo "$country -> ${capital[$country]}"
done
Output:
Size: 6
Capital of Germany: Berlin
Country -> Capital
UK -> London
Germany -> Berlin
Belgium -> Brussels
China -> Beijing
Japan -> Tokyo
Egypt -> Cairo
Further reading:
Bash, version 4, Associative Arrays