BASH shell Cookbook ETC.

Substrings

Create directory if does not exist

Match files with names containing zero-padded numbers falling within a specific range

Read a csv file

into an array

into an associative array

Array access

Declare and initialize

Access a specific element

Print everything in array

Looping through each element in array

Check file for non-ASCII characters

One-liner to replace strings In-place in multiple files

Parallel processing

ONLINE RESOURCES

Substrings

Trim shortest pattern match from the beginning of var ${var#pattern}

Trim shortest pattern match from the end of var ${var%pattern}

Trim longest pattern match from the beginning of var ${var##pattern}

Trim longest pattern match from the end of var ${var%%pattern}

basename of file f / strip extension ${f%.x} or ${f##x.}

get folder only of path ${pathstr%/*}

get filename only of path ${pathstr##*/}

Create directory if does not exist

targetDir='path/to/target/dir'

if [ ! -d $targetDir ]; then

mkdir $targetDir

Match files with names containing zero-padded numbers falling within a specific range

e.g. list all files between spmT_0013.nii, spmT_0014.nii, ..., spmT_0150.nii, i.e. whose names fall within the range 13 to 150

ls spmT_0*{13..150}.nii

However, if the range is grabbed from variables, the following will not work:

iA=13

iZ=150

ls spmT_0*{$iA..$iZ}.nii

One solution would be the following:

shopt -s extglob
iA=13
iZ=150

iZ=$(( $iZ+1 )) ## have to add one to get the last number (not sure why)
range=$( seq $iA $iZ )

ls spmT_0*@(${range//$'\n'/\.nii|})

Read a csv file

into an array

dat_arr=( $(cut -d ',' -f2 $csvFile ) )

readarray -t dat_array < $csvFile

# remove blank columns

dat_arr=("${dat_arr[@]#*,}")

dat_arr=("${dat_arr[@]%%,*}")

or (this one seems most reliable at least for what I have tried)

var1=`cat $csvFile`

IFS=',' dat_arr=( ${var1} )

into an associative array

oldIFS=${IFS} # keep track of old column separator

IFS=","

declare -A assocArr # initialize assoc array

while read -r -a linedata

# bash doesn't like keys beginning with numbers -- throws err "value too great for base"

# if your key begins with a number, you can prepend it with some alphacharacter

# e.g. prepend with "x": assocArr["x${linedata[0]}"]="${linedata[1]}"

assocArr["x${linedata[0]}"]="${linedata[1]}"

done < $csvFile

## random key lookup if you know there's a key called "akey"

# echo ${assoc["akey"]}

## check all entries in assoc arr

for key in "${!assocArr[@]}"

echo "${key} ---> ${assocArr[${key}]}"

done

## set delimiter char back

IFS=${oldIFS}

Array access

Collected from the Ultimate Bash array tutorial and forum posts.

Declare and initialize

# declare -a arrayname=(element1 element2 element3)

declare -a Unix=('Debian' 'Red hat' 'Red hat' 'Suse' 'Fedora');

Access a specific element

Use curly brackets like ${dat_arr[index]} (indexing begins at 0).

printf '%s\n' "${dat_array[1]}

Length of an array

echo ${#Unix[@]} #Number of elements in the array

echo ${#Unix} #Number of characters in the first element of the array.i.e Debian

Print everything in array

printf '%s\n' "${dat_arr[@]}"

Pitfall: not quoting. Quoted, "$@" expands each element as a separate argument, while "$*" expands to the args merged into one argument: "$1c$2c..." (where c is the first char of IFS). You almost always want "$@". Same goes for "${arr[@]}".

Looping through each element in array

for each in "${dat_arr[@]}"

echo "$each"

done

Check file for non-ASCII characters

Very useful for RMarkdown files because knitting can bail or RStudio might not launch because of non-ASCII characters

grep --color='auto' -P -n "[\x80-\xFF]" $fileName

One-liner to replace strings In-place in multiple files

Lots of ways to do this (e.g. perl, sed), but as I'm a Perl girl, the Perl way, e.g. to replace an old string to a new string in all files with the extension .txt in the current directory:

stringOld='changeMe'

stringNew='newMe'

## perl switches (more info with the command 'perldoc perlrun')

## -i in-place changes

## -p loop thru input arguments (what makes one-liner on multiple files possible)

## -e so Perl does not expect a filename but commands

## s = substitute, g = global (do not stop at first match in a line)

perl -i -pe 's/$stringOld/$stringNew/g' *.txt

Note that Perl does not recursively search through subdirectories unless you call its Find module -- the simples I find to do the command recursively is to used find to do the recursive search and wrap the Perl command with xargs which will pass the standard inputs iteratively as the argument to whatever command that follows it:

find . -name '*.txt' | xargs perl -i -pe 's/$stringOld/$stringNew/g'

Parallel processing

In base bash, sending processes in background (i.e. command issued with &) will in effect allow the system to send them to your different processors. To track the background jobs (from a Stack Exchange answer):

doSeq="`seq 1 5`"

doSubjs=( $(printf "sub%03d " $doSeq) ) # need outside () to make $doSubjs an array

# Make an associative array in which you'll record pids.

declare -A pids

start=$(date '+%s')

for subj in "${doSubjs[@]}"

echo ____________________ $subj

## start background processes to trigger parallel processing across processors

## to kill: pkill -f fsl

someProcess $subj &

pid=$!

echo "queuing for $subj (pid=$pid)"

pids[$pid]=$n

done

# Watch your backgrounded processes.

# If job completed, remove pid from the array.

while [ -n "${pids[*]}" ]; do

sleep 30s

for pid in "${!pids[@]}"; do

if ! ps "$pid" >/dev/null; then

unset pids[$pid]

echo "unset: $pid"

done

if [ -z "${!pids[*]}" ]; then

break

printf "\rStill waiting for: %s ... " "${pids[*]}"

done

printf "\r%-25s \n" "Done."

printf "Total runtime: %d seconds\n" "$((`date '+%s'` - $start))"

ONLINE RESOURCES

My notes/mini-tutorial:

unix-tutorial_2019.pdf

Really useful zwischenzwugs Ten Things I Wish I’d Known About Bash
Nicely organized devhints.io Cheatsheet

Page updated

Report abuse