Scripting

Prev: Warming up with Unix commands
Next: LabQSM#Link to Selected Laboratories

Commands (to be interpreted and executed) can be placed in a text file, called script, to be executed by means of an interpreter

The interpreter is specified in the first line of the script, e.g. by:

 #! /bin/sh
 #! /bin/bash
 #! /bin/tcsh
 #! /usr/bin/awk -f
 #! /usr/bin/env python
 ...

(Note that while # is in all the above languages a comment, #! is actually used to identify the interpreter).

Bash scripting

Among the many, bash scripting is particularly relevant to us (bash is also the interpreter of the command-line shell we have been using so far).

Unix commands (enriched by bash built-in functions & structures) can be used in bash scripts:

$> cat ./get_users.sh

#! /bin/bash  -x
filein=/etc/passwd
#
# extract user names
cat $filein | awk -v FS=":" '{print $1}'

Note that in order to execute get_users.sh, we need to change its permissions,

 $> chmod a+x ./get_users.sh

When executing, the output fo the script can also be redirected to a file,

 $> ./get_users.sh > users.dat

Within the script, $0 corresponds to the invocation name (./get_users.sh, in the example above), $1, $2, .. $n to the n-th arguments if present. $# is the number of command line arguments passed to the script.

$> cat ./get_users2.sh

#! /bin/bash
if [ $# == 0 ] ; then echo "Usage:  ./get_users2.sh  <filename>" ; exit 1 ; fi
filein=$1
# 
# extract user names
cat $filein | awk -v FS=":" '{print $1}'

Now, this second version of the script needs to be run as:

$> ./get_users2.sh /etc/passwd

Sed & Awk

These two commands, available almost everywhere, are extremely used in bash scripting.

sed

substitutes regular expressions in files or strings. Examples follow:

$> echo “Ciao Ciao” | sed ‘s/C/M/’
     ->  “Miao Ciao”
$> echo “Ciao Ciao” | sed ‘s/C/M/g’
     ->  “Miao Miao”                  # g stands for “global substitution”

Regular expressions can also be used in the search.

"." in the regular expr means all characters (wild card) and needs to be protected as \. to be treated as a regular character
\n means newline
\t tab

awk

line by line operations (number & strings, syntax similar to c)

$> echo 10 4.0 | awk '{print $1 * sqrt($2)}'
$> echo “LabQSM 2020” | awk '{print $1; print "Year", $2}'

Awk has its own scripting, useful eg for parsing or data post-processing (the same operation/search is done line by line)

Take eg the file apt.txt with the list of tennis players that we have used in previous examples:

 9850, Nadal,  Rafael 
 6630, Federer,  Roger
 3075, Berrettini,  Matteo 
12030, Djokovic,  Novak

The problem can be solved by awk as follows:

#! /usr/bin/awk -f
BEGIN{ i=ind; nlines=0; FS="," }
{
  if (NF != 3) next
  nlines++
  if (nlines == i) {printf "%s, %s\n", $3,$2}
}
END{
# place here any operation to be done at the end
}

Run as:

$> ./solution.awk -v ind=2  apt.txt

Note that comma-separated columns are no longer needed, and one can avoid using commas by simply dropping the redefinition of the field-separator FS=",".

Bash control statements

Conditionals

if [ "$var1" = "$var2" ] ; then
   <some-statements>
else
   <some-statements>
fi

if [ -e "$file" ] ; then echo "File exists" ; fi
if [ ! -e "$file" ] ; then echo "File does not exist" ; fi
if [ -d "$dir" ] ;  then echo "Dir exists" ; fi
if [ -x "$file" ] ; then echo "File exists and is exec" ; fi

Loops

list="item1 item2 item3"
for item in $list
do
   echo $item
done

Input from command line

#! /bin/bash
echo "number of arguments : $#"
echo "        	command : $0"
echo "        	1st arg : $1"
echo "        	2nd arg : $2"
echo "            	... "
echo "       	all args : $*"

Execute as:
$> ./example.sh  p1 p2 p3

Dealing with extended text (Useful eg to write input files)

This can be used for a few lines

echo line1 >  file.txt 
echo line2 >> file.txt
echo line3 >> file.txt

Instead, when text becomes extended

cat >file.txt << EOF
   line1
   line2
   line3
EOF

Exercises

Exercise 1

Job Script: Write a script (run.sh) to run pw.x once an input file is provided

Solution 1:

#! /bin/bash -x
filein=scf.diamond.in
bindir=/usr/local/bin
#
# fileout=scf.diamond.out
fileout=`echo $filein | sed 's/\.in/\.out/' `
#
$bindir/pw.x < $filein > $fileout
exit 0

Solution 2:

#! /bin/bash -x
filein=$1
if [ -z "$filein" ] ;   then echo "ERROR: filein needed"; exit 1 ; fi
if [ ! -e "$filein" ] ; then echo "ERROR: filein not found"; exit 1 ; fi 
#
bindir=/usr/local/bin
#
fileout="`echo $filein | sed 's/\.in//' `".out
#
$bindir/pw.x < $filein > $fileout
exit 0

Exercise 2

Modify the ruh.sh script of Problem 1 to loop over different lattice parameters.
Consider the grid: -3%, -2%, -1%, 0, +1%, +2%, + 3%
Keep trace of the parameters used in the file names

Hint1: create a template input file, where the celldm1 field is assigned @alat@, which you will substitute in the script
Hint2: save both input and output files;
Hint3: $ basename <name>.dat .dat -> <name>

Solution: have a look at

LAB_1/test_diamond/run_lattice.sh and
LAB_1/test_diamond/scf.tmpl

Exercise 3

Same at Exercise 2
Include the input file in the script, meaning that the substitution of parameters in the template input file can be done via shell variables and loops