Difference between revisions of "Scripting"

Revision as of 13:41, 25 November 2020

Commands (to be interpreted and executed) can be placed in a text file, called script, to be executed by means of an interpreter

The interpreter is specified in the first line of the script, e.g. by:

 #! /bin/sh
 #! /bin/bash
 #! /bin/tcsh
 #! /usr/bin/awk -f
 #! /usr/bin/env python
 ...

(Note that while # is in all the above languages a comment, #! is actually used to identify the interpreter).

Bash scripting

Among the many, bash scripting is particularly relevant to us (bash is also the interpreter of the command-line shell we have been using so far).

Unix commands (enriched by bash built-in functions & structures) can be used in bash scripts:

$> cat ./get_users.sh

#! /bin/bash  -x
filein=/etc/passwd
#
# extract user names
cat $filein | awk -v FS=":" '{print $1}'

Note that in order to execute get_users.sh, we need to change its permissions,

 $> chmod a+x ./get_users.sh

When executing, the output fo the script can also be redirected to a file,

 $> ./get_users.sh > users.dat

Within the script, $0 corresponds to the invocation name (./get_users.sh, in the example above), $1, $2, .. $n to the n-th arguments if present. $# is the number of command line arguments passed to the script.

$> cat ./get_users2.sh

#! /bin/bash
if [ $# == 0 ] ; then echo "Usage:  ./get_users2.sh  <filename>" ; exit 1 ; fi
filein=$1
# 
# extract user names
cat $filein | awk -v FS=":" '{print $1}'

Now, this second version of the script needs to be run as:

$> ./get_users2.sh /etc/passwd

Sed & Awk

These two commands, available almost everywhere, are extremely used in bash scripting.

sed

substitutes regular expressions in files or strings. Examples follow:

$> echo “Ciao Ciao” | sed ‘s/C/M/’
     ->  “Miao Ciao”
$> echo “Ciao Ciao” | sed ‘s/C/M/g’
     ->  “Miao Miao”                  # g stands for “global substitution”

Regular expressions can also be used in the search.

"." in the regular expr means all characters (wild card) and needs to be protected as \. to be treated as a regular character
\n means newline
\t tab

awk

line by line operations (number & strings, syntax similar to c)

$> echo 10 4.0 | awk '{print $1 * sqrt($2)}'
$> echo “LabQSM 2020” | awk '{print $1; print "Year", $2}'

Awk has its own scripting, useful eg for parsing or data post-processing (the same operation/search is done line by line)

Take eg the file apt.txt with the list of tennis players that we have used in previous examples:

 9850, Nadal,  Rafael 
 6630, Federer,  Roger
 3075, Berrettini,  Matteo 
12030, Djokovic,  Novak

The problem can be solved by awk as follows:

#! /usr/bin/awk -f
BEGIN{ i=ind; nlines=0; FS="," }
{
  if (NF != 3) next
  nlines++
  if (nlines == i) {printf "%s, %s\n", $3,$2}
}
END{
# place here any operation to be done at the end
}

Run as:

$> ./solution.awk -v ind=2  apt.txt

Note that comma-separated columns are no longer needed, and one can avoid using commas by simply dropping the redefinition of the field-separator FS=",".

Difference between revisions of "Scripting"

Revision as of 13:41, 25 November 2020

Contents

Bash scripting

Sed & Awk

sed

awk

Navigation menu

Search