Awk can be used to process text files and can be used to format reports, perform string and arithmetic operations.
The structure of an awk program is do something in the beginning, read a line from a file( or from a pipe), perform operations on it, read the next line and then when the file is empty perform an end operation.
Sample awk command:
[amittal@hills awk]$ ls -l | awk '{ sum += $5 } END { print sum }'
783
In the above lines we are using "ls -l" to list all the files in the folder and the fifth column has the size of the file in bytes. The "$5" signifies the 5th field and we are adding that to a variable named "sum" . Once the input is processed line by line; then the sum of the bytes is printed out. The statement after "END" is executed once all the lines have been processed.
The word "awk" does not come from the word "awkward" but rather from the authors "Alfred Aho", "Peter J. Weinberger" and "Brian Kernighan" . Awk has it's own scripting language that may look similar to the "C" programming language. It can use the regular expressions for the pattern matching parts of the program. The regular expression is the same as the Unix re that were covered in a previous section. The awk language though is not the same as the Unix shell language. It is it's own language.
The structure of an awk program is usually:
Do the BEGIN section
For every line read from a file
Pattern match Execute the command
Do the END section
The command section can consist of a pattern/action sequence. The BEGIN section is optional as is the END section. The pattern can be optional with just the command or we can have only the pattern and not the command. Both can be absent in which case nothing is printed out.
Let us modify our awk command to include more stuff.
ls -l | awk 'BEGIN { print "Sum of the files modified in November." } $6 == "Nov" { sum += $5 } END { print sum }'
Sum of the files modified in November.
19
The "BEGIN" section has a command that prints a line and then we have a pattern that checks if the 6th field is "Nov" and the rest of the line is the same as before.
We can have multiple pattern / action statements.
ls -l | awk 'BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }
$6 == "Oct" { sum += $5 } END { print sum }'
Another awk example. Let's say we want to kill a Unix process depending on a value that the elapsed time is greater than and a pattern matching the process name.
kill -9 $( ps -eo comm,pid,etimes | awk '/main/ {if( $3 > 20) { print $2 }}')
Running awk
There are different ways to run awk commands.
Command Line:
We can run awk from the command line.
ls -l | awk '{ print $0 }'
Remember the awk structure is :
Do the BEGIN section
For every line read from a file
Pattern match Execute the command
Do the END section
Our command:
ls -l | awk '{ print $0 }'
does not have the optional BEGIN or END sections. It also does not have the pattern match ( optional) and the command in the curly parentheses is executed.
The "$0" means the whole of the input line.
$ ls -l | awk '{ print $0 }'
total 3
-rwxr-xr-x 1 user None 96 Feb 20 16:19 1.sh
-rwxr-xr-x 1 user None 104 Feb 20 16:52 2.sh
-rwxr-xr-x 1 user None 83 Feb 20 16:56 3.cmd
If we don't provide an input to "awk" then it takes it's input from the command line. This is similar to the way "grep" and "sed" behave.
$ awk '{ print $0 }'
Test
Test
First
First
Commands in a file
We can place the commands in a file. In this example we place the commands in a file called "awk4_commands" .
File: awk4_commands
BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }
$6 == "Oct" { sum += $5 } END { print sum }
[amittal@hills awk_manual]$ ls -l | awk -f awk4_commands
Sum of the files modified in October November.
1459
We do not have to put the single quote and can place statements on different lines. To run the commands we can use the "-f" option. This is actually the easiest way to run awk. We do not need to put the single quote around the command. However we do have to be careful as to where we break the lines.
BEGIN {print "Printing the sum of file sizes"}
{ sum += $5 }
END { print sum }
The above works fine as each line is terminated by the curly brace. But the below does not work.
BEGIN {print "Printing the sum of file sizes"}
{ sum += $5 }
END
{ print sum }
We can place back slashes at the end of each line except the last to take care of this problem.
BEGIN {print "Printing the sum of file sizes"}\
{ sum += $5 }\
END\
{ print sum }
$ ls -l | awk -f 32.cmd
Printing the sum of file sizes
257
Let' look at the following command:
BEGIN \ { print "Sum of the files modified in October November.";
sum = 0 }
$6 == "Nov"
{ sum += $5 }
$6 == "Oct" { sum += $5 }
END \
{ print sum }
We have a condition "$6 == "Nov" and the action associated with this condition is "{ sum += $5 } ". However that's not how awk reads it. The pattern and action statements are optional. The $6 == "Nov" is read as a pattern and awk does not take an action based on this pattern match. The next line is an action statement and is executed by itself regardless of the pattern. What we wanted was the action should only be executed if the pattern matched.
BEGIN \
{ print "Sum of the files modified in October November.";
sum = 0 }
$6 == "Nov" \
{ sum += $5 }
$6 == "Oct" { sum += $5 }
END \
{ print sum }
The "\" forces awk to consider the $6 == "Nov" and { sum += $5} as one line . Now the action is executed only of the pattern matches.
Shell Script
We can place all the commands in a shell script, make the shell script executable and then run it.
ls -l | awk 'BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }
$6 == "Oct" { sum += $5 }
END { print sum }'
The above can be placed in a file called "awk3.sh" and then made executable and we can run it. The same rules apply for breaking lines. If the line does not terminate with "}" then we need to place a backward slash.
ls -l | awk 'BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }
$6 == "Oct" { sum += $5 }
END \
{ print sum }'
We can run the awk command with multiple files also:
awk -f awk4_commands file1 file2 file3
If we do not pipe anything to the awk command and do not specify any files either then awk will take the input from the console.
[amittal@hills awk_manual]$ awk /test/
testing
testing
rose
home
this is a test
this is a test
The above does not have a begin, end or an action part but does have the pattern part.
Pattern Action
We have studied that in addition to the normal Regular Expressions we also have Extended Regular Expressions. Awk works with Extended Regular Expressions by default.
We can omit the pattern or the action .
The "print $5" prints the 5th field.
ls -l | awk '/Nov/ { print $5 }'
Output:
0
19
45
49
Using only the pattern.
ls -l | awk '/Nov/'
-rw------- 1 amittal csdept 0 Nov 5 07:28 data.txt
drwxrwxrwx 3 amittal csdept 19 Nov 13 10:49 midterm1
-rw-r--r-- 1 amittal csdept 45 Nov 5 06:53 notes.txt
-rwxrwxrwx 1 amittal csdept 49 Nov 5 06:52 var1.sh
The above prints the whole lines that have the word "Nov" in them .
Exercise
Ex1:
Create a folder called "awk" and in this folder create 2 files: "notes1.txt" and "notes2.txt" . Remember "$" means the end of line anchor. Use the regular expression "txt$" in awk to print out the listing of the text files.
ls -l | awk TODO
Ex2:
In the same folder "awk" create another file "notestxt" ( a name that ends in txt but does not have a dot . Write the regular expression in the awk command that will list only the files with ".txt" at the end.
Ex3:
Print only the first field of the "ls -l" output. This is the permissions field.
Solutions
Soln 1:
ls -l | awk '/txt$/'
Soln 2:
ls -l | awk '/\.txt$/'
Soln 3:
$ ls -l | awk '{ print $1 }'
Using only the action
ls -l | awk '{ print $0 }'
Output:
total 8
-rw------- 1 amittal csdept 0 Nov 5 07:28 data.txt
drwxrwxrwx 3 amittal csdept 19 Nov 13 10:49 midterm1
-rw-r--r-- 1 amittal csdept 45 Nov 5 06:53 notes.txt
-rwxrwxrwx 1 amittal csdept 49 Nov 5 06:52 var1.sh
The above prints the every line in the input.
Patterns
The awk command's structure is
awk
Begin
Pattern Command
End
For the example below we will be using a data file called "marks.txt" .
1) John 80
2) Peter 90
3) David 47
4) James 25
5) Lisa 89
6) Kenny 56
7) Sam 95
8) Julia 74
9) Cassie 66
10) Marelena 45
Let's look at the different ways we can input the pattern command.
We shall place the commands in the file "awk1" and then run the awk command from the command line as:
awk -f awk1 marks1.txt
We can place our pattern or regular expressions using the forward slash "/ /" and placing the letter "l" inside it.
/l/ {print $0}
Output:
[amittal@hills pattern]$ awk -f awk1 marks1.txt
8) Julia 74
10) Marelena 45
All the lines with the letter "l" are printed out.
Using a regular expression:
/a?e/ {print $0}
[amittal@hills pattern]$ awk -f awk1 marks1.txt
2) Peter 90
4) James 25
6) Kenny 56
9) Cassie 66
10) Marelena 45
Two patterns separated by a comma signify a range:
/is/,/am/ {print $0}
[amittal@hills pattern]$ awk -f awk1 marks1.txt
5) Lisa 89
6) Kenny 56
7) Sam 95
The above will start the pattern match with the line that matches the first pattern and stop till the next pattern is matched.
We can use an expression as a pattern also . In the below example
we are using "&&" to and the expressions.
($3 > 50 && $3 < 60 ) {print $0}
Output:
[amittal@hills pattern]$ awk -f awk1 marks1.txt
6) Kenny 56
Exercises
Ex1: Write an awk command in a file called "ex1.cmd" that will print the id, name and a letter grade of "A" to a student whose score is above 50. It should also print the title Id, Name and Grade.
Output:
Id Name Grade
1) John A
2) Peter A
5) Lisa A
6) Kenny A
7) Sam A
8) Julia A
9) Cassie A
Using BEGIN and END
We shall use the same technique of placing the awk commands in a file .
BEGIN { print "BEGIN" } { } END { print "END" }
In the above before awk processes the input it prints the word "BEGIN" . Now data is processed one line at a time but the
action that we have is blank so nothing is done. After the data has been processed the "END" statement which prints "END" to the console is executed.
[amittal@hills BeginEnd]$ awk -f awk1 marks1.txt
BEGIN
END
BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }
END { print "Average marks:", sum/noPeople }
[amittal@hills BeginEnd]$ awk -f awk1 marks1.txt
Average marks: 66.7
We can use user defined variables in our awk script. The variables are similar to the variables in Shell . We do not have to declare the type. The increment operator "++" increases the value by 1 .
Exercise:
1) Modify the command file to print:
Number of people: (Actual number of people) Total: ( Actual total )
Average marks: 66.7
Let's rewrite the above to use the average variable.
BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }
END {
average = sum/noPeople
print "Average marks:", average }
[amittal@hills BeginEnd]$ awk -f awk2 marks1.txt
Average marks: 66.7
We are going to change the above command slightly by removing the "{" at the end of "END" .
BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }
END
{
average = sum/noPeople
print "Average marks:", average }
We receive an error of the form:
[amittal@hills BeginEnd]$ awk -f awk2 marks1.txt
awk: awk2:5: END blocks must have an action part
We need the "{" at the end of BEGIN or END block or we can use the slash "\" to correct the syntax.
BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }
END \
{
average = sum/noPeople
print "Average marks:", average }
Output:
[amittal@hills BeginEnd]$ awk -f awk2 marks1.txt
Average marks: 66.7
Fields
Assume we have a data file called "marks.txt"
1) Amit Physics 80
2) Rahul Maths 90
3) Shyam Biology 87
4) Kedar English 85
5) Hari History 89
The fields are labelled as $1, $2 and so on to represent first, second fields and so on. The "$0" represents the whole line. The "print" command without any arguments will print the whole line.
awk '{print}' marks.txt
or the equivalent statement:
awk '{print $0}' marks.txt
Output:
1) Amit Physics 80
2) Rahul Maths 90
3) Shyam Biology 87
4) Kedar English 85
5) Hari History 89
The field no does not have to be a constant
awk 'BEGIN {var1=3} {print $var1}' marks.txt
Output:
Physics
Maths
Biology
English
History
We can separate the items in a print statement with a comma. The string literals must be quoted.
awk 'BEGIN {var1=3} {print $var1, " " , $(var1-1)}' marks.txt
Awk Built In Variables
ENVIRON
ENVIRON is an associative array holding info about environment variables.
[amittal@hills arrays]$ awk 'BEGIN { print ENVIRON["USER"] }'
amittal
[amittal@hills arrays]$ awk 'BEGIN { print ENVIRON["PATH"] }'
/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/users/amittal/.local/bin:/users/amittal/bin
Notice since the awk command is small and does not take a data file we can use a single command.
FS
This is the field separator. By default it's value is space but we can change that.
[amittal@hills arrays]$ echo "first:second:third" | awk 'BEGIN { FS=":" } { print $1,$2,$3 }'
first second third
RS
RS is the record separator. Usually this is the new line but we can change that.
[amittal@hills arrays]$ echo "first:scond:third" | awk 'BEGIN { RS=":" } { print $1 }'
first
scond
third
In the above we specified the colon ":" as the record separator. There are 3 records and the first field of each record is printed out.
Exercise:
echo "line1a:line1b:line1c&line2a:line2b:line2c&" | awk -f f1.cmd
Write ":f1.cmd" to have the RS as & and FS as : to print the output as:
line1a:line1b:line1c
line2a:line2b:line2c
NR
The "NR" field represents the record number.
[amittal@hills arrays]$ echo "first:scond:third" | awk 'BEGIN { RS=":" } { print $1, NR }'
first 1
scond 2
third 3
Exercise
1) Modify the original example with:
BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }
END { print "Average marks:", sum/noPeople }
Take out the "noPeople" and instead use NR .
2) File: "marks2.txt"
Id Name Grade
---------------------
1) John 80
2) Peter 90
3) David 47
4) James 25
5) Lisa 89
6) Kenny 56
7) Sam 95
8) Julia 74
9) Cassie 66
10) Marelena 45
BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }
END { print "Average marks:", sum/noPeople }
Add the condition ( NR > 2) to the above command so that the first 2 lines are skipped when doing the calculations.
NF
The "NF" represents the number of fields in a record. We can use this to grab the last field from a record.
[amittal@hills arrays]$ echo "first scond third" | awk '{ print $NF }'
third
Exercise
1) Use NF and the condition ( NR > 2 ) to just print the grade from the previous example.
Printf
The "printf" function allows us to specify format specifiers . The "printf" function is very powerful and has extra features that are not there in the "print" function.
File: "p1.cmd"
{ printf( "%10s%10s%10s\n", $1 , $2 , $3 ) }
$ echo "Id Name Marks" | awk -f p1.cmd
Id Name Marks
We can specify a place holder in the first argument by using the percent symbol. Then we specify the value after the first argument. In the above we are stating that the first argument be used for "%10s" . The "s" means the value is a string, We must have have the same number of variables as the place holders. The "10" means reserve 10 spaces for the string. If the string is smaller then it is padded with spaces. This can help in aligning the values.
We do not need to specify a format string.
{ printf( "Id Name Marks\n" ) }
$ echo "" | awk -f p2.cmd
Id Name Marks
The function "print" will print a new line by default but "printf" does not do that .
We can use the usual backspace characters of "\n" to represent new line and "\t" to represent tabs.
Format Specifiers
%c ASCII Character
%d Decimal integer
%e Floating Point number
%f Floating Point number
%g The shorter of e or f,
%o Octal
%s String
%x Hexadecimal
%% Literal %
We do not have types in the awk language but a variable can be assigned a value and then we can print that value out if it contains the same type that we are specifying in the "printf" string. If we state the "%s" thgen we need to supply a string.
We saw how the statement:
{ printf( "%10s%10s%10s\n", $1 , $2 , $3 ) }
allocated a width of 10 for the string. The spaces are padded on the left. If we want the string to be on the left hand side with the spaces padded on the right then we use the "-10" notation.
File: "pr3.cmd"
{ printf( "%-10s%-10s%-10s\n", $1 , $2 , $3 ) }
$ echo "Id Name Marks" | awk -f p3.cmd
We can also restrict the number of decimal points with the ".2f" kind spedifier.
$ echo "" | awk '{ printf("%.2f" , 3.41256) }'
3.41
In the above we are stating that the floating point value should only have 2 fraction digits at most.
File: "marks.txt"
1) John 80
2) Peter 90
3) David 47
4) James 25
5) Lisa 89
6) Kenny 56
7) Sam 95
8) Julia 74
9) Cassie 66
10) Marelena 45
Exercise:
1)Write an awk command in file "pr4.cmd" . Create a file "pr4.sh" that will have the following line.
File: "pr4.sh"
cat data.txt | awk -f pr4.cmd
Run the file "./pr4.sh" to produce the output:
$ ./pr4.sh
Id Name Marks
1) John 80
2) Peter 90
3) David 47
4) James 25
5) Lisa 89
6) Kenny 56
7) Sam 95
8) Julia 74
9) Cassie 66
10) Marelena 45
2)
The int function can be used to retain the number and throw away the fractional part . It can be used as int( 3.142 ) . Use the printf to change the following file .
File: "data1.txt"
1.5 3.1425
14.23 7.5678
3.7 8.6523
4.9 9.4567
to:
1 3.14
14 7.57
3 8.65
4 9.46
Strings
Concatenation of strings.
There is no explicit operation to join strings. All we have to do is write the strings next to each other.
File: "s1.cmd"
{
str1="Ajay" "Mittal"
print str1
str1="Ajay"
str1 = str1 " " "Ajay"
print str1
}
$ echo "" | awk -f s1.cmd
AjayMittal
Ajay Ajay
The expression
str1 " " "Ajay"
joins 3 strings. The contents of the string str1 and a blank space and the string "Ajay" .
File: "s2.cmd"
{
str1="table"
str2 = ""
l1 = length( str1 )
for( i1=l1; i1 > 0 ; i1-- )
{
#print i1
str2 = str2 substr( str1, i1, 1 )
#print str2
}
print str2
}
$ echo "" | awk -f s2.cmd
elbat
The above code reverses the word in the variable "str1". The function substr has 3 arguments. The first argument is the string. The second argument is the position that we need to grab the sub string from and the third argument is the number of characters we need to grab. If "str1" contains the string "table" then some possible examples are:
substr( str1, 1, 1 ) Result is "s"
substr( str1, 1, 3 ) Result is "str"
substr( str1, 2 ) Result is "r1"
File: "str3.cmd"
{
str1="wood table"
split ( str1 , arr1, " " )
print arr1[1]
print arr1[2]
}
$ echo "" | awk -f s3.cmd
wood
table
The "split" function splits the input string and places the split strings into an array that can be indexed by numbers.
Comments
Comments in awk are preceded by the hash symbol.
Exercise
1)
Add some comments after the BEGIN part but before the action part in any of the previous exercises.
Awk Control Flow
If condition
Let us modify our "marks.txt" slightly .
1) John M 80
2) Peter M 90
3) David M 47
4) James M 25
5) Lisa F 89
6) Kenny M 56
7) Sam M 95
8) Julia F 74
9) Cassie F 66
10) Marelena F 45
and our awk command:
BEGIN { sum=0 ; average=0; noPeople=0 } {
if ( $3 == "F" )
{
print $0
noPeople++
sum += $4 ;
}
}
END { print "Average marks:", sum/noPeople }
We can use the semicolon to separate each statement . If a statement is on a line and the next statement is on another line then the semicolon is not necessary.
Using the "if" with "else if"
BEGIN { sum1=0 ; average1=0; noPeople1=0
sum2=0 ; average2=0; noPeople2=0
}
{
if ( $3 == "F" )
{
print $0
noPeople1++
sum1 += $4 ;
}
else if ( $3 == "M" )
{
print $0
noPeople2++
sum2 += $4 ;
}
}
END { print "Average marks for F:", sum1/noPeople1
print "Average marks for M:", sum2/noPeople2
}
Output:
[amittal@hills if_condition]$ awk -f awk2 marks1.txt
1) John M 80
2) Peter M 90
3) David M 47
4) James M 25
5) Lisa F 89
6) Kenny M 56
7) Sam M 95
8) Julia F 74
9) Cassie F 66
10) Marelena F 45
Average marks for F: 68.5
Average marks for M: 65.5
Exercise
Using the above data file determine the person with the highest marks and the person with the lowest mark.
John has the highest mark of 90.
James has the lowest mark of 25.
Awk loops
The for loop has the strucure:
for( Initial ; Condtion ; Post )
{
//Body of the loop
}
The "initialization" part is run once and can be used to initialize variables. The condition part is tested and if true the body of the loop is executed. After the body has been executed the post statementis run. After which the condition is tested again and so on till the condition becomes false.
Ex:
File: "1.cmd"
{
print "For loop"
for( i1=0 ; i1<3 ; i1++)
print i1
}
$ echo "" | awk -f 1.cmd
For loop
0
1
2
Ex:
{
ind1 = 2 ;
ind2 = $0 - 1
#print ind2 ;
isPrime = 1 ;
for ( ; ind1 <= ind2 ; ind1++ )
{
if ( $0 % ind1 == 0 )
isPrime = 0 ;
}
if ( isPrime == 1 && length( $0 ) > 0 )
{
print $0, " is a prime number."
}
}
data.txt:
20
21
23
17
7
8
9
Command is run as:
awk -f awk1 data.txt
Output:
[amittal@hills while]$ awk -f awk1 data.txt
23 is a prime number.
17 is a prime number.
7 is a prime number.
There is another notation for looping through the array and that is:
for( i1 in array )
do something
For each item in the array the variable "i1" will take on the value of the "index" element and we can access the array value with the notation "array[i1]" .
File: "awk3_states "
BEGIN {
state["Dublin"] = "California";
state["Reno"] = "Nevada"
state["San Jose"] = "California"
state["Oakland"] = "California"
state["Las Vegas"] = "Nevada"
for( str1 in state )
print str1 , state[str1]
}
Output:
[amittal@hills while]$ awk -f awk3_states
Reno Nevada
Dublin California
Las Vegas Nevada
San Jose California
Oakland California
While loops
File: "2.cmd"
{
print "While loop"
i1=0
while ( i1 < 3 )
{
print i1
i1++
}
}
$ echo "" | awk -f 2.cmd
While loop
0
1
2
Exercise
1)
Modify the prime no example to use "while" loop instead of "for" loop.
Awk Arrays
Let's assume we have a file called "cities.txt"
File: "cities.txt"
1) "Dublin"
2) "Reno"
3) "San Jose"
4) "Oakland"
5) "Las Vegas"
We want to try to get the second field but notice some of the cities like "San Jose" have spaces in them .
File: "cities"
{ print $2 }
[amittal@hills arrays]$ awk -f cities cities.txt
"Dublin"
"Reno"
"San
"Oakland"
"Las
We see that the output is not what we want. There isn't any easy way to fix this in awk.
What we can do is change the field separator with our sed command.
File: "convert_cities.sh"
cat cities.txt | sed -r 's/[ ]+/|/' > cities1.txt
File: "cities1.txt"
1)|"Dublin"
2)|"Reno"
3)|"San Jose"
4)|"Oakland"
5)|"Las Vegas"
We have replaced the first series of spaces with the pipeline character "|" .
Another way of getting around this problem is to use the quotation as the field separator character.
Exercise:
Write a command in file that prints the cities using the quotation mark as the separator. Run the command file as:
cat cities.txt | awk -f awk1.cmd
Our awk script:
File: "cities1"
BEGIN {
FS="|"
state["Dublin"] = "California";
state["Reno"] = "Nevada"
state["San Jose"] = "California"
state["Oakland"] = "California"
state["Las Vegas"] = "Nevada"
}
{
#gsub is an awk function
gsub(/"/, "", $2)
print $2 " : " state[$2] }
In the BEGIN section we list our file separator as "|" with the command:
FS="|"
There is still a problem and that is the text file contains quotes for the cities. We can remove the quotes with the awk built in "gsub" function . We use an array in the "BEGIN" section to assign states for cities. In the action command we look up the cities with the statement "state[$2]" .
[amittal@hills arrays]$ awk -f cities1 cities1.txt
Dublin : California
Reno : Nevada
San Jose : California
Oakland : California
Las Vegas : Nevada
We do not have to declare the array or it's size and the arrays are associative which means it's subscript value could be a number or string.
We can of course use the awk arrays in the traditional sense:
File: "fib1"
BEGIN {
#holder 1 to 10 for fibonacci number
holder[1] = 1
holder[2] = 1
for( i1=3 ; i1<=10 ; i1++ )
{
holder[ i1 ] = holder[ i1 - 1 ] + holder[ i1 - 2 ]
}
for( i1=1 ; i1<=10 ; i1++ )
{
print holder[i1]
}
}
Output:
[amittal@hills arrays]$ awk -f fib1
1
1
2
3
5
8
13
21
34
55
Awk functions
We have awk built in functions that are provided to us and we can also define our own functions if we wish.
File: awk3
BEGIN {
arr[0] = "Three"
arr[1] = "One"
arr[2] = "Two"
print "Array elements before sorting:"
for (i1 in arr) {
print arr[i1]
}
asort(arr)
print "Array elements after sorting:"
for (i1 in arr) {
print arr[i1] , length( arr[i1] )
}
}
Output:
[amittal@hills functions]$ awk -f awk3
Array elements before sorting:
Three
One
Two
Array elements after sorting:
One 3
Three 5
Two 3
We are using the "asort" function to sort and the "length" function to obtain the length of the string.
In the below example we have written a function that returns a 1 if the number passed to it in the argument is a prime number.
File: "awk1"
function isPrimeNo( num1 )
{
ind1 = 2 ;
ind2 = $0 - 1
#print ind2 ;
isPrime = 1 ;
for ( ; ind1 <= ind2 ; ind1++ )
{
if ( $0 % ind1 == 0 )
isPrime = 0 ;
}
if ( isPrime == 1 && length( $0 ) > 0 )
{
#print $0, " is a prime number."
return 1
}
return 0
}
{
if ( isPrimeNo( $0 ) == 1 )
print $0, " is a prime number."
}
File: "data.txt"
20
21
23
17
7
8
9
[amittal@hills functions]$ awk -f awk1 data.txt
23 is a prime number.
17 is a prime number.
7 is a prime number.
Exercise
1) File: "data.txt"
2 3
4 2
3 3
10 3
Use the file "power1.cmd" to fill in the function for power.
File: "power1.cmd"
function power( num1 , num2 )
{
//TO DO
}
{
value=power($1, $2)
print $1, $2 , value
}
$ cat data.txt | awk -f power1.cmd
2 3 8
4 2 16
3 3 27
10 3 1000
Exercise
The date command prints out the date and time.
$ date
Sat, Mar 13, 2021 7:42:55 AM
Create a folder called "exb" and in it create a folder called "source" and create some files in it. Write a script that will create a folder using the output of the date command and copy the files from the source folder to this folder. For the output above our folder will have the name:
backup_2021_Mar_13_7_42_55
1)
whoami | sed -r 's/[.]*/Thanks for sending some time with me today: /'
prints
Thanks for sending some time with me today: amittal
whereas
whoami | sed -r 's/.*/Thanks for sending some time with me today: /'
prints
whoami | sed -r 's/.*/Thanks for sending some time with me today: /'
Thanks for sending some time with me today: