Awk

Awk can be used to process text files and can be used to format reports, perform string and arithmetic operations.

The structure of an awk program is do something in the beginning, read a line from a file( or from a pipe), perform operations on it, read the next line and then when the file is empty perform an end operation.

Sample awk command:

[amittal@hills awk]$ ls -l | awk '{ sum += $5 } END { print sum }'

783

In the above lines we are using "ls -l" to list all the files in the folder and the fifth column has the size of the file in bytes. The "$5" signifies the 5th field and we are adding that to a variable named "sum" . Once the input is processed line by line; then the sum of the bytes is printed out. The statement after "END" is executed once all the lines have been processed.

The word "awk" does not come from the word "awkward" but rather from the authors "Alfred Aho", "Peter J. Weinberger" and "Brian Kernighan" . Awk has it's own scripting language that may look similar to the "C" programming language. It can use the regular expressions for the pattern matching parts of the program. The regular expression is the same as the Unix re that were covered in a previous section. The awk language though is not the same as the Unix shell language. It is it's own language.

The structure of an awk program is usually:

Do the BEGIN section

For every line read from a file

Pattern match Execute the command

Do the END section

The command section can consist of a pattern/action sequence. The BEGIN section is optional as is the END section. The pattern can be optional with just the command or we can have only the pattern and not the command. Both can be absent in which case nothing is printed out.

Let us modify our awk command to include more stuff.

ls -l | awk 'BEGIN { print "Sum of the files modified in November." } $6 == "Nov" { sum += $5 } END { print sum }'

Sum of the files modified in November.

The "BEGIN" section has a command that prints a line and then we have a pattern that checks if the 6th field is "Nov" and the rest of the line is the same as before.

We can have multiple pattern / action statements.

ls -l | awk 'BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }

$6 == "Oct" { sum += $5 } END { print sum }'

Another awk example. Let's say we want to kill a Unix process depending on a value that the elapsed time is greater than and a pattern matching the process name.

kill -9 $( ps -eo comm,pid,etimes | awk '/main/ {if( $3 > 20) { print $2 }}')

Running awk

There are different ways to run awk commands.

Command Line:

We can run awk from the command line.

ls -l | awk '{ print $0 }'

Remember the awk structure is :

Do the BEGIN section

For every line read from a file

Pattern match Execute the command

Do the END section

Our command:

ls -l | awk '{ print $0 }'

does not have the optional BEGIN or END sections. It also does not have the pattern match ( optional) and the command in the curly parentheses is executed.

The "$0" means the whole of the input line.

$ ls -l | awk '{ print $0 }'

total 3

-rwxr-xr-x 1 user None 96 Feb 20 16:19 1.sh

-rwxr-xr-x 1 user None 104 Feb 20 16:52 2.sh

-rwxr-xr-x 1 user None 83 Feb 20 16:56 3.cmd

If we don't provide an input to "awk" then it takes it's input from the command line. This is similar to the way "grep" and "sed" behave.

$ awk '{ print $0 }'

Test

First

Commands in a file

We can place the commands in a file. In this example we place the commands in a file called "awk4_commands" .

File: awk4_commands

BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }

$6 == "Oct" { sum += $5 } END { print sum }

[amittal@hills awk_manual]$ ls -l | awk -f awk4_commands

Sum of the files modified in October November.

1459

We do not have to put the single quote and can place statements on different lines. To run the commands we can use the "-f" option. This is actually the easiest way to run awk. We do not need to put the single quote around the command. However we do have to be careful as to where we break the lines.

BEGIN {print "Printing the sum of file sizes"}

{ sum += $5 }

END { print sum }

The above works fine as each line is terminated by the curly brace. But the below does not work.

BEGIN {print "Printing the sum of file sizes"}

{ sum += $5 }

END

{ print sum }

We can place back slashes at the end of each line except the last to take care of this problem.

BEGIN {print "Printing the sum of file sizes"}\

{ sum += $5 }\

END\

{ print sum }

$ ls -l | awk -f 32.cmd

Printing the sum of file sizes

257

Let' look at the following command:

BEGIN \ { print "Sum of the files modified in October November.";

sum = 0 }

$6 == "Nov"

{ sum += $5 }

$6 == "Oct" { sum += $5 }

END \

{ print sum }

We have a condition "$6 == "Nov" and the action associated with this condition is "{ sum += $5 } ". However that's not how awk reads it. The pattern and action statements are optional. The $6 == "Nov" is read as a pattern and awk does not take an action based on this pattern match. The next line is an action statement and is executed by itself regardless of the pattern. What we wanted was the action should only be executed if the pattern matched.

BEGIN \

{ print "Sum of the files modified in October November.";

sum = 0 }

$6 == "Nov" \

{ sum += $5 }

$6 == "Oct" { sum += $5 }

END \

{ print sum }

The "\" forces awk to consider the $6 == "Nov" and { sum += $5} as one line . Now the action is executed only of the pattern matches.

Shell Script

We can place all the commands in a shell script, make the shell script executable and then run it.

ls -l | awk 'BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }

$6 == "Oct" { sum += $5 }

END { print sum }'

The above can be placed in a file called "awk3.sh" and then made executable and we can run it. The same rules apply for breaking lines. If the line does not terminate with "}" then we need to place a backward slash.

ls -l | awk 'BEGIN { print "Sum of the files modified in October November." } $6 == "Nov" { sum += $5 }

$6 == "Oct" { sum += $5 }

END \

{ print sum }'

We can run the awk command with multiple files also:

awk -f awk4_commands file1 file2 file3

If we do not pipe anything to the awk command and do not specify any files either then awk will take the input from the console.

[amittal@hills awk_manual]$ awk /test/

testing

rose

home

this is a test

The above does not have a begin, end or an action part but does have the pattern part.

Pattern Action

We have studied that in addition to the normal Regular Expressions we also have Extended Regular Expressions. Awk works with Extended Regular Expressions by default.

We can omit the pattern or the action .

The "print $5" prints the 5th field.

ls -l | awk '/Nov/ { print $5 }'

Output:

Using only the pattern.

ls -l | awk '/Nov/'

-rw------- 1 amittal csdept 0 Nov 5 07:28 data.txt

drwxrwxrwx 3 amittal csdept 19 Nov 13 10:49 midterm1

-rw-r--r-- 1 amittal csdept 45 Nov 5 06:53 notes.txt

-rwxrwxrwx 1 amittal csdept 49 Nov 5 06:52 var1.sh

The above prints the whole lines that have the word "Nov" in them .

Exercise

Ex1:

Create a folder called "awk" and in this folder create 2 files: "notes1.txt" and "notes2.txt" . Remember "$" means the end of line anchor. Use the regular expression "txt$" in awk to print out the listing of the text files.

ls -l | awk TODO

Ex2:

In the same folder "awk" create another file "notestxt" ( a name that ends in txt but does not have a dot . Write the regular expression in the awk command that will list only the files with ".txt" at the end.

Ex3:

Print only the first field of the "ls -l" output. This is the permissions field.

Solutions

Soln 1:

ls -l | awk '/txt$/'

Soln 2:

ls -l | awk '/\.txt$/'

Soln 3:

$ ls -l | awk '{ print $1 }'

Using only the action

ls -l | awk '{ print $0 }'

Output:

total 8

-rw------- 1 amittal csdept 0 Nov 5 07:28 data.txt

drwxrwxrwx 3 amittal csdept 19 Nov 13 10:49 midterm1

-rw-r--r-- 1 amittal csdept 45 Nov 5 06:53 notes.txt

-rwxrwxrwx 1 amittal csdept 49 Nov 5 06:52 var1.sh

The above prints the every line in the input.

Patterns

The awk command's structure is

awk

Begin

Pattern Command

End

For the example below we will be using a data file called "marks.txt" .

1) John 80

2) Peter 90

3) David 47

4) James 25

5) Lisa 89

6) Kenny 56

7) Sam 95

8) Julia 74

9) Cassie 66

10) Marelena 45

Let's look at the different ways we can input the pattern command.

We shall place the commands in the file "awk1" and then run the awk command from the command line as:

awk -f awk1 marks1.txt

We can place our pattern or regular expressions using the forward slash "/ /" and placing the letter "l" inside it.

/l/ {print $0}

Output:

[amittal@hills pattern]$ awk -f awk1 marks1.txt

8) Julia 74

10) Marelena 45

All the lines with the letter "l" are printed out.

Using a regular expression:

/a?e/ {print $0}

[amittal@hills pattern]$ awk -f awk1 marks1.txt

2) Peter 90

4) James 25

6) Kenny 56

9) Cassie 66

10) Marelena 45

Two patterns separated by a comma signify a range:

/is/,/am/ {print $0}

[amittal@hills pattern]$ awk -f awk1 marks1.txt

5) Lisa 89

6) Kenny 56

7) Sam 95

The above will start the pattern match with the line that matches the first pattern and stop till the next pattern is matched.

We can use an expression as a pattern also . In the below example

we are using "&&" to and the expressions.

($3 > 50 && $3 < 60 ) {print $0}

Output:

[amittal@hills pattern]$ awk -f awk1 marks1.txt

6) Kenny 56

Exercises

Ex1: Write an awk command in a file called "ex1.cmd" that will print the id, name and a letter grade of "A" to a student whose score is above 50. It should also print the title Id, Name and Grade.

Output:

Id Name Grade

1) John A

2) Peter A

5) Lisa A

6) Kenny A

7) Sam A

8) Julia A

9) Cassie A

Using BEGIN and END

We shall use the same technique of placing the awk commands in a file .

BEGIN { print "BEGIN" } { } END { print "END" }

In the above before awk processes the input it prints the word "BEGIN" . Now data is processed one line at a time but the

action that we have is blank so nothing is done. After the data has been processed the "END" statement which prints "END" to the console is executed.

[amittal@hills BeginEnd]$ awk -f awk1 marks1.txt

BEGIN

END

BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }

END { print "Average marks:", sum/noPeople }

[amittal@hills BeginEnd]$ awk -f awk1 marks1.txt

Average marks: 66.7

We can use user defined variables in our awk script. The variables are similar to the variables in Shell . We do not have to declare the type. The increment operator "++" increases the value by 1 .

Exercise:

1) Modify the command file to print:

Number of people: (Actual number of people) Total: ( Actual total )

Average marks: 66.7

Let's rewrite the above to use the average variable.

BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }

END {

average = sum/noPeople

print "Average marks:", average }

[amittal@hills BeginEnd]$ awk -f awk2 marks1.txt

Average marks: 66.7

We are going to change the above command slightly by removing the "{" at the end of "END" .

BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }

END

{

average = sum/noPeople

print "Average marks:", average }

We receive an error of the form:

[amittal@hills BeginEnd]$ awk -f awk2 marks1.txt

awk: awk2:5: END blocks must have an action part

We need the "{" at the end of BEGIN or END block or we can use the slash "\" to correct the syntax.

BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }

END \

{

average = sum/noPeople

print "Average marks:", average }

Output:

[amittal@hills BeginEnd]$ awk -f awk2 marks1.txt

Average marks: 66.7

Fields

Assume we have a data file called "marks.txt"

1) Amit Physics 80

2) Rahul Maths 90

3) Shyam Biology 87

4) Kedar English 85

5) Hari History 89

The fields are labelled as $1, $2 and so on to represent first, second fields and so on. The "$0" represents the whole line. The "print" command without any arguments will print the whole line.

awk '{print}' marks.txt

or the equivalent statement:

awk '{print $0}' marks.txt

Output:

1) Amit Physics 80

2) Rahul Maths 90

3) Shyam Biology 87

4) Kedar English 85

5) Hari History 89

The field no does not have to be a constant

awk 'BEGIN {var1=3} {print $var1}' marks.txt

Output:

Physics

Maths

Biology

English

History

We can separate the items in a print statement with a comma. The string literals must be quoted.

awk 'BEGIN {var1=3} {print $var1, " " , $(var1-1)}' marks.txt

Awk Built In Variables

ENVIRON

ENVIRON is an associative array holding info about environment variables.

[amittal@hills arrays]$ awk 'BEGIN { print ENVIRON["USER"] }'

amittal

[amittal@hills arrays]$ awk 'BEGIN { print ENVIRON["PATH"] }'

/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/users/amittal/.local/bin:/users/amittal/bin

Notice since the awk command is small and does not take a data file we can use a single command.

This is the field separator. By default it's value is space but we can change that.

[amittal@hills arrays]$ echo "first:second:third" | awk 'BEGIN { FS=":" } { print $1,$2,$3 }'

first second third

RS is the record separator. Usually this is the new line but we can change that.

[amittal@hills arrays]$ echo "first:scond:third" | awk 'BEGIN { RS=":" } { print $1 }'

first

scond

third

In the above we specified the colon ":" as the record separator. There are 3 records and the first field of each record is printed out.

Exercise:

echo "line1a:line1b:line1c&line2a:line2b:line2c&" | awk -f f1.cmd

Write ":f1.cmd" to have the RS as & and FS as : to print the output as:

line1a:line1b:line1c

line2a:line2b:line2c

The "NR" field represents the record number.

[amittal@hills arrays]$ echo "first:scond:third" | awk 'BEGIN { RS=":" } { print $1, NR }'

first 1

scond 2

third 3

Exercise

1) Modify the original example with:

BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }

END { print "Average marks:", sum/noPeople }

Take out the "noPeople" and instead use NR .

2) File: "marks2.txt"

Id Name Grade

---------------------

1) John 80

2) Peter 90

3) David 47

4) James 25

5) Lisa 89

6) Kenny 56

7) Sam 95

8) Julia 74

9) Cassie 66

10) Marelena 45

BEGIN { sum=0 ; average=0; noPeople=0 } { sum += $3 ; noPeople++ }

END { print "Average marks:", sum/noPeople }

Add the condition ( NR > 2) to the above command so that the first 2 lines are skipped when doing the calculations.

The "NF" represents the number of fields in a record. We can use this to grab the last field from a record.

[amittal@hills arrays]$ echo "first scond third" | awk '{ print $NF }'

third

Exercise

1) Use NF and the condition ( NR > 2 ) to just print the grade from the previous example.

Printf

The "printf" function allows us to specify format specifiers . The "printf" function is very powerful and has extra features that are not there in the "print" function.

File: "p1.cmd"

{ printf( "%10s%10s%10s\n", $1 , $2 , $3 ) }

$ echo "Id Name Marks" | awk -f p1.cmd

Id Name Marks

We can specify a place holder in the first argument by using the percent symbol. Then we specify the value after the first argument. In the above we are stating that the first argument be used for "%10s" . The "s" means the value is a string, We must have have the same number of variables as the place holders. The "10" means reserve 10 spaces for the string. If the string is smaller then it is padded with spaces. This can help in aligning the values.

We do not need to specify a format string.

{ printf( "Id Name Marks\n" ) }

$ echo "" | awk -f p2.cmd

Id Name Marks

The function "print" will print a new line by default but "printf" does not do that .

We can use the usual backspace characters of "\n" to represent new line and "\t" to represent tabs.

Format Specifiers

%c ASCII Character

%d Decimal integer

%e Floating Point number

%f Floating Point number

%g The shorter of e or f,

%o Octal

%s String

%x Hexadecimal

%% Literal %

We do not have types in the awk language but a variable can be assigned a value and then we can print that value out if it contains the same type that we are specifying in the "printf" string. If we state the "%s" thgen we need to supply a string.

We saw how the statement:

{ printf( "%10s%10s%10s\n", $1 , $2 , $3 ) }

allocated a width of 10 for the string. The spaces are padded on the left. If we want the string to be on the left hand side with the spaces padded on the right then we use the "-10" notation.

File: "pr3.cmd"

{ printf( "%-10s%-10s%-10s\n", $1 , $2 , $3 ) }

$ echo "Id Name Marks" | awk -f p3.cmd

We can also restrict the number of decimal points with the ".2f" kind spedifier.

$ echo "" | awk '{ printf("%.2f" , 3.41256) }'

3.41

In the above we are stating that the floating point value should only have 2 fraction digits at most.

File: "marks.txt"

1) John 80

2) Peter 90

3) David 47

4) James 25

5) Lisa 89

6) Kenny 56

7) Sam 95

8) Julia 74

9) Cassie 66

10) Marelena 45

Exercise:

1)Write an awk command in file "pr4.cmd" . Create a file "pr4.sh" that will have the following line.

File: "pr4.sh"

cat data.txt | awk -f pr4.cmd

Run the file "./pr4.sh" to produce the output:

$ ./pr4.sh

Id Name Marks

1) John 80

2) Peter 90

3) David 47

4) James 25

5) Lisa 89

6) Kenny 56

7) Sam 95

8) Julia 74

9) Cassie 66

10) Marelena 45

The int function can be used to retain the number and throw away the fractional part . It can be used as int( 3.142 ) . Use the printf to change the following file .

File: "data1.txt"

1.5 3.1425

14.23 7.5678

3.7 8.6523

4.9 9.4567

to:

1 3.14

14 7.57

3 8.65

4 9.46

Strings

Concatenation of strings.

There is no explicit operation to join strings. All we have to do is write the strings next to each other.

File: "s1.cmd"

{

str1="Ajay" "Mittal"

print str1

str1="Ajay"

str1 = str1 " " "Ajay"

print str1

}

$ echo "" | awk -f s1.cmd

AjayMittal

Ajay Ajay

The expression

str1 " " "Ajay"

joins 3 strings. The contents of the string str1 and a blank space and the string "Ajay" .

File: "s2.cmd"

{

str1="table"

str2 = ""

l1 = length( str1 )

for( i1=l1; i1 > 0 ; i1-- )

{

#print i1

str2 = str2 substr( str1, i1, 1 )

#print str2

}

print str2

}

$ echo "" | awk -f s2.cmd

elbat

The above code reverses the word in the variable "str1". The function substr has 3 arguments. The first argument is the string. The second argument is the position that we need to grab the sub string from and the third argument is the number of characters we need to grab. If "str1" contains the string "table" then some possible examples are:

substr( str1, 1, 1 ) Result is "s"

substr( str1, 1, 3 ) Result is "str"

substr( str1, 2 ) Result is "r1"

File: "str3.cmd"

{

str1="wood table"

split ( str1 , arr1, " " )

print arr1[1]

print arr1[2]

}

$ echo "" | awk -f s3.cmd

wood

table

The "split" function splits the input string and places the split strings into an array that can be indexed by numbers.

Comments

Comments in awk are preceded by the hash symbol.

Exercise

Add some comments after the BEGIN part but before the action part in any of the previous exercises.

Awk Control Flow

If condition

Let us modify our "marks.txt" slightly .

1) John M 80

2) Peter M 90

3) David M 47

4) James M 25

5) Lisa F 89

6) Kenny M 56

7) Sam M 95

8) Julia F 74

9) Cassie F 66

10) Marelena F 45

and our awk command:

BEGIN { sum=0 ; average=0; noPeople=0 } {

if ( $3 == "F" )

{

print $0

noPeople++

sum += $4 ;

}

END { print "Average marks:", sum/noPeople }

We can use the semicolon to separate each statement . If a statement is on a line and the next statement is on another line then the semicolon is not necessary.

Using the "if" with "else if"

BEGIN { sum1=0 ; average1=0; noPeople1=0

sum2=0 ; average2=0; noPeople2=0

}

{

if ( $3 == "F" )

{

print $0

noPeople1++

sum1 += $4 ;

}

else if ( $3 == "M" )

{

print $0

noPeople2++

sum2 += $4 ;

}

END { print "Average marks for F:", sum1/noPeople1

print "Average marks for M:", sum2/noPeople2

}

Output:

[amittal@hills if_condition]$ awk -f awk2 marks1.txt

1) John M 80

2) Peter M 90

3) David M 47

4) James M 25

5) Lisa F 89

6) Kenny M 56

7) Sam M 95

8) Julia F 74

9) Cassie F 66

10) Marelena F 45

Average marks for F: 68.5

Average marks for M: 65.5

Exercise

Using the above data file determine the person with the highest marks and the person with the lowest mark.

John has the highest mark of 90.

James has the lowest mark of 25.

Awk loops

The for loop has the strucure:

for( Initial ; Condtion ; Post )

{

//Body of the loop

}

The "initialization" part is run once and can be used to initialize variables. The condition part is tested and if true the body of the loop is executed. After the body has been executed the post statementis run. After which the condition is tested again and so on till the condition becomes false.

Ex:

File: "1.cmd"

{

print "For loop"

for( i1=0 ; i1<3 ; i1++)

print i1

}

$ echo "" | awk -f 1.cmd

For loop

Ex:

{

ind1 = 2 ;

ind2 = $0 - 1

#print ind2 ;

isPrime = 1 ;

for ( ; ind1 <= ind2 ; ind1++ )

{

if ( $0 % ind1 == 0 )

isPrime = 0 ;

}

if ( isPrime == 1 && length( $0 ) > 0 )

{

print $0, " is a prime number."

}

data.txt:

Command is run as:

awk -f awk1 data.txt

Output:

[amittal@hills while]$ awk -f awk1 data.txt

23 is a prime number.

17 is a prime number.

7 is a prime number.

There is another notation for looping through the array and that is:

for( i1 in array )

do something

For each item in the array the variable "i1" will take on the value of the "index" element and we can access the array value with the notation "array[i1]" .

File: "awk3_states "

BEGIN {

state["Dublin"] = "California";

state["Reno"] = "Nevada"

state["San Jose"] = "California"

state["Oakland"] = "California"

state["Las Vegas"] = "Nevada"

for( str1 in state )

print str1 , state[str1]

}

Output:

[amittal@hills while]$ awk -f awk3_states

Reno Nevada

Dublin California

Las Vegas Nevada

San Jose California

Oakland California

While loops

File: "2.cmd"

{

print "While loop"

i1=0

while ( i1 < 3 )

{

print i1

i1++

}

$ echo "" | awk -f 2.cmd

While loop

Exercise

Modify the prime no example to use "while" loop instead of "for" loop.

Awk Arrays

Let's assume we have a file called "cities.txt"

File: "cities.txt"

1) "Dublin"

2) "Reno"

3) "San Jose"

4) "Oakland"

5) "Las Vegas"

We want to try to get the second field but notice some of the cities like "San Jose" have spaces in them .

File: "cities"

{ print $2 }

[amittal@hills arrays]$ awk -f cities cities.txt

"Dublin"

"Reno"

"San

"Oakland"

"Las

We see that the output is not what we want. There isn't any easy way to fix this in awk.

What we can do is change the field separator with our sed command.

File: "convert_cities.sh"

cat cities.txt | sed -r 's/[ ]+/|/' > cities1.txt

File: "cities1.txt"

1)|"Dublin"

2)|"Reno"

3)|"San Jose"

4)|"Oakland"

5)|"Las Vegas"

We have replaced the first series of spaces with the pipeline character "|" .

Another way of getting around this problem is to use the quotation as the field separator character.

Exercise:

Write a command in file that prints the cities using the quotation mark as the separator. Run the command file as:

cat cities.txt | awk -f awk1.cmd

Our awk script:

File: "cities1"

BEGIN {

FS="|"

state["Dublin"] = "California";

state["Reno"] = "Nevada"

state["San Jose"] = "California"

state["Oakland"] = "California"

state["Las Vegas"] = "Nevada"

}

{

#gsub is an awk function

gsub(/"/, "", $2)

print $2 " : " state[$2] }

In the BEGIN section we list our file separator as "|" with the command:

FS="|"

There is still a problem and that is the text file contains quotes for the cities. We can remove the quotes with the awk built in "gsub" function . We use an array in the "BEGIN" section to assign states for cities. In the action command we look up the cities with the statement "state[$2]" .

[amittal@hills arrays]$ awk -f cities1 cities1.txt

Dublin : California

Reno : Nevada

San Jose : California

Oakland : California

Las Vegas : Nevada

We do not have to declare the array or it's size and the arrays are associative which means it's subscript value could be a number or string.

We can of course use the awk arrays in the traditional sense:

File: "fib1"

BEGIN {

#holder 1 to 10 for fibonacci number

holder[1] = 1

holder[2] = 1

for( i1=3 ; i1<=10 ; i1++ )

{

holder[ i1 ] = holder[ i1 - 1 ] + holder[ i1 - 2 ]

}

for( i1=1 ; i1<=10 ; i1++ )

{

print holder[i1]

}

Output:

[amittal@hills arrays]$ awk -f fib1

Awk functions

We have awk built in functions that are provided to us and we can also define our own functions if we wish.

File: awk3

BEGIN {

arr[0] = "Three"

arr[1] = "One"

arr[2] = "Two"

print "Array elements before sorting:"

for (i1 in arr) {

print arr[i1]

}

asort(arr)

print "Array elements after sorting:"

for (i1 in arr) {

print arr[i1] , length( arr[i1] )

}

Output:

[amittal@hills functions]$ awk -f awk3

Array elements before sorting:

Three

One

Two

Array elements after sorting:

One 3

Three 5

Two 3

We are using the "asort" function to sort and the "length" function to obtain the length of the string.

In the below example we have written a function that returns a 1 if the number passed to it in the argument is a prime number.

File: "awk1"

function isPrimeNo( num1 )

{

ind1 = 2 ;

ind2 = $0 - 1

#print ind2 ;

isPrime = 1 ;

for ( ; ind1 <= ind2 ; ind1++ )

{

if ( $0 % ind1 == 0 )

isPrime = 0 ;

}

if ( isPrime == 1 && length( $0 ) > 0 )

{

#print $0, " is a prime number."

return 1

}

return 0

}

{

if ( isPrimeNo( $0 ) == 1 )

print $0, " is a prime number."

}

File: "data.txt"

[amittal@hills functions]$ awk -f awk1 data.txt

23 is a prime number.

17 is a prime number.

7 is a prime number.

Exercise

1) File: "data.txt"

2 3

4 2

3 3

10 3

Use the file "power1.cmd" to fill in the function for power.

File: "power1.cmd"

function power( num1 , num2 )

{

//TO DO

}

{

value=power($1, $2)

print $1, $2 , value

}

$ cat data.txt | awk -f power1.cmd

2 3 8

4 2 16

3 3 27

10 3 1000

Exercise

The date command prints out the date and time.

$ date

Sat, Mar 13, 2021 7:42:55 AM

Create a folder called "exb" and in it create a folder called "source" and create some files in it. Write a script that will create a folder using the output of the date command and copy the files from the source folder to this folder. For the output above our folder will have the name:

backup_2021_Mar_13_7_42_55

Review

whoami | sed -r 's/[.]*/Thanks for sending some time with me today: /'

prints

Thanks for sending some time with me today: amittal

whereas

whoami | sed -r 's/.*/Thanks for sending some time with me today: /'

prints

whoami | sed -r 's/.*/Thanks for sending some time with me today: /'

Thanks for sending some time with me today:

Page updated

Google Sites

Report abuse