Linux is a powerful and complete operating system that allows programming to be very flexible and efficient. Most of the programming tools belong to this operating system, and the most common way to use these tools is by using the terminal.
There are several commands that you will be using often in scripting, and therefore, it is worth the time to spend in learning basic commands in the Linux environment.
In this section, we will discuss the basic commands that every programmer should know at some point of their life. Most of these commands are Linux—based commands, so you may wonder, why am I learning Linux “stuff”, if this is a scripting programming course. The answer is simple, since most of the Linux programming is based on a commands (i.e., using shell scripts), it is important to understand basic shell scripts/terminal command line commands in Linux before we jump into Python programming. In this Section, we will discuss the “first-aid kit” commands that you might need for the rest of your life.
If you are interested in learning more about Linux, we recommend taking a course in Linux or any UNIX flavors. This will benefit you enormously for your career in computer programming/computer science.
The sequence of instruction to perform actions. Now that we know a couple of commands that perform actions through the terminal, we would like to have them together to perform several actions as a “program”. These sets of instructions are called scripts. Although scripts do not require necessarily to be an intense or elaborated program, scripts are playing an important role these days in the computing community.
Gluing procedures
Transitioning commands to programs
manipulate input and output
Preliminaries
These scripts would require preliminary actions such as create a file, change its attributes, and edit the file with specific commands.
Create a file
There are different ways to create a file through the terminal. Let us simply create a file as follows
> myscript
By listing the attributes specifically for the file we just create as follows ls –l myscript01 we obtain the following description:
-rw-rw-rw- 1 cservin cservin 0 Jun 4 16:24 myscript
Change File Attributes
A script is a file with executable attributes. It is important to recognize that the script will act such a program to whoever is “running” it into the system, that is why we need to have executable attributes. There are several ways to provide executable attributes:
chmod +x myscript01
is an example to provide an executable attribute to EVERYBODY. We can see the attributes by ls –l myscript01
-rwxrwxrwx 1 cservin cservin 0 Jun 4 16:44 myscript
In many Linux distributions provides a highlighted green or yellow color to the file name that has the executable attribute.
In case you would like to specify the executable attribute to the file, you can use the “u”, “g”, and “o” in front of the “x”, to specify “user”, “group”, and “other” respectively. For example, if you would like to give only executable attribute to the user, then the command would look as follows:
chmod u+x myscript01
Another way to change mode is using a numeric value. To demonstrate, let's suppose that we have a file whose current mode (ignoring the leftmost character) is “rwxr-xr-x” which would equal to “755”, each digit represents a set of “r w x” respectively. To get these numbers we use binary powers, meaning that r = 22, w = 21, and x = 20. If the letter is there we count its value and if it’s not we don’t.
For example, “r w x” = 22 + 21 + 20 = 7, “r w -” = 22 + - + 20 = 5, “r w -” = 22 + - + 20 = 5. So we have “7 5 5” which belong to “user, group, and others” in that order. Remember that everything to the zero power is always equal to 1. Once you’ve decided your permissions we change them by doing:
chmod 755 myscript01
(Note: The number in the middle will be different depending on how you want your permissions)
Edit File and Incorporate Commands
In similar fashion, there are several ways to edit files or scripts. Through this tutorial, we will show several ways. This time let us use nano to add a command to prints hello.
nano myscript01
We can test this script by using the “echo” command discussed previously. The script will look as follows:
echo “Hello world…”
Make sure you save the changes in file myscript01
Run the script
Once you create the file and change the attributes, now you can run it as an executable file. The way to execute the script is as follows
./myscript
Notice that the “./” must be in front of the script that already has executable attributes.
To concatenate a variable or variables to a text we use:
${name_of_variable}
where you would like to add the variable. Suppose we have a script that greets users when they type their name. The name is saved on a variable and we just concatenate that to the greeting text, we will suppose the name is saved on a variable named “VAR1”. So to concatenate this we do the following:
echo “Hello ${VAR1}, it’s nice to see you!”
So if the name in the variable is Sam, the output will be the following:
Hello Sam, it’s nice to see you!
We have seen already how to execute a script that does whatever set of instructions we wrote on it, but sometimes it will be necessary to give those scripts some information which they can’t get on their own. To do this we feed them what are called parameters doing the following:
./script parameter
We type the script execution followed by the value to give (a.k.a parameter). This parameter will be identified with the token “$1”.
To demonstrate, let's recall the previous example on concatenation, we concatenated a variable with the value “Samantha” to a greeting text. But now instead of using a variable to hold the value, we’ll pass it as a parameter inside the script:
echo Hello $1, it’s nice to see you!
To execute:
./script Samantha
will output:
Hello Samantha, it’s nice to see you!
Of course we can add more than one parameter by putting it next to the first parameter and we’ll refer to it in the script by the number of the position it is in. So if we were to add a second parameter to our script we’ll use $2 to refer to it.
Changing the file permissions to executable.
A script requires to have an executable attribute, therefore, we need to chmod +x to the file.
Redirection
Challenges
Using the terminal, perform the following:
Write a script with executable attributes for the user, group, and others; with read attributes only for user and others; and write attributes only for user and group. Using the echo command output the message "Hello class!". [solution]
The next two sections discuss awk and grep two well-known programming languages used
for scripting. We will discuss very briefly these two languages so you can grasp an idea
about the power of scripting.
The grep tool is a program very useful in order to manipulate any kind of text from a file or multiple files. There are extensive guidelines on how to work with grep that you can find in books or articles. There is a course (Introduction to Linux) that you can take and learn more in depth about grep. Since scripting programming requires a lot of text manipulation, it is important to know at least the basic ideas behind grep.
Here are some of the basic operations:
Assume that target is a string that you are interested to find and filename is the file that you are interested to search.
To find a string in a file target:
>grep target filename
Find a target in multiple Files
>grep target filename1 filename2 or you can use the wildcard * to search in all files
>grep target *.txt
For Case Sensitive include the flag –i
>grep -i target *.txt
Count how many times a target occurs in file
> grep target filename | wc where wc: word count
You can find certain number of lines before, after, and between a specific target within a file. For example:
Before
>grep -B <number> -i “target” filename
After
>grep -A <number> -i “target” filename
Center
>grep -C <number> -i “target” filename
Search recursively in subdirectories
>grep -r “target” * | wc -l
Here is the file walrus.txt
I am he as you are he as you are me
And we are all together
See how they run like pigs from a gun see how they fly
I'm crying
I am the EGGMAN, they are the EGGMEN
I am the walrus, goo goo goo joob
if you are interested to find a particular string, grep allows
you to do it as follow:
grep <string> filename
For example, if we are interested in finding the string 'are' in the file walrus.txt we write something like:
grep "are" walrus
and we obtained the following:
I am he as you are he as you are me
And we are all together
I am the EGGMAN, they are the EGGMEN
In similar fashion, grep can search a string in multiple files
grep "walrus" *.txt
The script says: go find the string walrus in ALL the files (the wildcard * means all)
As many programming languages, grep is case sensitive. If you are interested to find the string eggmen in the file walrus.txt, as follows:
grep "eggman" walrus.txt
grep will not find any occurrence since EGGMAN is in capital letters, by incorporating the flag -i for insensitive search:
grep -i "eggman" walrus.txt
we now can get the following:
I am the EGGMAN, they are the EGGMEN
We can also count the total number of occurrences that certain string appears in a file by using the flag -c in the following command:
grep -c <string> filename
For example, if we are interested in finding the string ``I am'' at walrus.txt we can use the following
grep -c -i "I am" walrus.txt
3
If you are interested in displaying the line number where the pattern was found you can include the flag -n to provide the line number as follows:
grep -n -i "I am" walrus.txt
1:I am he as you are he as you are me
5:I am the EGGMAN, they are the EGGMEN
6:I am the walrus, goo goo goo joob
The grep allows to display a <number> of lines desired either before, after and around a specific target within a file. For example:
Before: use the flag -B
>grep -B <number> -i “target” filename
After: use the flag -A
>grep -A <number> -i “target” filename
Around: use the flag -C
>grep -C <number> -i “target” filename
A recursive process is similar as a looping process. It means that it will keep trying to find in sub-directories the target that we are looking for. E.g.,
>grep -r “target” * | wc -l
The awk programming language, is used in scripting programming since was designed for text processing and often used as a data extraction tool. In this Section we will demonstrate several usages of awk.
Let us use the file data.txt as an example. The file data.txt contains the following information:
name id major
name1 981 computer science
name2 789 psychology
name3 098 bioinformatics
name4 456 computer programming
name5 765 security and networks
if you are interested in print the contents of the file, you can invoke the following script:
awk '{print;}' data.txt
and this will display the file content. This is very similar as using the “more” command as we showed in previous sections.
In case you are interested in search for a specific string in the file you can specify the string as follows:
awk '/computer/' data.txt
In this case, we are interested in finding the line(s) containing the string ``computer'' from the file data.txt. After executing the command, we obtained the following:
name1 981 computer science
name4 456 computer programming
The awk provides variables that associates values of the files according to the fields of the file, i.e., the rows or the columns (or fields) of the file. The awk delimits the file through white spaces (i.e., blank space, tabs), and each delimited value is associated with a variable. For example, if we are interested in printing only the first field and the third field of the file, we can execute the following statement
$ awk '{print $1,$3;}' data.txt
The script says the following:
Please go to the field number 1 and number 3 from the file
data.txt, and print them. The comma (,) that is between
the $1 and $3 is a space separator. If you do not put it, it
will stick both fields together.
we get the following output:
name major
name1 computer
name2 psychology
name3 bioinformatics
name4 computer
name5 security
awk is smart enough to associate $1 with the first column and $3 with the third column of the file. Since the file separates each column by tab space, awk easily breaks the data into columns.
Comment: Also, awk provides a variable that will be associated with the last
column of the file (just in case you do not know how many columns there are), this is called $NF. This is similar to a ``length'' for fields in a file. For instance, in our example, if we execute:
$ awk '{print $NF;}' data.txt
we get:
major
science
psychology
bioinformatics
programming
networks
since the column that contains the majors correspond to the last field in the file, i.e., the $NF.
At awk, the relational operations are supported in order to extract data that satisfies certain conditions. For example, if we are interested in extracting the information of the file BUT ONLY the records which id is greater than $750$, we can perform such task by executing the following:
awk '$2 > 750' data.txt
The script says the following:
Please go to field number 2 from the file (i.e., the id field),
and search for all the values that are greater than $750$ from the file data.txt.
Once you find them all (or any), print them.
These are the results that awk will retrieve:
name id major
name1 981 computer science
name2 789 psychology
We can create more complex scripts in awk. For example, keep track of the number of occurrences an element/pattern occurs. Suppose we are interested to know how many times the string ``computer'' appears in the major field. We can write a script as follows:
awk 'BEGIN
{count=0;}
$3 ~ /computer/ { count++; }
END
{print "Number of computer-related majors=",count;}'
data.txt
The script says the following:
Begin a process by creating a counter at 0.
Go to column number 3 and find if the pattern "computer" appears in this field. If it happens that indeed the string "computer" appears, then increment the counter (i.e. count++).
Keep doing that... (that's what the END means).
When you are done, print the following
``Number of computer-related majors='', and stick the count variable there (which hopefully at this point we incremented a few times). Oh by the way!, do
not forget the file that you will be searching the data, i.e., data.txt.
After executing this script, we obtained the following:
Number of computer-related majors= 2
We can go over more on awk and grep, however, the purpose of this course is to learn Python as the scripting language. We wanted to give you a gentle introduction about scripting, or at least what scripting programming has been done for the past decades.
By the way, all the things that can be done in awk and grep can be done in Python!.