Unix/Linux Origins
Unix is an operating system that was developed at Bell Labs in the early 70's . The source code was made free and from this code base, many other systems such as Aix, Hp-UX, Solaris evolved. Berkeley Software Distribution was also based on the Bell Labs version and was a base for FreeBSD, NetBsd, OpenBsd, Mac Os X. The Mac Os X has a it's own GUI but the underlying operating system is Unix based.
An operating system can be certified by the Open Group ( responsible for the trademark of Unix ) in order to be truly Unix. However there are many operating systems that are Unix like in terms of tools and interaction with the operating system ( commands and shell) . An example of a Unix like system is Linux. Both Unix/Linux have the same parts such as kernel and the shell. The commands that the shell takes will be same on the both the systems.
A unix operating system can be pictured as:
The shell is a program through which a user can type commands and the shell executes those commands. The kernel contains the core components of the operating system such as process management , file system, device drivers and interacting with the hardware. The shell can take your command and then invoke Api's to interact with the kernel. There are different kinds of shells such as Bourne shell, C Shell and Korn shell. The shell ( in addition to commands ) has support for scripting language namely the shell scripting language. We will be working with the Bourne shell. Since there are many shells and also many variants of the operating system ; how do we know that one script will work on another system ? This is where POSIX ( Portable Operating System Interface ) comes in. As long as we adhere to the POSIX standard we can be sure that our scripts will work on another shell and on another Unix variant. The Posix specification is at:
https://pubs.opengroup.org/onlinepubs/9699919799/
The specification is pretty thorough and extensive but we can refer to it to make sure that we are sticking to the standard.
Shell is a scripting language for the Unix system. We can execute commands from the command line and we can also write the commands in a file and ask the shell to run the file for us. Shell also contains utilities such as "vi" and "sed" . The "vi" is an editor while "sed" is a utility to find patterns and replace them . Why do we need a Scripting language when we have a high level language. like Java or C ? Suppose we have 2 files "foo1.txt" and "foo2.txt" and we want to copy the contents of both the files into a single file "foo3.txt" . With a high level language we have to obtain a file handle to "foo1.txt" and a file handle to "foo2.txt" , read lines from the files and write to a new file "foo3.txt". The program will be at around 20-30 lines long. Compare that with the single line shell command.
cat foo1.txt foo2.txt > foo3.txt
If we have a mosquito in a room do we throw a grenade in the room to kill the mosquito or do we get some spray and spray the room. We need to use the right tool for the right job.
An operating system allows us to use the resources that are provided by the computer. If we have a car that has an engine and a gear mechanism and brakes on the wheels then we can think of the operating system as the steering wheel, brake pedals and the gear stick. Imagine how difficult it would be to drive a car if we have to get under the hood to turn the wheel with pliers every time we wanted to turn the wheel. The operating system runs your programs and figures out where to load them into the RAM. It decides how much of CPU time to give to each process . It has a component that implements the TCP/IP protocol and is responsible for network communication. It is responsible for storing your files and folder on the hard drive. Let's see what it would be like if the operating system did not have a component ( the file system ) to help us with our files.
The above is a picture depicting the internals of a hard drive. It has tracks that are concentric circles going around and also has sectors that are represented by the purple section . The hard drive manufacturer may give us some software( driver ) that lets us store something given the track and sector. Now we are going to create a file. Well we need to store the file name somewhere. We also need to store some information related to the file such as track , sector and we need to store the size of the file somewhere. We have to do all that work ourselves. A file system is a must if we are to use the hard drive in any meaningful way.
Discuss Flash drive. Show file system on Windows.
For those of you who have been using Windows might be familiar with the notation "C:" , "D:" to denote different drives or partitions. Unix works differently. It starts with root which is at "/" and then to go to a sub folder we have the structure as "/home/" as an example. Similarly to denote that "home" has a sub folder we can have
"/home/a4m13w7/" . Accessing the cd rom might have the notation "/cdrom" depending on how it was mounted.
A Unix system will usually provide the "vi" editor. This is useful if you are working from a terminal and need to edit files. You can edit files on Windows by using the Windows editor such as "Notepad" or "Notepad++" or "Textpad" .
Make sure that you are saving the files in plain text mode. If you installed "cygwin" on the "c:" folder then the cygwin home path will look like:
C:\cygwin64\home\amittal
On Mac and Linux based systems the home path may be at;
\users\amittal
The name "amittal" will of course be different on your system (it will your username for that system).
If you open a new Terminal window on Mac or Unix it will usually take you to your home folder and you can view the complete path with the command "pwd" .
Some commonly used "Unix" commands are shown below. Many commands have options that can be specified using the "-" character .
man
The "man" command can be used to list the manual for a command. It is used as:
man "nameofcommand"
man vi
will show the manual for v editor. Use the spacebar to scroll down and q to quit.
touch
The "touch" command can be used to modify the update timestamp of a file. It can also be
used to create a new file.
whereis
Locates source, binary and manuals for the program listed.
which
Finds the binary executable associated with the name
rm
Removes a file
rm filename
Recursively removes a folder and all it's sub files and sub folders
rm -r folder
The "-f" option ( forcefully ) does not prompt the user before deleting the files.
cd
Changes directory.
cd directoryName
The "cd .." command takes us to a directory one level up .
The "cd ~" changes to the home folder.
Also "cd" by itself will also take us to the home folder.
The "cd -" takes us to the previous folder.
mkdir
Creates a folder.
mkdir directoryName
mv
Moves a file from one place to another. This command can also be used to
rename files.
mv 1.txt 2.txt
pwd
Shows us the folder we are in.
$ pwd
/home/user/x412/intro
cp
The command "cp" copies a file. It's syntax is:
cp sourcefile destination
The destination can be another file in which case the sourcefile gets copied to another file with the name of the destination or the destination cam be a folder in which case the sourcefile gets copied to the destination folder with the same name as sourcefile.
cp can also copy multiple files to a folder.
cp a1.txt b1.txt c1.txt destinationFolder
cp can copy all the contents of a folder to another folder .
cp -r folder1 folder2
This will create a "folder2" with all the contents from "folder1"
cat
The "cat" command can be used to print the contents of a file. It can also print contents of multiple files.
ls
The command "ls" will list the files and folders in our current directory.
Ex: ls
Ex: ls -l
ls with the "-l" option produces a detailed list of files.
ls -ltr
The "l" option prints the long format. The "t" option sorts the order by time of the files and the "r" option reverses the order.
The "ltr" option will place the most recent files at the bottom.
tar
The "tar" command can be used to compress and extract files. To compress the folder "mymath" to a file called "mymath.gz"
tar -czvf mymath.gz mymath
To extract the compressed file use:
tar -xvf mymath.gz
This will create a folder "mymath" in the current directory and extract all the files there.
chmod
The "chmod" command is used to change permissions of a file. A file on the Unix system has read, write and execute permissions. There are 3 groups for which these permissions can be applied. The user/owner who created the file. The group that the user is part of and others which refers to everyone in the organization.
Example:
$ ls -l
total 0
-rw-r--r-- 1 a4m13w7 Domain Users 0 Jun 10 09:50 new.txt
drwxr-xr-x+ 1 a4m13w7 Domain Users 0 Jun 9 18:06 sub1
drwxr-xr-x+ 1 a4m13w7 Domain Users 0 Jun 9 18:07 sub2
drwxr-xr-x+ 1 a4m13w7 Domain Users 0 Jun 9 21:58 zipped
The first character "d" denotes if the file is a director or not. Then we can have "rwx" to denote the read, write , execute permissions. There are 3 sets of these permissions mapping to user, group and others.
We can change permissions. There are 2 ways to do this. One is by symbolic mode.
chmod ug+x new.txt
$ ls -l
total 0
-rwxr-xr-- 1 a4m13w7 Domain Users 0 Jun 10 09:50 new.txt
The "ug" means user and group and the "+x" means add execute permission. The command adds the execute permission for user and group. The letter "o" means others . We have "r", "w", "x" for read write permissions.
The other is using binary notation. It is necessary to understand binary arithmetic.
Let's assume we have the file "new.txt"
-rw-rw-r-- 1 a4m13w7 Domain Users 0 Jun 10 21:37 new.txt
To add execute permissions to the user and group our permissions need to be of the form:
rwxrwxr--
We substitute the letters by 1 and the entries without any letters by 0 . This gives us:
111 111 100
This converted to decimal gives us 774
chmod 774 new.txt
We can verify the new permissions.
-rwxrwxr-- 1 a4m13w7 Domain Users 0 Jun 10 22:43 new.txt
Read , Write , Execute permissions on a folder.
These permissions have the following meanings when applied to a folder.
Read
We can list the files in the folder . If we "cd" to the folder then we can also list the files.
$ ls -l
drwxr-xr-x+ 1 user None 0 Jan 16 22:58 temp1
$ chmod 333 temp1
$ ls temp1
ls: cannot open directory 'temp1': Permission denied
$ cd temp1
$ ls
ls: cannot open directory '.': Permission denied
Write
We can add or delete files in the folder.
$ chmod 555 temp1 $ cd temp1 $ ls 1.txt $ touch 2.txt touch: cannot touch '2.txt': Permission denied
Execute
We can search for files in that folder.
date
The "date" command prints the current date and time.
$ date
Wed, Jun 10, 2020 1:20:04 AM
wc
The "wc" command counts the lines, words and characters of it's input.
$ wc data.txt
3 4 24 data.txt
to indicate that there are 3 lines, 4 words and 24 characters in the file.
The file "data.txt" contains the following text :
First Line
Second Line
echo
The "echo" command is one of the most commonly used commands in the shell language. It prints to the console.
$ echo "Some text"
Some text
who
The "who" command lists the users on the system. The command "who am i" ( the am and i are options ) lists the current user.
set
The "set" command can be used to manipulate environment variables. The command without any options shows all the environment variables.
grep
The "grep" command searches for a word/pattern in text and if it finds it in the line it prints the line else does not print anything.
$ grep "tall"
It's a tall tree.
It's a tall tree.
It's a bad apple.
If we use "grep" without a file then it waits for us to type something. We type in the line "It's a tall tree." and it gets printed out because it contains the word "tall" . The next line "It's a bad apple." does not get printed out because it does not contain the word "tall" .
The "grep" utility has options like "-v" that finds a match if the word is not found and "-i" that ignores cases.
We can grep for multiple words by separating them with the pipe symbol enclosed by a string.
$ grep 'red\|juicy'
The apple is red.
The apple is red.
The apple is juicy.
The apple is juicy.
The lemon is sour.
We need to escape the pipeline character. The pipeline character is part of extended regular expression character set and thus the reason for escaping it. We shall study this in detail in the section on regular expressions.
Ex1:
1) Create a folder called "temp1" .
2) Cd to this folder and create a folder called "sub2" in this folder.
3) Create a file in folder called "file2.txt"
4) Execute the command "ls -R" .
5) You should have the folder structure as below
temp1 -> sub2 -> file2.txt
Ex 2
We have the file structure from the Exercise 1 as:
temp1 -> sub2 -> file2.txt
Navigate to the folder "temp1" and execute the command:
tar -cvf sub2.tar sub2
Now create a folder called "zipped" under temp1 .
mkdir zipped
Move the sub1.tar file to the zipped folder using the command:
mv sub2.tar zipped
Change folder to the zipped folder.
cd zipped
Untar the compressed file using the command:
tar -xvf sub2.tar
Do a "ls" to make sure that we have a new folder called "sub2" and it has a file by doing:
ls sub2
Ex3:
Cd to the "temp1" folder created in exercise 1 . Create a new file called "1.txt" by doing:
touch 1.txt
Now do a "ls -l" to see the permissions.
Using the symbolic method add write permissions for user, group and others. Using the binary method
remove write permissions for the "others" . Your final file will look like.
-rw-rw-r-- 1 a4m13w7 Domain Users 0 Jun 10 21:37 1.txt
Ex4:
Create a file named "data.txt" containing the following line:
"This is a sentence."
Use the "man" command to look up the options of "wc" and write 1 line commands to output the only number of characters, number of words and number of lines in file "data.txt" .
Solution 4:
$ wc -c data.txt 20 data.txt $ wc -w data.txt 4 data.txt $ wc -l data.txt 1 data.txt
There are many different kinds of shells available such as bash, tsh, csh, ksh . The difference is in things like automatic file completion, interactive abilities, syntax. The "csh" offers a syntax similar to the "C" language.
Usually the output of a command is shown at the console such as :
[amittal@hills ~]$ date
Sun Oct 28 12:02:07 PDT 2018
However we can redirect the output to go to a file instead.
[amittal@hills ~]$ date > t1.txt
[amittal@hills ~]$ cat t1.txt
Sun Oct 28 12:06:07 PDT 2018
Now the output of date is sent to the file "t1.txt" instead. Some Unix commands can take input from the console. We can also use the ">>" symbol to append to a file instead of overwriting the file.
$ date >> data.txt
$ date >> data.txt
$ cat data.txt
Sat, Jan 23, 2021 6:31:36 PM
Sat, Jan 23, 2021 6:31:37 PM
[amittal@hills ~]$ grep Oct
The month is October
The month is October
The grep command looks for a string and waits for us to type in a line. If it finds the word then the line is repeated otherwise the line is not repeated. We can redirect the "grep" command to take it's input from a file instead.
[amittal@hills ~]$ grep Oct < t1.txt
Sun Oct 28 12:13:37 PDT 2018
Let's say we have 2 commands and we want the output of the first command to go as the input to the second command. We could do this with a temporary file.
[amittal@hills ~]$ date > t1.txt
[amittal@hills ~]$ grep Oct < t1.txt
Sun Oct 28 12:17:11 PDT 2018
However Unix gives us an easier way. Using pipes we can tell the output of the "date" command should be considered as the "input" of the grep command.
[amittal@hills ~]$ date | grep Oct
Sun Oct 28 12:18:59 PDT 2018
[amittal@hills ~]$
We can keep repeating the "|" as many times as we want in a single command.
[amittal@hills ~]$ date | grep Oct | wc -l
1
The "wc -l" counts the number of lines in the input.
We can also combine the redirect with pipe.
date | grep -v Oct > data2.txt
Assuming we are not in the month of October this will create the file "data2.txt" with the following sample contents.
$ cat data2.txt
Sun, Jun 21, 2020 4:36:36 PM
Variables in Shell Script do not have types associated with them. We can assign a value to a variable :
var1=John
To print the value of the variable we can use the "$" sign.
echo $var1
There should not be spaces around the equals sign. The following is invalid:
var1 = John
The shell assumes that "var1" is a command and states that it cannot find it.
-bash: var1: command not found
If we want to assign a string with spaces then we must enclose them in quotes.
var1=John Wayne
-bash: Wayne: command not found
var1="John Wayne"
echo $var1
Valid variable assignments:
MY_MESSAGE="Hello World"
MY_SHORT_MESSAGE=hi
MY_NUMBER=1
MY_PI=3.142
MY_OTHER_PI="3.142"
MY_MIXED=123abc
A variable can be composed of letters , digits or the underscore character. The variable must start with a character.
A number can be assigned to a variable. If a routine is expecting a number and we supply a string then it will throw an error.
var1="text"
expr $var1 + 2
expr: non-integer argument
var1=3
expr $var1 + 2
5
Variables do not have to be declared in Shell language. If we try to use an undeclared variable then the value is blank .
Prompt: echo $var123
Prompt:
Using the export keyword.
Let's say we have a script name var1.sh:
#!/bin/sh
echo "MYVAR is: $MYVAR"
MYVAR="hello there"
echo "MYVAR is: $MYVAR"
Output:
$ ./var1.sh MYVAR is: MYVAR is: hello there
Since the value of "MYVAR" is not initialized it is blank and then we set a value of "hello there"
to it and that gets printed out by the second "echo" statement.
[amittal@hills variables]$ MYVAR=hello
[amittal@hills variables]$ ./var1.sh
MYVAR is:
MYVAR is: hello there
It is still not printed out. Running the "var1.sh" spawns another process which is the subshell and it does not see the "MYVAR" variable. In order for the subshell to see the variable we have to export it.
[amittal@hills variables]$ MYVAR=hello
[amittal@hills variables]$ export MYVAR
[amittal@hills variables]$ ./var1.sh
MYVAR is: hello
MYVAR is: hello there
[amittal@hills variables]$ echo $MYVAR
hello
[amittal@hills variables]$
The script changed the variable "MYVAR" to "hello there" but after running the script we see that the value of MYVAR is still "hello" . What if we wanted the script to change the value so that we can see it in the main shell. To do that we must run the script in the main shell.
[amittal@hills variables]$ . ./var1.sh
MYVAR is: hello
MYVAR is: hello there
[amittal@hills variables]$
Assigning the output of a command to a variable.
We can use the backquote to execute a command and assign the output to a variable.
$ VAR1=`echo "Testing"`
$ echo $VAR1
Testing
There are different ways we can run a shell script. Writing our first shell script program.
Create a file called "first.sh" and enter the contents as:
echo "First Script"
Make it executable by changing it's permissions.
chmod +x first.sh
And now we can run it :
$ ./first.sh
First Script
Why do we need to do "./" . Let's see if we don't put the "./" in front of the "first1.sh" .
$ first.sh
-bash: first.sh: command not found
Unix can't find out executable shell script. The reason is that the environment variable "PATH" does not have our folder listed. Unix will go through each path in the PATH variable and try to find our script. Since it doesn't find it there we get a command not found error.
[amittal@hills cs160b]$ echo $PATH
/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/users/amittal/.local/bin:/users/amittal/bin
Our current path is:
[amittal@hills cs160b]$ pwd
/users/amittal/cs160b
Now we are going to modify the "PATH" variable and add our own folder.
[amittal@hills cs160b]$ PATH=$PATH:/users/amittal/cs160b
Now we can run our script without putting the "./"
[amittal@hills cs160b]$ first.sh
First Script
We are going to modify our script by placing another command below it:
[amittal@hills cs160b]$ cat first.sh
echo "First Script"
cd /tmp
[amittal@hills cs160b]$ ./first.sh
First Script
We execute the script but after execution our folder does not change. We are not placed in the "/tmp" folder. That's because the main shell spawns a child process and shows us the output but things like changing the folder doesn't affect us. Another way of running the shell script is.
[amittal@hills cs160b]$ bash first.sh
First Script
[amittal@hills cs160b]$ pwd
/users/amittal/cs160b
In both cases we are still in the same folder that we started out from.
The 2 approaches "./first.sh" and "bash first1.sh" refer to the files as shell script files.
In order for the script to make changes we need to run the content of the script in our shell instead of a sub shell. Using this approach we call the files command files. We can run the scripts either as:
source first.sh
or
. ./first.sh
Environment Variables
Every shell session ( when we open a terminal ) has environment variables associated with them . These are variables that determine how your shell acts. We can list the variables using the set command. Since the output is pretty long we can do
set | more
Internal vs External Commands
Internal commands are the ones that are built into the shell program such as "source" .
[amittal@hills w2]$ which source
/usr/bin/which: no source in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/users/amittal/.local/bin:/users/amittal/bin)
An external command is one that has an executable somewhere in the path folder. Example ls:
[amittal@hills w2]$ which cat
/usr/bin/cat
[amittal@hills w2]$
A path can be automatically set by the Unix system for a device such as "/mnt/cdrom" . It can also be set manually by the mount command.
Ex1
Create a file called "data.txt" whose contents are:
This is first line.
This is second line.
This is third.
This is fourth line.
Using pipes and redirection, "cat" and grep" write all the lines that do not contain the word "line" to another file called "data1.txt" .
Ex2
The command "date" writes out the current date to the console. Using pipes write a command that will print the date if the current month is not Sept or Oct .
Ex3
Create a folder say "ex3" and create 2 files in it with the following contents.
File: "data1.txt"
This is file data1.txt
File: "data2.txt"
This is file data2.txt
Write a shell script say "ex3.sh" that creates a file called "data3.txt" with the contents from each file with the following line appended to the contents.
File: "data3.txt"
This is file data1.txt
-------------------------
This is file data2.txt
-------------------------
Ex4
In the shell create 2 variables with the following values:
num1=10
num2=5
Do not declare or set these variables in the schell script.
Write a shell script say "ex4.sh" that prints the sum Use the expression:
expr $num1 + $num2
You must execute the script with the following syntax.
./ex4,sh
Ex5
Create a script say "ex5.sh" that will create a folder called "tmp" in the folder where the script is executing and create a file in this called "trash.txt" .
Ex6:
Create a script that has a single "echo" statement :
echo "Script Ex6"
Run the script using the following commands.
./ex6.sh
bash ex6.sh
. ./ex6.sh
source ex6.sh
Soln 1
cat data.txt | grep -v "line" > data1.txt
Soln 2
date | grep -v "Sept" | grep -v Oct
grep has am option to use with the pipe symbol. We need to place a "\" in front of it as it is extended regular expression. We shall study the difference between regular and extended expressions later on.
date | grep -v 'Sept\|Oct'