First things first, where are you? Run the command that will help you figure it out!
pwd
So, now we know we are in /home/genomics/
To access the practice lab of today, we need to access the folder "workshop_materials", and within it, a folder called "unix_tutorial".
How would you move to that folder using the absolute path?
cd /home/genomics/workshop_materials/unix_tutorial
How would you move to the directory using alternative path?:
cd ./workshop_materials/unix_tutorial
Inside the folder, there is two folders "data" and "unix", let's access the unix one!
cd /home/genomics/workshop_materials/unix_tutorial/unix
cd unix
Choose your way, and access the unix folder that contains today's material!
If you try both ways, how do you move back to the parent directory of unix (unix_tutorial)?
cd ..
Remember that the command cd can always easily take us back home (/home/username) with either option:
cd
cd ~
If we want to go back to the previous working directory where we were before the last cd, we type:
cd -
Let's take a look at what is inside the unix folder, what command do we use for that?
ls
Now visualize the contents of the unix_tutorial and home directories.
ls ..
ls ../../../
Check the manual of the command ls! man ls
What option of the command can we use to see more information of each file?
ls -l
Let's combine flags! Try to make the formatted list more "human readable", so that file sizes are clearer!
ls -lh
Are file1 and file2 different in size?
Finally... Do we have any hidden files in the directory? 👀
ls -a
Let's create a directory inside the unix folder and call it new_directory:
mkdir new_directory
Remember to always try to avoid spaces in both filenames and directories, as some programs will not be able to interpret it correctly as one filename!
Now copy file1 to the directory you just created!
cp file1 new_directory/
Check the location of file1, is it inside the new directory? Is it still in the unix folder?
ls new_directory/
ls .
We know that file2 is larger than file1, so we do not want file2 duplicated, filling up extra space.
Let's instead move it to the new directory!
mv file2 new_directory/
Take a look again in both folders! Where can we see file2?
You should be able see file1 in both locations, and file2 only in the new location.
Let's now make a copy of file1 in the unix folder, and call it file2!
cp file1 file2
Check the file sizes of file2 in unix and file2 in new_directory, are they the same?
This can be a bit confusing for us, as we might work with the wrong file, so let's rename the copy of file1 in unix (now named file2), and call it file1_copy
mv file2 file1_copy
Great! Now we have more informative names! This is important when we work with a lot of files and we might forget which is which.
Now, let's try copying a directory! Copy the new_directory, and call the new one new_directory2
cp new_directory new_directory2
Visualize the content of the folder to visualize if it worked!
Did it work? Or did you forget to add the flag -r? ;)
cp -r new_directory new_directory2
If you managed, just run ls to check!
We have created a lot of duplicates, let's keep only the files in new_directory (which are the original file1 and file2!)
Remove file1 and file1_copy from the unix folder.
🛑 The remove command will not print any caution message! Always work with caution before running your commands! Check always first if you are in the right directory and if the file is really the one you want to remove!
Are you in /home/genomics/workshop_materials/unix? If yes, go ahead!
rm file1
rm file1_copy
Now remove new_directory2, make sure to use the adequate flags to not get an error message!
rm -r new_directory2
Check one last time the contents of unix and new_directory to confirm it all worked!
We will proceed to work more with the files in new_directory, but to do that, we will be in a new folder dedicated for that named: working_directory, inside the unix folder. Again, we do not want duplicated filling up our precious server space!
Let's create a symlink instead! A symlink will be located in the folder in which you are working, and will point to the true location of that file, while not occupying any extra space!
First, create a directory named working_directory, inside the unix folder.
mkdir working_directory
and change location to working_directory:
cd working_directory/
Then, create a symbolic link of file1 inside working_directory:
ln -s /home/genomics/workshop_materials/unix_tutorial/unix/new_directory/file1 /home/genomics/workshop_materials/unix_tutorial/unix/working_directory/
As we are already inside working_directory, you could use the relative path as well:
/home/genomics/workshop_materials/unix_tutorial/unix/new_directory/file1 .
Important: the original location of the file should always be specified as absolute path!
Visualize the contents of working_directory to check if it worked! What information tells you that is a symlink in the formatted visualization?
ls -l
The -> symbol after the file name shows the path the symlink points to.
Now, repeat the same procedure with file2!
Great! file1 and file2 are still stored in /home/genomics/workshop_materials/unix_tutorial/unix/new_directory, but now we can also access them easily in /home/genomics/workshop_materials/unix_tutorial/unix/working_directory!!
Now we want to start looking at what our files contain.
First, we can get an idea of how big they are. This is sometimes important to avoid printing into our screen hundreds, or even thousands of lines. We can always take a look at the file sizes, as we did before. Which helped us discover that file1 is smaller than file2.
Let's use the command "word count" (wc) to get more information about the file content.
How many lines does file1 have?
How many words?
How many characters?
Check the manual to figure out the options of the command to get each bit of information!
Repeat the procedure with file2.
It is time to finally read the files content! 📖 We can use the program less for that! less is a read-only program, that allow us to read the contents of a file, without making any changes to it.
less is a more complete version of the command more, with less allowing a wider arrange of possibilities to navigate through the file. A trick to remember it: "less is more". For the whole array of options in less, check the wikipedia page or its manual in the command line.
Use it to read both file1 and file2.
To quit less, we just press q.
Another command that will show us the complete contents of the file is cat. In this case, it will be printed as standard output directly in our terminal.
Use it to read both file1 and file2.
Short files like file1 are ok to be visualized directly as stdout, but others, like file2, will fill our terminal with all its lines. This command will be very useful later, to redirect file contents to other commands! :)
When working with larger files, sometimes we want to just quickly check its contents, instead of opening an entire huge file. Use the appropiate commands to check the first 10 lines or and the last 10 lines of file2.
Check now the first 15 and the last 15! Find the right option in the manual.
Create a new_file2 that contains two times the contents of file2!
Take a look at the contents of the file (use less, head, tail...)
Are the the lines in order? If not, let's sort the contents! Save it in a new file named new_file2_sort.
Is the order ok? Do we need some special sort flag to get a natural order?
Check how the new_file2_sort differs from the original file2, and how new_file2 differs from new_file2_sort.
Split new_file2 into 2 files! How many lines should we put in each file? What about if we want to split file2 in two files?
Make new_file2 look again like file2. How can you do that with only one command?
Finally, extract only the number from every line!
1. Create a new_file2 that contains two times the contents of file2:
cat file2 file2 > new_file2
2. Take a look at the contents of the file:
less new_file2
3. Sort the contents of the file and save it in a new file named new_file2_sort:
sort new_file2 > new_file2_sort
4. Check if the order is natural. If not, sort using a special flag. For example, if the lines contain numbers and you want to sort based on numerical order:
sort -n new_file2 > new_file2_sort
Alternatively, you can also specify the field containing the numbers by using:
sort -n -k4 new_file2 > new_file2_sort
5. Check how new_file2_sort differs from the original file2, and how new_file2 differs from new_file2_sort:
diff new_file2 file2
diff new_file2_sort new_file2
6. Split new_file2 into 2 files. To determine how many lines should be in each file, you can count the total lines and divide by 2. If you want to split file2 into two files, you can do the same:
wc -l new_file2
(divide number by 2)
split -l "#number" new_file2 new_file2_split
7. Make new_file2 look again like file2:
uniq new_file2
Other tricks:
cp file2 new_file2
cat file2 > new_file2
8. Finally, extract only the number from every line. If the lines contain the phrase "This is line X", you can extract X using the following command:
cut -d ' ' -f 4 new_file2
This will extract the fourth field (delimited by space) from each line.