Welcome‎ > ‎Resources‎ > ‎

Linux Tips


Foundation

Most of the following ideas will involve using Bash, and manipulating the inputs and outputs to basic commands. In order for that to make sense, we need to review a few concepts. stdin, stdout, and stderr. Every program, when run, gets 3 interactions with the outside environment - an input, an output, and a special output to receive errors. These are called stdin, stdout and stderr, and they can be accessed in a few ways. 

  • stdin
    • command < file
      In this method, the contents of the file are taken as the stdin for the command you are running
    • command1 | command2
      In this method the stdin to command2 is taken from the output of command1.  The operator "|" is called a pipe.
  • stdout
    • command > file
      This will write the stdout (note: NOT the stderr) to the specified file, overwriting any previous contents
    • command >> file 
      Distinct from the previous option, this will append the stdout to the file, without overwriting the previous contents
    • command1 | command2
      As before, but the emphasis now is that the stdout can be redirected to another command. The operator "|" is called a pipe.
  • stderr
    • command 2> file
      This will write the stderr (note: NOT the stdout) to the specified file, overwriting any previous contents
    • command &> file
      This will write BOTH stderr and stdout to the same file.
  • tee
    • The data stream can be split with a 'tee' command. For example
      ls | tee file
    • This will take the output of ls and print it to screen in addition to writing it to file. 

Simple Scripts

Now that we have the basic tools and vocabulary to start doing more interesting things with bash, lets take a look at some handy dandy scripts to make day to day life a lot easier. 

Running from the command line

You need a few things to be able to run a script from the command line like:

~ $ command 

instead of:

~ $ bash /path/to/command 

First: you need to make sure the top of the script says what interpreter to use for the contents of the script. For bash, this means that the first line need to read "#!/bin/bash". Second, make the script executable. Then, add its location to your PATH environment variable. Finally, you need to either restart your session, or re-source your bashrc file. 

~ $ head -1 /path/to/command 

#!/bin/bash

~ $ chmod +x /path/to/command

~ $ echo 'export PATH=$PATH:/path/to/command' >> ~/.bashrc

~ $ source ~/.bashrc

is

Lists any processes you are running which contain the first command line argument.

~ $ cat ~/bin/is
#!/bin/bash
ps aux | grep $1 | grep ^$USER

~ $ is import

oqmd      5311 18.9  1.1 345152 189940 pts/15  S    15:01   4:44 /usr/bin/python2.6 import.py

iskill

Works the same as 'is', but instead of listing the results, the processes found are killed.

~ $ cat ~/bin/iskill
#!/bin/bash
ps aux | grep $1 | grep ^$USER | awk ‘{print $2}’ | while read i; do kill -9 $i; done

~ $ iskill import
~ $ is import 

qkill

Searches for jobs which have the first command line argument anywhere in its output from qstat. For example, if I submitted a much of jobs named 'fcc_convergence_E_K' where E and K are indices tracking energy cutoff and kpoint mesh, and I know that I made a mistake in the queue file. They can all be easily killed with this command. Alternatively, if you use your username you can unilaterally terminate all of your jobs.

~ $ cat ~/bin/qkill
#!/bin/bash
qstat | grep $1 | grep $USER | awk –F”.” ‘{print $1}’ | while read i; do qdel $i; done

~ $ qkill sjk648

jnode

Output the resource list for the job in question. Useful for investigating why the job isn't running correctly. In the example below the job is running on node27.

~ $ cat ~/bin/jnode

#!/bin/bash

qstat -f $1 | sed -n '/exec_host/,/Hold_Types/p' | grep -v Hold

 ~ # qstat -f 184525 | sed -n '/exec_host/,/Hold_Types/p' | grep -v Hold

exec_host = node27.cl.northwestern.edu/7+node27.cl.northwestern.edu/6+node 27.cl.northwestern.edu/5+node27.cl.northwestern.edu/4+node27.cl.northwestern.edu/3+node27.cl.northwestern.edu/2+node27.cl.northwestern.edu/+node27.cl.northwestern.edu/0


Bash Tips

alias

Turn a simple one liner into a single command which can be issued from the command line. To add an alias for your session, you can add it by doing:

echo 'alias command' >> ~/.bash_alias 

Some alias' that I personally like, and use:
alias pal=‘ssh -Y sjk648@palestrina’


alias vcl='if [ -e OUTCAR ]; then gzip -fq OUTCAR; fi; if [ -e CHGCAR ]; then gzip -fq CHGCAR; fi; if [ -e WAVECAR ]; then gzip -fq WAVECAR; fi && rm -f PROCAR vasprun.xml‘

alias fl=‘find ~/ -type f -size +200000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }‘

nohup

Allows you to run a process which will keep going after the session is closed. stderr and stdout are written to a file (nohup.out by default).

~ $ nohup long_script.sh

(appends output to nohup.out, keeps running in current shell)
~ $ nohup long_script > long.out 2> long.err &

(appends stdout to long.out, stderr to long.err and runs in background so the current shell is left open)

cron

Allows you to schedule the execution of a process based on the system clock.
crontab file # Adds contents of file to crontab 

Each line of file should look like:
Every 2 hours at :30 30 0/2 * * ?
Every day at 11:45PM 45 23 * * ?
Every Sunday at 1:00AM 0 1 ? * 0
Every last day of month
at 10:00AM and 10:00PM 0 10,22 L * ?

rsync/ rdiff

Can be used to keep a backup of data. Only copies new data, doesn’t recopy everything every time. 

rsync -avze ssh --delete netid@quest.it.northwestern.edu:~/ /backup/path

(Executed locally, this backs up the home directory from quest to the local machine at /backup/path)

rdiff-backup josquin-backup::/etc /backup/josquin/etc

(The command used to backup from one cluster to the others)

sort/ uniq

Sort the contents of a file. Remove duplicate, adjacent lines.

~ $ cat file
6 apples
4 oranges
7 bananas
4 oranges
~ $ cat file | sort –n | uniq
4 oranges
6 apples
7 bananas

xargs

stdout | xargs command

Pipe the stdout of some process into xargs and use it as a sequence of arguments for the command

Example:

find –name WAVECAR | xargs rm

find –name WAVECAR will find every file named ‘WAVECAR’ in an subdirectory of your current location. Piping this list into ‘xargs rm’ will delete every such file.

Bash one-liners




Comments