Howto: Linux / Cygwin Command Line

Linux or Cygwin command line is very powerful. It lets you perform sophisticated tasks in the command line. A list of common tasks that I find useful are given below. 

Viewing Contents of a Text File 

  • Find line count of file without opening it in a viewer
    • wc <fileName>
    • This returns the number of lines, words and bytes within the file
  • View the part of a large file, from the start
    • less <fileName>
  • View the end of a file 
    • tail <fileName>

Useful Commands for Data Analysis

  • Extract a column of data (where , is a delimiter)
    • cut -d "," -f 2 <fileName>
  • Producing a list of unique values of a column (output as a set of lines)
    • cut -d "," -f 2 <fileName> | sort | uniq
    • Note that the the output of the cut command has to sent through to sort prior to sending it to uniq
  • Producing list of unique values from a column from a set of instances and output as a comma delimited list
    • cut -d "," -f 3 <fileName> | sort | uniq | sed -e ':a; N; s/\n/,/; ta'
    • The final statement 'sed ....' merges all the elements output from the uniq command into one list

Search Related

  • Finding a file name by recursing sub directories 
    • find . -name file_name -print
  • Search for something with in all files, recursively going into directories 
    • grep -r 'search_string' *
  • Loop through a set of files with in a directory 
    • for i in $(ls .); do (action using $i as filename);done 
    • E.g. for i in $(ls .); do (a2ps --center-title=$i-Syntax_constraints --output=$i/ $i/; done 
    • In the example, the current directory only has directories. The loop body calls a2ps on the .cl file within each directory.
  • Find a class in a set of jar archives in a directory 
    • for i in $(ls .); do (echo $i;jar tf $i | grep class_name );done
  • Display file names without extension - this is very useful in conjunction with the for loop to perform some operation each file and specify the output filename with a different extension. This command can be used to replace the 'ls .' within the for statement above. 
    • ls -1 | sed -e 's/\.[a-zA-Z*]*$//'


  • Clear contents of console
    • the clear command clears the console window. However, Cygwin does not have clear installed by default. 
    • ctrl+L clears the screen Bash, which works in cygwin as well
  • Produce a printout of nicely formatted code. 
    • a2ps file_name
  • Dump file contents in oct format 
    • od -c file_name
  • Remove a charactor (char) from a file 
    • cat file_name | tr -d 'char' > file_name
  • Printing 2 pages to a one physical page 
    • cat | psnup -pa4 -2 | lpr
  • Using tar 
    • Creating a tar archive
      • tar -cvf tar_filename.tar source_dir 
    • Creating a tarred zipped file
      • tar -czvf tar_filename.tar.gz source_dir 
    • Extracting a tar archive
      • tar -xvf tar_filename.tar source_dir 
    • Extracting tarred and zipped file
      • tar -xzvf tar_filename.tar.gz source_dir