Sed

Sed stands for Stream Editor. It is a powerful utility that can be used for manipulating text and files . An example of what "sed" can do. There is an attached file to this page named "6_29.cpp" . There are line numbers at the beginning of this file We can use the following command to remove the line numbers.

cat 6_29.cpp | sed -r 's/^ *[0-9]+//g'

The re "^ *[0-9]+" is stating that we can have any number of spaces at the beginning followed by at least one numerical digit and if so we remove that match.

cat 6_29.cpp | sed -r '/\/\*/,/\/*\//d'

The command looks for a line with the pattern matching "/*" and then another line that matches the end pattern "*/" and deletes these lines along with any lines in between.

The sed command takes a string

/../../

The "s" states that we are using the substitution command. We specify the pattern and what to replace the pattern with. The part between the first 2 slashes is the pattern and the part between the second and third slash is the replacement string. This is one use of sed and we shall see other ways that sed can manipulate text .

[amittal@hills sed]$ echo "Lemon tree" | sed 's/tree/juice/'

Lemon juice

[amittal@hills sed]$

We do not have to use the forward slash as a separator and can essentially use any character . Using the question mark:

[amittal@hills sed]$ echo "Lemon tree" | sed 's?tree?juice?'

Lemon juice

[amittal@hills sed]$

$ echo "Lemon tree" | sed 's_tree_juice_'

Lemon juice

The below expression replaces any word starting with t or a word that has a t inside it.

$ echo "Lemon tree tank top" | sed -r 's/t[a-zA-Z]+/juice /g'

Lemon juice juice juice

$ echo "Lemon atree tank top" | sed -r 's/t[a-zA-Z]+/juice /g'

Lemon ajuice juice juice

Exercises

1) What does the following do ?

$ echo "this is something for tom." | sed -r 's/^t/T/' | sed -r 's/ t/ T/'

The problem with the below command is that ir changes the words beginning with "t" but also changes a word if t is in the middle of the word. Change it so that only words that begin with the letter "t" are modified. Spaces should be preserved as in the original string.

echo "temon its tree tank top" | sed -r 's/t[a-zA-Z]+/juice /g'

Solutions

echo "temon its tree tank top" | sed -r 's/^t[a-zA-Z]+/juice/g' | sed -r 's/ t[a-zA-Z]+/ juice/g'

Also using the pipe as or but then problem with spaces.

echo "temon its tree tank top" | sed -r 's/(^t[a-zA-Z]+| t[a-zA-Z]+)/juice/g'

The "&" symbol gives us the matched string .

$ echo "Lemon tree" | sed -r 's/tree/& &/'

Lemon tree tree

Rest of the string that is not matched stays the same.

$ echo "Lemon 5-6" | sed -r 's/[+,-]/ & /'

Lemon 5 - 6

In the above whenever we see a "+" or a "-" symbol in the input string we place spaces around it.

[amittal@hills sed]$ echo "123 abc" | sed -r 's/[0-9]+/& &/'

123 123 abc

The pattern that was matched was "123" and that got repeated with "& &" .

[amittal@hills sed]$ echo "123 abc" | sed -r 's/[0-9]+/(&)/'

(123) abc

The above line puts brackets around the number "123" . What if we wanted to get rid of the words "abc" and only have "(123)" as the output. We could do something like :

echo "123 abc" | sed -r 's/ [a-zA-Z]+//' | sed -r 's/[0-9][0-9]*/& &/'

$ ./sed2.sh

123 123

We can do this in a better way because sed allows us to specify a particular pattern in our regular expression string.

Exercises:

1) Place the command

echo "123 abc" | sed -r 's/[0-9]+/& &/'

in a shell script and then run the shell script. This method has the advantage of being able to edit the text file and the command is saved for future reference.

Using "()" and "\1"

We can use "() \number" syntax to further isolate patterns and select particular patterns.

[amittal@hills sed]$ echo "123 abc" | sed -r 's/(^[0-9]+) .*/\1/'

123

In the above example the brackets match the number and the rest of the line is matched by the pattern " .*" . The substitute section only has "\1" and the pattern in bracket is matched while the rest of the line is truncated.

The brackets "()" match the pattern "\1" and the next brackets will match "\2". We will get an error if the round brackets do not match the pattern number.

$ echo "123 abc" | sed -r 's/^[0-9]+ .*/\1/'

sed: -e expression #1, char 16: invalid reference \1 on `s' command's RHS

We are missing the round brackets in the pattern.

$ echo "This is a lemon tree" | sed -r 's/(is) (a)/\2 \1/'

This a is lemon tree

In the above the patterns are "is" and "a" .

The below line shows how we can switch the first and the second word.

[amittal@hills PartOfPattern]$ echo "We are in a unix scripting class." | sed -r 's/(^[A-Za-z]+) ([A-Za-z]+)/\2 \1/'

are We in a unix scripting class.

[amittal@hills PartOfPattern]$

What if we wanted to grab the second word only from the above example:

echo "We are in a unix scripting class." | sed -r 's/(^[A-Za-z]+) ([A-Za-z]+).*/\2/'

There is usually more than one way to write something.

echo "We are in a unix scripting class." | sed -r 's/^[A-Za-z]+ ([A-Za-z]+).*/\1/'

$ echo "We are in a unix scripting class." | sed -r 's/(^[A-Za-z]+) ([A-Za-z]+)/\2/'

are in a unix scripting class.

We are replacing the first 2 words by just the second word.

We can place "\1" on the left hand side also .

[amittal@hills PartOfPattern]$ echo "This This contains a mistake." | sed -r 's/([A-Za-z]+) \1/\1/'

This contains a mistake.

[amittal@hills PartOfPattern]$

[amittal@hills PartOfPattern]$ echo "This contains contains a mistake." | sed -r 's/([A-Za-z]+) \1/\1/'

This contains a mistake.

Removing duplicated words at the beginning and end of the line:

echo "This contains a mistake. This" | sed -r 's/(^[A-Za-z]+)(.*)\1$/\1\2/'

Removing duplicated words.

$ echo "This contains This a mistake." | sed -r 's/(^[A-Za-z]+)(.*)\1/\1\2/'

This contains a mistake.

Exercises

Assume we have a string "We are in a unix scripting class." |

Switch the first and last word.

Switch the first and third word.

Switch the first and third word and remove the second word.

echo "We are in a unix scripting class." | sed -r 'TODO'

Output should be as:

class. are in a unix scripting We

in are We a unix scripting class.

in We a unix scripting class.

Flags

-n and p

The flag -n means that lines will not be output to the console.

Ex:

data.txt

This is a test.

The dog is chasing the cat.

A test is coming up.

Are we having fun in this class ?

sed -n 's/test/Test/' data.txt

The "-n" option suppresses the output so we don't get any output printed to the console at all. If we use the "p" flag then the lines that match will get printed out.

[amittal@hills Flags]$ sed -n 's/test/Test/p' data.txt

This is a Test.

A Test is coming up.

The "-n" option suppressed the lines that would normally get printed out and the "p" option prints out the lines that match. What if we have only the "p" option and not the "-n" option.

[amittal@hills Flags]$ sed 's/test/Test/p' data.txt

This is a Test.

The dog is chasing the cat.

A Test is coming up.

Are we having fun in this class ?

All the lines in the file "data.txt" get printed out and the lines matching the pattern also get printed out.

We can use both the -n and -p flag to simply print the lines that match and not replace anything .

sed -rn '/([a-z]+) \1/p'

The above will print the lines that contain a duplicate word. In this way the sed command is working like a grep.

$ cat data.txt | sed -rn '/fun/p'

Are we having fun in this class ?

The above command prints the lines that have the word "fun" in them.

Exercises

1) What does the below print ?

cat 6_29.cpp | sed -nr 's/([0-9]+)/\1/p'

Flag g

The flag "g" will make the replacements globally.

[amittal@hills Flags]$ echo "Testing the Tesla car." | sed 's/Tes/TES/'

TESting the Tesla car.

Normally the substitution is done on the first pattern match that sed found. Using the "g" flag causes the replacements to occur throughout the line.

[amittal@hills Flags]$ echo "Testing the Tesla car." | sed 's/Tes/TES/g'

TESting the TESla car.

[amittal@hills Flags]$ echo "Testing the Tesla car." | sed -r 's/[^ ]+/(&)/g'

(Testing) (the) (Tesla) (car.)

The above line uses the not operator to mean a combination does not contain a space.

Exercise

echo "suden unflatering noncommital subcommitee" | sed -r 'TODO'

Complete the sed command above so that the d is replaced by 2 dd's and one t is replaced by 2 t's .

Remove both the duplicated words.

This This contains a mistake mistake

Specifying the occurrence

We can specify which matching pattern should be applied.

[amittal@hills Flags]$ echo "Testing the Tesla car." | sed -r 's/[^ ]+/(&)/4'

Testing the Tesla (car.)

In the above line we are stating that the pattern match should apply to the 4th word only. We can use the number with the "g" flag to specify apply the match pattern to the nth occurrence and beyond.

[amittal@hills Flags]$ echo "Testing the Tesla car." | sed -r 's/[^ ]+/(&)/2g'

Testing (the) (Tesla) (car.)

In the above line we are stating that apply the pattern matching to 2nd word and beyond.

Exercises:

1) Write a sed command to work on the file "data.txt" to keep just the first 3 words in each line.

Solution

$ cat data.txt | sed -r 's/[^ ]+//4g'

Writing the output to a file

File even.txt contents

22 Even number

23 Odd Number

24 Even Number

25 Odd Number

sed -n 's/^[0-9]*[02468] /&/w even' even.txt

Contents of the file "even" :

22 Even number

24 Even Number

This can also be done using redirection:

sed -n 's/^[0-9]*[02468] /&/' > even.txt

You can also combine flags such as:

sed -n -r 's/^[0-9]*[02468] /&/w even' even.txt

sed -nr 's/^[0-9]*[02468] /&/w even' even.txt

[amittal@hills Flags]$ var1=`sed -n 's/^[0-9]*[02468] /&/p' even.txt`

[amittal@hills Flags]$ echo $var1

22 Even number 24 Even Number

[amittal@hills Flags]$ echo "$var1"

22 Even number

24 Even Number

[amittal@hills Flags]$

When we print the value of "$var1" without the quotes then the shell comes into play and takes out the newlines .

Exercises

1) Using the file "even.txt" place the values 22,23,24,25 in a variable "var1" . There should only be spaces between the numbers . Write your commands in a script and execute the script. Do not hard code the numbers.

$ ./ex_even.sh

22 23 24 25

Ignoring case

$ echo "cAt ate the fish" | sed -r 's/cat/Cat/i'

Cat ate the fish

echo "Abc" | sed -n '/abc/I p'

We can use the capitol letter "I" character to ignore the case. The above sed does not substitute but merely searches for a pattern in a manner similar to grep.

Exercises

1) What does the below do

$ echo "a dog jumps A fence" | sed -n 's/a/A/2ipw data'

Multiple Commands

Instead of pipes we can use "-e" option to give multiple commands.

[amittal@hills Flags]$ echo "cab is coming" | sed -e 's/a/A/g' -e 's/c/C/g'

CAb is Coming

Exercises:

1) What does the below print ?

$ echo "cab is coming" | sed -e 's/a/A/g' -e 's/A/C/ig'

Using Multiple FileNames

Sed can work with multiple files at the same time. Assume we have the following files.

File "f1"

#12 This is the first line in file f1.

#Abc This is the second line in file f1

File: "f2"

#132 This is a line in f2.

#Abcdef This is another line in f2

$ sed -r 's/^#[^ ]+ //' f1.txt f2.txt

Output:

$ sed -r 's/^#[^ ]+ //' f1.txt f2.txt

This is the firstline in file f1.

This is the second line in file f1

This is a line in f2.

This is another line in f2

Exercises:

1) Use the above sed command to save the output in a variable "var1" and output the contents of the variable.

Using Sed to work with a script in a file

If we have many commands we can place the commands in a file and use the "-f" option to run the sed command.

contents of the "myscript" file.

# sed comment - This script changes lower case vowels to upper case

s/a/A/g

s/e/E/g

s/i/I/g

s/o/O/g

s/u/U/g

[amittal@hills FileBased]$ echo "cat is sitting on the roof" | sed -f myscript

cAt Is sIttIng On thE rOOf

Exercises

1) Place your sed command in a file to increment a 2 digit number so that each digit gets converted to the one higher with 9 getting converted to 0 .

45 -> 56

91 -> 02

99 -> 00

00 -> 11

Printing a specific line using Sed

Let's create a file called "data.txt" containing the following 10 lines.

Line 1

Line 2

Line 3

Line 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

To print out the 5th line we can use the command:

[amittal@hills LineNumbers]$ cat data.txt | sed -n 5p

Line 5

To print out lines 3 to 5 we can use:

[amittal@hills LineNumbers]$ cat data.txt | sed -n 3,5p

Line 3

Line 4

Line 5

We can also specify that a pattern should apply to a specific line.

[amittal@hills LineNumbers]$ cat data.txt | sed -n '3 s/3/31/p'

Line 31

[amittal@hills LineNumbers]$ cat data.txt | sed -n '4 s/3/3/p'

[amittal@hills LineNumbe

Applying a range of line numbers

[amittal@hills LineNumbers]$ cat data.txt | sed -n '1,4 s/3/31/p'

Line 31

The above states that look in the lines 1 to 4 and apply the substitute operation if a match for the string "3" is found.

$ cat data.txt | sed '1,4 s/Line/LINE/'

LINE 1

LINE 2

LINE 3

LINE 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

The above states that look in the lines from 1 to 4 and change the small "Line" to "LINE" .

The "$" sign means till the end of the file.

$ cat data.txt | sed '3,$ s/Line/LINE/'

Line 1

Line 2

LINE 3

LINE 4

LINE 5

LINE 6

LINE 7

LINE 8

LINE 9

LINE 10

Searching for a range by pattern :

[amittal@hills LineNumbers]$ sed '/3/,/5/ s/Line//' data.txt

Line 1

Line 2

Line 6

Line 7

Line 8

Line 9

Line 10

We apply the substitute command upon encountering the first pattern up to the second pattern.

We can apply the range and pattern also.

$ cat data.txt | sed '2,/4/ s/Line/LINE/'

Line 1

LINE 2

LINE 3

LINE 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

The above states that start at line 2 and then go up the line that contains the pattern "4" .

Exercises:

File: "data1.txt" This is a test.

BEGIN The dog is chasing the cat.

A test is coming up.

Are we having fun in this class ? END

Some more lines.

Write a sed command that will place a "#" in the section marked BEGIN to END .

Deleting a line:

[amittal@hills LineNumbers]$ sed 3d data.txt

Line 1

Line 2

Line 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

Deleting by a range:

$ sed 1,3d data.txt

Deleting by a pattern:

[amittal@hills LineNumbers]$ sed '/3/ d' data.txt

Line 1

Line 2

Line 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

sed '5,$ d' data.txt

$ sed '5,$ d' data.txt Line 1 Line 2 Line 3 Line 4

Exercise

In this exercise we combine the line range with a pattern. Write a sed command to delete from line "1" to the pattern "3" .

Adding a line after a pattern match:

[amittal@hills LineNumbers]$ sed '/3/ a\ Add' data.txt

Line 1

Line 2

Line 3

Add

Line 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

Adding a line after a line number.

sed '3 a\ Add' data.txt

Adding a line at the end of the file.

sed '$ a\ Add' data.txt

Changing a line using the "c" flag

[amittal@hills LineNumbers]$ sed '/3/ c\ Change a line' data.txt

Line 1

Line 2

Change a line

Line 4

Line 5

Line 6

Line 7

Line 8

Line 9

Line 10

Exercises

1) Write a shell script using sed and line ranges to create a file "data1.txt" with the following contents.

Line 6

Line 7

Line 8

Line 9

Line 10

Line 1

Line 2

Line 3

Line 4

Line 5

Adding a line number.

The "=" command can be used to insert line numbers before each line.

$ sed = data.txt 1 Line 1 2 Line 2

File: "data2.txt"

Line a Line b Line c Line d Line e Line f Line g Line h Line i Line j

$ sed -n '/c/ =' data2.txt 3

The above states that match the line with "c" in it and print it's line number.

Transforming Characters

$ sed 'y/ie/IE/' data.txt LInE 1 LInE 2 LInE 3

We use the "y" option to state that "i" should be changed to "I" and "e" should be changed to "E" .

Exercise

Place your sed command in a file to increment a 2 digit number so that each digit gets converted to the one higher with 9 getting converted to 0 .

45 -> 56

91 -> 02

99 -> 00

00 -> 11

Do the above using the "y" option with sed.

The "/u" option

$ echo "cat" | sed -r 's/.$/\u&/'

caT

$ echo "cat" | sed -r 's/.*/\u&/'

Cat

The small "/u" option turns the next character into upper case.

$ echo "cat" | sed -r 's/.*/\U&/'

CAT

Grouping

File: "data.txt"

BEGIN The dog is chasing the cat.

A test is coming up.

#Comment1

Are we having fun in this class ? END

Some more lines.

#Comment2

File: "1.sh"

sed -n '

/BEGIN/,/END/ {

s/#.*//

/^$/ d

}

Grouping allows us to combine multiple sed commands together.

Page updated

Google Sites

Report abuse