Saturday, May 3, 2014

Unix Shell Text Substitution: sed(2)

1. print or not to print

test:
 #! /bin/bash  
 New York  
 Hello Hello  
 Hello  

terminal:
-n could stop sed from printing out:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed 's/Hello/world/' ./test  
 #! /bin/bash  
 New York  
 world Hello  
 world  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed -n 's/Hello/world/' ./test  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$   

-p could force sed from printing out. "Printing out" could be separated from 2 parts: whenver sed read one line from the input, after applying all commands, it will output the result automatically, this is the "first printing out" which is sth '-n' could stop. After that, sed could choose whether to print out the result again, this is sth controlled by -p, -p forces sed to print out again. In following example, the last 2 lines were caught and replaced, so they got printed out twice.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed 's/Hello/world/p' ./test  
 #! /bin/bash  
 New York  
 world Hello  
 world Hello  
 world  
 world  

-p normally get used together with '-n'. -n stop all automatic printing out, and -p force sed to print out only lines get changed
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed -n 's/Hello/world/p' ./test  
 world Hello  
 world  

2. Specify the range

Specify the single line:

test is same as above
terminal:
The first pattern /New York/ means that we only pick up lines matching this pattern and then apply the command. 
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '/New York/ s/New York/Chicago/' ./test  
 #! /bin/bash  
 Chicago  
 Hello Hello  
 Hello  

combined with command file:
command.sed contains the command "s/New York/&, the US Financial Capital"

 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '/New York/' -f command.sed ./test  
 sed: can't read /New York/: No such file or directory  
 #! /bin/bash  
 New York, the US Financial Capital  
 Hello Hello  
 Hello  
======================================
Specify the line range with number:

test is same as above
terminal:
1) First command is specifying line 1 to line 5, which is picking up all lines to apply commands, so both last two lines get changed
2) Second command is specifying the last line, line 5 to get picked, only the last line get changed.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '1,5 s/Hello/world/g' ./test  
 #! /bin/bash  
 New York  
 world world  
 world  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '5,5 s/Hello/world/g' ./test  
 #! /bin/bash  
 New York  
 Hello Hello  
 world  
======================================
Specify the line range with pattern:

test:
 #! /bin/bash  
 New York  
 Hello Hello  
 Hello  
 Chicago  

terminal:
This command starts from the line matching pattern /New York/, end to the line matching pattern /Chicago/, replace all "Hello" with "world"
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '/New York/,/Chicago/ s/Hello/world/g' ./test  
 #! /bin/bash  

 New York  
 world world  
 world  
 Chicago  
======================================
specify the line range by negating the pattern

test is same as above.
terminal:
The command is picking up lines not matching "Hello", and then apply all commands to those lines.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '/Hello/!' -e 's/New York/&, US Financial Capital/' -e 's/Chicago/&, Middle Financial Center/' ./test  
 sed: can't read /Hello/!: No such file or directory  
 #! /bin/bash  
 New York, US Financial Capital  
 Hello Hello  
 Hello  
 Chicago, Middle Financial Center  

3. Use different delimiter:
test:
 #! /bin/bash  
 New York  
 Boston  
 Washington D.C  

terminal:
1) First command, we want to recognize "New York" firstly, then replace it with Chicago, ":New York:", means we want to use ":" to separate the pattern, but shell doesn't recognize it.
2) Second command, we firstly use "\" to escape ":", so shell recognize the colon as the separator. Then shell could successfully catch the pattern "New York" and replace it with Chicago
3) Third command is same as second command, the only difference is we used a different delimiter.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed ':New York: s;;Chicago;' ./test  
 sed: -e expression #1, char 6: unknown command: `Y'  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '\:New York: s;;Chicago;' ./test  
 #! /bin/bash  
 Chicago  
 Boston  
 Washington D.C  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed '\/New York/ s;;Chicago;' ./test  
 #! /bin/bash  
 Chicago  
 Boston  
 Washington D.C  

4. Note the "pattern range"
Shell will try to match the patter "as far as it can"
test:
 #! /bin/bash  
 
 Jersey City  
 abc  

terminal:
we aim to replace "Jersey" with "Union", and get the result "Union City". But the shell match "J.*y" to the entire string "Jersey City". And we get the result "Union".
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed -n '3 s/J.*y/Union/p' ./test  
 Union  

"b*" means 0 or more b, which means it could indicate null string. So for the 4th line "abc", when sed apply the command to this string:
1) In the beginning, it will match null to "b*"
2) At 'a', not match
3) At 'b', match
4) At 'c', not match
5) At the end, match another null string
so the final result is "1a1c1"
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed -n '4 s/b*/1/p' ./test  
 1abc  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sed -n '4 s/b*/1/pg' ./test  
 1a1c1  

No comments:

Post a Comment