Saturday, June 21, 2014

awk: one-line examples(3)

1. Convert double space lines to single space lines
text2:
 1 Hello  
   
 2 World  
   
 3 Hello  
   
 4 Chicago!  
   
 5 Hello  
   

terminal:
 aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ awk -v RS="\n *\n" '{ print; }' text2  
 1 Hello  
 2 World  
 3 Hello  
 4 Chicago!  
 5 Hello  

2. Locate lines whose length exceeds the upperlimit
text2:
 Hello  
 Hello Chicago!  
 Hello Los Angeles!  

terminal:
 aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ egrep "^.{8,}" text2  
 Hello Chicago!  
 Hello Los Angeles!  
 aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ awk 'length($0)>8' text2  
 Hello Chicago!  
 Hello Los Angeles!  

3. Strip the mark up tags
text2:
 <head>Hello<head />  
 <body>Hello Chicago!<body />  
 <end>Hello Los Angeles!<end />  

terminal:
We change the record separator to one regular expression representing the markup tag, and output record separator to a white space. Then in the end, for each record, awk execute the action to print it out and + white space in the end.
 aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ awk 'BEGIN { ORS=" "; RS="<[^<>]*>" } { print }' text2  
  Hello   
  Hello Chicago!   
  Hello Los Angeles!   

4. Extract title tag from xml
text2:
 <title>Unix Shell</title>    
 <body>Unix shell is very awesome!</body>  
   <title>Algorithm</title>  
 <body>Algorithm is very awesome</body>   
     <title>Machine Learning</title>   
 <body>Machine Learning is very awesome</body>  

terminal:
For each record, as long as it satisfies the title markup tag, awk will execute the default action to print it out and pipe to another sed command, which remove the spaces at the beginning.
 aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ awk '/<title *>.*<\/title *>/' text2  
 <title>Unix Shell</title>    
   <title>Algorithm</title>  
     <title>Machine Learning</title>   

 aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ awk '/<title *>.*<\/title *>/' text2 | \  
 > sed -e 's/ *<title>/<title>/g'  
 <title>Unix Shell</title>    
 <title>Algorithm</title>  
 <title>Machine Learning</title>   

No comments:

Post a Comment