Sunday, May 4, 2014

Unix Shell Fields Rearrangement: awk

1. output specific field

test_1(Note the empty line):
If using space as delimiter, then each line has 2 fields
If using ":" as delimiter, then each line has 3 fields: null, YY, " Female" for 2nd line

 :YY: Female  
 :XX: Male  

terminal:
Using space as delimiter, output the first field, and second field
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk '{print $1}' ./test_1  

 :YY:  
 :XX:  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk '{print $2}' ./test_1  

 Female  
 Male  

2. NF: number of fields
output the last field:
test_1 is same as above

terminal:
since NF represents the number of fields, then $NF means the index for the last field. 'print $NF' means print last field.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk '{print $NF}' ./test_1  

 Female  
 Male  

output the non-empty field:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk 'NF>0 {print $1, $NF}' ./test_1  
 :YY: Female  
 :XX: Male  


output lines with specific number of fields, there are 2 lines with 2 fields, one line with no field(empty line). No line with one field.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk 'NF==2 {print $1, $NF}' ./test_1  
 :YY: Female  
 :XX: Male  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk 'NF==0 {print $1, $NF}' ./test_1  

 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk 'NF==1 {print $1, $NF}' ./test_1  

3. Use different delimiter
test_1 is same as above

terminal:
-F can change the awk variable 'FS'. By changing the "FS", we can change the awk behavior about how to separate fields.

Use ":" as the delimiter to print out specific field:
Note: with ":" as the delimiter, the first field is empty for all three lines. so print $1 print out 3 empty lines.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -F: '{print $2}' ./test_1  

 YY  
 XX  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -F: '{print $1}' ./test_1  



 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -F: '{print $NF}' ./test_1  

  Female  
  Male  

4. Use different output field separator
In awk, by changing variable "OFS" we can change the output field separator. using option -v, we can change the awk variable in command line.

test_1 is same as above

terminal:
1) First command, we are using ":" as the delimiter, "##" as the output separator. 1st line, three fields are all empty, and three fields are separated by 2 "##" separator. So we can see four "#" at 1st line. 2nd line, first field is empty, then it follow with "##" as separator, that's why we are seeing 2nd and 3rd line starts with "##"
2) Second command, we are using space, default delimiter, "##" as the output field separator. 1st line, three fields are empty, similar from 1st command. 2nd line, first field is ":YY:", then follows "##" as separator, then follows "Female" as second field, since 2nd line only have 2 fields, so 3rd field is empty, but we still need to have "##" to separate 2nd field and 3rd field even if 3rd field is empty.
3rd line is similar
3) Third command, we are using space, default delimiter, "##" as the output field separator. 1st line, 2 fields are empty, we only need one "##" to separate 2 empty fields. 2nd and 3rd line, there are only 2 fields, so we only need on "##" in between.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -F: -v 'OFS=##' '{print $1, $2, $3}' ./test_1  
 ####  
 ##YY## Female  
 ##XX## Male  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -v 'OFS=##' '{print $1, $2, $3}' ./test_1  
 ####  
 :YY:##Female##  
 :XX:##Male##  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -v 'OFS=##' '{print $1, $2}' ./test_1  
 ##  
 :YY:##Female  
 :XX:##Male  

5. print lines
test_1 is same as above

terminal:
1) First command, using ":" as the separator,  all fields in first line are empty, so the output for 1st line is " is ". For 2nd and 3rd lines,  first field is empty , 2nd and 3rd fields are name and sex.
2) Second command, basically same as above. But we are using "printf" instead of "print" here. Note: printf doesn't supply a newline automatically, so we have to add "\n" by ourselves.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -F: '{print $2, "is", $3}' ./test_1  
  is   
 YY is Female  
 XX is Male  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk -F: '{printf "%s is %s\n", $2, $3}' ./test_1  
  is   
 YY is Female  
 XX is Male  

6. BEGIN
test_1 is same as above

terminal:
We are using "BEGIN" to initialize awk variable here.
FS is the awk variable controlling the separator, OFS is the one controlling output separator.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ awk 'BEGIN { FS=":";OFS="#" }  
 > {print $2, $3}' ./test_1  
 #  
 YY# Female  
 XX# Male  

No comments:

Post a Comment