Monday, May 5, 2014

Unix Shell Text Processing: uniq, fmt, wc, head, tail, pr

1. output unique records:
test_1:
 Hello  
 Hello  
 Hello  
 world  

terminal:
uniq command by default will filter out duplicate lines
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ uniq ./test_1  
 Hello  
 world  

-d means only output duplicate lines
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ uniq -d ./test_1  
 Hello  

-c means output the number of line in the beginning:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ uniq -c ./test_1  
    3 Hello  
    1 world  

-u means only output the unique lines:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ uniq -u ./test_1  
 world  

2. reformat paragraphs: fmt
test_1:
 Hello world Hello world  
 Hello  
 world  

terminal:
-w specify the line length. fmt read the entire text of given file, and re-format the entire text with the rule that each line has only at most 12 characters. Then first line is separate into two lines, and last two lines are combined.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ fmt -w 12 ./test_1  
 Hello world  
 Hello world  
 Hello world  

-s means, fmt can't touch the "short" lines even if we can combine short lines together to make a qualified long line.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ fmt -s -w 12 ./test_1  
 Hello world  
 Hello world  
 Hello  
 world  

3. count text: wc
wc by default outputs : 1) number of lines 2) number of words 3) number of characters, of the input text.
-c means: output how many characters
-l means: output how many lines
-w means: output how many words
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ echo Hello world | wc  
    1    2   12  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ echo Hello world | wc -c  
 12  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ echo Hello world | wc -l  
 1  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ echo Hello world | wc -w  
 2  

4. Find the first n lines of text file:
test_1:
 Hello world Hello world  
 Hello  
 world  


1) First command output the first line of ./test_1
2) Second command output the first 2 lines of ./test_1
3) Third command output the first 5 lines, but there are only 3 lines, so it just output 3 lines.
Note: if not specifying the -n option, by default it will use "-n 10"
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ head -n 1 ./test_1  
 Hello world Hello world  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ head -n 2 ./test_1  
 Hello world Hello world  
 Hello  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ head -n 5 ./test_1  
 Hello world Hello world  
 Hello  
 world  

5. Find the last n lines of text file:
test_1 is same as above.

terminal:
-f means "follow". It tells "tail" that we should monitor the file ./test_1, as long as ./test_1 is modified(add more text in the end) at somewhere else, we could see it over here.
monitor the appended text:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ tail -f ./test_1  
 Hello world Hello world  
 Hello  
 world  

output the last n lines of text:
-n 20 means that tail should output last 20 lines of text, but there are only 3 lines of text in ./test_1, so it just output 3 lines.
If without -n option, by default, -n will take 10 like "head" command.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ tail -n 1 ./test_1  
 world  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ tail -n 2 ./test_1  
 Hello  
 world  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ tail -n 20 ./test_1  
 Hello world Hello world  
 Hello  
 world  

No comments:

Post a Comment