Friday, May 2, 2014

Unix Shell Basic Regular Expression(1)

test:
 #! /bin/bash  
 echo $1*  
 echo $2*\  
 echo ${10}  


1. Match one ordinary character:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep e ./test  
 echo $1  
 echo $2  
 echo ${10}  

2. Match one special character:
We have to use "\" to escape the character:
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep # ./test  
 Usage: grep [OPTION]... PATTERN [FILE]...  
 Try 'grep --help' for more information.  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep \# ./test  
 #! /bin/bash  

If the character is not "special", "\" will just get ignored, following examples shows that '/' is the same in 2 cases.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep / ./test  
 #! /bin/bash  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep \/ ./test  
 #! /bin/bash  

3. Match multiple characters:
e*o, means, starting with "0 or more e", then end with o. As long as line contains such kind of string, it is picked up. In following example, only "o" is highlighted, meaning it picked up "0 e + 1 o".
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep e*o ./test  
 echo $1*  
 echo $2*\  
 echo ${10}  

As long as the line contains words "ech", it would be picked up.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep ech ./test  
 echo $1*  
 echo $2*\  
 echo ${10}  

We are picking up "e + one any character + c". dot means "one character here", 0 doesn't count, so "ec" doesn't get picked.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep e.c ./test  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$  

We are picking up "e + two any characters + o". so "echo" get picked up.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep e..o ./test  
 echo $1*  
 echo $2*\  
 echo ${10}  

We are picking up "e + any characters + o". dot means exactly "one character", the following * means "any(0 or more) preceding characters". So the result is "e + any characters + o".
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep e.*o ./test  
 echo $1*  
 echo $2*\  
 echo ${10}  

4. Match one set of characters:

As long as the line in ./test contains "a" or "b"(any character specified inside the bracket), it would be picked up.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [ab] ./test  
 #! /bin/bash  

As long as the line in ./test contains alpha character, it would be picked up.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [[:alpha:]] ./test  
 #! /bin/bash  
 echo $1*  
 echo $2*\  
 echo ${10}  


"^" in the beginning of bracket compliments all characters. Following example specifies a set containing characters "#!/" and alpha characters and space characters, which is totally the first line. "^" in the beginning make grep to catch all lines excluding these characters. Then the remaining lines got picked.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [^#\!/[:alpha:][:space:]] ./test  
 echo $1*  
 echo $2*\  
 echo ${10}  

Note: special characters are not special any more in brackets:
".*" means any number of any characters, it matches anything.
[.*] means any line containing character "." or "*", so only 2 lines returns.
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep ".*" ./test  
 #! /bin/bash  
 echo $1*  
 echo $2*\  
 echo ${10}  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [.*] ./test  
 echo $1*  
 echo $2*\  


5. Match Range of Characters:
We can specify a range. [0-9] means, we should pick up any character from 0 - 9. But we should be cautious different locales have different implementation of range of characters, this is not portable.
We have to make sure the range is valid, which means staring character must be less than ending character.

 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [0-9] ./test  
 echo $1*  
 echo $2*\  
 echo ${10}  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [a-9] ./test  
 grep: Invalid range end  
 aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ grep [9-5] ./test  
 grep: Invalid range end  

No comments:

Post a Comment