dict: a dictionary file containing all words in american english
A
A's
AA's
AB's
ABM's
AC's
ACTH's
AI's
AIDS's
AM's
AOL
AOL's
ASCII's
ASL's
ATM's
ATP's
AWOL's
AZ's
......
script:
-h: if there are multiple files as input, we may ignore the file name in the standard output
-i: ignore the letter case when doing the matching
#! /bin/bash
pattern="$1"
egrep -h -i "$pattern" ./dict | sort -u -f
terminal:
1) First Command: pick up all words starting from the "Hello"(ignoring the lettercase)
2) Second Command: pick up all words starting from the "world"(ignoring the lettercase)
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ ./script Hello.*
hello
hellos
hello's
Othello
Othello's
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ ./script world.*
otherworldly
underworld
underworlds
underworld's
unworldly
world
worldlier
worldliest
worldliness
worldliness's
worldly
worlds
world's
worldwide
2. Word lists
Given a text block, we need to list how many times each word occurs.
text:
Hello world!
Hello Hello my great world!
script:
1) First command in the pipeline: -c means complement, -s means squeezing repeated characters. Then translate all characters which are not in set [A-Za-z!], into newline operator
2) Second command in the pipeline: translate all upper case characters into lower case characters
3) sort, by default, it sorted on first field with dictionary order
4) output the unique format while having the count number for different words
5) using number order, sort the first field, if first field matches, then sort the 2nd field
6) ${1:-3} means using the first positional parameter, it not available, using default number "3". sed -n "1,3 s/ / /p" means enforce not printing out the pattern space(-n), while printing out the "touched line"(p). So we just output first 3 lines.
#! /bin/bash
tr -cs [A-Za-z!] '\n' < ./text \
| tr [A-Z] [a-z] \
| sort \
| uniq -c \
| sort -k1,1nr -k2,2 \
| sed -n "1,${1:-3} s/ / /p"
terminal:
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ ./script
3 hello
2 world!
1 great
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ ./script 2
3 hello
2 world!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ ./script 1
3 hello
No comments:
Post a Comment