Given two files, comm command can detect common lines in both files, the unique line in file1 and unique lines in file2.
Note: all two files need to be sorted firstly
terminal:
1 - 2) Print out the content of t1 and t2, each contains two lines of strings
3) Use comm command to get command and unique lines from two files, --check-order option make comm command to check the order whenever proceeding one step. At this step, it complains that input file is not sorted yet.
4 - 5) sort files t1 and t2, output the result to sorted_t1 and sorted_t2 separately
6 - 7) Print out the file content of sorted_t1 and sorted_t2, both files are already sorted
8) Use comm command to get common and unique lines.
First column: "Hello Los Angeles" means this line only exists at first file sorted_t1
Second column: "Hello New York" means this line only exists at second file sorted_t2
Third column: "Hello world" means this line exists at both files.
9) -1 option means suppressing the output of "Unique lines in file 1"
10) -2 option means suppressing the output of "Unique lines in file 2"
11) -3 option means suppressing the output of "Common lines in both files"
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat t1
Hello world!
Hello Los Angeles!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat t2
Hello world!
Hello New York!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ comm --check-order t1 t2
Hello world!
comm: file 1 is not in sorted order
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ sort <t1 >sorted_t1
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ sort <t2 >sorted_t2
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat sorted_t1
Hello Los Angeles!
Hello world!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat sorted_t2
Hello New York!
Hello world!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ comm --check-order sorted_t1 sorted_t2
Hello Los Angeles!
Hello New York!
Hello world!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ comm --check-order -1 sorted_t1 sorted_t2
Hello New York!
Hello world!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ comm --check-order -2 sorted_t1 sorted_t2
Hello Los Angeles!
Hello world!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ comm --check-order -3 sorted_t1 sorted_t2
Hello Los Angeles!
Hello New York!
2. A simple self-made spell checker program with "comm" command
terminal:
1) Print out the file content of owndict, which is a simple self-made "dictionary", the spell checker works based comparison of owndict and t1
2) Print out the file content of t1, it contains a wrongly spelled word: "worlds"
3) Sort owndict, and output the result into sorted_dict
4) Print out the file content of sorted_dict
5) Use comm command to list lines existed only on t1, but not in sorted_dict, -13 option is used to suppress the output first and third columns, which are lines existed only at sorted_dict and existed at both files. If the line exists only at t1 but not our dictionary "sorted_dict", then it will get output and taken as wrongly spelled word
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat owndict
Hello
world
New
York
Los
Angelesaubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat t1
Hello
worlds
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ sort <owndict >sorted_dict
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat sorted_dict
Angeles
Hello
Los
New
world
York
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ comm -13 sorted_dict t1
worlds
3. Spell command
command can be used to check the wrongly spelled word in the text file
terminal:
1) Print out the file content of t1, the first line's "color" is american english, and the second line's "colour" is british english
2) Print out the file content of owndict, which is our own "dictionary"
3) Use spell command to get the wrongly spelled word, in this case, "colour" in the second line get picked. By default, spell command is using the american english as the standard.
4) -b option tell "comm" command to use "british english" as the standard
5) -d option allows user to specify own dictionary file, since at owndict, we don't have "colour", so "colour" at the t1 get picked as the wrongly spelled word.
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat t1
I love the blue color!
I love the red colour!
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ cat owndict
I
love
the
blue
red
color
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ spell t1
colour
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ spell -b t1
aubinxia@aubinxia-fastdev:~/Desktop/xxdev$ spell -d owndict t1
colour
Note:If changing the locale, we need to re-sort the dictionary with new rules in new locale, otherwise comparison result would be problematic.
No comments:
Post a Comment