test_1(Note the spaces):
:YY: Female
:XX: Male
terminal:
-k means we are sorting by first field, by default, sort uses white space as the separator, for a field, trailing and tailing white spaces are ignored. That's why line starting with ":XX:" is before the line starting with " :YY:". Because the spaces in front of ":YY:" are ignored, and the dictionary comparison result of ":XX:" and ":YY:" make the 2nd line goes first.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1 ./test_1
:XX: Male
:YY: Female
-t: means using ":" as the separator. We are sorting by the first field. With ":" as the separator, 1st field of 3 lines are: null, null, 3 spaces. That's why we get the following result
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1 -t: ./test_1
:XX: Male
:YY: Female
2. sort starting from one field
test_2:
10:YY:Lawyer
10:XX:Engineer
terminal:
We are using ":" as the delimiter and sort staring from first field. Note: -k1 means "sorting starting from 1st field to the end of record, not exactly 1st field only". That's why sort reversed the order of two records even if first fields are equal. Because the 2nd field "XX" is less than "YY", causing "XX" line goes first.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1 -t: ./test_2
10:XX:Engineer
10:YY:Lawyer
3. Reverse the order
test_2:
10:YY:Lawyer
15:XX:Engineer
terminal:
with -r, we reverse the sorting order, so line starting with 15 goes first.
without -r, we use the normal ascending order, so line starting with 10 goes first.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1r ./test_2
15:XX:Engineer
10:YY:Lawyer
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1 ./test_2
10:YY:Lawyer
15:XX:Engineer
4. Sort by number
test_2:
10:YY:Lawyer
9:XX:Engineer
terminal:
First command: With ":" as the delimiter, we are sorting starting from 1st field, which is a number. But normal sorting does the job with dictionary order, so we get the following result with the line starting from "10" goes first.
Second Command: -n means sorting based on "number context", which means, 9 < 10 in this case. So the order get reversed with the new option -n.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -t: -k1 ./test_2
10:YY:Lawyer
9:XX:Engineer
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -t: -k1n ./test_2
9:XX:Engineer
10:YY:Lawyer
5. Sort By Range of Fields
test_2:
10 20 XX
10 10 YY
terminal:
First command: sort based on field 1 to field 3
Second command: sort based on field 2 to field 3
Third command: sort only based on field 3
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1,3 ./test_2
10 10 YY
10 20 XX
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k2,3 ./test_2
10 10 YY
10 20 XX
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k3,3 ./test_2
10 20 XX
10 10 YY
6. Sort by specific position
test_2:
XX20 Engineer
YY10 Lawyer
terminal:
First Command: sort by 1st field, XX20 is before YY10
Second command: -k1.3 means, sort starting from 1st field's 3rd character, which is comparing 10 and 20 in this case.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1 ./test_2
XX20 Engineer
YY10 Lawyer
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1.3 ./test_2
YY10 Lawyer
XX20 Engineer
7. Sort by outputting unique record
test_2:
10 XX Engineer
10 YY Lawyer
terminal:
First Command, sort starting from field 1 to the end of record, so there is no duplicate record here.
Second command, sort starting from field 1 to field 1, meaning that we only compare field 1, so -u only output the unique record on the specific field.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1 -u ./test_2
10 XX Engineer
10 YY Lawyer
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ sort -k1,1 -u ./test_2
10 XX Engineer
8. Sort Text Block
display the text block:
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ cat ./test
#sortkey: Bill
Bill Gates
President of Microsoft
=====
#sortkey: Jobs
Steve Jobs
President of CEO
=====
#sortkey: Barack
Barack Obama
President of United States
put each text block on one line:
RS controls input separator, gsub is trying to replace "\n" with "--" globally
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ cat ./test |
> awk -v RS="=====\n" '{ gsub("\n","--"); print}'
#sortkey: Bill--Bill Gates--President of Microsoft--
#sortkey: Jobs--Steve Jobs--President of CEO--
#sortkey: Barack--Barack Obama--President of United States--
sort all text blocks:
we use sort to sort all blocks. -f means converting all letters to a common lettercase for comparison. -k2 means we are comparing the 2nd field, which is the name.
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ cat ./test |
> awk -v RS="=====\n" '{ gsub("\n","--"); print}' |
> sort -f -k2
#sortkey: Barack--Barack Obama--President of United States--
#sortkey: Bill--Bill Gates--President of Microsoft--
#sortkey: Jobs--Steve Jobs--President of CEO--
convert the sorting result to the original format:
ORS controls the output separator, gsub replaces all "--" with "\n".
aubinxia@aubinxia-VirtualBox:~/Desktop/xxdev$ cat ./test |
> awk -v RS="=====\n" '{ gsub("\n","--"); print}' |
> sort -f -k2 |
> awk -v ORS="=====\n" '{ gsub("--", "\n"); print}'
#sortkey: Barack
Barack Obama
President of United States
=====
#sortkey: Bill
Bill Gates
President of Microsoft
=====
#sortkey: Jobs
Steve Jobs
President of CEO
=====
9. Note:
Unix shell sort is very efficient, it is not using the simplest bubble sort.
Unix shell sort is not stable sort, which means, for two records which are determined to be equal, the original order is not guaranteed.
No comments:
Post a Comment