sort can be used to sort the lines in a given file. A simple use would be
echo -e "c\na\nb" | sort
which produces:
a
b
c
Sweet! Now lets look at the sort program:
echo -e "a\na\nb\nc\nb" | uniq
gives:
a
b
c
b
Removing the duplicate a, but not the b! The tool only removes any consecutive duplicate lines. So what if you want to remove all duplicates? Easy:
echo -e "a\na\nb\nc\nb" | sort | uniq
gives:
a
b
c
Taa Dah! Thats easy!
Now this duo can be used in very many useful ways. Just the other day I needed to find two XML elements that had the same value. I used 'sed' to pull out all the values in the given element, sort to put these values in lexographically sorted order, and uniq to tell me the duplicates found:
sed -n 's/.*<tag>
For the 'sed' part you could put any tag name in there you need to find duplicate value for.
No comments:
Post a Comment