The uniq
command in Linux filters or reports adjacent duplicate lines in a text file or input stream. It is commonly used to remove duplicates, count occurrences, or identify unique/repeated lines. For non-adjacent duplicates, pair uniq
with sort
to pre-process the input. Below are practical examples:
- Remove Adjacent Duplicate Lines
- Remove All Duplicates (After Sorting)
- Count Occurrences of Each Line
- Show Only Duplicated Lines
- Show Only Unique Lines
- Case-Insensitive Comparison
- Skip Fields Before Checking
- Skip Characters Before Checking
- Compare Only the First N Characters
- Combine with cut to Process Columns
- Check for Duplicates in Raw (Unsorted) Files
Remove Adjacent Duplicate Lines
uniq file.txt
Deletes consecutive duplicate lines in file.txt
(requires duplicates to be adjacent).
Remove All Duplicates (After Sorting)
sort file.txt | uniq
Sorts the file first, then removes all duplicates.
Count Occurrences of Each Line
sort file.txt | uniq -c
-c
adds a count of occurrences (e.g.,3 apples
).
Show Only Duplicated Lines
sort file.txt | uniq -d
-d
prints lines that appear more than once.
Show Only Unique Lines
sort file.txt | uniq -u
-u
prints lines that appear exactly once.
Case-Insensitive Comparison
sort file.txt | uniq -i
-i
ignores case differences (e.g.,Error
=error
).
Skip Fields Before Checking
sort -t',' -k2 data.csv | uniq -f2
-t','
uses comma as the delimiter.-f2
skips the first 2 fields when comparing lines.
Skip Characters Before Checking
sort file.txt | uniq -s5
-s5
ignores the first 5 characters of each line.
Compare Only the First N Characters
sort file.txt | uniq -w10
-w10
compares only the first 10 characters.
Combine with cut
to Process Columns
cut -d',' -f1 data.csv | sort | uniq
Extracts the first CSV column, sorts it, and removes duplicates.
Check for Duplicates in Raw (Unsorted) Files
uniq raw_data.txt
Note: Only removes adjacent duplicates. Non-adjacent duplicates remain.
Key Notes:
- Sorted Input: Always use
sort
beforeuniq
unless duplicates are guaranteed to be adjacent. - Delimiters: Use
-t
withsort
orcut
for structured data (e.g., CSV). - Options:
-c
: Count lines.-d
: Show duplicates.-u
: Show uniques.-i
: Case-insensitive mode.