How to Remove Duplicate Lines in Linux

The uniq command in Linux filters or reports adjacent duplicate lines in a text file or input stream. It is commonly used to remove duplicates, count occurrences, or identify unique/repeated lines. For non-adjacent duplicates, pair uniq with sort to pre-process the input. Below are practical examples:

Contents

Remove Adjacent Duplicate Lines
Remove All Duplicates (After Sorting)
Count Occurrences of Each Line
Show Only Duplicated Lines
Show Only Unique Lines
Case-Insensitive Comparison
Skip Fields Before Checking
Skip Characters Before Checking
Compare Only the First N Characters
Combine with cut to Process Columns
Check for Duplicates in Raw (Unsorted) Files

Remove Adjacent Duplicate Lines

uniq file.txt

Deletes consecutive duplicate lines in file.txt (requires duplicates to be adjacent).

Remove All Duplicates (After Sorting)

sort file.txt | uniq

Sorts the file first, then removes all duplicates.

Count Occurrences of Each Line

sort file.txt | uniq -c

-c adds a count of occurrences (e.g., 3 apples).

Show Only Duplicated Lines

sort file.txt | uniq -d

-d prints lines that appear more than once.

Show Only Unique Lines

sort file.txt | uniq -u

-u prints lines that appear exactly once.

Case-Insensitive Comparison

sort file.txt | uniq -i

-i ignores case differences (e.g., Error = error).

Skip Fields Before Checking

sort -t',' -k2 data.csv | uniq -f2

-t',' uses comma as the delimiter.
-f2 skips the first 2 fields when comparing lines.

Skip Characters Before Checking

sort file.txt | uniq -s5

-s5 ignores the first 5 characters of each line.

Compare Only the First N Characters

sort file.txt | uniq -w10

-w10 compares only the first 10 characters.

Combine with `cut` to Process Columns

cut -d',' -f1 data.csv | sort | uniq

Extracts the first CSV column, sorts it, and removes duplicates.

Check for Duplicates in Raw (Unsorted) Files

uniq raw_data.txt

Note: Only removes adjacent duplicates. Non-adjacent duplicates remain.

Key Notes:

Sorted Input: Always use sort before uniq unless duplicates are guaranteed to be adjacent.
Delimiters: Use -t with sort or cut for structured data (e.g., CSV).
Options:
-c: Count lines.
-d: Show duplicates.
-u: Show uniques.
-i: Case-insensitive mode.

Popular Post

Which Music Platforms Offer the Most Tracks?

How to Disable Ads in Windows 11 Start Menu

How to Add Google Drive to File Explorer in Windows 11

How to Reset Windows 11 Without Losing Data

How to Remove Duplicate Lines in Linux

Remove Adjacent Duplicate Lines

Remove All Duplicates (After Sorting)

Count Occurrences of Each Line

Show Only Duplicated Lines

Show Only Unique Lines

Case-Insensitive Comparison

Skip Fields Before Checking

Skip Characters Before Checking

Compare Only the First N Characters

Combine with `cut` to Process Columns

Check for Duplicates in Raw (Unsorted) Files

Key Notes:

Must Read

Which Music Platforms Offer the Most Tracks?

How to Disable Ads in Windows 11 Start Menu

How to Add Google Drive to File Explorer in Windows 11

How to Reset Windows 11 Without Losing Data

You Might also Like

How to Kill a Running Linux Process

How to Find Files and Directories in Linux

Difference Between “more” and “less” Commands

How to Install ZFS on Debian/Ubuntu

Remove Adjacent Duplicate Lines

Remove All Duplicates (After Sorting)

More Read

Count Occurrences of Each Line

Show Only Duplicated Lines

Show Only Unique Lines

Case-Insensitive Comparison

Skip Fields Before Checking

Skip Characters Before Checking

Compare Only the First N Characters

Combine with cut to Process Columns

Check for Duplicates in Raw (Unsorted) Files

Key Notes:

Must Read

You Might also Like

Combine with `cut` to Process Columns