How to Remove Duplicate Lines in Linux

The uniq command in Linux filters or reports adjacent duplicate lines in a text file or input stream. It is commonly used to remove duplicates, count occurrences, or identify unique/repeated lines. For non-adjacent duplicates, pair uniq with sort to pre-process the input. Below are practical examples:

Contents

Remove Adjacent Duplicate Lines Remove All Duplicates (After Sorting)Count Occurrences of Each Line Show Only Duplicated Lines Show Only Unique Lines Case-Insensitive Comparison Skip Fields Before Checking Skip Characters Before Checking Compare Only the First N Characters Combine with cut to Process Columns Check for Duplicates in Raw (Unsorted) Files

Remove Adjacent Duplicate Lines

uniq file.txt

Deletes consecutive duplicate lines in file.txt (requires duplicates to be adjacent).

Remove All Duplicates (After Sorting)

sort file.txt | uniq

Sorts the file first, then removes all duplicates.

Count Occurrences of Each Line

sort file.txt | uniq -c

-c adds a count of occurrences (e.g., 3 apples).

Show Only Duplicated Lines

sort file.txt | uniq -d

-d prints lines that appear more than once.

Show Only Unique Lines

sort file.txt | uniq -u

-u prints lines that appear exactly once.

Case-Insensitive Comparison

sort file.txt | uniq -i

-i ignores case differences (e.g., Error = error).

Skip Fields Before Checking

sort -t',' -k2 data.csv | uniq -f2

-t',' uses comma as the delimiter.
-f2 skips the first 2 fields when comparing lines.

Skip Characters Before Checking

sort file.txt | uniq -s5

-s5 ignores the first 5 characters of each line.

Compare Only the First N Characters

sort file.txt | uniq -w10

-w10 compares only the first 10 characters.

Combine with `cut` to Process Columns

cut -d',' -f1 data.csv | sort | uniq

Extracts the first CSV column, sorts it, and removes duplicates.

Check for Duplicates in Raw (Unsorted) Files

uniq raw_data.txt

Note: Only removes adjacent duplicates. Non-adjacent duplicates remain.

Key Notes:

Sorted Input: Always use sort before uniq unless duplicates are guaranteed to be adjacent.
Delimiters: Use -t with sort or cut for structured data (e.g., CSV).
Options:
-c: Count lines.
-d: Show duplicates.
-u: Show uniques.
-i: Case-insensitive mode.

Popular Post

How to Disable Apps on Android and What Happens When You Do

Varnish Cache + SSL Setup for WordPress, Apache, and Ubuntu

Varnish Cache with Nginx and WordPress on Ubuntu

Varnish Cache with Nginx and WordPress on CentOS

How to Remove Duplicate Lines in Linux

Remove Adjacent Duplicate Lines

Remove All Duplicates (After Sorting)

Count Occurrences of Each Line

Show Only Duplicated Lines

Show Only Unique Lines

Case-Insensitive Comparison

Skip Fields Before Checking

Skip Characters Before Checking

Compare Only the First N Characters

Combine with `cut` to Process Columns

Check for Duplicates in Raw (Unsorted) Files

Key Notes:

Must Read

How to Disable Apps on Android and What Happens When You Do

Varnish Cache + SSL Setup for WordPress, Apache, and Ubuntu

Varnish Cache with Nginx and WordPress on Ubuntu

Varnish Cache with Nginx and WordPress on CentOS

You Might also Like

How to Create and Extract Tar Archive Files in Linux

How to Cut Sections of Line in File

How to Install Nagios XI on Ubuntu 22.04

How to Use tee Command in Linux

Remove Adjacent Duplicate Lines

Remove All Duplicates (After Sorting)

More Read

Count Occurrences of Each Line

Show Only Duplicated Lines

Show Only Unique Lines

Case-Insensitive Comparison

Skip Fields Before Checking

Skip Characters Before Checking

Compare Only the First N Characters

Combine with cut to Process Columns

Check for Duplicates in Raw (Unsorted) Files

Key Notes:

Must Read

You Might also Like

Combine with `cut` to Process Columns