Delete Duplicate Rows

Delete Duplicate Rows explained with practical SQL patterns, edge cases, and production-ready guidance.

6 min read · Last updated: March 2026 · Back to overview

Quick Answer

To delete duplicate rows in PostgreSQL, use DELETE ... USING with a self-join to keep the row with the highest (or lowest) ID, or use a subquery with ROW_NUMBER() OVER (PARTITION BY ...) to identify and delete all but one duplicate per group.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

Duplicate rows accumulate when tables lack unique constraints or data is imported from external sources without deduplication. PostgreSQL offers several approaches to find and remove duplicates: a self-join with DELETE USING, a ROW_NUMBER() subquery, and an immediate table technique.

Finding duplicate rows

Before deleting, identify which values are duplicated and how many copies exist:

SELECT fruit, COUNT(fruit)
FROM basket
GROUP BY fruit
HAVING COUNT(fruit) > 1
ORDER BY fruit;

Output:

  fruit  | count
---------+-------
 apple   |     2
 orange  |     3
(2 rows)

Method 1: DELETE USING with a self-join

Join the table to itself and delete the rows with lower IDs, keeping the row with the highest ID per duplicate group:

DELETE FROM basket a
USING basket b
WHERE a.id < b.id
  AND a.fruit = b.fruit;

After this, each fruit value appears only once — specifically the row with the highest ID. To keep the lowest ID instead, flip the comparison operator:

DELETE FROM basket a
USING basket b
WHERE a.id > b.id
  AND a.fruit = b.fruit;

Method 2: Subquery with ROW_NUMBER()

Use ROW_NUMBER() partitioned by the duplicate column to number each duplicate group, then delete rows with a row number greater than 1:

DELETE FROM basket
WHERE id IN (
  SELECT id
  FROM (
    SELECT id,
           ROW_NUMBER() OVER (PARTITION BY fruit ORDER BY id) AS row_num
    FROM basket
  ) t
  WHERE t.row_num > 1
);

This keeps the row with the lowest ID per group. To keep the highest, change ORDER BY id to ORDER BY id DESC.

For duplicates defined across multiple columns, partition by all of them:

DELETE FROM table_name
WHERE id IN (
  SELECT id
  FROM (
    SELECT id,
           ROW_NUMBER() OVER (PARTITION BY column_1, column_2 ORDER BY id) AS row_num
    FROM table_name
  ) t
  WHERE t.row_num > 1
);

Method 3: Immediate table

Create a deduplicated copy, drop the original, and rename the copy. This is useful when the table lacks a primary key or other unique identifier:

-- 1. Create a new table with the same structure
CREATE TABLE basket_temp (LIKE basket);

-- 2. Insert only distinct rows
INSERT INTO basket_temp (fruit, id)
SELECT DISTINCT ON (fruit) fruit, id
FROM basket;

-- 3. Drop the original
DROP TABLE basket;

-- 4. Rename the copy
ALTER TABLE basket_temp RENAME TO basket;

Practical tips

  • Always run a SELECT with your deduplication logic first to confirm which rows will be deleted before executing the DELETE.
  • Wrap the delete in a transaction so you can roll back if the result is not what you expected.
  • After removing duplicates, add a UNIQUE constraint on the relevant columns to prevent future duplicates from accumulating.
  • The self-join method (DELETE USING) is typically faster on small-to-medium tables. The ROW_NUMBER() subquery is more flexible for complex duplicate definitions.

Reference: PostgreSQL documentation — DELETE.

Continue in Modifying Data: Upsert.

Related in this section: INSERT · INSERT Multiple Rows · UPDATE

Frequently Asked Questions

What is the fastest way to delete duplicate rows in PostgreSQL?

The DELETE USING self-join is typically the most straightforward and performant for simple cases. For complex deduplication (multiple columns, specific row to keep), the ROW_NUMBER() subquery approach is more flexible.

How do I keep the most recently inserted duplicate row?

Keep the row with the highest ID (assuming IDs are assigned sequentially): DELETE FROM basket a USING basket b WHERE a.id < b.id AND a.fruit = b.fruit; This removes rows with lower IDs, leaving the highest.

How do I delete duplicates defined across multiple columns?

Use the ROW_NUMBER() approach with PARTITION BY on all relevant columns: DELETE FROM t WHERE id IN (SELECT id FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY id) AS rn FROM t) x WHERE rn > 1);

How do I prevent duplicate rows from appearing in the future?

Add a UNIQUE constraint on the column or combination of columns that should be unique: ALTER TABLE basket ADD CONSTRAINT basket_fruit_unique UNIQUE (fruit); PostgreSQL will then reject any INSERT or UPDATE that would create a duplicate.

Can I use the immediate table method if the table has foreign key references?

The immediate table method (CREATE, INSERT DISTINCT, DROP, RENAME) will fail if other tables have foreign keys referencing the original table. In that case, use the DELETE USING or subquery methods, which operate on the table in place.