DISTINCT ON in PostgreSQL

Learn how PostgreSQL's DISTINCT ON clause retrieves the first row per group, how ORDER BY controls which row is kept, and when to use it versus window functions.

6 min read · PostgreSQL 7.4+ · Back to overview

Quick Answer

DISTINCT ON (column) is a PostgreSQL-specific clause that returns one row per unique value of the specified column. An ORDER BY clause controls which row is considered 'first' within each group — the DISTINCT ON expression must match the leftmost ORDER BY expression.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

DISTINCT ON is a PostgreSQL-specific extension to standard SQL that retrieves exactly one row per unique value of a specified column (or combination of columns). Unlike SELECT DISTINCT, which only eliminates rows that are completely identical across all selected columns, DISTINCT ON lets the other columns in each row vary — it simply picks the “first” row per group as determined by the ORDER BY clause.

This makes DISTINCT ON the most concise tool for “latest record per entity,” “highest score per player,” or “most recent event per category” queries.

Syntax

SELECT DISTINCT ON (expression1, expression2, ...)
  expression1, expression2, other_columns, ...
FROM table_name
ORDER BY expression1, expression2, ..., tiebreaker_column [ASC | DESC];

Key rule: the expressions inside DISTINCT ON (...) must exactly match the leftmost expressions in ORDER BY. The remaining ORDER BY expressions determine which row within each group is kept.

Practical Example

Create a table of sensor readings from IoT devices reporting temperature every few minutes:

CREATE TABLE sensor_readings (
  id          SERIAL PRIMARY KEY,
  device_id   VARCHAR(20) NOT NULL,
  location    VARCHAR(100) NOT NULL,
  temperature NUMERIC(5, 2) NOT NULL,
  recorded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

INSERT INTO sensor_readings (device_id, location, temperature, recorded_at) VALUES
  ('DEV-A1', 'Server Room',  22.1, '2026-04-07 08:00:00+00'),
  ('DEV-A1', 'Server Room',  22.4, '2026-04-07 08:05:00+00'),
  ('DEV-A1', 'Server Room',  23.0, '2026-04-07 08:10:00+00'),
  ('DEV-B2', 'Office Floor', 20.5, '2026-04-07 08:00:00+00'),
  ('DEV-B2', 'Office Floor', 20.3, '2026-04-07 08:05:00+00'),
  ('DEV-C3', 'Data Center',  19.8, '2026-04-07 08:00:00+00'),
  ('DEV-C3', 'Data Center',  20.1, '2026-04-07 08:05:00+00'),
  ('DEV-C3', 'Data Center',  20.6, '2026-04-07 08:10:00+00');

Get the most recent reading per device:

SELECT DISTINCT ON (device_id)
  device_id,
  location,
  temperature,
  recorded_at
FROM sensor_readings
ORDER BY device_id, recorded_at DESC;

Result:

device_id | location     | temperature | recorded_at
-----------+--------------+-------------+------------------------
DEV-A1    | Server Room  |       23.00 | 2026-04-07 08:10:00+00
DEV-B2    | Office Floor |       20.30 | 2026-04-07 08:05:00+00
DEV-C3    | Data Center  |       20.60 | 2026-04-07 08:10:00+00

DISTINCT ON (device_id) groups rows by device_id. ORDER BY device_id, recorded_at DESC sorts within each group so the most recent reading is first — that is the row kept.

Get the highest temperature reading per device:

SELECT DISTINCT ON (device_id)
  device_id,
  temperature AS peak_temp,
  recorded_at AS peak_at
FROM sensor_readings
ORDER BY device_id, temperature DESC;

Get the most recent reading per location:

SELECT DISTINCT ON (location)
  location,
  device_id,
  temperature,
  recorded_at
FROM sensor_readings
ORDER BY location, recorded_at DESC;

Verify with pg_catalog:

-- Check the query plan to confirm index usage
EXPLAIN (ANALYZE, BUFFERS)
SELECT DISTINCT ON (device_id)
  device_id, temperature, recorded_at
FROM sensor_readings
ORDER BY device_id, recorded_at DESC;

DISTINCT ON vs. Window Functions

Both DISTINCT ON and ROW_NUMBER() can solve “first row per group” queries. Choose based on context:

ApproachSyntaxWhen to prefer
DISTINCT ONConcise, PostgreSQL-onlySimple “latest/first per group” in PostgreSQL
ROW_NUMBER()Verbose, portable SQLCross-database portability, or when you need multiple rows per group

Equivalent query using ROW_NUMBER():

SELECT device_id, temperature, recorded_at
FROM (
  SELECT
    device_id,
    temperature,
    recorded_at,
    ROW_NUMBER() OVER (PARTITION BY device_id ORDER BY recorded_at DESC) AS rn
  FROM sensor_readings
) ranked
WHERE rn = 1
ORDER BY device_id;

Both produce the same result. DISTINCT ON is more concise; the window function version works in MySQL, SQL Server, and other databases.

Testing with Vela

DISTINCT ON queries with composite indexes are sensitive to data distribution — the query plan that’s fast on a small development dataset may perform differently at production scale. Vela’s database branching lets you clone your production database and benchmark your DISTINCT ON queries with realistic data volumes and cardinalities before deploying. You can also test the composite index creation (CREATE INDEX ON sensor_readings (device_id, recorded_at DESC)) on a branch to confirm performance without impacting production.

Production Tips

  • Create a composite index matching your DISTINCT ON and ORDER BY columns: CREATE INDEX ON sensor_readings (device_id, recorded_at DESC). PostgreSQL can then satisfy the query with an index scan in the correct order, skipping the sort step entirely.
  • Always include ORDER BY in DISTINCT ON queries. Without it, the row selected within each group is non-deterministic and can change between executions.
  • The DISTINCT ON expressions must be the leftmost expressions in ORDER BY in the same order. Violating this rule produces an error: SELECT DISTINCT ON expressions must match initial ORDER BY expressions.
  • DISTINCT ON is PostgreSQL-specific. If your application must support other databases, use the ROW_NUMBER() window function approach instead.
  • For very large tables where only a few distinct groups exist (low cardinality), DISTINCT ON with a proper index is typically faster than a GROUP BY + subquery approach.
  • When combined with LIMIT, PostgreSQL can short-circuit the scan — once it has found one row per group for the requested limit, it stops reading. This makes DISTINCT ON particularly efficient for paginated “latest per entity” feeds.

Continue in Querying Data: Back to tutorial overview.

Related in this section: PostgreSQL SELECT · Column Aliases · ORDER BY

Frequently Asked Questions

What is the difference between DISTINCT ON and SELECT DISTINCT?
SELECT DISTINCT removes rows where all selected columns are identical. DISTINCT ON (col) keeps one row per unique value of col — the other columns in that row can vary across groups. DISTINCT ON is designed for 'get the latest/best/first row per group' queries, while SELECT DISTINCT is for simple full-row deduplication.
Does DISTINCT ON lock the table?
No. DISTINCT ON is evaluated as part of a SELECT statement, which acquires only an ACCESS SHARE lock that does not block concurrent writes. The deduplication happens after rows are retrieved and sorted — no additional locking is involved.
Why does PostgreSQL require ORDER BY with DISTINCT ON?
Without ORDER BY, PostgreSQL has no defined way to choose which row within a group is 'first.' The ORDER BY clause makes the selection deterministic. The DISTINCT ON expression must match the leftmost ORDER BY expression — this is enforced by the query planner and produces an error if violated.
Can I use DISTINCT ON with multiple columns?
Yes. DISTINCT ON (col1, col2) treats the combination of col1 and col2 as the grouping key, keeping one row per unique (col1, col2) pair. The ORDER BY clause must begin with col1, col2 in the same order, followed by any additional sort columns.
Is DISTINCT ON available in MySQL, SQL Server, or other databases?
No. DISTINCT ON is a PostgreSQL-specific extension. The portable equivalent in other databases uses a subquery or CTE with ROW_NUMBER() OVER (PARTITION BY col ORDER BY sort_col DESC) and then filters WHERE rn = 1.