PostgreSQL GROUP BY

How PostgreSQL GROUP BY works: grouping rows with aggregate functions, combining with WHERE and HAVING, grouping by expressions, and common analytics patterns.

12 min read · Last updated: March 2026 · Back to overview

Quick Answer

GROUP BY groups rows that have the same values in specified columns into summary rows. It is always used with aggregate functions (COUNT, SUM, AVG, MIN, MAX) to compute one value per group. Every column in the SELECT list must either be in GROUP BY or wrapped in an aggregate function.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

Introduction to PostgreSQL GROUP BY

GROUP BY divides rows into groups based on shared column values, then applies aggregate functions to compute summary values for each group.

Syntax:

SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1
ORDER BY column1;

Every non-aggregated column in SELECT must appear in GROUP BY.

Aggregate functions used with GROUP BY

  • COUNT(*) — number of rows in the group
  • COUNT(col) — number of non-NULL values
  • SUM(col) — sum of values
  • AVG(col) — average value
  • MIN(col) / MAX(col) — smallest / largest value

Example: orders per customer

SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
ORDER BY order_count DESC;

Example: revenue by month

SELECT
  date_trunc('month', placed_at) AS month,
  COUNT(*) AS orders,
  SUM(total_amount) AS revenue,
  AVG(total_amount) AS avg_order_value
FROM orders
WHERE placed_at >= '2024-01-01'
GROUP BY 1
ORDER BY 1;

GROUP BY with multiple columns

-- Revenue broken down by year and product category
SELECT
  date_part('year', o.placed_at) AS year,
  p.category,
  SUM(oi.unit_price * oi.quantity) AS revenue
FROM order_items oi
JOIN orders o ON o.order_id = oi.order_id
JOIN products p ON p.product_id = oi.product_id
GROUP BY 1, 2
ORDER BY 1, revenue DESC;

Filtering groups with HAVING

HAVING filters groups after aggregation (unlike WHERE, which filters rows before grouping):

-- Only customers with more than 5 orders
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;

COUNT DISTINCT: unique values per group

-- Unique products ordered per customer
SELECT
  customer_id,
  COUNT(DISTINCT product_id) AS unique_products
FROM order_items oi
JOIN orders o USING (order_id)
GROUP BY customer_id;

Production tips

  • Index columns used in GROUP BY when the query is part of a reporting or analytics workload.
  • Prefer HAVING for post-aggregation filters and WHERE for pre-aggregation filters — WHERE can use indexes; HAVING cannot.
  • Use GROUPING SETS or ROLLUP for multi-level aggregations without writing multiple UNION ALL queries.
  • For dashboards querying large tables, consider partial indexes or materialized views to avoid full table scans on every request.

Reference: PostgreSQL documentation — GROUP BY and HAVING.

Continue in Group Data: HAVING.

Related in this section: HAVING · GROUPING SETS · CUBE

Frequently Asked Questions

What is the difference between WHERE and HAVING in PostgreSQL GROUP BY queries?

WHERE filters rows before grouping — it operates on individual rows and cannot reference aggregate functions. HAVING filters groups after aggregation — it operates on group-level aggregates. Example: WHERE amount > 10 keeps only rows where amount > 10 before grouping; HAVING SUM(amount) > 100 keeps only groups where the sum exceeds 100 after grouping. You can use both in the same query.

Do I need to list all non-aggregated columns in GROUP BY?

Yes. In standard SQL (and PostgreSQL), every column in the SELECT list that is not inside an aggregate function must appear in the GROUP BY clause. PostgreSQL will throw an error if you reference a non-aggregated column that is not in GROUP BY. Exception: if you group by a primary key column, you can reference any other column from the same table without including it in GROUP BY (since the primary key uniquely identifies the row).

Can I GROUP BY an expression or alias in PostgreSQL?

You can GROUP BY an expression directly: GROUP BY date_trunc('month', created_at). PostgreSQL also allows referencing a SELECT column position: GROUP BY 1 means "group by the first column in the SELECT list." This is a PostgreSQL-specific convenience. You cannot use a SELECT alias in GROUP BY because GROUP BY is evaluated before SELECT.

How do I count distinct values per group?

Use COUNT(DISTINCT column) inside the aggregate. For example: SELECT category, COUNT(DISTINCT user_id) AS unique_users FROM events GROUP BY category. This counts unique user IDs per category, not total rows. Note that COUNT(DISTINCT) can be slower than COUNT(*) on large datasets — for performance-critical queries, consider using a subquery or window function.