Introduction to PostgreSQL GROUP BY
GROUP BY divides rows into groups based on shared column values, then applies aggregate functions to compute summary values for each group.
Syntax:
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1
ORDER BY column1;
Every non-aggregated column in SELECT must appear in GROUP BY.
Aggregate functions used with GROUP BY
COUNT(*)— number of rows in the groupCOUNT(col)— number of non-NULL valuesSUM(col)— sum of valuesAVG(col)— average valueMIN(col)/MAX(col)— smallest / largest value
Example: orders per customer
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
ORDER BY order_count DESC;
Example: revenue by month
SELECT
date_trunc('month', placed_at) AS month,
COUNT(*) AS orders,
SUM(total_amount) AS revenue,
AVG(total_amount) AS avg_order_value
FROM orders
WHERE placed_at >= '2024-01-01'
GROUP BY 1
ORDER BY 1;
GROUP BY with multiple columns
-- Revenue broken down by year and product category
SELECT
date_part('year', o.placed_at) AS year,
p.category,
SUM(oi.unit_price * oi.quantity) AS revenue
FROM order_items oi
JOIN orders o ON o.order_id = oi.order_id
JOIN products p ON p.product_id = oi.product_id
GROUP BY 1, 2
ORDER BY 1, revenue DESC;
Filtering groups with HAVING
HAVING filters groups after aggregation (unlike WHERE, which filters rows before grouping):
-- Only customers with more than 5 orders
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;
COUNT DISTINCT: unique values per group
-- Unique products ordered per customer
SELECT
customer_id,
COUNT(DISTINCT product_id) AS unique_products
FROM order_items oi
JOIN orders o USING (order_id)
GROUP BY customer_id;
Production tips
- Index columns used in GROUP BY when the query is part of a reporting or analytics workload.
- Prefer
HAVINGfor post-aggregation filters andWHEREfor pre-aggregation filters — WHERE can use indexes; HAVING cannot. - Use
GROUPING SETSorROLLUPfor multi-level aggregations without writing multiple UNION ALL queries. - For dashboards querying large tables, consider partial indexes or materialized views to avoid full table scans on every request.
Reference: PostgreSQL documentation — GROUP BY and HAVING.