PostgreSQL Self-Join

Learn how PostgreSQL self-joins work: querying hierarchical data, finding pairs within the same table, choosing INNER vs LEFT JOIN, and performance guidance.

5 min read · PostgreSQL 9.4+ · Back to overview

Quick Answer

A PostgreSQL self-join is any standard JOIN where both sides reference the same table, distinguished by two different aliases. Self-joins are used to query hierarchical data (employees and their managers) or to find matching pairs within the same table.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

A self-join joins a table to itself. You reference the same table twice in the FROM clause — once for each role the data plays — and assign each reference a distinct alias so PostgreSQL can tell them apart. Self-joins are the natural tool for hierarchical data and for finding pairs of rows within the same table that share a common value.

Syntax

Using INNER JOIN (excludes rows with no matching partner):

SELECT select_list
FROM table_name t1
INNER JOIN table_name t2
  ON join_predicate;

Using LEFT JOIN (includes rows even when no matching partner exists):

SELECT select_list
FROM table_name t1
LEFT JOIN table_name t2
  ON join_predicate;

Practical Example

The following staff table represents employees in an organization, where each employee may report to a manager who is also a row in the same table:

CREATE TABLE staff (
  staff_id    SERIAL PRIMARY KEY,
  full_name   VARCHAR(120) NOT NULL,
  job_title   VARCHAR(80),
  manager_id  INT REFERENCES staff(staff_id)
);

INSERT INTO staff (full_name, job_title, manager_id) VALUES
  ('Eleanor Vasquez', 'VP of Engineering',  NULL),   -- top of hierarchy
  ('Kwame Asante',    'Engineering Manager', 1),
  ('Lena Bauer',      'Engineering Manager', 1),
  ('Rami El-Said',    'Senior Engineer',     2),
  ('Yuki Mori',       'Engineer',            2),
  ('Carmen Ruiz',     'Engineer',            3),
  ('Tobias Wirth',    'Junior Engineer',     3);

List employees and their managers (INNER JOIN)

SELECT
  e.full_name  AS employee,
  e.job_title,
  m.full_name  AS manager
FROM staff e
INNER JOIN staff m ON m.staff_id = e.manager_id
ORDER BY m.full_name, e.full_name;

Eleanor Vasquez is absent because she has no manager (manager_id IS NULL) and INNER JOIN requires a match on both sides.

Include the top-level employee (LEFT JOIN)

SELECT
  e.full_name  AS employee,
  e.job_title,
  m.full_name  AS manager
FROM staff e
LEFT JOIN staff m ON m.staff_id = e.manager_id
ORDER BY m.full_name NULLS LAST, e.full_name;

Eleanor now appears with NULL in the manager column — the LEFT JOIN preserves all rows from the e alias even when no matching row exists in the m alias.

Verify the self-referencing foreign key

SELECT
  conname   AS constraint_name,
  pg_get_constraintdef(oid) AS definition
FROM pg_constraint
WHERE conrelid = 'staff'::regclass
  AND contype  = 'f';

Finding Matching Pairs Within the Same Table

A self-join can compare every row against every other row to find pairs that share a common attribute. The < inequality on the primary key prevents self-matches and duplicate pairs:

-- Find pairs of staff with the same job title
SELECT
  s1.full_name AS employee_a,
  s2.full_name AS employee_b,
  s1.job_title
FROM staff s1
INNER JOIN staff s2
  ON  s1.job_title = s2.job_title
  AND s1.staff_id  < s2.staff_id   -- prevents (A,B) and (B,A) duplicates
ORDER BY s1.job_title, s1.full_name;

INNER vs LEFT JOIN in Self-Joins

INNER JOIN self-joinLEFT JOIN self-join
Root node (NULL foreign key)ExcludedIncluded (NULL on parent side)
Non-root nodesIncludedIncluded
Use caseStrict hierarchy traversalFull hierarchy including root

Testing with Vela

Hierarchical self-joins often need index tuning — an index on manager_id can make a large org-chart query orders of magnitude faster. Vela’s database branching lets you add the index on a production-data branch, run EXPLAIN ANALYZE to confirm the improvement, and then apply the change to production with confidence.

Production Tips

  • Always assign meaningful aliases (e for employee, m for manager) so the query is self-documenting.
  • Add an index on the self-referencing foreign key column (e.g., manager_id) — PostgreSQL does not create this index automatically and unindexed self-joins cause full table scans.
  • Use LEFT JOIN rather than INNER JOIN when the root of the hierarchy must appear in results.
  • For unique pair comparisons within the same table, use t1.id < t2.id rather than t1.id != t2.id to eliminate both self-matches and reversed duplicates in a single condition.
  • For hierarchies deeper than two or three levels, switch to a WITH RECURSIVE Common Table Expression — it handles arbitrary depth without stacking multiple self-joins.
  • Self-joins on large tables produce row counts proportional to n², so ensure your ON predicate is selective enough to keep the result set manageable.

Continue in Join Tables: FULL OUTER JOIN.

Related in this section: PostgreSQL Joins · Table Aliases · PostgreSQL INNER JOIN

Frequently Asked Questions

What is a self-join in PostgreSQL?
A self-join is a regular join — INNER JOIN, LEFT JOIN, or any other type — where both the left and right sides reference the same physical table, each given a distinct alias. PostgreSQL treats it identically to a join between two different tables; the aliases make the two references distinguishable.
Does a self-join lock the table twice?
No. PostgreSQL acquires one AccessShareLock on the table regardless of how many times it appears in the FROM clause within the same query. The same table referenced under two aliases does not require or produce two separate locks.
What happens to the root node of a hierarchy in a self-join?
With INNER JOIN, the root row (the one with NULL in the parent foreign key) is excluded because no matching row exists for it. Use LEFT JOIN instead — the root row appears with NULL in the parent-side columns, letting you see the entire hierarchy including the top-level node.
How do I avoid duplicate pairs in a self-join comparison?
Use a strict inequality on the primary key: ON t1.id < t2.id and the same join condition. This prevents a row from matching itself and ensures each pair (A, B) appears only once rather than as both (A, B) and (B, A).
What is the safest way to use self-joins in production with deep hierarchies?
For hierarchies deeper than two or three levels, replace the self-join with a recursive Common Table Expression (WITH RECURSIVE). Recursive CTEs handle arbitrary depth in one query, are easier to read, and let PostgreSQL optimize the traversal more efficiently than stacked self-joins.