XML Data Type in PostgreSQL

Learn how to store and query XML documents in PostgreSQL using the XML type: XMLPARSE, the xpath() function, extracting fields, filtering rows, and creating expression indexes on XML data.

5 min read · Back to overview

Quick Answer

The PostgreSQL XML type stores well-formed XML documents or content fragments with built-in validation. Use XMLPARSE() to insert XML data and xpath() to extract values using XPath expressions. For new schemas, prefer JSONB — use XML when integrating with systems that require XML.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

The PostgreSQL XML data type stores well-formed XML documents directly in the database. PostgreSQL validates XML on insert and provides built-in XPath query support via the xpath() function. Use this type when integrating with systems that produce or require XML — legacy enterprise APIs, SOAP services, or EDI pipelines. For new designs, JSONB is usually the better choice.

Syntax

column_name XML

Practical Example

Store product specifications as XML documents for compatibility with a legacy procurement system:

CREATE TABLE products (
  id    SERIAL PRIMARY KEY,
  sku   VARCHAR(50) NOT NULL UNIQUE,
  spec  XML
);

-- Insert using XMLPARSE(DOCUMENT ...) for full XML documents
INSERT INTO products (sku, spec) VALUES
  ('BOLT-TI-M6',
   XMLPARSE(DOCUMENT '<?xml version="1.0"?>
   <product>
     <name>Titanium M6 Bolt</name>
     <material>titanium</material>
     <weight_g>4.2</weight_g>
     <thread>M6</thread>
   </product>')),
  ('PANEL-CF-A4',
   XMLPARSE(DOCUMENT '<?xml version="1.0"?>
   <product>
     <name>Carbon Fiber Panel A4</name>
     <material>carbon_fiber</material>
     <weight_g>82.0</weight_g>
     <dimensions><w>210</w><h>297</h><t>2</t></dimensions>
   </product>'));

Querying XML with xpath()

-- Extract a single text value (returns XML array; cast element to text)
SELECT
  sku,
  (xpath('/product/name/text()', spec))[1]::text        AS name,
  (xpath('/product/material/text()', spec))[1]::text    AS material,
  (xpath('/product/weight_g/text()', spec))[1]::text::numeric AS weight_g
FROM products;
    sku      |         name          | material    | weight_g
-------------+-----------------------+-------------+----------
 BOLT-TI-M6  | Titanium M6 Bolt      | titanium    |      4.2
 PANEL-CF-A4 | Carbon Fiber Panel A4 | carbon_fiber|     82.0

Extracting Nested Fields

-- Extract a nested element
SELECT
  sku,
  (xpath('/product/dimensions/w/text()', spec))[1]::text AS width_mm,
  (xpath('/product/dimensions/h/text()', spec))[1]::text AS height_mm
FROM products
WHERE xpath_exists('/product/dimensions', spec);

Filtering Rows by XML Content

-- Filter using xpath() in WHERE
SELECT sku
FROM products
WHERE (xpath('/product/material/text()', spec))[1]::text = 'titanium';

-- xpath_exists() is cleaner for existence checks
SELECT sku
FROM products
WHERE xpath_exists('/product/dimensions', spec);

Creating an Expression Index on XML Data

XPath queries on large tables are slow without an index. Create a B-tree expression index:

CREATE INDEX CONCURRENTLY idx_products_material
ON products USING BTREE
  (cast(xpath('/product/material', spec) AS text[]));

-- Query using the same expression to trigger the index
EXPLAIN ANALYZE
SELECT sku FROM products
WHERE cast(xpath('/product/material', spec) AS text[])
    = '{<material>titanium</material>}';

Testing with Vela

XPath expression indexes have a fragile contract — the query expression must match the index expression exactly. Before adding an XML expression index to a production table, test on a database branch to verify the EXPLAIN ANALYZE output shows an index scan and that the cast expression matches what your application generates.

Production Tips

  • For new designs, prefer JSONB over XML — it has richer operator support, better indexing, and is more commonly used in modern applications.
  • Use XML when integrating with systems that produce or require XML (legacy enterprise APIs, SOAP services, XSLT pipelines).
  • Always cast extracted XML values to text before casting to another type — PostgreSQL does not support direct casting from XML to integer or date.
  • Create expression indexes on frequently queried XPath expressions using CREATE INDEX CONCURRENTLY to avoid full table scans without blocking production traffic.
  • Use XMLPARSE(CONTENT ...) for XML fragments (no root element required) and XMLPARSE(DOCUMENT ...) for full XML documents with a declaration.

Continue in PostgreSQL Data Types: BYTEA.

Related in this section: Boolean · CHAR, VARCHAR, and TEXT · NUMERIC

Frequently Asked Questions

What is the PostgreSQL XML data type?
The XML type stores XML documents or content fragments. PostgreSQL validates that inserted data is well-formed XML. It provides built-in functions for querying (xpath()), generating (xmlelement(), xmlforest()), and transforming XML data. Use it when integrating with systems that produce or consume XML, such as legacy enterprise APIs or SOAP services.
Does inserting XML data lock the table?
No. INSERT and UPDATE on XML columns acquire only row-level locks. The XML well-formedness validation happens at parse time without blocking other connections.
What happens to expression indexes when I change the XPath expression in a query?
If you create an expression index on a specific XPath expression and then query with a different XPath, the index will not be used. Keep the expression in your WHERE clause identical to the expression in the index definition.
Can I cast an XML value directly to integer or date in PostgreSQL?
No. PostgreSQL does not support direct casting from XML to most other types. You must first extract the value as text with ::text, then cast to the target type: (xpath('/path/text()', col))[1]::text::integer.
What is the safest way to index XML data in production?
Create a B-tree expression index on the specific XPath value you query most frequently: CREATE INDEX CONCURRENTLY idx ON t USING BTREE ((xpath('/path', col)::text[])). Use CONCURRENTLY to avoid blocking. Test the query plan on a Vela branch to confirm the index is used.