BYTEA Data Type in PostgreSQL

Learn how to use the PostgreSQL BYTEA type to store binary data: hex and escape output formats, inserting binary strings, common functions, and when to use BYTEA versus external file storage.

4 min read · Back to overview

Quick Answer

The PostgreSQL BYTEA type stores variable-length binary strings — raw byte sequences up to 1 GB. Use BYTEA for small binary objects like cryptographic hashes, digital signatures, or certificates. For large files, store them externally and save the path in a VARCHAR column.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

The PostgreSQL BYTEA (binary array) type stores variable-length sequences of bytes — raw binary data including null bytes and non-printable characters. The maximum size is 1 GB per column value. Common use cases include cryptographic hashes, digital signatures, encrypted values, and small binary file content.

Syntax

column_name BYTEA

Storage and Output Formats

PostgreSQL stores BYTEA values as raw bytes and displays them in either of two formats:

-- Hex format (default since PostgreSQL 9.0)
SET bytea_output = 'hex';
SELECT '\x48656c6c6f'::bytea;
-- \x48656c6c6f

-- Escape format (older, human-readable for ASCII bytes)
SET bytea_output = 'escape';
SELECT '\x48656c6c6f'::bytea;
-- Hello

Practical Example

Store product image hashes and API key hashes in BYTEA columns:

CREATE TABLE products (
  id           SERIAL PRIMARY KEY,
  sku          VARCHAR(50) NOT NULL UNIQUE,
  name         VARCHAR(200) NOT NULL,
  image_hash   BYTEA,          -- SHA-256 hash of the product image
  created_at   TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE api_keys (
  id          SERIAL PRIMARY KEY,
  owner_id    INTEGER     NOT NULL REFERENCES products(id),
  key_hash    BYTEA       NOT NULL,   -- store hash, never the raw key
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  last_used   TIMESTAMPTZ
);

-- Insert using hex literal syntax
INSERT INTO products (sku, name, image_hash)
VALUES (
  'BOLT-TI-M6',
  'Titanium M6 Bolt',
  '\x3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f2a3b4c'
);

Common BYTEA Functions

-- Length in bytes
SELECT octet_length('\x48656c6c6f'::bytea);
-- 5

-- Concatenate two BYTEA values
SELECT '\x4865'::bytea || '\x6c6c6f'::bytea;
-- \x48656c6c6f

-- Extract a slice (1-based offset)
SELECT substring('\x48656c6c6f'::bytea FROM 1 FOR 3);
-- \x48656c

-- Encode to Base64 (for API responses)
SELECT encode('\x48656c6c6f'::bytea, 'base64');
-- SGVsbG8=

-- Decode from Base64 back to BYTEA
SELECT decode('SGVsbG8=', 'base64');
-- \x48656c6c6f

Storing a Cryptographic Hash

-- Store a SHA-256 hash using pgcrypto (more secure than MD5)
CREATE EXTENSION IF NOT EXISTS pgcrypto;

INSERT INTO api_keys (owner_id, key_hash)
VALUES (1, digest('my-raw-api-key-value', 'sha256'));

-- Look up by hash (never store the raw key)
SELECT id, created_at
FROM api_keys
WHERE key_hash = digest('my-raw-api-key-value', 'sha256');

Storage and Performance Considerations

BYTEA values larger than approximately 2 KB are automatically compressed and moved to PostgreSQL’s TOAST storage. This is transparent to queries but affects:

  • Backup size: Large BYTEA values inflate backup files significantly.
  • Replication: Every BYTEA update is replicated in full via WAL.
  • Memory: Fetching large BYTEA values requires allocating memory to decompress them.

For files larger than a few megabytes, store them in object storage (S3, GCS, Azure Blob) and save the URL in a VARCHAR column.

Testing with Vela

Before adding large BYTEA columns to a production table, test on a database branch to measure the impact on backup size, query performance with TOAST decompression, and replication lag. This is especially important for product image pipelines or document storage systems where values may grow to hundreds of kilobytes per row.

Production Tips

  • For files larger than a few megabytes, store them in object storage and save the URL in a VARCHAR column — BYTEA at that scale increases backup size and memory pressure.
  • Use parameterized queries when inserting binary data from application code — your database driver handles proper encoding without manual hex escaping.
  • Use encode(data, 'base64') when you need to return binary data as a text payload in an API response.
  • For cryptographic hashes, use SHA-256 or stronger instead of MD5 — PostgreSQL provides digest(data, 'sha256') in the pgcrypto extension.
  • Monitor TOAST table size with SELECT pg_total_relation_size('products') to detect unexpected growth from large BYTEA values.

Continue in PostgreSQL Data Types: Composite Types.

Related in this section: Boolean · CHAR, VARCHAR, and TEXT · NUMERIC

Frequently Asked Questions

What is the PostgreSQL BYTEA data type?
BYTEA (binary array) stores variable-length sequences of bytes — any binary data including non-printable characters and null bytes. The maximum size is 1 GB per value. PostgreSQL displays BYTEA values in hex format by default (e.g. \x012345) but also supports an older escape format.
Does storing large BYTEA values lock the table?
No. Large BYTEA values are stored in PostgreSQL's TOAST (The Oversized-Attribute Storage Technique) tables automatically without table-level locking. However, very large values increase the size of backups and WAL replication traffic.
What is the difference between the hex and escape formats for BYTEA?
The hex format (default since PostgreSQL 9.0) represents each byte as two hexadecimal digits: \x0123ABCD. The escape format uses printable ASCII for printable bytes and octal sequences (\NNN) for non-printable ones. Control the output with SET bytea_output = 'hex'; or SET bytea_output = 'escape';
Can I use IF EXISTS when creating a BYTEA column?
Use ALTER TABLE t ADD COLUMN IF NOT EXISTS col_name BYTEA to safely skip the statement if the column already exists. This is useful in idempotent migration scripts.
What is the safest way to store and retrieve binary data in production?
Use parameterized queries from your application code — your database driver handles hex encoding automatically. Avoid constructing hex literals by string concatenation. Use encode(col, 'base64') when you need to return binary data in an API response.