# Databases — PostgreSQL Patterns


# PostgreSQL

## When to Use

- PostgreSQL database operations
- SQL query optimization
- Schema design and migrations
- JSONB document storage within a relational model
- Full-text search without a dedicated search engine
- Complex analytical queries with window functions and CTEs

## When NOT to Use

- NoSQL-only projects where no relational database is involved
- In-memory databases like Redis or SQLite used purely for caching or ephemeral storage
- File-based storage scenarios that do not require a database engine

---

## Core Patterns

### 1. Schema Design

Design tables with explicit constraints, proper types, and clear relationships.

```sql
-- Enums for constrained value sets
CREATE TYPE user_role AS ENUM ('admin', 'editor', 'viewer');
CREATE TYPE order_status AS ENUM ('pending', 'processing', 'shipped', 'delivered', 'cancelled');

-- Composite types for reusable structures
CREATE TYPE address AS (
    street TEXT,
    city   TEXT,
    state  TEXT,
    zip    VARCHAR(10)
);

-- Users table with constraints
CREATE TABLE users (
    id          BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email       TEXT NOT NULL UNIQUE,
    name        TEXT NOT NULL CHECK (char_length(name) >= 1),
    role        user_role NOT NULL DEFAULT 'viewer',
    metadata    JSONB DEFAULT '{}',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Organizations with self-referencing hierarchy
CREATE TABLE organizations (
    id          BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name        TEXT NOT NULL,
    parent_id   BIGINT REFERENCES organizations(id) ON DELETE SET NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Membership join table with composite primary key
CREATE TABLE org_memberships (
    user_id     BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    org_id      BIGINT NOT NULL REFERENCES organizations(id) ON DELETE CASCADE,
    role        user_role NOT NULL DEFAULT 'viewer',
    joined_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
    PRIMARY KEY (user_id, org_id)
);

-- Orders with foreign keys, check constraints, and enum status
CREATE TABLE orders (
    id          BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id     BIGINT NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
    status      order_status NOT NULL DEFAULT 'pending',
    total_cents BIGINT NOT NULL CHECK (total_cents >= 0),
    shipping    address,
    items       JSONB NOT NULL DEFAULT '[]',
    placed_at   TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Auto-update updated_at with a trigger
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = now();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_users_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW EXECUTE FUNCTION set_updated_at();
```

**Key principles:**
- Use `BIGINT GENERATED ALWAYS AS IDENTITY` over `SERIAL` for new projects
- Use `TIMESTAMPTZ` (not `TIMESTAMP`) to store times with timezone awareness
- Prefer `TEXT` over `VARCHAR(n)` unless a hard length limit is business-critical
- Add `ON DELETE` actions on every foreign key (CASCADE, RESTRICT, or SET NULL)
- Use `CHECK` constraints for business rules that live at the data level

---

### 2. Index Strategy

Choose the right index type based on your query patterns.

**Decision guide:**

| Query Pattern | Index Type | Example |
|---------------|-----------|---------|
| Equality (`=`) and range (`<`, `>`, `BETWEEN`) | B-tree (default) | `WHERE created_at > '2025-01-01'` |
| Array containment (`@>`), JSONB queries | GIN | `WHERE tags @> '{postgres}'` |
| Full-text search (`@@`) | GIN | `WHERE to_tsvector(body) @@ query` |
| Geometry, range overlap | GiST | `WHERE location <-> point '(40.7,-74.0)' < 0.01` |
| Filtered subset of rows | Partial | `WHERE active = true` |
| Index-only scans (no heap lookup) | Covering (INCLUDE) | Frequently selected columns |

```sql
-- B-tree: default, good for equality and range
CREATE INDEX idx_orders_placed_at ON orders(placed_at DESC);
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- GIN: arrays and JSONB containment
CREATE INDEX idx_users_metadata ON users USING GIN (metadata);
CREATE INDEX idx_orders_items ON orders USING GIN (items jsonb_path_ops);

-- GIN: full-text search
ALTER TABLE articles ADD COLUMN search_vector tsvector
    GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(body, '')), 'B')
    ) STORED;

CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);

-- Full-text search query
SELECT id, title, ts_rank(search_vector, query) AS rank
FROM articles, plainto_tsquery('english', 'database optimization') AS query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;

-- GiST: geometry and range types
CREATE INDEX idx_events_duration ON events USING GiST (
    tstzrange(starts_at, ends_at)
);

-- Find overlapping events
SELECT * FROM events
WHERE tstzrange(starts_at, ends_at) && tstzrange('2025-06-01', '2025-06-02');

-- Partial index: only index rows you actually query
CREATE INDEX idx_orders_pending ON orders(placed_at)
    WHERE status = 'pending';

-- Covering index: avoids heap lookup for common queries
CREATE INDEX idx_users_email_covering ON users(email)
    INCLUDE (name, role);

-- This query can now be answered entirely from the index
SELECT name, role FROM users WHERE email = 'user@example.com';
```

**When to add an index:** Run `EXPLAIN ANALYZE` first. Add an index when you see sequential scans on large tables with selective WHERE clauses. Do not index columns with very low cardinality (e.g., a boolean on a small table) unless combined with other columns.

---

### 3. Query Optimization

#### Reading EXPLAIN ANALYZE

```sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.placed_at > now() - INTERVAL '30 days'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 10;
```

**What to look for in the output:**
- **Seq Scan on large tables** -- add an index or rewrite the WHERE clause
- **Nested Loop with high row counts** -- consider a Hash Join (may need more `work_mem`)
- **actual rows far exceeding estimated rows** -- run `ANALYZE tablename` to update statistics
- **Buffers: shared read** large numbers -- data not cached, check `shared_buffers` sizing
- **Sort Method: external merge** -- increase `work_mem` for this query

#### Common Query Rewrites

```sql
-- BAD: correlated subquery runs once per row
SELECT u.name,
    (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) AS order_count
FROM users u;

-- GOOD: single pass with JOIN + GROUP BY
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name;

-- BAD: OR on different columns defeats index usage
SELECT * FROM orders WHERE user_id = 5 OR status = 'pending';

-- GOOD: UNION ALL lets each branch use its own index
SELECT * FROM orders WHERE user_id = 5
UNION ALL
SELECT * FROM orders WHERE status = 'pending' AND user_id != 5;

-- BAD: function call on indexed column prevents index use
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- GOOD: expression index or use citext
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- or better: define email as CITEXT type

-- Avoiding N+1: fetch users and their latest order in one query
SELECT DISTINCT ON (u.id)
    u.id, u.name, o.id AS latest_order_id, o.total_cents, o.placed_at
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
ORDER BY u.id, o.placed_at DESC;
```

---

### 4. Migrations

Follow the up/down pattern and plan for zero-downtime deployments.

```sql
-- ============================================
-- Migration: 20250601_001_add_user_preferences
-- ============================================

-- UP
ALTER TABLE users ADD COLUMN preferences JSONB DEFAULT '{}';

-- Create index CONCURRENTLY to avoid locking the table
CREATE INDEX CONCURRENTLY idx_users_preferences
    ON users USING GIN (preferences);

-- DOWN
DROP INDEX IF EXISTS idx_users_preferences;
ALTER TABLE users DROP COLUMN IF EXISTS preferences;
```

**Safe vs unsafe operations:**

| Operation | Safe? | Notes |
|-----------|-------|-------|
| ADD COLUMN (nullable or with volatile default) | Yes | Instant in PG 11+ with non-volatile default too |
| ADD COLUMN NOT NULL without default | No | Fails if rows exist; add nullable first, backfill, then set NOT NULL |
| DROP COLUMN | Mostly | Quick, but ORM queries may break if they SELECT * |
| RENAME COLUMN | Dangerous | Breaks all queries referencing old name; use a transition period |
| ADD INDEX | Safe with CONCURRENTLY | Without CONCURRENTLY, locks writes for duration |
| ADD CONSTRAINT (CHECK/FK) | Careful | Use NOT VALID then VALIDATE CONSTRAINT in two steps |
| Change column type | Dangerous | Rewrites entire table; use a new column + migration instead |

```sql
-- Zero-downtime: add NOT NULL constraint safely
-- Step 1: add column as nullable
ALTER TABLE users ADD COLUMN phone TEXT;

-- Step 2: backfill in batches
UPDATE users SET phone = '' WHERE phone IS NULL AND id BETWEEN 1 AND 10000;
UPDATE users SET phone = '' WHERE phone IS NULL AND id BETWEEN 10001 AND 20000;
-- ... continue in batches

-- Step 3: add constraint without full table lock
ALTER TABLE users ADD CONSTRAINT users_phone_not_null
    CHECK (phone IS NOT NULL) NOT VALID;

-- Step 4: validate (scans table but allows concurrent writes)
ALTER TABLE users VALIDATE CONSTRAINT users_phone_not_null;

-- Step 5: optionally convert to proper NOT NULL
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
ALTER TABLE users DROP CONSTRAINT users_phone_not_null;
```

---

### 5. JSON/JSONB

Use JSONB for semi-structured data that lives alongside relational columns.

**When to use JSONB:**
- User preferences, settings, or metadata with varying keys
- API response caching or event payloads
- Flexible attributes that differ per row

**When NOT to use JSONB:**
- Data you regularly JOIN on or use in WHERE clauses across tables -- normalize it
- Data that has a fixed, well-known schema -- use proper columns

```sql
-- Querying JSONB: operators
-- ->  returns JSONB element (keeps type)
-- ->> returns TEXT value
-- @>  containment (left contains right)
-- ?   key exists

-- Get a nested value
SELECT
    metadata->>'department' AS department,
    metadata->'settings'->>'theme' AS theme
FROM users
WHERE metadata @> '{"role": "admin"}';

-- Check if a key exists
SELECT * FROM users WHERE metadata ? 'avatar_url';

-- Query inside JSONB arrays
SELECT * FROM orders
WHERE items @> '[{"sku": "WIDGET-001"}]';

-- Update a nested JSONB field
UPDATE users
SET metadata = jsonb_set(metadata, '{settings,notifications}', '"email"')
WHERE id = 42;

-- Remove a key
UPDATE users
SET metadata = metadata - 'deprecated_field'
WHERE metadata ? 'deprecated_field';

-- Aggregate JSONB: expand array elements into rows
SELECT o.id, item->>'sku' AS sku, (item->>'qty')::int AS qty
FROM orders o, jsonb_array_elements(o.items) AS item
WHERE o.status = 'pending';

-- Index strategies for JSONB
-- General containment queries: GIN with jsonb_ops (default)
CREATE INDEX idx_users_metadata_gin ON users USING GIN (metadata);

-- Containment-only queries (smaller, faster index): jsonb_path_ops
CREATE INDEX idx_orders_items_path ON orders USING GIN (items jsonb_path_ops);

-- Specific key lookups: expression index on extracted value
CREATE INDEX idx_users_department ON users ((metadata->>'department'));
```

---

### 6. CTEs and Window Functions

#### Common Table Expressions (CTEs)

```sql
-- Readable multi-step query with CTEs
WITH monthly_revenue AS (
    SELECT
        date_trunc('month', placed_at) AS month,
        SUM(total_cents) AS revenue_cents
    FROM orders
    WHERE status = 'delivered'
    GROUP BY 1
),
revenue_with_growth AS (
    SELECT
        month,
        revenue_cents,
        LAG(revenue_cents) OVER (ORDER BY month) AS prev_month,
        ROUND(
            100.0 * (revenue_cents - LAG(revenue_cents) OVER (ORDER BY month))
            / NULLIF(LAG(revenue_cents) OVER (ORDER BY month), 0),
            1
        ) AS growth_pct
    FROM monthly_revenue
)
SELECT * FROM revenue_with_growth ORDER BY month DESC;

-- Recursive CTE: org hierarchy tree
WITH RECURSIVE org_tree AS (
    -- Base case: top-level orgs
    SELECT id, name, parent_id, 0 AS depth, name::TEXT AS path
    FROM organizations
    WHERE parent_id IS NULL

    UNION ALL

    -- Recursive step
    SELECT o.id, o.name, o.parent_id, t.depth + 1, t.path || ' > ' || o.name
    FROM organizations o
    JOIN org_tree t ON o.parent_id = t.id
)
SELECT * FROM org_tree ORDER BY path;
```

#### Window Functions

```sql
-- ROW_NUMBER: assign rank within a partition
SELECT
    user_id,
    id AS order_id,
    total_cents,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY placed_at DESC) AS rn
FROM orders;

-- Get each user's most recent order
SELECT * FROM (
    SELECT
        o.*,
        ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY placed_at DESC) AS rn
    FROM orders o
) sub WHERE rn = 1;

-- LAG/LEAD: compare with previous/next row
SELECT
    placed_at::date AS order_date,
    total_cents,
    LAG(total_cents) OVER (ORDER BY placed_at) AS prev_order_total,
    total_cents - LAG(total_cents) OVER (ORDER BY placed_at) AS diff
FROM orders
WHERE user_id = 42;

-- Running total
SELECT
    placed_at::date AS order_date,
    total_cents,
    SUM(total_cents) OVER (
        ORDER BY placed_at
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM orders
WHERE user_id = 42;

-- NTILE: divide rows into equal buckets (e.g., quartiles)
SELECT
    user_id,
    SUM(total_cents) AS lifetime_spend,
    NTILE(4) OVER (ORDER BY SUM(total_cents) DESC) AS spend_quartile
FROM orders
GROUP BY user_id;
```

---

### 7. Transaction Isolation

PostgreSQL supports four isolation levels. The two most commonly used:

| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Use Case |
|-------|-----------|-------------------|-------------|----------|
| READ COMMITTED (default) | No | Possible | Possible | Most OLTP workloads |
| REPEATABLE READ | No | No | No (in PG) | Reports, consistent snapshots |
| SERIALIZABLE | No | No | No | Financial transactions, inventory |

```sql
-- Default: READ COMMITTED
-- Each statement sees the latest committed data
BEGIN;
    UPDATE accounts SET balance = balance - 100 WHERE id = 1;
    UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- SERIALIZABLE: full isolation, detects write conflicts
BEGIN ISOLATION LEVEL SERIALIZABLE;
    -- Read current inventory
    SELECT quantity FROM inventory WHERE sku = 'WIDGET-001';
    -- Decrement if sufficient (PG will abort if concurrent tx conflicts)
    UPDATE inventory SET quantity = quantity - 1 WHERE sku = 'WIDGET-001';
COMMIT;
-- If another SERIALIZABLE tx modified the same row, one will get:
-- ERROR: could not serialize access due to concurrent update
-- Your application must retry on serialization failure (SQLSTATE 40001)

-- Advisory locks for application-level coordination
SELECT pg_advisory_xact_lock(hashtext('process-user-' || '42'));
-- Lock is held until transaction ends; no table-level contention
```

**Guidelines:**
- Use READ COMMITTED for general CRUD operations
- Use SERIALIZABLE when correctness requires that concurrent transactions behave as if run sequentially (e.g., balance transfers, seat reservations)
- Always implement retry logic for serialization failures
- Keep transactions as short as possible to reduce contention

---

### 8. Connection Pooling

Direct PostgreSQL connections are expensive (~1-10 MB RAM each). Use a pooler.

**PgBouncer configuration (pgbouncer.ini):**

```ini
[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt

; Pool mode: transaction is best for most web apps
pool_mode = transaction

; Sizing: start conservative, tune with monitoring
default_pool_size = 20
max_client_conn = 200
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3

; Timeouts
server_idle_timeout = 300
client_idle_timeout = 60
query_timeout = 30
```

**Pool sizing formula:**

```
optimal_pool_size = ((2 * cpu_cores) + effective_disk_spindles)
```

For a 4-core SSD server: `(2 * 4) + 1 = 9` connections is a good starting point. More connections does not mean more throughput -- too many causes contention.

**Pool modes:**

| Mode | Description | Caveats |
|------|-------------|---------|
| `transaction` | Connection returned after each transaction | Cannot use session-level features (LISTEN/NOTIFY, prepared statements, temp tables) |
| `session` | Connection held for entire client session | Fewer pooling benefits; use only when session features needed |
| `statement` | Connection returned after each statement | No multi-statement transactions; rarely used |

**Application-level pooling (Python example with asyncpg):**

```python
import asyncpg

pool = await asyncpg.create_pool(
    dsn="postgresql://user:pass@localhost:6432/myapp",
    min_size=5,
    max_size=20,
    max_inactive_connection_lifetime=300,
    command_timeout=30,
)

async with pool.acquire() as conn:
    rows = await conn.fetch("SELECT * FROM users WHERE active = true")
```

---

## Best Practices

1. **Use parameterized queries everywhere.** Never concatenate user input into SQL strings. ORMs and query builders handle this, but verify in raw SQL contexts.

2. **Run ANALYZE after bulk data changes.** The query planner relies on statistics. After large imports or deletes, run `ANALYZE tablename` to update them.

3. **Prefer BIGINT for primary keys.** INTEGER (max ~2.1 billion) can be exhausted sooner than expected in high-write systems. BIGINT costs 4 extra bytes per row but avoids a painful migration later.

4. **Store money as integers (cents).** Floating-point arithmetic causes rounding errors. Use `BIGINT` for cents or `NUMERIC(19,4)` if sub-cent precision is needed.

5. **Add indexes for foreign keys.** PostgreSQL does not automatically index the child side of a foreign key. Without it, DELETE on the parent table triggers a sequential scan on the child.

6. **Use TIMESTAMPTZ, not TIMESTAMP.** `TIMESTAMP WITHOUT TIME ZONE` silently drops timezone info. Always use `TIMESTAMPTZ` and let the application control display timezone.

7. **Set statement_timeout for web requests.** Prevent runaway queries from holding connections: `SET statement_timeout = '5s';` at session start, or configure per-role in PostgreSQL.

8. **Monitor with pg_stat_statements.** Enable this extension to track query performance over time. The top queries by `total_exec_time` are your optimization targets.

```sql
-- Find slowest queries
SELECT
    calls,
    round(total_exec_time::numeric, 1) AS total_ms,
    round(mean_exec_time::numeric, 1) AS mean_ms,
    query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
```

## Common Pitfalls

1. **N+1 queries from ORM lazy loading.** Loading a list of users and then accessing `user.orders` in a loop generates one query per user. Use eager loading (`joinedload` in SQLAlchemy, `select_related` in Django) or batch the query with a JOIN.

2. **Locking the table during migrations.** `ALTER TABLE ... ADD COLUMN NOT NULL DEFAULT 'x'` is safe in PG 11+, but `CREATE INDEX` without `CONCURRENTLY` locks writes. Always use `CREATE INDEX CONCURRENTLY` in production migrations.

3. **Bloated tables from UPDATE-heavy workloads.** PostgreSQL MVCC creates dead tuples on every UPDATE. If autovacuum cannot keep up, table size and query times grow. Monitor `pg_stat_user_tables.n_dead_tup` and tune autovacuum settings for hot tables.

4. **Using OFFSET for pagination on large datasets.** `OFFSET 100000` forces PG to scan and discard 100,000 rows. Use keyset pagination instead:

```sql
-- BAD: slow for deep pages
SELECT * FROM orders ORDER BY id LIMIT 20 OFFSET 100000;

-- GOOD: keyset pagination
SELECT * FROM orders WHERE id > 100000 ORDER BY id LIMIT 20;
```

5. **Ignoring connection limits.** Each PostgreSQL connection consumes RAM. Opening hundreds of direct connections (e.g., one per serverless function invocation) will exhaust `max_connections` and crash the server. Always use PgBouncer or an application-level pool.

6. **Storing large blobs in the database.** Files over a few KB should go in object storage (S3, R2). Store the URL/key in PostgreSQL. Large `bytea` or `TEXT` columns bloat the table, slow backups, and waste shared_buffers cache.

## Related Skills

- `mongodb` - Document-based database patterns for non-relational data
- `caching` - Caching strategies to reduce database load
- `logging` - Logging patterns for query debugging and monitoring