Multiple dbt models or ADF pipeline activities writing to the same target table concurrently under SERIALIZABLE isolation — common when parallelism is set too high for overlapping models.. High write throughput on hot rows, e.g. a single 'last_updated' row in a metadata table that every pipeline run touches.. Long-running analytical queries holding read locks that conflict with concurrent write transactions — a 30-minute dashboard query can block short ETL writes.. ORM frameworks (Django, SQLAlchemy) defaulting to SERIALIZABLE isolation without the developer realizing it, causing conflicts in multi-worker task queues like Celery.. Schema migrations running concurrently with data loads — an ALTER TABLE under SERIALIZABLE can conflict with INSERT/UPDATE transactions on the same table.

Step 1: Add retry logic at the transaction level. In Python/SQLAlchemy: wrap your session.commit() in a loop that catches sqlalchemy.exc.OperationalError with pgcode='40001', waits with exponential backoff (e.g. 100ms, 200ms, 400ms), and retries the full transaction up to 5 times.. Step 2: Check your current isolation level: run SHOW default_transaction_isolation; on the database. If it returns 'serializable' and your workload does not strictly require it, switch to READ COMMITTED: ALTER DATABASE yourdb SET default_transaction_isolation = 'read committed';. Step 3: Identify conflicting transactions by querying active locks: SELECT pid, state, query, xact_start FROM pg_stat_activity WHERE state != 'idle' AND datname = 'yourdb' ORDER BY xact_start; — look for long-running transactions that overlap with your pipeline runs.. Step 4: Reduce transaction duration. Break large bulk loads into smaller batches (e.g. 10,000 rows per transaction instead of 1 million). In dbt, set dbt run --threads 1 for models that write to the same table to serialize them.. Step 5: If a specific long-running query is the blocker, kill it: SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE pid = ; — confirm the PID first to avoid killing the wrong session.. Step 6: For tables with extreme write contention, partition by date or tenant so concurrent writes hit different physical partitions: CREATE TABLE events (id bigint, created_at timestamptz, data jsonb) PARTITION BY RANGE (created_at);. Step 7: In Airflow or ADF, configure task-level retries with backoff. In Airflow: set retries=3 and retry_delay=timedelta(seconds=30) on the task. In ADF: set retry count on the Copy Activity or Stored Procedure activity.

High severityconfiguration

PostgreSQL Error:
40001

Impact

When 40001 errors happen on more than 5% of pipeline runs or recur on every scheduled refresh. Downstream tables go stale, SLA windows get missed, and dbt or ADF retries pile up. If retries exhaust without success, your reporting layer serves outdated data until someone manually intervenes.

A 40001 failure rolls back the entire transaction, meaning no data from that pipeline run reaches the target table. Without retry logic, the run fails and downstream models or reports consume stale data. In dbt, dependent models skip or fail. In ADF, the activity marks as failed and downstream activities do not execute. If retries exhaust, the refresh window is missed entirely — SLA breaches follow, and any alerting system (including MetricSign) fires a pipeline-failure notification. Repeated 40001s during peak hours indicate a structural concurrency bottleneck, not a transient glitch.

What does this error mean?

SQLSTATE 40001 means PostgreSQL aborted and rolled back your transaction because it conflicted with another concurrent transaction under SERIALIZABLE isolation. The database detected that committing both transactions would violate serializability guarantees, so it picked one to kill. In a data-pipeline context (dbt, ADF, Airflow, custom ETL), this typically surfaces as a failed run or task retry. The engineer sees 'could not serialize access due to concurrent update' or 'could not serialize access due to read/write dependencies among transactions' in the pipeline logs. The transaction is fully rolled back — no partial writes land in the target table. The only correct response is to retry the entire transaction from the start, not just the failing statement.

Common causes

1Multiple dbt models or ADF pipeline activities writing to the same target table concurrently under SERIALIZABLE isolation — common when parallelism is set too high for overlapping models.
2High write throughput on hot rows, e.g. a single 'last_updated' row in a metadata table that every pipeline run touches.
3Long-running analytical queries holding read locks that conflict with concurrent write transactions — a 30-minute dashboard query can block short ETL writes.
4ORM frameworks (Django, SQLAlchemy) defaulting to SERIALIZABLE isolation without the developer realizing it, causing conflicts in multi-worker task queues like Celery.
5Schema migrations running concurrently with data loads — an ALTER TABLE under SERIALIZABLE can conflict with INSERT/UPDATE transactions on the same table.

How to fix it

1Step 1: Add retry logic at the transaction level. In Python/SQLAlchemy: wrap your session.commit() in a loop that catches sqlalchemy.exc.OperationalError with pgcode='40001', waits with exponential backoff (e.g. 100ms, 200ms, 400ms), and retries the full transaction up to 5 times.
2Step 2: Check your current isolation level: run SHOW default_transaction_isolation; on the database. If it returns 'serializable' and your workload does not strictly require it, switch to READ COMMITTED: ALTER DATABASE yourdb SET default_transaction_isolation = 'read committed';
3Step 3: Identify conflicting transactions by querying active locks: SELECT pid, state, query, xact_start FROM pg_stat_activity WHERE state != 'idle' AND datname = 'yourdb' ORDER BY xact_start; — look for long-running transactions that overlap with your pipeline runs.
4Step 4: Reduce transaction duration. Break large bulk loads into smaller batches (e.g. 10,000 rows per transaction instead of 1 million). In dbt, set dbt run --threads 1 for models that write to the same table to serialize them.
5Step 5: If a specific long-running query is the blocker, kill it: SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE pid = <blocking_pid>; — confirm the PID first to avoid killing the wrong session.
6Step 6: For tables with extreme write contention, partition by date or tenant so concurrent writes hit different physical partitions: CREATE TABLE events (id bigint, created_at timestamptz, data jsonb) PARTITION BY RANGE (created_at);
7Step 7: In Airflow or ADF, configure task-level retries with backoff. In Airflow: set retries=3 and retry_delay=timedelta(seconds=30) on the task. In ADF: set retry count on the Copy Activity or Stored Procedure activity.

Example log output

ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.
HINT:  The transaction might succeed if retried.

Frequently asked questions

How do I add retry logic for SQLSTATE 40001 in Python?

Wrap your transaction in a for-loop (max 5 attempts). Catch sqlalchemy.exc.OperationalError, check that the pgcode is '40001', call session.rollback(), sleep with exponential backoff (e.g. 0.1 * 2^attempt seconds), and re-execute the entire transaction block. Do not retry just the last statement — the full transaction must restart.

Can I fix 40001 by switching from SERIALIZABLE to READ COMMITTED?

Yes, in most ETL and pipeline workloads READ COMMITTED is sufficient and eliminates serialization failures entirely. Run SHOW default_transaction_isolation; to check your current level. Change it with ALTER DATABASE yourdb SET default_transaction_isolation = 'read committed'; Only keep SERIALIZABLE if your application logic depends on it for correctness (e.g. financial double-spend prevention).

Why does 40001 keep happening even after I added retries?

If retries consistently fail, the contention is too high for retries alone to solve. Check whether multiple pipelines write to the same table at the same time (stagger their schedules by 5-10 minutes), reduce transaction size to shorten lock windows, or partition the target table so writes don't overlap physically. Also verify your retry actually restarts the full transaction, not just the failing statement.

Does 40001 cause data loss or partial writes?

No. PostgreSQL rolls back the entire transaction atomically — no partial data lands in the target table. However, if your pipeline does not retry, the expected data simply never arrives. Downstream tables remain stale until the next successful run, which is why retry logic is mandatory, not optional.

Source · www.postgresql.org/docs/current/errcodes-appendix.html