PostgreSQL Error:
23000
What does this error mean?
SQLSTATE 23000 is the parent class for all integrity constraint violations in PostgreSQL. When the server returns this code, an INSERT, UPDATE, or DELETE statement conflicted with a unique constraint, foreign key constraint, CHECK constraint, or NOT NULL restriction on the target table. In data pipelines (ADF, dbt, Airflow, custom Python ETL), this error typically surfaces during the load phase and causes the entire transaction or batch to roll back. The symptom you see: the pipeline run fails, the error log contains 'integrity_constraint_violation' with the specific constraint name, and zero rows from that batch make it into the destination. Downstream tables that depend on this load remain stale until the constraint conflict is resolved and the pipeline is re-run.
Common causes
- 1Duplicate rows in the source data or staging table being loaded without deduplication — common when a pipeline retries after a partial failure and reprocesses already-loaded rows.
- 2Foreign key reference to a parent row that does not exist yet because the parent table load ran after the child table load, or the parent row was deleted between extraction and loading.
- 3NOT NULL column receiving NULL values from the source because an upstream schema change added a new required field that the extraction query does not yet include.
- 4CHECK constraint violated by out-of-range values — for example a percentage column constrained to 0–100 receiving a raw count value from the source.
- 5Concurrent pipeline runs inserting the same natural key simultaneously, causing a race condition on a unique index that neither run anticipated.
- 6Exclusion constraint conflict (range overlap) on partitioned tables, especially with time-series data where partition boundaries shift during a schema migration.
How to fix it
- 1Step 1: Identify the exact constraint. Run: SELECT conname, contype, pg_get_constraintdef(oid) FROM pg_constraint WHERE conrelid = 'your_table'::regclass; — this shows all constraints on the target table with their definitions.
- 2Step 2: For unique constraint violations, add an upsert clause: INSERT INTO target_table (...) VALUES (...) ON CONFLICT (key_column) DO UPDATE SET col = EXCLUDED.col; — or use ON CONFLICT DO NOTHING if duplicates should be silently skipped.
- 3Step 3: For foreign key violations, verify parent data exists before loading child rows: SELECT id FROM parent_table EXCEPT SELECT parent_id FROM staging_child_table; — any results indicate missing parents that need to be loaded first.
- 4Step 4: For NOT NULL violations, add a COALESCE or default in your transformation layer: SELECT COALESCE(source_col, 'default_value') AS target_col FROM staging; — or fix the extraction query to include the missing column.
- 5Step 5: For CHECK constraint violations, query the offending rows: SELECT * FROM staging_table WHERE NOT (check_expression); — replace check_expression with the constraint definition from Step 1.
- 6Step 6: For concurrent load race conditions, wrap the insert in an advisory lock: SELECT pg_advisory_xact_lock(hashtext('your_table_load')); before the INSERT to serialize overlapping pipeline runs.
- 7Step 7: After fixing, validate with a dry run: BEGIN; INSERT INTO target_table SELECT * FROM staging_table; -- inspect results, then ROLLBACK; — this confirms the fix without committing data.
Example log output
ERROR: duplicate key value violates unique constraint "orders_order_id_key"
DETAIL: Key (order_id)=(10042) already exists.
STATEMENT: INSERT INTO orders (order_id, customer_id, amount) VALUES ($1, $2, $3)