An ADF pipeline was triggered twice — manually or via a duplicate schedule — and the copy activity uses 'Insert' write behavior without a deduplication check. The second run attempts to insert the same primary key values that were already written in the first run.. Source data contains duplicate key values before they reach the target. A staging layer or upstream extraction did not deduplicate, so multiple rows carry the same business key and only the first one inserts successfully.. A dbt incremental model is configured with `strategy: append` instead of `strategy: merge`. On each run it appends all records from the source rather than matching on the unique key and updating existing rows.. An ADF upsert is configured but the key columns in the sink settings do not match the actual primary key of the target table. The MERGE statement generated by ADF cannot match existing rows, so every row falls into the INSERT branch and collides with existing data.. A pipeline was retried after a partial failure mid-batch. The first attempt wrote a subset of rows before failing; the retry starts from the beginning and tries to re-insert rows that were already committed because the source query has no checkpoint or watermark.. A TRUNCATE + INSERT pattern was replaced with a pure INSERT to avoid locking, but the TRUNCATE step was removed without adding duplicate-safe write logic. Rows accumulate across runs until a primary key collision occurs.. Multiple pipelines write to the same target table concurrently — for example a full-load job and a delta job running in overlapping windows. Both read the same source rows and race to insert them, causing one to hit a duplicate key error.

Step 1 — Identify the duplicate key. The error message contains the constraint name and the offending value. Confirm which rows are duplicated: SELECT , COUNT(*) AS cnt FROM GROUP BY HAVING COUNT(*) > 1; Note the key value from the error message and cross-reference it with the source query.. Step 2 — For ADF Copy Activity: open the sink tab, change 'Write behavior' from 'Insert' to 'Upsert', and add the key columns (e.g. id or composite business key) in the 'Key columns' field. ADF will generate a MERGE statement that updates existing rows and inserts new ones.. Step 3 — For dbt incremental models: change the config block to strategy: merge and specify unique_key. Example: {{ config(materialized='incremental', unique_key='order_id', incremental_strategy='merge') }}. Run dbt run --full-refresh once to rebuild the table cleanly, then future runs will merge.. Step 4 — For custom T-SQL pipelines, replace bare INSERT with an idempotent pattern: MERGE target_table AS tgt USING source_cte AS src ON tgt.id = src.id WHEN MATCHED THEN UPDATE SET tgt.col1 = src.col1, tgt.col2 = src.col2 WHEN NOT MATCHED THEN INSERT (id, col1, col2) VALUES (src.id, src.col1, src.col2); This is atomic and retry-safe.. Step 5 — If INSERT only (no updates needed): use INSERT ... WHERE NOT EXISTS: INSERT INTO target (id, col1, col2) SELECT id, col1, col2 FROM source WHERE NOT EXISTS (SELECT 1 FROM target t WHERE t.id = source.id); This skips rows that already exist instead of failing.. Step 6 — Add a watermark or checkpoint to delta pipelines to avoid re-processing already-loaded data. Store the last successful high-watermark (e.g. max updated_at) in a control table and filter source rows: WHERE updated_at > (SELECT last_watermark FROM pipeline_control WHERE pipeline_name = 'my_pipeline').. Step 7 — If the pipeline is retrying after a partial load, check whether the target table needs to be cleaned before the retry. DELETE FROM target WHERE batch_id = ' '; then re-run. For full-load pipelines, switch to a TRUNCATE + INSERT pattern inside a transaction so partial states never persist.

Medium severitydata integrity

SQL Server Error:
2627

Impact

When the pipeline fails mid-batch, the target table is left in a partial state — some new rows are written, others are not. Downstream reports read incomplete data until the pipeline succeeds, and any SLA-bound refresh will miss its window. If the pipeline has no alerting, the stale data may go undetected for hours.

A 2627 failure stops the copy activity immediately and marks the pipeline run as Failed. ADF does not commit partial batches by default, but rows written in earlier batches within the same run are already committed — leaving the target table in an inconsistent state. Downstream dbt models, Power BI imports, or Synapse views that depend on this table will either fail their own refresh or silently return stale data. If the pipeline has a retry policy configured, each retry will hit the same duplicate and fail again unless the root cause is fixed first, consuming retry quota and delaying any eventual recovery.

What does this error mean?

SQL Server error 2627 is raised when an INSERT or UPDATE tries to store a value that already exists in a column protected by a PRIMARY KEY or UNIQUE constraint. The engine rejects the entire statement and rolls back the affected row. In a data-pipeline context this typically surfaces when a copy activity runs without idempotency guards — a second pipeline execution attempts to re-insert rows that were already loaded in a previous run. The engineer sees the pipeline fail at the sink step, the error message includes the constraint name and the offending duplicate key value, and the target table is left in whatever partial state it was in at the point of failure.

Common causes

1An ADF pipeline was triggered twice — manually or via a duplicate schedule — and the copy activity uses 'Insert' write behavior without a deduplication check. The second run attempts to insert the same primary key values that were already written in the first run.
2Source data contains duplicate key values before they reach the target. A staging layer or upstream extraction did not deduplicate, so multiple rows carry the same business key and only the first one inserts successfully.
3A dbt incremental model is configured with `strategy: append` instead of `strategy: merge`. On each run it appends all records from the source rather than matching on the unique key and updating existing rows.
4An ADF upsert is configured but the key columns in the sink settings do not match the actual primary key of the target table. The MERGE statement generated by ADF cannot match existing rows, so every row falls into the INSERT branch and collides with existing data.
5A pipeline was retried after a partial failure mid-batch. The first attempt wrote a subset of rows before failing; the retry starts from the beginning and tries to re-insert rows that were already committed because the source query has no checkpoint or watermark.
6A TRUNCATE + INSERT pattern was replaced with a pure INSERT to avoid locking, but the TRUNCATE step was removed without adding duplicate-safe write logic. Rows accumulate across runs until a primary key collision occurs.
7Multiple pipelines write to the same target table concurrently — for example a full-load job and a delta job running in overlapping windows. Both read the same source rows and race to insert them, causing one to hit a duplicate key error.

How to fix it

1Step 1 — Identify the duplicate key. The error message contains the constraint name and the offending value. Confirm which rows are duplicated: SELECT <key_columns>, COUNT(*) AS cnt FROM <target_table> GROUP BY <key_columns> HAVING COUNT(*) > 1; Note the key value from the error message and cross-reference it with the source query.
2Step 2 — For ADF Copy Activity: open the sink tab, change 'Write behavior' from 'Insert' to 'Upsert', and add the key columns (e.g. id or composite business key) in the 'Key columns' field. ADF will generate a MERGE statement that updates existing rows and inserts new ones.
3Step 3 — For dbt incremental models: change the config block to strategy: merge and specify unique_key. Example: {{ config(materialized='incremental', unique_key='order_id', incremental_strategy='merge') }}. Run dbt run --full-refresh once to rebuild the table cleanly, then future runs will merge.
4Step 4 — For custom T-SQL pipelines, replace bare INSERT with an idempotent pattern: MERGE target_table AS tgt USING source_cte AS src ON tgt.id = src.id WHEN MATCHED THEN UPDATE SET tgt.col1 = src.col1, tgt.col2 = src.col2 WHEN NOT MATCHED THEN INSERT (id, col1, col2) VALUES (src.id, src.col1, src.col2); This is atomic and retry-safe.
5Step 5 — If INSERT only (no updates needed): use INSERT ... WHERE NOT EXISTS: INSERT INTO target (id, col1, col2) SELECT id, col1, col2 FROM source WHERE NOT EXISTS (SELECT 1 FROM target t WHERE t.id = source.id); This skips rows that already exist instead of failing.
6Step 6 — Add a watermark or checkpoint to delta pipelines to avoid re-processing already-loaded data. Store the last successful high-watermark (e.g. max updated_at) in a control table and filter source rows: WHERE updated_at > (SELECT last_watermark FROM pipeline_control WHERE pipeline_name = 'my_pipeline').
7Step 7 — If the pipeline is retrying after a partial load, check whether the target table needs to be cleaned before the retry. DELETE FROM target WHERE batch_id = '<failed_batch_id>'; then re-run. For full-load pipelines, switch to a TRUNCATE + INSERT pattern inside a transaction so partial states never persist.

Example log output

ErrorCode=UserErrorSqlException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=SQL operation failed with the following error: 'Violation of PRIMARY KEY constraint 'PK_orders'. Cannot insert duplicate key in object 'dbo.orders'. The duplicate key value is (10043).',Source=Microsoft.DataTransfer.ClientLibrary

Frequently asked questions

What is the difference between SQL Server error 2627 and error 2601?

Error 2627 is raised when a PRIMARY KEY or UNIQUE constraint is violated — constraints are declared with CREATE TABLE or ALTER TABLE ADD CONSTRAINT. Error 2601 is raised when a unique index (created with CREATE UNIQUE INDEX, not a constraint) is violated. Both indicate a duplicate key, but 2627 means the enforcement is at the constraint level. In practice the fix is the same: deduplicate the source or switch to MERGE/upsert logic.

How do I fix error 2627 in an ADF copy activity?

Open the copy activity, go to the Sink tab, and change 'Write behavior' from 'Insert' to 'Upsert'. In the 'Key columns' field, enter the column name(s) that form the primary key of the target table (e.g. id). ADF will generate a MERGE statement on those columns so existing rows are updated and new rows are inserted without collisions.

Can I make SQL Server ignore error 2627 and continue inserting other rows?

Not with a plain INSERT statement — 2627 aborts the entire statement. If you want to skip duplicates instead of failing, use INSERT ... SELECT ... WHERE NOT EXISTS (...) to filter out already-existing rows before inserting. Alternatively, set IGNORE_DUP_KEY = ON on the unique index (not available on primary key constraints) so the engine silently skips duplicate rows rather than raising an error.

Error 2627 appears only on retries — why does the first run succeed?

The first run inserts the rows successfully. When the pipeline is retried — due to a transient failure, a manual re-trigger, or a duplicate schedule — the same rows are presented to the sink again. Because the target already contains them, the second INSERT hits the constraint. The fix is to make the pipeline idempotent: use MERGE so re-runs update rather than re-insert, or add a watermark so the source query only returns rows not yet loaded.

Source · learn.microsoft.com/en-us/sql/relational-databases/errors-events/mssqlserver-2627-database-engine-error