Medium severitydata integrity
SQL Server Error:
2601
What does this error mean?
SQL Server throws error 2601 when an INSERT attempts to write a row whose key value already exists in a unique index — not necessarily a PRIMARY KEY constraint, but any index created with UNIQUE. In a data pipeline context this surfaces when an ADF copy activity, SSIS package, or custom ETL script tries to load rows that were already committed in a previous run, or when the source dataset contains duplicates within the same batch. The pipeline activity fails immediately at the first violation; rows processed before the offending record may already be committed depending on transaction scope. The error message includes the table name, index name, and the duplicate key value, which is the starting point for diagnosis.
Common causes
- 1Pipeline retry after partial success: an ADF pipeline failed at row 50,000 of 100,000 and was re-triggered from the start without truncating the target table first. Rows 1–49,999 were already committed, so the re-run collides on the first batch.
- 2Source system sends duplicate records within a single extract: a CDC feed, API response, or flat-file export contains the same business key twice — common after source-side replication lag or manual data corrections that generate two versions of the same record.
- 3Missing deduplication in a staging-to-target transform: a SQL transform or dbt model joins multiple source tables and produces fan-out rows that share the same unique key in the destination.
- 4Concurrent pipeline runs writing to the same table: two ADF triggers fire simultaneously (e.g. a scheduled run overlaps with a manual re-run), and both attempt to insert the same keys in parallel.
- 5Incremental load window overlap: a watermark-based incremental load uses a non-exclusive upper bound (BETWEEN or <=) and re-reads rows that were already loaded in the previous window, causing duplicates on the next run.
- 6Schema mismatch after index rebuild: a unique index was added to a column that previously had duplicates cleaned up manually; subsequent loads fail because the source still produces those duplicates.
- 7MERGE statement logic error: a custom MERGE used as an upsert has a WHEN NOT MATCHED branch that fires for rows already inserted earlier in the same batch, due to a missing GROUP BY or incorrect join condition on the source.
How to fix it
- 1Step 1 — Read the error message precisely. It contains the table, index name, and duplicate value: 'Cannot insert duplicate key row in object 'dbo.Orders' with unique index 'IX_Orders_ExternalId'. Duplicate key value is (EXT-9921).' Note the index name and key value — you need both for the next steps.
- 2Step 2 — Find which columns the index covers: SELECT c.name, ic.key_ordinal FROM sys.index_columns ic JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id WHERE ic.object_id = OBJECT_ID('dbo.Orders') AND ic.index_id = (SELECT index_id FROM sys.indexes WHERE object_id = OBJECT_ID('dbo.Orders') AND name = 'IX_Orders_ExternalId');
- 3Step 3 — Find duplicates in the target to understand scope: SELECT ExternalId, COUNT(*) AS cnt FROM dbo.Orders GROUP BY ExternalId HAVING COUNT(*) > 1 ORDER BY cnt DESC; If this returns rows, the table already has inconsistencies from a prior partial load — decide whether to truncate or deduplicate before retrying.
- 4Step 4 — Deduplicate the source before loading. Use a CTE with ROW_NUMBER() to keep only one row per key: WITH deduped AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY ExternalId ORDER BY UpdatedAt DESC) AS rn FROM staging.Orders) INSERT INTO dbo.Orders SELECT <columns> FROM deduped WHERE rn = 1;
- 5Step 5 — Switch to MERGE/upsert for idempotent loads. In ADF: open the Copy Data sink, set Write method to 'Upsert', and specify the unique index columns as Key columns. ADF generates a MERGE statement that updates existing rows and inserts new ones, making retries safe without truncation.
- 6Step 6 — If using a custom MERGE statement, ensure the source CTE is pre-deduplicated before the MERGE: a MERGE that receives duplicate source rows for the same target key will throw error 2601 even with a correct WHEN MATCHED branch. Add: WITH src AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY ExternalId ORDER BY UpdatedAt DESC) AS rn FROM staging.Orders) ... WHERE rn = 1.
- 7Step 7 — Fix the retry strategy for the pipeline. Either (a) add a truncate activity before the copy activity so retries always start clean, or (b) implement a proper watermark with an exclusive upper bound (< @watermark instead of <= @watermark) to prevent window overlap on incremental loads.
Example log output
ErrorCode=UserErrorInvalidData,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error happened when loading data into SQL Database. 'Cannot insert duplicate key row in object 'dbo.Orders' with unique index 'IX_Orders_ExternalId'. The duplicate key value is (EXT-9921).',Source=Microsoft.DataTransfer.ClientLibrary