MetricSign
Start free
Best Practices8 min·

Delta MERGE From Multiple Source Tables Fails Because UNION ALL Isn't Enough

A UNION ALL in the USING clause looks correct until two source tables contribute a row for the same key. Delta rejects the ambiguity outright.

UNION ALL in USING passes the parser but not the executor

The pattern seems straightforward. You have an orders table in a bronze layer and a corrections table from a downstream system. Both contain rows keyed on order_id. You combine them in the USING clause:

``sql MERGE INTO gold.orders AS target USING ( SELECT * FROM bronze.orders UNION ALL SELECT * FROM bronze.corrections ) AS source ON target.order_id = source.order_id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * ``

This compiles. Spark builds the plan. Then execution hits a wall: DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE. The error surfaces when more than one source row matches the same target row on the ON condition and a WHEN MATCHED clause is present.

Delta's MERGE follows SQL standard semantics. If order_id = 1001 appears in both bronze.orders and bronze.corrections, Spark cannot decide which source row should supply the UPDATE values. Rather than picking arbitrarily — which would make results non-deterministic — it throws the error.

This is not a bug. It is the engine enforcing a contract: every matched target row must correspond to exactly one source row. The confusion comes from the fact that the UNION ALL itself is perfectly legal SQL. Nothing in the parser warns you about the downstream collision. You only discover it at runtime, after Spark has already shuffled and joined the data, which on large tables means you've burned cluster time before seeing the failure.

The error message itself is somewhat opaque. It tells you multiple source rows matched but doesn't identify which keys are duplicated across your sources. On a 50-million-row combined source, that leaves you hunting.

Deduplicate before MERGE, not inside the WHEN clause

The fix is a CTE or subquery that collapses the combined source to one row per key before MERGE ever sees it. The most reliable pattern uses ROW_NUMBER():

``sql WITH combined AS ( SELECT *, 'orders' AS _source FROM bronze.orders UNION ALL SELECT *, 'corrections' AS _source FROM bronze.corrections ), deduped AS ( SELECT * FROM ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY order_id ORDER BY CASE _source WHEN 'corrections' THEN 1 WHEN 'orders' THEN 2 END, updated_at DESC ) AS rn FROM combined ) WHERE rn = 1 ) MERGE INTO gold.orders AS target USING deduped AS source ON target.order_id = source.order_id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * ``

The _source tag and the CASE expression establish precedence: corrections override orders. Within the same source, the most recent updated_at wins. The outer filter WHERE rn = 1 guarantees one row per order_id.

This approach has a cost. The ROW_NUMBER window function forces a shuffle on the partition key. On a 100-million-row combined source, that's a full data exchange across the cluster. Two things help: first, if your sources are already partitioned or Z-ordered on the merge key, Spark can use a sort-merge strategy instead of a hash shuffle. Second, you can filter each source to only include recent changes (using _commit_version from Delta's change data feed or a watermark column) so the combined dataset is small relative to the target.

A tempting alternative — using QUALIFY instead of the subquery — works on Databricks SQL warehouses but not on all Databricks Runtime versions for notebooks. Stick with the subquery pattern for portability.

Delta MERGE with multi-source deduplication UNION ALL source tables into CTE Tag each row with _source identifier Apply ROW_NUMBER() OVER (PARTITION BY merge_key ORDER BY Filter WHERE rn = 1 Pass deduplicated result to MERGE INTO USING clause Single atomic commit to target Delta table
Delta MERGE with multi-source deduplication

Sequential MERGE per source creates worse problems than duplicates

When engineers hit the multiple-match error, a common reaction is to split the operation: run one MERGE for bronze.orders, then a second MERGE for bronze.corrections. This avoids the ambiguity error because each MERGE has a single source. It also introduces two problems that are harder to debug than the original.

First, each MERGE rewrites the target table's affected files. Delta's MERGE implementation reads the target, identifies matching files, and writes new versions of those files with the updates applied. Two sequential MERGEs that touch overlapping key ranges will each rewrite the same files. On a target table with 500 Parquet files, the first MERGE might rewrite 120 of them. The second MERGE rewrites another 80 — some overlapping with the first batch. You've now done roughly 200 file rewrites instead of the 130 a single MERGE would require. For large tables, this doubles write amplification and transaction log entries.

Second, there is a concurrency hazard. If another job writes to the same target table between your two MERGEs, the second MERGE may fail with a ConcurrentAppendException or ConcurrentDeleteReadException, depending on isolation level. Even with serializable isolation, the second MERGE must retry against the new table version, adding latency.

The sequential approach also breaks atomicity. If the first MERGE succeeds and the second fails — say, because of a schema mismatch in bronze.corrections — your target table is in a half-updated state. With the single CTE-based MERGE, either the entire upsert commits or nothing does.

There's a third issue specific to Databricks jobs: each MERGE is a separate Spark action, which means a separate stage in the Spark UI. Debugging is harder because you're correlating two stages with two different sets of metrics, shuffle reads, and spill statistics. A single MERGE gives you one stage to inspect.

Diagnosing which keys collide across sources

Before applying the ROW_NUMBER fix, you should verify which keys actually collide. This avoids over-engineering a deduplication step that might mask upstream data quality issues.

Run this diagnostic query against your combined source:

``sql WITH combined AS ( SELECT order_id, 'orders' AS _source FROM bronze.orders UNION ALL SELECT order_id, 'corrections' AS _source FROM bronze.corrections ) SELECT order_id, COUNT(*) AS row_count, COUNT(DISTINCT _source) AS source_count, COLLECT_SET(_source) AS sources FROM combined GROUP BY order_id HAVING COUNT(*) > 1 ORDER BY row_count DESC LIMIT 50 ``

This tells you three things: which keys are duplicated, how many rows each key has, and whether duplicates come from the same source or across sources. If duplicates are within a single source, the problem is upstream — your bronze table has duplicate records, and the UNION ALL is incidental. If duplicates span sources, you need the precedence-based deduplication.

Another diagnostic worth running: check whether the collisions are stable or growing. Run the same query against successive micro-batches or daily loads. If the collision count grows linearly with load volume, your sources have overlapping scopes by design — for example, a corrections feed that replays the original record alongside the correction. If collisions are sporadic, you may have an intermittent upstream issue like a CDC replay or a source system re-emitting records after a recovery.

For streaming MERGE (using foreachBatch), the diagnostic is the same but you apply it inside the batch function. Log the collision count as a custom Spark metric so you can track it over time without re-running ad hoc queries.

Streaming foreachBatch needs the same deduplication guard

If you're running a structured streaming job that merges from multiple sources into a Delta table using foreachBatch, the same error applies — but the failure mode is worse. A batch job that fails can be restarted manually. A streaming job that fails on batch N will restart, reprocess the same micro-batch, hit the same collision, and fail again in a loop.

The pattern inside foreachBatch looks like this in PySpark:

``python def upsert_to_gold(batch_df, batch_id): batch_df.createOrReplaceTempView("source_batch") spark.sql(""" WITH deduped AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY order_id ORDER BY updated_at DESC ) AS rn FROM source_batch ) MERGE INTO gold.orders AS target USING (SELECT * FROM deduped WHERE rn = 1) AS source ON target.order_id = source.order_id WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * """) ``

The batch_df here might already be the result of reading multiple sources joined or unioned upstream in the streaming graph. The deduplication inside foreachBatch is a safety net, not a replacement for fixing upstream duplication.

One subtlety: if you use spark.conf.set("spark.databricks.delta.merge.repartitionBeforeWrite.enabled", "true"), the MERGE will repartition output files to match the target's partitioning scheme. This is helpful for preventing small files but adds shuffle overhead on top of the ROW_NUMBER shuffle. On small batches (under 1 million rows), the overhead is negligible. On larger batches, consider whether you need both.

MetricSign detects Databricks job failures with the specific error context — including the DELTA_MULTIPLE_SOURCE_ROW_MATCHING_TARGET_ROW_IN_MERGE error class — and groups recurring failures on the same job, so you can distinguish a one-time data collision from a systemic upstream issue before the streaming lag compounds into hours of missing data.

DBR 16.1 relaxes the constraint but doesn't eliminate it

Databricks Runtime 16.1 introduced a narrower interpretation of the multiple-match rule. Previously, if two source rows matched the same target row on the ON clause, MERGE would fail even if only one of those source rows satisfied the WHEN MATCHED condition. DBR 16.1 now allows the MERGE to proceed as long as exactly one source row passes the full WHEN MATCHED predicate.

This means you can write:

``sql MERGE INTO gold.orders AS target USING combined_source AS source ON target.order_id = source.order_id WHEN MATCHED AND source._source = 'corrections' THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * ``

If order_id = 1001 appears in both the orders and corrections sources, but only the corrections row satisfies _source = 'corrections', the MERGE succeeds on DBR 16.1+. On older runtimes, it still fails.

This is useful but limited. It only works when you can express precedence as a filter in the WHEN MATCHED clause. If both source rows have _source = 'corrections' — because the corrections feed itself contains duplicates — you're back to the original error. The ROW_NUMBER approach handles both cases.

There's also a portability concern. If your Databricks workspace runs mixed runtime versions across clusters — perhaps the interactive cluster is on DBR 15.4 while the job cluster is on DBR 16.2 — the same notebook succeeds on one and fails on the other. The CTE deduplication pattern works identically on every runtime version from DBR 10 onward, making it the safer default for production pipelines that need to survive cluster upgrades and workspace migrations.

Related integrations

Related articles

← All articlesShare on LinkedIn