What causes DF-Executor-BroadcastFailure?

The dataset on one side of a join transformation exceeds the memory available per executor for broadcast — anything larger than 500MB per executor will fail. The Broadcast option is set to 'Auto' and Spark incorrectly estimates the dataset size as small enough to broadcast. The Azure IR compute type is too small to accommodate the broadcast dataset in memory alongside the main data stream. Both sides of a join are set to broadcast, doubling the memory pressure

How do I fix DF-Executor-BroadcastFailure?

Open the failing join transformation in ADF Studio and set the Broadcast option to 'Off' to disable the broadcast join entirely.. Alternatively, set the Broadcast option to 'Fixed' and add a `azure.sizeAmbiguityRelationThreshold` parameter to increase the broadcast size threshold.. If the dataset being broadcast is large, switch the join strategy to Sort Merge Join which does not require broadcasting.. Increase the Azure IR compute size to give Spark more memory for the broadcast operation.. Enable debug mode and preview both sides of the join to verify which dataset is triggering the broadcast failure.

Medium severitydata flowAzure Data Factory →

ADF Pipeline Error:
DF-Executor-BroadcastFailure

What does this error mean?

A Spark broadcast join failed because the dataset being broadcast is too large for the cluster's available memory. In a broadcast join, Spark copies one dataset to every executor node — if the dataset exceeds available memory, the broadcast fails.

Common causes

1The dataset on one side of a join transformation exceeds the memory available per executor for broadcast — anything larger than 500MB per executor will fail
2The Broadcast option is set to 'Auto' and Spark incorrectly estimates the dataset size as small enough to broadcast
3The Azure IR compute type is too small to accommodate the broadcast dataset in memory alongside the main data stream
4Both sides of a join are set to broadcast, doubling the memory pressure

How to fix it

1Open the failing join transformation in ADF Studio and set the Broadcast option to 'Off' to disable the broadcast join entirely.
2Alternatively, set the Broadcast option to 'Fixed' and add a `azure.sizeAmbiguityRelationThreshold` parameter to increase the broadcast size threshold.
3If the dataset being broadcast is large, switch the join strategy to Sort Merge Join which does not require broadcasting.
4Increase the Azure IR compute size to give Spark more memory for the broadcast operation.
5Enable debug mode and preview both sides of the join to verify which dataset is triggering the broadcast failure.

Frequently asked questions

What is a broadcast join and when does ADF use it?

A broadcast join copies the smaller dataset to every executor, avoiding a shuffle. ADF broadcasts automatically for small datasets or when set explicitly — fast for small data but fails when the dataset exceeds executor memory.

What is the difference between BroadcastFailure and BroadcastTimeout?

BroadcastFailure is a memory exception — the dataset is too large to allocate. BroadcastTimeout occurs when broadcast exceeds the timeout window. Both are resolved by disabling broadcast on the join transformation.

How do I disable broadcast on a specific join in ADF?

Open the failing Join transformation and change 'Broadcast' from 'Auto' to 'Off'. This forces a Sort Merge Join — no broadcast required, adequate for most scenarios.

Will downstream Power BI datasets be affected?

Yes — the pipeline fails and downstream datasets serve stale data.

Source · learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide