Medium severitydata flow
Power BI Refresh Error:
DF-Executor-BroadcastFailure
What does this error mean?
A Spark broadcast join failed because the dataset being broadcast is too large for the cluster's available memory. In a broadcast join, Spark copies one dataset to every executor node — if the dataset exceeds available memory, the broadcast fails.
Common causes
- 1The dataset on one side of a join transformation exceeds the memory available per executor for broadcast — anything larger than 500MB per executor will fail
- 2The Broadcast option is set to 'Auto' and Spark incorrectly estimates the dataset size as small enough to broadcast
- 3The Azure IR compute type is too small to accommodate the broadcast dataset in memory alongside the main data stream
- 4Both sides of a join are set to broadcast, doubling the memory pressure
How to fix it
- 1Open the failing join transformation in ADF Studio and set the Broadcast option to 'Off' to disable the broadcast join entirely.
- 2Alternatively, set the Broadcast option to 'Fixed' and add a `azure.sizeAmbiguityRelationThreshold` parameter to increase the broadcast size threshold.
- 3If the dataset being broadcast is large, switch the join strategy to Sort Merge Join which does not require broadcasting.
- 4Increase the Azure IR compute size to give Spark more memory for the broadcast operation.
- 5Enable debug mode and preview both sides of the join to verify which dataset is triggering the broadcast failure.
Frequently asked questions
Official documentation: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide