MetricSign
EN|NLRequest Access
Medium severitydata flow

Power BI Refresh Error:
DF-Executor-AcquireStorageMemoryFailed

What does this error mean?

The Spark executor failed to acquire storage memory (off-heap memory used for caching and shuffle spill). The Azure IR compute type does not have enough memory for the data volume being processed, or a broadcast join is caching too much data in the storage memory region.

Common causes

  • 1The Azure IR compute type is too small for the data volume — each executor node does not have enough memory to hold the required shuffle and cache data
  • 2A broadcast join is loading a large dataset into storage memory — broadcast caches consume storage memory proportional to the dataset size
  • 3The data flow has many transformations that each consume storage memory (caching, windowing, sort) and the cumulative usage exceeds the executor capacity
  • 4Data volume has grown since the IR was last sized, and the pipeline now processes significantly more rows per run

How to fix it

  1. 1Upgrade the Azure Integration Runtime compute type — go to the data flow activity Settings tab and increase the core count or switch to a memory-optimized compute type.
  2. 2Reduce the data volume processed per run by adding source filter conditions or processing data in smaller partitions.
  3. 3Disable broadcast joins on the data flow join transformations to prevent in-memory broadcasting of large datasets.
  4. 4Enable data flow debug mode and run with a limited row count to identify which transformation step exhausts storage memory.
  5. 5Review the ADF activity run output for the specific sub-error detail and consult the Azure Data Factory data flow performance tuning documentation.

Frequently asked questions

What is the difference between storage memory and execution memory in Spark?

Spark divides heap into execution memory (shuffle, sort, aggregation) and storage memory (caching, broadcast). AcquireStorageMemoryFailed means the storage region is exhausted — caused by a broadcast join or cache(). Increase IR compute size to expand both.

How do I know which join is causing the broadcast memory problem?

Enable debug mode and run with a small row count. The error stack trace often names the stage where memory allocation failed. Look for Join transformations with Broadcast set to 'Auto' or 'Left/Right'.

Will this error appear on every run or only sometimes?

It appears when data volume nears the memory limit. With incremental growth, the pipeline may succeed for months before crossing the threshold — once it fails, subsequent runs also fail unless the IR is scaled up.

Will downstream Power BI datasets be affected?

Yes — the pipeline fails and the target table receives no new data. Dependent datasets serve stale figures.

Official documentation: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide

Other data flow errors