Medium severitydata flow
Power BI Refresh Error:
DF-Executor-OutOfDiskSpaceError
What does this error mean?
The Spark executor ran out of local disk space on the Azure IR compute node. Data flows use local disk for shuffle spill files when in-memory buffers are exhausted — if the spilled data volume exceeds the local disk capacity, the executor fails with an out-of-disk error.
Common causes
- 1The data volume being processed per partition exceeds the local disk capacity available per executor node in the Azure IR
- 2A large shuffle operation (sort, group by, join) is spilling much more data to disk than the node's temporary storage can accommodate
- 3Multiple concurrent data flows are running on the same IR, competing for the limited local disk on each node
- 4Intermediate files from a previous failed run were not cleaned up and are consuming disk space alongside the current run
How to fix it
- 1The Azure IR node running the Spark executor has run out of local disk space for shuffle or spill files — increase the IR compute type to use nodes with more local storage.
- 2Reduce the amount of data being processed in a single run by adding source filters or partitioning the pipeline to process smaller batches.
- 3Disable caching transformations that are writing intermediate data to local disk, or move them to use in-memory cache only for smaller datasets.
- 4Check whether any intermediate files from previous failed runs are accumulating on the IR nodes and consuming disk — retry with a fresh cluster session.
- 5Increase partition count in the data flow to reduce the per-partition data size being spilled to disk during shuffle operations.
Frequently asked questions
Official documentation: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide