MetricSign
EN|NLRequest Access
Medium severitydata flow

Power BI Refresh Error:
DF-Executor-OutOfDiskSpaceError

What does this error mean?

The Spark executor ran out of local disk space on the Azure IR compute node. Data flows use local disk for shuffle spill files when in-memory buffers are exhausted — if the spilled data volume exceeds the local disk capacity, the executor fails with an out-of-disk error.

Common causes

  • 1The data volume being processed per partition exceeds the local disk capacity available per executor node in the Azure IR
  • 2A large shuffle operation (sort, group by, join) is spilling much more data to disk than the node's temporary storage can accommodate
  • 3Multiple concurrent data flows are running on the same IR, competing for the limited local disk on each node
  • 4Intermediate files from a previous failed run were not cleaned up and are consuming disk space alongside the current run

How to fix it

  1. 1The Azure IR node running the Spark executor has run out of local disk space for shuffle or spill files — increase the IR compute type to use nodes with more local storage.
  2. 2Reduce the amount of data being processed in a single run by adding source filters or partitioning the pipeline to process smaller batches.
  3. 3Disable caching transformations that are writing intermediate data to local disk, or move them to use in-memory cache only for smaller datasets.
  4. 4Check whether any intermediate files from previous failed runs are accumulating on the IR nodes and consuming disk — retry with a fresh cluster session.
  5. 5Increase partition count in the data flow to reduce the per-partition data size being spilled to disk during shuffle operations.

Frequently asked questions

What is shuffle spill and why does it consume disk?

During shuffle (sort, group by, join), Spark redistributes data across executors. Partitions that don't fit in memory spill to local disk — large shuffles on small IR nodes produce large spill files.

How does increasing partition count help with disk space?

More partitions means smaller spill files per partition. Start by doubling the partition count (Optimize tab) and monitor whether disk usage drops below capacity. Too many partitions add overhead.

Is OutOfDiskSpace the same as OutOfMemory?

No — OOM means Spark can't allocate heap memory. OutOfDiskSpace means shuffle spill files filled the executor's local disk. OOM needs more memory; OutOfDiskSpace needs more disk or fewer data per partition.

Will downstream Power BI datasets be affected?

Yes — the pipeline fails and the target table receives no new data. Dependent datasets serve stale figures.

Official documentation: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide

Other data flow errors