MetricSign
EN|NLRequest Access
Critical severityresource

Power BI Refresh Error:
CLUSTER_TERMINATED_UNEXPECTEDLY

What does this error mean?

A running Databricks cluster stopped without a user-initiated action, typically due to an out-of-memory condition, a driver JVM crash, or an underlying cloud infrastructure failure.

Common causes

  • 1The driver or worker ran out of memory (OOM) and the JVM was killed by the OS
  • 2An Azure/AWS spot instance was preempted by the cloud provider
  • 3A runaway shuffle or broadcast join caused disk spill beyond available temp space
  • 4A cluster init script raised an unhandled exception that terminated the process

How to fix it

  1. 1Step 1: Open the cluster's Event Log and check the Termination Reason field for the root cause code.
  2. 2Step 2: If OOM: increase the driver or worker memory, reduce partition size, or add more workers.
  3. 3Step 3: If spot preemption: switch to on-demand instances for production jobs or enable Spot Fallback.
  4. 4Step 4: Review Ganglia or Spark UI for memory and GC pressure before the termination.
  5. 5Step 5: Enable cluster log delivery to DBFS or S3 so logs are available after the cluster is gone.

Frequently asked questions

How is this different from COMMUNICATION_LOST?

COMMUNICATION_LOST means Databricks lost contact with the driver but the underlying VM may still be running (e.g., network partition). CLUSTER_TERMINATED_UNEXPECTEDLY means the cluster process itself has definitively stopped.

Will Databricks automatically retry a job that fails this way?

Yes, if you configure a retry policy on the job. Set the maximum retries and minimum retry interval in the job settings, and ensure your job is idempotent before enabling retries.

Other resource errors