What causes DF-Executor-RemoteRPCClientDisassociated?

An executor process was killed by the JVM out-of-memory handler and the driver received the disconnection as an RPC error instead of a clean failure message. A large shuffle operation between executors exhausted disk space on a node, causing the executor to terminate abnormally. The Azure IR cluster experienced a transient node failure or eviction — more common with spot-instance IR configurations. A broadcast join attempted to materialize a dataset that exceeded executor heap capacity, triggering a cascading JVM crash

How do I fix DF-Executor-RemoteRPCClientDisassociated?

Retry the pipeline — RemoteRPCClientDisassociated is a transient Spark cluster failure where an executor lost contact with the driver; retries succeed.. Increase the Azure IR compute type to provide more memory per executor node, reducing the likelihood of executor crashes under memory pressure.. Check whether the error coincides with a large shuffle operation — reduce partition count or add filtering to reduce the data volume being shuffled between executors.. Disable broadcast joins if the error occurs near a join transformation — large broadcasts can exhaust executor memory and trigger disconnection.. If the error recurs consistently on the same data, check ADF Monitor for any patterns (same time of day, same data volume) to identify the root cause.

Medium severityinfrastructureAzure Data Factory →

ADF Pipeline Error:
DF-Executor-RemoteRPCClientDisassociated

What does this error mean?

A Spark executor lost its RPC connection to the driver — almost always because the executor crashed due to an out-of-memory kill. The data flow fails with this network-level symptom rather than surfacing the underlying OOM.

Common causes

1An executor process was killed by the JVM out-of-memory handler and the driver received the disconnection as an RPC error instead of a clean failure message
2A large shuffle operation between executors exhausted disk space on a node, causing the executor to terminate abnormally
3The Azure IR cluster experienced a transient node failure or eviction — more common with spot-instance IR configurations
4A broadcast join attempted to materialize a dataset that exceeded executor heap capacity, triggering a cascading JVM crash

How to fix it

1Retry the pipeline — RemoteRPCClientDisassociated is a transient Spark cluster failure where an executor lost contact with the driver; retries succeed.
2Increase the Azure IR compute type to provide more memory per executor node, reducing the likelihood of executor crashes under memory pressure.
3Check whether the error coincides with a large shuffle operation — reduce partition count or add filtering to reduce the data volume being shuffled between executors.
4Disable broadcast joins if the error occurs near a join transformation — large broadcasts can exhaust executor memory and trigger disconnection.
5If the error recurs consistently on the same data, check ADF Monitor for any patterns (same time of day, same data volume) to identify the root cause.

Frequently asked questions

Why does ADF show RemoteRPCClientDisassociated instead of an OutOfMemory error?

When a Spark executor is killed by the JVM OOM handler, it terminates abnormally and the driver reports an RPC disconnection. Check the Spark logs in ADF Monitor for OOM messages to find the true root cause.

How do I check whether an OOM killed the executor?

In ADF Monitor, click the failed activity then 'View details' for the Spark job logs. Look for 'java.lang.OutOfMemoryError' or 'ExecutorLostFailure' in executor stderr — these confirm a JVM OOM, not a network failure.

Should I retry automatically, or is manual intervention needed?

If the error is truly transient (single occurrence, no OOM evidence), 1–2 retries with a 5-minute interval reduce noise. If logs show consistent OOM kills, retrying reproduces the failure — fix the root cause first.

Will downstream Power BI datasets be affected?

Yes — the data flow fails and no data is written to the target. Dependent datasets and reports serve stale data until the run completes successfully.

Source · learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide