MetricSign
EN|NLRequest Access
High severitystreaming

Power BI Refresh Error:
STATEFUL_PROCESSOR_CORRUPTED_STATE

What does this error mean?

A stateful streaming operator (such as mapGroupsWithState or transformWithState) encountered state data it cannot deserialize. The state store schema or class definitions no longer match what was written during a previous run.

Common causes

  • 1The state encoder class was changed (field added, removed, or renamed) between streaming restarts
  • 2A case class used as the state type was recompiled with an incompatible serialization
  • 3The checkpoint was migrated from a different Spark or Databricks runtime version with incompatible Kryo serialization
  • 4A schema evolution change to the input stream altered how state values were derived
  • 5The RocksDB state store backing files were partially overwritten or corrupted

How to fix it

  1. 1Compare the current state class definition against the one used when the checkpoint was last written.
  2. 2If class definitions changed incompatibly, delete the checkpoint and restart from scratch.
  3. 3Use Avro or a versioned encoder for state types to allow forward/backward compatible schema evolution.
  4. 4Test state class changes in a staging environment before deploying to production streaming jobs.
  5. 5If using RocksDB state store, check for storage errors in the cluster logs before assuming schema mismatch.

Frequently asked questions

Can I add a new field to my state class without losing state?

Only if you use a schema-evolution-safe encoder such as Avro or Protobuf. With default Kryo or Java serialization, even adding an optional field will produce STATEFUL_PROCESSOR_CORRUPTED_STATE on restart.

Is this the same as CHECKPOINT_CORRUPTED?

No. CHECKPOINT_CORRUPTED means the checkpoint directory itself is damaged. STATEFUL_PROCESSOR_CORRUPTED_STATE means the checkpoint is intact but the state data inside it is incompatible with the current code.

Other streaming errors