Low severitysql
Power BI Refresh Error:
DELTA_STATS_COLLECTION_FAILED
What does this error mean?
Delta Lake failed to collect file-level statistics (min/max values, null counts) for one or more columns during a write operation. Statistics are used for data skipping; their absence degrades query performance but does not corrupt data.
Common causes
- 1A column type (e.g., STRUCT, MAP, ARRAY, or a very long STRING) does not support file statistics collection
- 2The table has more than 32 columns and Delta only collects statistics for the first 32 by default
- 3Autoloader or a streaming job wrote rows that caused a stats computation error on a nested schema
- 4Custom data types or UDTs used in the schema are not supported by Delta's stats collection engine
How to fix it
- 1For unsupported column types, exclude them from stats collection: `ALTER TABLE t SET TBLPROPERTIES ('delta.dataSkippingNumIndexedCols' = <n>)` where n is the count of columns to include.
- 2If the failing column is a complex type (STRUCT/MAP/ARRAY), set `delta.dataSkippingNumIndexedCols` to exclude it from the stats window.
- 3Check whether the write completed successfully despite the stats failure — data is written even when stats collection fails.
- 4Run `ANALYZE TABLE t COMPUTE STATISTICS` to rebuild stats after addressing the root cause.