Medium severitydata flow
Power BI Refresh Error:
DF-Hive-InvalidDataType
What does this error mean?
A column in the Hive source or sink uses a data type that ADF Mapping Data Flows cannot handle — common culprits are Hive-specific complex types (ARRAY, MAP, STRUCT) or timestamp formats that don't map cleanly to Spark SQL types. The data flow fails when it encounters the unsupported type during schema inference or write.
Common causes
- 1The Hive table contains ARRAY, MAP, or STRUCT columns that ADF does not natively support
- 2A Hive TIMESTAMP column has a format or precision that Spark SQL rejects
- 3The derived column or cast expression produces a type incompatible with the target Hive column
- 4Import schema was run when the table was empty or had different types, and the stored schema is now stale
How to fix it
- 1In ADF Studio, open the source transformation and click 'Import schema' to refresh the column type list from Hive.
- 2Identify the specific column causing the error — check the ADF activity run output for the column name and type in the error detail.
- 3Add a Derived Column transformation to cast complex types to STRING before writing: `toString(complexColumn)`.
- 4For timestamp precision issues, add an explicit cast: `toTimestamp(timestampCol, 'yyyy-MM-dd HH:mm:ss')`.
- 5Enable debug mode and run a data preview on the Source transformation to confirm which columns have type mismatches.
Frequently asked questions
Official documentation: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-troubleshoot-guide