Medium severitydata integrity
MySQL Error:
1406
What does this error mean?
MySQL error 1406 is thrown when an INSERT or UPDATE statement tries to write a value that exceeds the byte length defined for the target column. In strict SQL mode (the default since MySQL 5.7) the server rejects the write and rolls back the statement; in non-strict mode it silently truncates the value. In a data pipeline this typically surfaces as a failed ADF copy activity, a failed dbt run, or a broken LOAD DATA INFILE job — the engineer sees the error in the pipeline run log referencing the column name and the offending row count. The root cause is almost always a schema definition that has not kept pace with the growth of source data.
Common causes
- 1Source VARCHAR column was widened upstream without a corresponding ALTER in the destination MySQL table. For example, a CRM field for 'company_name' was expanded from 100 to 200 characters in Salesforce, but the MySQL staging table still defines it as VARCHAR(100).
- 2A free-text API field (description, notes, comments) receives an unusually long value for the first time. The column definition was sized on initial samples; production data introduces edge cases that exceed that estimate.
- 3Multi-byte UTF-8 characters inflate byte length beyond character count. MySQL's VARCHAR length is defined in characters, but certain collations store emoji or CJK characters as 3-4 bytes each, which can push the byte count past the row-size limit even when CHAR_LENGTH looks acceptable.
- 4ADF copy activity maps source and destination columns automatically but does not apply truncation or validation. When the source schema changes, ADF passes the raw value unchanged and MySQL rejects it at write time.
- 5A dbt model performs a string concatenation (e.g. CONCAT(first_name, ' ', last_name, ' — ', department)) and the result exceeds the destination column width. The model passes silently in development with limited data but fails in production with full cardinality.
- 6LOAD DATA INFILE or a bulk insert script reads a flat file where one field contains an unexpectedly long value — such as a CSV export that includes line breaks or un-escaped commas that shift column boundaries, causing a long blob to land in the wrong column.
- 7Implicit type coercion during JOIN or expression evaluation produces a string longer than the target column. For instance, GROUP_CONCAT without an explicit SEPARATOR limit defaults to a result up to group_concat_max_len bytes, which can far exceed the column definition.
How to fix it
- 1Step 1 — Find the actual maximum length in the source: `SELECT MAX(CHAR_LENGTH(col)), MAX(LENGTH(col)) FROM source_table;` — CHAR_LENGTH counts characters, LENGTH counts bytes. Compare both against the destination column definition to understand whether the issue is character count or byte size.
- 2Step 2 — Inspect the destination column definition: `SHOW COLUMNS FROM your_table LIKE 'col';` — note the Type field (e.g. VARCHAR(100)). Cross-reference with the source maximum from step 1.
- 3Step 3 — Widen the column to accommodate the source data with headroom: `ALTER TABLE your_table MODIFY COLUMN col VARCHAR(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;` — choose a size at least 20% larger than the current maximum to absorb future growth without an immediate re-migration.
- 4Step 4 — If widening is not an option (e.g. the column is part of a unique index with a length constraint), apply truncation in the pipeline. In an ADF derived column transformation use: `left(col, 100)`. In a dbt model use: `LEFT(col, 100) AS col`. Document the truncation in a code comment so future maintainers understand the intent.
- 5Step 5 — Switch to TEXT type for genuinely unbounded free-text fields where indexing is not required: `ALTER TABLE your_table MODIFY COLUMN col TEXT;` — TEXT supports up to 65535 bytes and never triggers error 1406. Note that TEXT columns cannot have a default value and require a prefix length for any index: `CREATE INDEX idx ON your_table (col(255));`
- 6Step 6 — Add a pre-load data quality check in dbt or ADF that asserts column length before writing. In dbt: add a `dbt_utils.expression_is_true` test — `expression: "CHAR_LENGTH(col) <= 100"` — so schema violations are caught at test time, not at load time.
- 7Step 7 — If strict mode is the blocker and a short-term fix is needed, you can disable it per session: `SET SESSION sql_mode = REPLACE(@@sql_mode, 'STRICT_TRANS_TABLES', '');` — this causes silent truncation and is not recommended for production; treat it as a temporary diagnostic measure only.
Example log output
ErrorCode=UserErrorDatabaseOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error happened when loading data into database. SQL Error Number: 1406. Error Message: Data too long for column 'notes' at row 1.,Source=Microsoft.DataTransfer.ClientLibrary'