What causes COLUMN_ALREADY_EXISTS?

An ALTER TABLE ADD COLUMN statement specifies a column name that already exists in the Delta table. A CREATE TABLE AS SELECT query produces duplicate column names from a SELECT with multiple joins. A DataFrame.withColumn call uses an existing column name, silently replacing the column — and a later merge or write then surfaces the conflict. An idempotent migration script runs twice without checking for column existence first. A schema evolution path adds the same column in two migration steps

How do I fix COLUMN_ALREADY_EXISTS?

Before ALTER TABLE ADD COLUMN, check with: DESCRIBE TABLE or SHOW COLUMNS IN .. Use IF NOT EXISTS guard where supported: ALTER TABLE t ADD COLUMN IF NOT EXISTS col INT.. For CREATE TABLE AS SELECT, use explicit aliases to deduplicate column names in joins.. Wrap migration scripts in IF NOT EXISTS checks or catch AnalysisException in Python code.. Review schema evolution history with DESCRIBE HISTORY to understand when the column was added.

Medium severitysqlDatabricks →

Databricks Error:
COLUMN_ALREADY_EXISTS

What does this error mean?

A DDL statement or DataFrame operation attempted to add a column with a name that already exists in the table or schema. Databricks rejects this operation to prevent silent column shadowing or data loss.

Common causes

1An ALTER TABLE ADD COLUMN statement specifies a column name that already exists in the Delta table
2A CREATE TABLE AS SELECT query produces duplicate column names from a SELECT with multiple joins
3A DataFrame.withColumn call uses an existing column name, silently replacing the column — and a later merge or write then surfaces the conflict
4An idempotent migration script runs twice without checking for column existence first
5A schema evolution path adds the same column in two migration steps

How to fix it

1Before ALTER TABLE ADD COLUMN, check with: DESCRIBE TABLE <table_name> or SHOW COLUMNS IN <table_name>.
2Use IF NOT EXISTS guard where supported: ALTER TABLE t ADD COLUMN IF NOT EXISTS col INT.
3For CREATE TABLE AS SELECT, use explicit aliases to deduplicate column names in joins.
4Wrap migration scripts in IF NOT EXISTS checks or catch AnalysisException in Python code.
5Review schema evolution history with DESCRIBE HISTORY <table_name> to understand when the column was added.

Frequently asked questions

Does Delta Lake support IF NOT EXISTS for ADD COLUMN?

Yes, as of Delta Lake 2.0 and Databricks Runtime 11.3+: ALTER TABLE t ADD COLUMN IF NOT EXISTS col INT is supported.

Why does withColumn not raise this error in PySpark?

DataFrame.withColumn silently replaces a column with the same name at the DataFrame level. The error surfaces later when writing to Delta if the column definition conflicts with the table schema.

Source · docs.databricks.com/aws/en/error-messages/error-classes.html