MetricSign
EN|NLRequest Access
Medium severitysql

Power BI Refresh Error:
COLUMN_ALREADY_EXISTS

What does this error mean?

A DDL statement or DataFrame operation attempted to add a column with a name that already exists in the table or schema. Databricks rejects this operation to prevent silent column shadowing or data loss.

Common causes

  • 1An ALTER TABLE ADD COLUMN statement specifies a column name that already exists in the Delta table
  • 2A CREATE TABLE AS SELECT query produces duplicate column names from a SELECT with multiple joins
  • 3A DataFrame.withColumn call uses an existing column name, silently replacing the column — and a later merge or write then surfaces the conflict
  • 4An idempotent migration script runs twice without checking for column existence first
  • 5A schema evolution path adds the same column in two migration steps

How to fix it

  1. 1Before ALTER TABLE ADD COLUMN, check with: DESCRIBE TABLE <table_name> or SHOW COLUMNS IN <table_name>.
  2. 2Use IF NOT EXISTS guard where supported: ALTER TABLE t ADD COLUMN IF NOT EXISTS col INT.
  3. 3For CREATE TABLE AS SELECT, use explicit aliases to deduplicate column names in joins.
  4. 4Wrap migration scripts in IF NOT EXISTS checks or catch AnalysisException in Python code.
  5. 5Review schema evolution history with DESCRIBE HISTORY <table_name> to understand when the column was added.

Frequently asked questions

Does Delta Lake support IF NOT EXISTS for ADD COLUMN?

Yes, as of Delta Lake 2.0 and Databricks Runtime 11.3+: ALTER TABLE t ADD COLUMN IF NOT EXISTS col INT is supported.

Why does withColumn not raise this error in PySpark?

DataFrame.withColumn silently replaces a column with the same name at the DataFrame level. The error surfaces later when writing to Delta if the column definition conflicts with the table schema.

Other sql errors