MetricSign
Start free
Data Observability7 min·

dbt Snowflake Monitoring: How Failures Travel from Transformation to Warehouse

The dbt job finished. The Snowflake table refreshed. The Power BI report is showing wrong numbers. The failure crossed two tool boundaries without triggering a single alert.

The failure that lives between dbt and Snowflake

dbt transforms data in Snowflake. It runs SQL models that read from source tables and write to output tables. When a dbt model executes, the result — good or bad — is stored in Snowflake. Snowflake stores it faithfully.

This creates a failure surface that neither tool individually monitors well. dbt reports on execution: did the SQL run, did the tests pass, how long did each model take? Snowflake reports on storage: is the table there, does it have rows, are queries against it succeeding?

Neither tool is designed to answer the question that matters for downstream consumers: is the data in the Snowflake table correct, current, and complete relative to what was expected?

Here is the scenario that makes this concrete. A source table renames a field. The dbt model that references it uses a flexible reference that compiles without error. The model runs and writes to the Snowflake output table — minus the renamed field, which is now absent from the SELECT. dbt reports Succeeded. Snowflake has a table with the new structure. Power BI refreshes against it and loads data without the missing field. Measures that referenced it evaluate to blank.

The failure crossed two tool boundaries. dbt did not flag it because the SQL compiled. Snowflake did not flag it because the data was written cleanly. Power BI did not flag it because the import completed. A cross-layer monitoring approach is required — and that starts with understanding what both layers actually surface. See data observability platform overview for how this fits into a complete monitoring strategy.

What dbt monitors about Snowflake execution

When dbt runs against Snowflake, it executes models (SQL statements that CREATE OR REPLACE tables or views) and tests (SQL assertions that should return zero rows if the data is valid). The monitoring surface is:

Model execution status — Did the SQL complete without a fatal error? model_execution_time and status per model in run_results.json.

Test results — Which dbt tests passed or failed? Tests are the primary quality gate in dbt. A not_null test on a primary key that fails is a clear signal. A test that runs against an empty table returns zero rows and 'passes' — which is misleading.

Source freshnessdbt source freshness checks whether upstream source tables in Snowflake have been updated within a configured window. This is a scheduled check, not a continuous one, but it is the one native dbt freshness signal for Snowflake source tables.

Compilation warnings — dbt surfaces compile-time warnings in the run output. These do not fail the model but indicate problems that warrant attention.

What dbt does not monitor: the row count of what it wrote to the Snowflake table, the schema of what it wrote relative to what it wrote last run, or whether the output is consistent with the data the downstream Power BI datasets expect. See dbt monitoring for a deeper treatment of the full dbt monitoring surface.

How Snowflake table monitoring fills the gap

After a dbt job completes, Snowflake holds the output. The right post-run checks on the Snowflake side:

Row count comparison — How many rows does the output table contain now compared to the previous run (for full-refresh models) or how many rows were inserted in the current run period (for incremental models)? A deviation of more than 20–25% from the day-of-week baseline is worth alerting on.

Watermark check — What is the maximum value of the primary timestamp column in the output table? For a table that aggregates today's transactions, the watermark should be from today. If it is from yesterday, either the dbt model ran against stale source data or the incremental filter missed the current period.

Schema comparison — Compare the column set of the Snowflake table after the dbt run against the previous run. Any column that was present before and is absent now is a schema change introduced by the dbt model or an upstream source. Alert before the Power BI consumer loads an incomplete table.

Cross-run consistency — For dbt models that aggregate or join data, compare the total value of key metrics (sum of revenue, count of active customers) between consecutive runs. Significant changes that cannot be explained by the expected data volume change are worth investigating.

These checks run after the dbt job completes, using Snowflake's query capabilities. MetricSign automates them for tables connected to Snowflake and correlates the results with the downstream Power BI refresh cycle.

When dbt tests catch what Snowflake monitoring misses

The two layers are complementary. A Snowflake row count check tells you that the table has 60% of its expected volume — but it cannot tell you why. The dbt test that would have caught the specific cause (a filter that excluded records incorrectly, a null in a join key) needs to exist before the run.

The most valuable dbt tests for the Snowflake combination:

Referential integrity between dbt models — The relationships test checks that every foreign key value in one model exists as a primary key in another. For a dbt project where fact_orders joins to dim_customers on customer_id, a referential integrity test catches rows dropped by a bad join.

Non-null and unique on output columns — Primary keys in the output tables should always be non-null and unique. A model that aggregates to a non-unique primary key indicates a logic error in the transformation.

Accepted values for categorical fields — If status in the orders model should be one of a known set, an accepted_values test catches records with unexpected values before they reach Snowflake.

Custom row count assertions — The dbt_utils.expression_is_true macro can enforce a minimum row count: SELECT COUNT(*) > 0 FROM {{ ref('fact_orders') }}. This does not replace baseline comparison, but it catches the zero-row case explicitly.

For the combination of dbt and Snowflake running as a pipeline that feeds Snowflake into Power BI, the monitoring chain requires both layers: dbt tests for expected data properties, and Snowflake table checks for volume and freshness anomalies after the run.

Connecting the chain: dbt failure → Snowflake table → Power BI report

The full monitoring picture for a dbt-Snowflake-Power BI pipeline requires three nodes:

  1. dbt run output — Model execution status, test results, source freshness check results
  2. Snowflake table state — Row count vs baseline, watermark freshness, schema comparison post-run
  3. Power BI dataset — Did the dataset import the expected data? Freshness of the imported watermark, row count of the imported model

A failure at node 1 (dbt model error) may or may not affect node 2 — the Snowflake table might have data from the previous successful run. A failure at node 2 (Snowflake table empty or stale) always affects node 3 (Power BI imports what is there).

The most common cross-layer failure: dbt runs successfully but produces incorrect output (silent success), Snowflake stores it, Power BI imports it. This is the class of failure that passes through all three tools without triggering an alert in any of them — until a business user notices.

Making this chain visible requires a monitoring platform that connects to all three layers. MetricSign reads dbt job results, Snowflake table metadata, and Power BI refresh history, and correlates them into a single incident when the chain breaks. An alert that says 'dbt model daily_orders_summary completed with 40% of expected row count; Snowflake table analytics.fact_orders is now at 43% volume; Power BI dataset Sales Overview is scheduled to refresh in 2 hours' is the difference between proactive resolution and a Monday morning escalation.

Frequently asked questions

How does dbt monitoring work with Snowflake?+
dbt runs SQL models against Snowflake and reports execution status and test results. Snowflake stores the output tables. Monitoring the combination requires watching dbt run output (model status, test failures, source freshness) and post-run Snowflake table state (row counts, watermark, schema) to confirm the output is correct.
Can a dbt model succeed but write wrong data to Snowflake?+
Yes. A dbt model can execute without a fatal error while producing incorrect output — a renamed column silently dropped, a filter that matches no rows, a join that loses records due to a key mismatch. Snowflake stores the result faithfully. dbt's test suite must exist and cover the specific failure mode to catch it.
What is dbt source freshness for Snowflake?+
dbt source freshness runs a query against each configured Snowflake source table and checks whether the data has been updated within a configured time window (warn_after, error_after). It is a scheduled check that runs as part of the dbt build process and is the primary native freshness signal in dbt for Snowflake sources.
How do you monitor row counts in dbt Snowflake models?+
The dbt_utils.expression_is_true macro can check that a model's row count is greater than zero. For baseline comparison — detecting that today's row count is 40% below the day-of-week average — you need an observability tool that tracks historical row counts and alerts on significant deviations.
Why does monitoring dbt and Snowflake separately miss failures?+
Separate monitoring misses the cross-layer failures where the failure mode crosses a tool boundary: dbt runs successfully but produces partial output; Snowflake stores it; Power BI imports it. Each tool reports success. Only a monitoring platform that connects all three layers can detect this class of failure proactively.

Related integrations

Related articles