Hard Failures vs. Silent Failures
When a Power BI refresh fails with an error, you know about it. The API returns a failed status, your alert fires, and someone starts investigating. The problem is visible, the impact is known, and the clock starts on remediation.
Silent failures work differently. The refresh completes. No alert fires. The report opens without error. But the underlying data is wrong — incomplete, stale, or based on a corrupt upstream source. By the time anyone notices, the damage may already be done: decisions made, reports distributed, KPIs reported to leadership.
In most data teams, far more effort goes into preventing and detecting hard failures than silent ones. The monitoring infrastructure for catching a failed refresh is straightforward: poll the API, send an alert. The infrastructure for catching a silent failure — a dataset that loaded successfully from an empty source table — is harder to build and rarely exists.
The ADF Pipeline That Copied Nothing
Here's a scenario that plays out regularly in Azure-heavy data stacks: an Azure Data Factory pipeline runs on schedule, executes without errors, and marks the run as Succeeded. Power BI refreshes an hour later, also succeeds. Your morning dashboard looks normal.
What actually happened: the source system — a REST API, an SFTP server, an SAP export — didn't produce its file on time. ADF saw an empty source, copied zero bytes to the destination, and reported success because "no data to copy" is not an error in ADF's execution model. The staging table now has no new rows. Power BI loaded those non-existent rows into the model.
If the model doesn't surface a "last loaded date" anywhere in the report, users have no indication that anything is wrong. They're looking at yesterday's data — or last week's — believing it's current. This is a silent failure with real business impact, and it's completely invisible to any monitoring that only checks refresh status.
This is one of the most common patterns in enterprise data pipelines. ADF's native monitoring won't flag a zero-row copy as a failure. You need an additional layer that inspects the actual data — either at the destination (row count check) or at the model layer (watermark monitoring).
The Cost in Decisions, Trust, and Time
Quantifying the cost of bad data is difficult, but the categories are consistent across organizations.
Wrong decisions: A sales team runs pipeline forecasts off stale data. Operations allocates resources based on yesterday's figures. Finance reports a number to leadership that's based on a partial load. These decisions may not be immediately reversible.
Lost trust: The second-order effect of a data incident is often more damaging than the incident itself. After one high-profile case where a report showed incorrect numbers, users start doubting everything. "Is this right?" becomes a refrain in every meeting that uses data. The investment in the data platform is undermined by uncertainty.
Investigation time: Every data quality incident generates investigation work. A data engineer has to retrace the pipeline, check the source, validate the model, confirm what went wrong. Even a 2-hour investigation, repeated across multiple incidents per quarter, adds up to significant lost engineering capacity.
Escalation cost: Senior leaders who receive bad data escalate. This involves data engineers, data owners, and sometimes executives in a chain of meetings. The opportunity cost — time that could have been spent building rather than explaining — is real.
Scenarios Data Engineers Recognize
The following patterns show up repeatedly in production data stacks.
The incrementally-growing null column: A source system starts returning null for a field that used to have values — not a schema change, just the application stopped populating the field. The Power BI model refreshes fine. A calculated column that depended on that field now returns zero for every new row. Reports look subtly wrong for weeks before anyone notices.
The timezone offset incident: A data warehouse loads timestamps in UTC. A report uses a measure that does date math assuming local time. A timezone configuration changes somewhere in the stack, and suddenly all date-based aggregations are off by one. Refreshes succeed. Data is wrong.
The incremental refresh partition error: A Power BI dataset uses incremental refresh with rolling monthly partitions. A misconfigured refresh policy causes the process to reload a historical partition instead of the current month. The refresh completes, row counts are similar to the previous run, but every report shows last month's values where this month's should be. Volume monitoring won't catch this — only watermark inspection (checking the max date actually loaded) will.
The dbt model that errored silently: A dbt Cloud job runs, one model fails with a non-fatal error, and the job continues and completes. The failed model writes partial data. Power BI refreshes off that partial output. Everything looks fine in Power BI Service's refresh history.
What Detection Actually Requires
Detecting silent failures requires monitoring at multiple points in the data chain, not just at the Power BI API layer.
At the pipeline layer: Row count validation after each copy activity. ADF's Copy Activity outputs a rowsCopied metric — you can add a conditional step to fail the pipeline if that value is 0 or below threshold. Most teams don't configure this.
At the staging layer: Data contracts between pipeline output and model input. If a table should have rows where event_date = today, assert that before the Power BI refresh runs.
At the model layer: Volume monitoring on loaded datasets. If the fact table has 40% fewer rows than yesterday, the refresh completed successfully but something is wrong.
At the report layer: Watermark monitoring. Surface the max date in the dataset somewhere visible — either in the report itself or in your monitoring tool — so someone can glance and see whether the data is current.
None of these checks are individually complex. The challenge is building the infrastructure to run all of them reliably and surface the results in a single place rather than across four different consoles.