MetricSign
Request Access
Monitoring

How do I monitor an on-premises data gateway for Power BI?

The on-premises data gateway is the single point of failure for every Power BI dataset that connects to on-premises data sources. When the gateway fails, every dataset routed through it stops refreshing — often with little warning before the failure appears in Power BI Service.

What to monitor

1. Gateway service status

The gateway runs as a Windows service. Service crashes happen — due to updates, memory pressure, or underlying system issues. Power BI cloud detects gateway offline based on a heartbeat polling interval (typically 10 minutes), so by the time the portal shows the gateway as offline, scheduled refreshes that ran in that window have already silently failed.

Direct monitoring of the Windows service status (via WinRM, a monitoring agent, or a custom script) gives faster detection than waiting for the Power BI portal to reflect the offline state.

2. Gateway cluster capacity

A single gateway node handles a limited number of concurrent refreshes effectively. When the gateway is processing more jobs than its capacity supports, refreshes queue up and appear slow before they start failing with timeout errors. Monitoring active job counts and queue depth helps identify capacity pressure before it causes failures.

3. Gateway version

Microsoft releases gateway updates regularly. Running an outdated gateway version causes authentication failures, connection errors, and compatibility issues with new Power BI features. The gateway version should be compared against the current release and alerted when it falls behind by more than one release cycle.

4. Per-data-source connectivity

The gateway can be running and healthy while individual data source connections are broken — due to a firewall rule change, an expired certificate, or a SQL Server that moved to a new host. Regular connectivity tests (attempted connections to each registered data source) catch these before a scheduled refresh tries them.

5. Error pattern analysis

Gateway error codes in Power BI refresh failures follow patterns. Multiple datasets failing simultaneously with DM.GWPipeline.Client.GatewayUnreachable indicates the gateway is offline. Multiple datasets failing with ServiceBusFailed indicates the gateway cannot reach Azure Service Bus (a network or proxy issue). Analyzing error patterns across datasets gives a clearer signal than individual refresh failures.

Gateway monitoring in practice

The most common gap in gateway monitoring is the 10-minute detection lag. A gateway that goes offline at 05:50 and comes back at 06:05 will have silently failed any refresh scheduled in that window — and Power BI may or may not report those as failures depending on timing. Proactive service-level monitoring closes this gap.

Related questions

Related error codes

Related integrations

Related articles

← All questions