metricsign
Start free
Best Practices9 min·

Power BI On-Premises Gateway Offline: Causes, Diagnostics, and Fixes

A gateway that goes offline at 02:00 and recovers by 09:00 can silently fail dozens of scheduled refreshes while everyone sleeps.

What the on-premises gateway actually does — and why it fails

The Power BI on-premises data gateway is a Windows service (PBIEgwService) that runs on a machine inside your network and maintains a persistent outbound connection to Azure Service Bus. When Power BI Service needs to execute a scheduled refresh against an on-premises data source, it sends the job to the gateway through that Service Bus channel. The gateway queries the source, transforms the data through the Power Query Mashup Engine, and streams the result back to Power BI Service.

This architecture has two links that can break independently. The first is the connection from Power BI Service to the gateway through Azure Service Bus — a cloud-to-gateway link. The second is the connection from the gateway machine to the actual data source — a gateway-to-datasource link. Most gateway troubleshooting sessions start without knowing which link has failed, which is why systematic diagnosis matters.

When the gateway is offline, all scheduled refreshes for every dataset routed through that gateway fail simultaneously. The simultaneity is itself a diagnostic signal: when ten datasets fail at exactly the same time, the problem is in the gateway, not in any individual dataset.

PBIEgwService stopped: the most common cause

The most frequently cited cause for gateway offline status in the Fabric Community forums — including a thread with over 22,000 views — is the gateway Windows service stopping unexpectedly. Windows Update reboots the server, an antivirus process terminates the service, or a crash leaves the service in a stopped state without auto-recovery configured.

Check service status and configure auto-restart:

Get-Service -Name "PBIEgwService" | Select-Object Name, Status, StartType

If stopped, restart and configure failure recovery:

Restart-Service -Name "PBIEgwService" -Force sc.exe failure PBIEgwService reset= 86400 actions= restart/5000/restart/10000/restart/20000

This configures the service to restart automatically after a crash, with increasing delays between restart attempts (5 seconds, then 10, then 20). Set the startup type to Automatic in Services (services.msc) so the service starts after server reboots without manual intervention.

For servers managed by Windows Update, schedule update restarts outside the scheduled refresh windows, or pause updates during critical refresh periods.

Gateway offline troubleshooting checklist — five checks in order, each with a targeted fix and resolution likelihood.
Gateway offline troubleshooting checklist — five checks in order, each with a targeted fix and resolution likelihood.

Firewall blocking Azure Service Bus

The gateway communicates with Power BI Service exclusively through outbound connections to Azure Service Bus on port 443 (HTTPS/WebSocket). If a firewall rule blocks this outbound traffic, the gateway cannot maintain its connection to Power BI and appears offline even though the gateway service itself is running.

The required endpoints for gateway communication: - *.servicebus.windows.net on port 443 - *.frontend.clouddatahub.net on port 443

Test connectivity from the gateway machine:

Test-NetConnection -ComputerName "westeurope-prod-sb.servicebus.windows.net" -Port 443

If the test fails, the firewall is blocking the connection. Add the required endpoints to the outbound allowlist. Note that proxy servers that perform TLS inspection may intercept and break the WebSocket connection even when the port is open — the gateway app's diagnostics section can verify whether the Service Bus connection succeeds end-to-end.

The gateway configuration file (Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config) can be used to specify proxy settings if the gateway machine requires proxy authentication for outbound connections.

Gateway installed on a laptop or non-dedicated machine

Installing the Power BI gateway on a laptop, a developer workstation, or any machine that is not always on is one of the most reliable ways to get unpredictable scheduled refresh failures. Any interruption to the machine — sleep, lid close, or removal from the network — takes the gateway offline with it.

Microsoft's documentation explicitly recommends a dedicated server or VM that never sleeps and has a stable network connection. This is not a suggestion for large environments — it is the minimum requirement for reliable production scheduled refresh.

For machines that must run the gateway but are not fully dedicated, disable sleep mode:

powercfg /change standby-timeout-ac 0 powercfg /change monitor-timeout-ac 0

This prevents the machine from entering standby while plugged in. It does not help if the machine is shut down or restarted — for those cases, a dedicated server with Automatic service startup and recovery is the only reliable configuration.

Outdated gateway version

Microsoft releases the on-premises data gateway on a monthly update cycle. Older versions accumulate known bugs: connectivity issues with updated Azure Service Bus endpoints, certificate expiry problems, and Mashup Engine bugs that cause specific types of data source connections to fail.

The gateway version is visible in Power BI Service under Settings > Manage gateways. If the installed version is more than two or three months behind the current release, update it before spending time on other diagnostics — the issue may be a known bug with a published fix.

Download the latest version from the Microsoft Download Center or from the Manage gateways page in Power BI Service. The update process is a standard installer that preserves the existing gateway configuration and data source credentials.

Organizations that skip gateway updates for extended periods occasionally find that their gateway loses connectivity entirely because Azure-side endpoints have changed and the old version no longer knows how to reach them. Monthly or quarterly updates prevent this class of failure entirely.

Credential expiry and data source authentication failures

The gateway stores credentials for each configured data source. When a data source credential expires — because a service account password was rotated, an OAuth token was revoked, or an API key was cycled — the gateway can still be online (connected to Azure Service Bus) but fail to connect to the specific data source. Power BI may report this as a gateway error rather than a credential error, making it look like a gateway infrastructure problem when it is actually an authentication problem.

To distinguish: check whether all datasets on the gateway fail or only datasets using a specific data source. If only one or a few datasets fail while others succeed, the gateway is likely healthy and the problem is the credential for that specific data source.

Update credentials in Power BI Service: Settings > Manage gateways > select the gateway > select the data source > Edit credentials > enter the current values. After saving, trigger a manual refresh to verify the new credentials work before the next scheduled refresh window.

For organizations with 90-day AD password rotation policies, add a calendar reminder to update Power BI data source credentials three days before each rotation. The predictable failure that arrives every 90 days has a predictable fix — the only thing that makes it an incident instead of a planned maintenance step is forgetting.

Reading gateway logs for the actual error

The gateway logs contain more detail than what Power BI Service reports in the refresh history. When the refresh history shows a generic gateway error, the log file on the gateway machine shows the specific exception, the timestamp, and often the exact component that failed.

Gateway log location:

explorer "$env:LOCALAPPDATA\Microsoft\On-premises data gateway"

Filter the log for errors:

Get-Content "$env:LOCALAPPDATA\Microsoft\On-premises data gateway\GatewayErrors.log" -Tail 200 | Where-Object { $_ -match "ERROR|WARN" }

The log entries include an InnerType and InnerMessage field in JSON format. For example, a network connectivity failure appears as message GatewayNotReachable with InnerType System.Net.Sockets.SocketException and InnerMessage "No connection could be made because the target machine actively refused it". The InnerType identifies the exception class — CryptographicException for certificate or SSO problems, SocketException for network connectivity, and SqlException for data source authentication failures. Reading InnerType and InnerMessage is almost always faster than working from the generic error code in Power BI Service.

MetricSign and gateway failure grouping

MetricSign groups refresh_failed incidents by gateway cluster, so ten simultaneous dataset failures appear as one incident listing all affected datasets — not as ten separate alerts. The on-call engineer sees the scope immediately and goes straight to the gateway investigation.

MetricSign also generates refresh_delayed incidents when an expected refresh does not start within a window after its scheduled time. This catches the case where the gateway is offline and the refresh never starts — which produces no error code in Power BI Service and would otherwise be invisible until a user opens a stale dashboard.

Related integrations

Related articles

← All articles