High severityconnection
MySQL Error:
2013
What does this error mean?
MySQL error 2013 means the TCP connection between client and server was severed while a query was actively running — the server was in the middle of sending or receiving data when the link dropped. Unlike error 2006 (server gone away between queries), 2013 hits mid-execution: the client sent a query, got no complete response, and the socket closed or timed out. In a data pipeline this surfaces as a failed copy activity, an incomplete dataset load, or a broken dbt run. The engineer typically sees partial row counts in the destination or a pipeline run that hangs for the full timeout duration before failing.
Common causes
- 1net_read_timeout or net_write_timeout expired mid-query: MySQL's default is 30 seconds for both. Any query or result transfer that takes longer — such as a full-table scan on a 50M-row table — will be killed by the server without warning, dropping the connection.
- 2OOM killer terminated the mysqld process: When the MySQL server runs out of RAM (often due to a large sort buffer, an uncapped innodb_buffer_pool_size, or concurrent workloads), the Linux kernel's OOM killer sends SIGKILL to the process. The client sees this as a sudden connection drop mid-query.
- 3Network interruption between client and server: Firewalls, load balancers, and NAT gateways often enforce idle or long-running TCP session limits (typically 300–900 seconds). A query that runs longer than the appliance's TCP timeout is silently dropped at the network layer, leaving the client hanging until its own read timeout fires.
- 4Large result set exhausting network buffers: Fetching millions of rows without server-side cursors causes the MySQL server to buffer the entire result set before streaming. If the result set exceeds max_allowed_packet or overwhelms the send buffer, the write stalls and the server eventually drops the connection.
- 5Binary log or replication lag causing server stall: On a busy primary, heavy replication writes can starve query threads. If the server stalls long enough during result streaming, the client-side net_read_timeout fires and drops the connection from the client side.
- 6Query killed by wait_timeout on the server side during execution: Although wait_timeout normally governs idle connections, a misconfigured interactive_timeout combined with a long-running query on a sleeping connection can race and terminate the session before the query completes.
- 7Cloud NAT gateway or Azure VNet terminating long TCP sessions: ADF's integration runtime connects from Azure infrastructure through NAT. Azure Load Balancer enforces a 4-minute idle TCP timeout; if the MySQL server stops sending rows for more than 4 minutes (e.g., sorting a large result), the NAT entry expires and the connection is silently dropped.
How to fix it
- 1Step 1: Raise server-side network timeouts to accommodate long-running queries. Connect as root and run: `SET GLOBAL net_read_timeout=3600; SET GLOBAL net_write_timeout=3600;` Then persist in my.cnf under [mysqld]: `net_read_timeout=3600` and `net_write_timeout=3600`. Restart not required if SET GLOBAL is used, but the my.cnf change survives reboots.
- 2Step 2: Check for OOM kills in the OS kernel log: `dmesg | grep -i 'killed process' | tail -20` and `journalctl -k | grep -i oom | tail -20`. If you see mysqld in the output, the server ran out of memory. Cross-check mysqld memory usage: `ps aux --sort=-%mem | grep mysqld`.
- 3Step 3: If OOM is confirmed, reduce InnoDB buffer pool to ~70% of available RAM: `SET GLOBAL innodb_buffer_pool_size=<bytes>;` and set in my.cnf. Also check sort_buffer_size and join_buffer_size — these are per-thread and multiply with connection count.
- 4Step 4: For ADF copy activities on large tables, enable partition-based reads in the ADF MySQL source settings. Set 'Partition option' to 'Physical partitions of table' or 'Dynamic range', specify the partition column (typically a numeric PK or timestamp) and upper/lower bounds. This splits one long query into many short parallel queries, each well within timeout limits.
- 5Step 5: Add a TCP keepalive and connection retry policy in ADF. In the linked service advanced settings, set 'Connection timeout' to 300 and enable retry: in the copy activity → Settings tab → set 'Retry count' to 3 and 'Retry interval' to 120 seconds. ADF will re-issue the query on transient network drops.
- 6Step 6: If the connection drops at a consistent time interval (e.g., exactly 240 seconds), the cause is a NAT or firewall TCP idle timeout. Fix by enabling TCP keepalive on the MySQL server: `SET GLOBAL wait_timeout=28800; SET GLOBAL interactive_timeout=28800;` and configure the OS-level keepalive: `echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time`. For Azure VNet, increase the load balancer idle timeout to 30 minutes in the LB rule settings.
- 7Step 7: For dbt or application-level connections, configure the MySQL connector to use server-side cursors and fetch rows in batches rather than buffering the full result set. In Python/SQLAlchemy: use `execution_options(stream_results=True)`. In dbt profiles.yml under mysql adapter, set `connect_timeout: 3600`.
Example log output
[2026-05-11 03:14:22] ERROR 2013 (HY000): Lost connection to MySQL server during query
[2026-05-11 03:14:22] ADF Copy Activity 'Copy_Orders_Full' failed after 1802s: Source read error on table `orders` — OperationalError: (2013, 'Lost connection to MySQL server during query')
[2026-05-11 03:14:22] Rows read before failure: 18,442,310 of estimated 31,000,000