Troubleshooting SQL Connection Issues with Network Watcher

Let’s walk through a classic, “everything is on fire” scenario. This is the bread and butter of why Network Watcher exists.

The Scenario: “The Database is Down (But it’s Not)”

The Setup: You have a 3-tier application. Your Frontend Web VM is trying to connect to a Backend SQL VM on port 1433.

The Symptom: The web app is throwing “Connection Timed Out” errors. Your database admin swears the SQL server is up and running perfectly.

Here is how you use Network Watcher to find the culprit in 5 minutes.


Step 1: The “Bouncer” Check (IP Flow Verify)

First, you need to know if a firewall rule is blocking the traffic.

  • Action: Run IP Flow Verify.
  • Input: Source IP (Web VM), Destination IP (SQL VM), Port 1433, Protocol TCP.
  • The Result: Network Watcher tells you: “Denied by Security Rule: DefaultRule_DenyAllInBound”.
  • The Fix: You realize someone created a high-priority NSG rule that accidentally blocked all traffic to the backend subnet. You fix the rule.

Step 2: The “GPS” Check (Next Hop)

Traffic is now “Allowed” by the NSG, but the app still can’t connect. Now you check if the packets are actually being routed to the right place.

  • Action: Run Next Hop.
  • The Result: It shows the Next Hop is a Virtual Appliance (NVA) (like a Palo Alto or Fortigate firewall) instead of the Virtual Network.
  • The Insight: You find an old User-Defined Route (UDR) that is forcing traffic through a firewall that isn’t configured to handle SQL traffic.
  • The Fix: You update the Route Table to allow direct VNet-to-VNet communication for the SQL port.

Step 3: The “All-in-One” Diagnostic (Connection Troubleshoot)

The rules look good, the route looks good, but it’s still failing. You’re starting to sweat.

  • Action: Run Connection Troubleshoot.
  • The Result: It checks everything (DNS, Routing, NSG) and reports: “Status: Reachable” at the network level, but “Port unreachable” at the OS level.
  • The Insight: This is the “Eureka” moment. The network is fine, but the application is rejecting the connection.
  • The Fix: You log into the SQL VM and realize the Windows Firewall is turned on and blocking 1433, or the SQL service isn’t listening on the public IP.

Step 4: The “Deep Dive” (Packet Capture)

If Connection Troubleshoot had said “Network reachable” but you were seeing weird data corruption or intermittent drops, you’d go nuclear.

  • Action: Start a Remote Packet Capture on both VMs.
  • The Result: You download the .cap file and open it in Wireshark.
  • The Insight: You see a “TCP Reset” packet being sent halfway through the handshake. This proves a middle-box (like a Load Balancer) is killing the connection due to an idle timeout.

Summary of the “Defense in Depth” Workflow

ToolAsk yourself…
IP Flow Verify“Is the Bouncer (NSG) letting me in?”
Next Hop“Is the GPS (Routing) sending me to the right house?”
NSG Flow Logs“Did the packet actually arrive at the gate?”
Connection Troubleshoot“Is the whole path from A to B clear?”
Packet Capture“What exactly are these two talking about?”

Peer Tip: Always start with IP Flow Verify. In 90% of Azure networking cases, the problem is a “Deny” rule in a Network Security Group that someone forgot existed.

Leave a comment