“peering” in Azure

When discussing “peering” in Azure, it’s important to clarify the context. Usually, this refers to VNet Peering (connecting virtual networks) or Direct Peering (which can refer to Azure Peering Service for optimized internet or ExpressRoute Direct for high-speed private fiber).

Here is what you need to consider for each to ensure a secure and performant design.


1. VNet Peering (Connecting VNets)

VNet Peering is the primary way to connect two Azure Virtual Networks. They behave as a single network using private IP addresses.

๐Ÿ”‘ Key Considerations:

  • Address Space Overlap: CRITICAL. You cannot peer VNets if their IP address spaces (CIDR blocks) overlap. Plan your IP schema early; fixing an overlap later requires deleting and recreating the VNet.
  • Transitivity: VNet peering is not transitive. If VNet A is peered with VNet B, and VNet B is peered with VNet C, VNet A cannot talk to VNet C.
    • Solution: Use a Hub-and-Spoke model with an Azure Firewall/NVA or Azure Virtual WAN for transitive routing.
  • Gateway Transit: If VNet A has a VPN/ExpressRoute gateway, you can allow VNet B to use it.
    • Check: Enable “Allow gateway transit” on VNet A and “Use remote gateways” on VNet B.
  • Cost: Local peering (same region) is cheaper than Global peering (different regions). You are charged for both inbound and outbound data transfer on both sides of the peering.

2. Direct Peering (ExpressRoute Direct & Peering Service)

“Direct Peering” usually refers to ExpressRoute Direct, where you connect your own hardware directly to Microsoftโ€™s edge routers at 10 Gbps or 100 Gbps.

๐Ÿ”‘ Key Considerations:

  • Physical Connectivity: You are responsible for the “Last Mile” fiber from your data center to the Microsoft Peering Location.
  • SKU Selection: * Local: For traffic within the same geopolitical region (cheapest).
    • Standard: For traffic within the same continent.
    • Premium: Required for global connectivity and more than 10 VNet links.
  • Microsoft Peering vs. Private Peering:
    • Private Peering: Connects your on-prem network to your Azure VNets (internal traffic).
    • Microsoft Peering: Connects your on-prem network to Microsoft 365, Dynamics 365, and Azure Public PaaS services (Storage, SQL) over a private link.

3. Comparison Summary

ConsiderationVNet PeeringDirect Peering (ExpressRoute Direct)
Primary UseCloud-to-Cloud connectivity.On-Prem-to-Cloud (High Bandwidth).
MediumMicrosoft Global Backbone.Dedicated Physical Fiber + Backbone.
BandwidthLimited by VM/Gateway SKU.Up to 100 Gbps.
ComplexityLow (Point-and-click).High (Requires physical fiber/BGP).
SecurityEncapsulated in Azure backbone.Private, dedicated physical path.

๐Ÿšฆ Common Pitfall: Asymmetric Routing

If you have both a VNet Peering and an ExpressRoute circuit connecting the same two locations, Azure might send traffic out via the peering but receive it back via ExpressRoute.

The Fix: Use User-Defined Routes (UDRs) or BGP weights to ensure the “return” path matches the “outbound” path. Azure will prioritize VNet Peering routes over ExpressRoute routes by default if the address prefixes are the same.

Would you like a specific KQL query to monitor the latency or throughput between your peered networks?

When a VM canโ€™t talk to a Storage Private Endpoint

When a VM canโ€™t talk to a Storage Private Endpoint, the issue almost always boils down to one of three things: DNS, Network Rules, or Approval State.

Here is your step-by-step troubleshooting checklist.


๐Ÿ” Step 1: The “Approval” Check

Before looking at technical networking, ensure the connection is actually “On.”

  • Check the Status: Go to the Storage Account > Networking > Private Endpoint Connections.
  • Look for “Approved”: If it says Pending, the connection isn’t active yet. Someone needs to manually approve it (common if the Storage Account is in a different subscription than the Private Endpoint).

๐ŸŒ Step 2: The DNS Resolution Check (Most Likely Culprit)

This is where 90% of Private Endpoint issues live. Your VM needs to resolve the Storage Account’s URL to a Private IP (e.g., 10.0.0.5), not its Public IP.

  1. Run a Test: From your VM (PowerShell or Bash), run:
    • nslookup yourstorage.blob.core.windows.net
  2. Evaluate the Result:
    • Bad: It returns a Public IP. Your VM is bypassing the Private Link and hitting the internet (which is likely blocked by the storage firewall).
    • Good: It returns a Private IP (usually in the range of your VNet) and shows an alias like yourstorage.privatelink.blob.core.windows.net.

The Fix: * Ensure you have a Private DNS Zone named privatelink.blob.core.windows.net.

  • Ensure that DNS Zone is linked to the Virtual Network where your VM sits.
  • If you use a Custom DNS/Domain Controller, ensure it has a conditional forwarder pointing to the Azure DNS IP 168.63.129.16.

๐Ÿ›ก๏ธ Step 3: Network Security Group (NSG) Check

Even with Private Link, your Subnet’s “Firewall” rules still apply.

  1. Outbound Rules (VM Subnet): Does the NSG on your VM’s subnet allow traffic to the Private Endpointโ€™s IP? (Usually, the default “AllowVnetOutbound” covers this, but check for manual “Deny” rules).
  2. Inbound Rules (Private Endpoint Subnet): In 2026, Private Endpoints support Network Policies. Check if the NSG on the Private Endpoint’s subnet allows inbound traffic from your VM on Port 443.
  3. ASG Check: If you are using Application Security Groups, ensure your VM is a member of the ASG allowed in the NSG rules.

๐Ÿงฑ Step 4: Storage Firewall Settings

By default, when you enable a Private Endpoint, you usually “Lock Down” the Storage Account.

  • Go to Storage Account > Networking.
  • Ensure Public Network Access is set to “Disabled” or “Enabled from selected virtual networks and IP addresses.”
  • Crucial: Even if public access is disabled, the Private Endpoint connection itself must be listed and active in the “Private endpoint connections” tab.

๐Ÿ› ๏ธ Step 5: The “Quick Tools” Test

If you’re still stuck, run these two commands from the VM to narrow down if it’s a DNS or Port issue:

  • Test the Port (TCP 443):PowerShell# Windows Test-NetConnection -ComputerName yourstorage.blob.core.windows.net -Port 443 (If this fails but DNS is correct, an NSG or Firewall is blocking you).
  • Check the IP directly:Find the Private IP of the endpoint in the Azure Portal and try to ping it (if ICMP is allowed) or use it in the connection string to see if the error changes.

Summary Checklist:

  1. Is the Private Endpoint Approved?
  2. Does nslookup return a Private IP?
  3. Is the Private DNS Zone linked to the VM’s VNet?
  4. Does the NSG allow traffic on Port 443?

Identity and Access Management (IAM)

Identity and Access Management (IAM) in Azure is the framework of policies and technologies that ensures the right people (and software) have the appropriate access to technology resources.

In 2026, Azure IAM is primarily managed through Microsoft Entra ID (formerly Azure AD). It is built on the philosophy of Zero Trust: “Never trust, always verify.”


๐Ÿ—๏ธ The Core Architecture

Azure IAM is governed by two separate but integrated systems:

  1. Entra ID Roles: Control access to “Identity” tasks (e.g., creating users, resetting passwords, managing domain names).
  2. Azure RBAC (Role-Based Access Control): Control access to “Resources” (e.g., starting a VM, reading a database, managing a virtual network).

๐Ÿ”‘ The Three Pillars of IAM

To understand any IAM request, Azure looks at three specific components:

1. Who? (The Security Principal)

This is the “Identity” requesting access. It can be:

  • User: A human (Employee or Guest).
  • Group: A collection of users (Best practice: always assign permissions to groups, not individuals).
  • Service Principal: An identity for an application/tool (e.g., a backup script).
  • Managed Identity: The “most secure” ID for Azure-to-Azure communication.

2. What can they do? (The Role Definition)

A “Role” is a collection of permissions.

  • Owner: Can do everything, including granting access to others.
  • Contributor: Can create/manage resources but cannot grant access.
  • Reader: Can only view resources.
  • Custom Roles: You can create your own if the “Built-in” ones are too broad.

3. Where? (The Scope)

Scope defines the boundary of the access. Azure uses a hierarchy:

  • Management Group: Multiple subscriptions.
  • Subscription: The billing and resource boundary.
  • Resource Group: A logical container for related resources.
  • Resource: The individual VM, SQL DB, or Storage Account.

Note: Permissions are inherited. If you are a “Reader” at the Subscription level, you are a “Reader” for every single resource inside that subscription.


๐Ÿ›ก๏ธ Advanced IAM Tools (The “Pro” Features)

Privileged Identity Management (PIM)

In a modern setup, no one should have “Permanent” admin access. PIM provides:

  • Just-In-Time (JIT) Access: You are “Eligible” for a role, but you only activate it for 2 hours when you need to do work.
  • Approval Workflows: A manager must approve your request to become an Admin.

Conditional Access (The “Smart” Gatekeeper)

Conditional Access is like a “Check-in Desk” that looks at signals before letting you in:

  • Signal: Is the user in a weird location? Is their device unmanaged?
  • Decision: Require MFA, Block access, or allow it.

ABAC (Attribute-Based Access Control)

As of 2025/2026, Azure has expanded into ABAC. This allows you to add “Conditions” to roles.

  • Example: “User can only read storage blobs if the blob is tagged with Project=Blue.”

โœ… Best Practices

  • Principle of Least Privilege: Give users only the bare minimum access they need.
  • Use Groups: Never assign a role to a single user; assign it to a group so you can easily audit it later.
  • Enable MFA: 99.9% of identity attacks are blocked by Multi-Factor Authentication.
  • Use Managed Identities: Avoid using passwords or “Client Secrets” in your code.

AZ – NSG and ASG

Think of NSG and ASG as two sides of the same coin. The NSG is the actual “firewall” that enforces the rules, while the ASG is a “labeling” system that makes those rules easier to manage and understand.


๐Ÿ›ก๏ธ Network Security Group (NSG)

An NSG is a filter for network traffic. It contains a list of security rules that allow or deny traffic based on the “5-tuple” (Source IP, Source Port, Destination IP, Destination Port, and Protocol).

  • Where it lives: You associate it with a Subnet or a Network Interface (NIC).
  • What it does: It acts as a basic firewall for your Virtual Machines (VMs).
  • The Problem: If you have 50 web servers, youโ€™d traditionally have to list all 50 IP addresses in your NSG rules. If you add a 51st server, you have to update the NSG rule. This is tedious and prone to error.

๐Ÿท๏ธ Application Security Group (ASG)

An ASG is not a firewall itself; it is a logical object (a grouping) that you put inside an NSG rule. It allows you to group VMs together based on their function (e.g., “Web-Servers” or “DB-Servers”) regardless of their IP addresses.

  • Where it lives: You assign it directly to a Network Interface (NIC).
  • What it does: It allows you to write “natural language” rules. Instead of saying “Allow IP 10.0.0.4 to 10.0.0.5,” you can say “Allow Web-Servers to talk to DB-Servers.”
  • The Benefit: If you scale up and add 10 more web servers, you just tag them with the “Web-Servers” ASG. The NSG automatically applies the correct rules to them without you needing to change a single IP address in the security policy.

๐Ÿ”„ Key Differences at a Glance

FeatureNetwork Security Group (NSG)Application Security Group (ASG)
Primary RoleThe “Enforcer” (Filters traffic).The “Organizer” (Groups VMs).
LogicBased on IP addresses and ports.Based on application roles/labels.
AssociationApplied to Subnets or NICs.Applied only to NICs.
Rule LimitUp to 1,000 rules per NSG.Used as a source/destination inside NSG rules.
MaintenanceHigh (must update IPs manually).Low (rules update automatically as VMs are added).

Better Together: A Real-World Example

Imagine a 3-tier app (Web, App, Database).

  1. You create three ASGs: ASG-Web, ASG-App, and ASG-DB.
  2. You assign each VM to its respective ASG.
  3. In your NSG, you create a rule: Allow Source: ASG-Web to Destination: ASG-App on Port 8080.

Now, it doesn’t matter if your web tier has 1 VM or 100 VMsโ€”the security policy remains exactly the same and stays clean!

Would you like to see an example of how to configure these using the Azure CLI or Portal?

DNS in Azure

DNS in Azure is one of the most misunderstood parts of network architecture โ€” let me build this up in two diagrams: first the overall structure of where name resolution happens, then the full query flow so you can see the decision logic in motion.

The DNS landscape in Azure

Every VNet has a built-in resolver at the special address 168.63.129.16. By default this resolves Azure public hostnames and the VNet’s own internal hostnames, but it knows nothing about your private zones or your on-premises DNS. The whole point of a DNS architecture is to extend that default with three building blocks: Private DNS Zones, DNS Private Resolver, and conditional forwarding rules.

Now let’s trace exactly what happens when a VM makes a DNS query โ€” because the decision logic is what makes this architecture tick.Step through the query flow above. Here’s the full picture in prose:


The three building blocks

Private DNS Zones are Azure-managed authoritative zones that exist outside any VNet but get linked to VNets. When a zone is linked, 168.63.129.16 can resolve names in it from that VNet. The most important zones are the privatelink.* zones โ€” one exists for every Azure PaaS service (Blob, SQL, Key Vault, etc.) and they map service hostnames to private endpoint IPs. Without a correct private zone, a private endpoint is useless because DNS still returns the public IP.

Auto-registration is a separate feature: link a zone named e.g. internal.contoso.com to a VNet with auto-registration enabled, and Azure automatically creates A records for every VM in that VNet. Useful for simple intra-VNet name resolution without managing records manually.

DNS Private Resolver is a fully managed, scalable DNS proxy service deployed into a VNet subnet (/28 minimum). It has two endpoint types. The inbound endpoint gets a static private IP โ€” this is what you point all your spoke VNets’ custom DNS settings at, and what you configure on-prem DNS to forward Azure-destined queries to. The outbound endpoint is used by forwarding rulesets to reach external DNS servers (on-prem). Before DNS Private Resolver existed, people ran custom DNS VMs (Windows Server or BIND) in the hub โ€” the resolver replaces that with a managed, zone-redundant service.

Forwarding rulesets are attached to the outbound endpoint of the resolver. They’re just ordered lists of <domain suffix> โ†’ <target DNS IP> rules. The catch-all dot (.) rule is critical โ€” it determines where unmatched queries go. Typically that’s 168.63.129.16 so Azure private zones still work for everything else. Rulesets can be associated with multiple VNets, which makes them very powerful in hub-and-spoke: one ruleset attached to the hub propagates to all linked spokes.


The hybrid DNS loop

The trickiest part of Azure DNS is making it work symmetrically between cloud and on-premises:

On-premises โ†’ Azure: configure a conditional forwarder on your on-prem DNS server for each privatelink.* zone (and any Azure-private zones) pointing to the resolver’s inbound endpoint IP. Traffic flows: on-prem workstation โ†’ on-prem DNS โ†’ resolver inbound endpoint โ†’ 168.63.129.16 โ†’ private zone โ†’ private endpoint IP.

Azure โ†’ on-premises: configure a forwarding ruleset rule for your corporate domains (e.g. corp.local) pointing to your on-prem DNS server IP, reachable over the VPN/ExpressRoute. Traffic flows: Azure VM โ†’ resolver โ†’ ruleset matches โ†’ on-prem DNS โ†’ answer returned.


Common pitfalls

The most frequent DNS problem in Azure is a Private Endpoint resolving to its public IP instead of its private IP. This happens when the private zone isn’t linked to the VNet making the query, or when a spoke VNet is using the default 168.63.129.16 directly (bypassing the resolver) and the zone is only linked to the hub. The fix: link private zones to all VNets that need resolution, or ensure all VNets point their DNS to the resolver inbound endpoint.

The second most frequent issue is forgetting to set custom DNS on spoke VNets after creating them. The Azure default (168.63.129.16) works fine for public names but can’t do conditional forwarding. Always explicitly set the DNS server to the resolver inbound IP via the VNet’s DNS servers setting, then restart VMs so they pick up the new DHCP-assigned DNS server.

AZ Hub & Spoke

What is hub-and-spoke?

Hub-and-spoke is the most widely recommended network topology for enterprise Azure environments. The idea is simple: instead of connecting every VNet to every other VNet (which creates an unmanageable mesh of peering links and duplicate security controls), you designate one central VNet as the hub โ€” the place where all shared infrastructure lives โ€” and connect all other VNets as spokes that only peer with the hub.

The spokes never peer with each other directly. If Spoke A needs to talk to Spoke B, the traffic flows through the hub, which means it passes through your centralized firewall or NVA where you can inspect, log, and control it.


What lives in the hub

Now let’s zoom into what actually goes inside the hub VNet and why.The hub is divided into purpose-built subnets, each with a reserved name that Azure recognizes. The GatewaySubnet (name is mandatory and exact) hosts your VPN or ExpressRoute gateway for on-premises connectivity. The AzureFirewallSubnet (also exact) requires at least a /26 and hosts Azure Firewall, which becomes the traffic inspection point for everything flowing between spokes and out to the internet. AzureBastionSubnet hosts Azure Bastion, giving your operations team secure browser-based RDP/SSH to VMs in any spoke without exposing public IPs.

The UDR (User Defined Route) shown at the bottom is the mechanism that makes forced inspection work: every spoke subnet gets a route table with a default route pointing to the Azure Firewall’s private IP. This ensures no traffic can bypass the hub.


Why hub-and-spoke over full mesh?

With full mesh, connecting 5 VNets requires 10 peering links, 10 separate NSG/firewall rule sets to maintain, and no single place to audit traffic. With hub-and-spoke, you have N peering links (one per spoke), a single firewall policy to manage, centralized logging in one place, and a topology that scales linearly as you add spokes.


Traffic flows

There are three traffic paths to understand:

Spoke to spoke โ€” traffic from Spoke A travels into the hub, hits the Azure Firewall, which evaluates your network rules, and if permitted forwards it out to Spoke B. Neither spoke knows about the other at the routing level; they only know the hub’s firewall IP as their default gateway.

Spoke to internet โ€” the UDR default route sends internet-bound traffic to the Firewall rather than out directly. The Firewall applies application rules (FQDNs, categories), performs SNAT, and egresses through its own public IP. This gives you a single, auditable egress point for the entire organization.

On-premises to spoke โ€” traffic from your corporate network arrives via VPN or ExpressRoute into the GatewaySubnet, then routes through the Firewall before reaching any spoke. Gateway transit (enabled on the hub side of each peering, “use remote gateways” on the spoke side) lets all spokes share a single gateway.


Key design decisions and best practices

Address space planning is everything. The hub needs enough space for its subnets (Gateway needs at least /27, Firewall needs /26, Bastion needs /26). Spokes should get their own non-overlapping /16 or /24 ranges. Plan for future growth โ€” you can’t resize after peering.

Use Azure Firewall Policy, not classic Firewall rules. Policy objects can be shared across multiple Firewall instances in different regions, making multi-region hub-and-spoke consistent.

NSGs at every spoke subnet. The Firewall is your perimeter, but NSGs at the subnet level are your last line of defence. They provide micro-segmentation even if a firewall rule is misconfigured.

One hub per region. In a multi-region setup, deploy a hub in each region. Spokes peer to their regional hub. The two hubs can be globally peered to each other, but remember: gateway transit does not work over global peering, so on-premises routes to remote-region spokes need careful planning (usually handled via BGP and ExpressRoute Global Reach, or Azure Virtual WAN).

Consider Azure Virtual WAN for large scale. If you have dozens of spokes, branches, and multiple regions, Azure Virtual WAN automates hub management, routing, and scaling. It’s hub-and-spoke as a managed service.

Tag everything. Use resource tags (environment, cost-center, spoke-owner) consistently on VNets and peering resources so you can attribute costs and audit ownership as the topology grows.

Azure VNet peering

What is VNet peering?

VNet peering connects two Azure Virtual Networks so that resources in each can communicate with each other using private IP addresses, routing traffic over the Microsoft backbone โ€” never the public internet. From the VM’s perspective, the remote VNet feels like it’s on the same network.

There are two types:

Regional peering connects VNets within the same Azure region. Traffic stays entirely within that region’s backbone, latency is minimal, and there is no bandwidth cost for the traffic itself (though egress charges apply in some configurations).

Global peering connects VNets across different Azure regions. It uses the same private backbone but crosses the WAN layer between regions. This incurs additional data transfer charges and has a few extra restrictions (covered below).


How it works under the hood

Peering is non-transitive by design. If VNet A peers with VNet B, and VNet B peers with VNet C, VNet A cannot reach VNet C through B automatically. This is intentional โ€” it keeps the blast radius of any misconfiguration small and forces deliberate topology decisions. To allow Aโ€“C traffic, you must either peer them directly or use a Network Virtual Appliance (NVA) or Azure Firewall as a transit hub.

Peering is also directional: each side must create its own peering link pointing to the other. Both links must exist and be in “Connected” state before traffic flows.


Key restrictions

Address space: The two VNets being peered must have non-overlapping CIDR ranges. This is the most common cause of failed peerings โ€” plan your IP space carefully before deploying. You cannot resize a VNet’s address space after peering without first removing and re-adding the peering.

No transitivity (without help): As noted above, peering is point-to-point only. Traffic does not flow through an intermediate VNet unless you explicitly route it via an NVA or gateway.

Gateway transit limits (global peering): You cannot use a VPN Gateway or ExpressRoute Gateway in a remote VNet for on-premises connectivity over a global peer. Gateway transit is supported for regional peering but not for global peering. This is the biggest operational gotcha for global peerings.

Basic Load Balancer: Resources behind an Azure Basic SKU Load Balancer in one peered VNet are not reachable from the other VNet. Standard SKU Load Balancer works fine. Microsoft has largely deprecated Basic LB anyway.

IPv6: Dual-stack (IPv4 + IPv6) peering is supported, but you must configure both address families explicitly.

Subscription and tenant: You can peer VNets across different subscriptions and even across different Azure AD tenants, but this requires explicit authorization on both sides and the right RBAC roles (Network Contributor or a custom role with Microsoft.Network/virtualNetworks/peer/action).


Best practices

Plan your IP addressing first. This is the cardinal rule. Overlapping CIDRs cannot be peered, and changing address space after the fact is painful. Use RFC 1918 space systematically โ€” e.g. one /16 per major environment, subdivided by region and VNet purpose.

Use hub-and-spoke. Rather than full-mesh peering (Nร—(Nโˆ’1) peering links between N VNets), peer all spoke VNets to a central hub that hosts shared services: Azure Firewall, DNS resolvers, VPN/ExpressRoute gateways. This centralises traffic inspection and keeps the number of peering links manageable. Azure Virtual WAN automates much of this for large-scale deployments.

Enable gateway transit deliberately. If spoke VNets need to reach on-premises networks via a hub gateway, enable “Allow gateway transit” on the hub side and “Use remote gateways” on the spoke side. Be aware this only works for regional peering.

Monitor with Network Watcher. Use Connection Monitor and VNet Flow Logs to validate that peered traffic is flowing as expected and to detect routing anomalies early.

Tag and document peerings. Peering links don’t carry tags natively, but you should document each peering in your infrastructure-as-code (Bicep/Terraform) with clear naming conventions โ€” e.g. peer-hubeastus-to-spokeeastus-app1 โ€” so intent is obvious six months later.

Use NSGs on subnets, not VNets. Peering opens the network path, but you still control traffic with Network Security Groups at the subnet level. Don’t assume peering = trusted; apply least-privilege NSG rules between peered VNets just as you would for any other traffic.

Prefer Infrastructure-as-Code. Peering configuration done manually in the portal is error-prone (easy to create only one side). Terraform’s azurerm_virtual_network_peering or Bicep’s Microsoft.Network/virtualNetworks/virtualNetworkPeerings resource create both sides atomically.


Regional vs global at a glance

RegionalGlobal
LatencyLower (same region backbone)Higher (cross-region WAN)
Data transfer costLower / free in some casesAdditional per-GB charges
Gateway transitโœ“ Supportedโœ— Not supported
Basic LB reachabilityโœ— Not supportedโœ— Not supported
Typical use caseApp tiers, dev/prod separationDisaster recovery, multi-region apps

Click any VNet in the diagram above to dive deeper into a specific aspect.

Managing cross-spoke traffic

Managing cross-spoke trafficโ€”often called East-West trafficโ€”is a critical design challenge in large-scale Azure environments. As of 2026, the shift toward Zero Trust and automated routing has made traditional manual User-Defined Routes (UDRs) less sustainable for large enterprises.

Depending on your scale and security requirements, here are the preferred strategies for managing this traffic.


1. The Modern Enterprise Standard: Azure Virtual WAN (vWAN)

For environments with dozens or hundreds of spokes, Azure Virtual WAN with a Secured Hub is the gold standard. It replaces the manual effort of managing peering and route tables.

  • How it works: You deploy a Virtual Hub and connect spokes to it. By enabling Routing Intent, you tell Azure to automatically attract all East-West traffic to an Azure Firewall (or supported third-party NVA) within the hub.
  • Why itโ€™s preferred:
    • Auto-propagation: No need to manually create UDRs in every spoke to point to a central firewall; the hub manages the route injection.
    • Transitive Routing: vWAN provides “any-to-any” connectivity by default, solving the non-transitive nature of standard VNet peering.
    • Scale: It is designed to handle thousands of connections and massive throughput across multiple regions.

2. The Granular Control Approach: Hub-and-Spoke with Azure Virtual Network Manager (AVNM)

If you prefer a traditional Hub-and-Spoke model but want to avoid the “UDR Hell” of manual updates, Azure Virtual Network Manager (AVNM) is the strategic choice.

  • Strategy: Use AVNM to define Network Groups (e.g., “Production-Spokes”). AVNM can then automatically:
    • Create and manage VNet peerings.
    • Deploy Admin Rules (security) and Routing Rules (UDRs) across all VNets in a group.
  • Best For: Environments that require high customization or the use of specific third-party NVAs that may not yet be fully integrated into the vWAN “Managed” model.

3. The “Service-First” Strategy: Azure Private Link

Not all cross-spoke traffic needs to be “network-level” (Layer 3). For communication between applications (e.g., a web app in Spoke A talking to a database in Spoke B), Private Link is often superior to VNet peering.

  • Strategy: Instead of peering the entire VNets, expose the specific service in Spoke B via a Private Link Service. Spoke A then consumes it via a Private Endpoint.
  • Why itโ€™s preferred:
    • Isolation: It eliminates the risk of lateral movement across the network because the VNets are not actually “connected.”
    • IP Overlap: It allows spokes with overlapping IP ranges to communicate, which is impossible with standard peering.
    • Security: Traffic stays on the Microsoft backbone and is mapped to a specific resource, reducing the attack surface.

4. Architectural Comparison: At-a-Glance

FeatureStandard Peering + UDRVirtual WAN (Secured Hub)Private Link
ComplexityHigh (Manual)Low (Automated)Medium (Per-service)
TransitivityNone (requires NVA/UDR)NativeN/A (Service-based)
ScaleHard to maintainExcellentExcellent
SecurityNSG + Firewall NVAIntegrated FirewallLeast Privilege (Resource-level)

5. Critical Best Practice: “Zero Trust” at the Spoke

Regardless of the routing strategy, large environments should implement Micro-segmentation within the spokes.

  1. NSGs and ASGs: Use Network Security Groups (NSGs) combined with Application Security Groups (ASGs) to control traffic between subnets within the same spoke.
  2. Explicit Outbound (2026 Change): Note that as of March 31, 2026, Azure has retired “Default Outbound Access.” You must now explicitly define outbound paths (NAT Gateway or Firewall) for all spokes, which prevents accidental “leaking” of traffic to the internet while managing your internal East-West flows.

Designing a large-scale Azure environment from scratch in 2026 requires moving away from “bespoke” networking toward a Productized Infrastructure model.

The most robust strategy follows the Azure Landing Zone (ALZ) conceptual architecture, utilizing Azure Virtual WAN (vWAN) as the connectivity backbone. This setup minimizes manual routing while providing maximum security.


1. The Foundation: Management Group Hierarchy

Before touching a VNet, you must organize your governance. Use Management Groups to enforce “Guardrails” (Azure Policy) that automatically configure networking for every new subscription.

  • Root Management Group
    • Platform MG: Contains the Connectivity, Identity, and Management subscriptions.
    • Landing Zones MG: * Corp MG: For internal workloads (connected to the Hub).
      • Online MG: For internet-facing workloads (isolated or DMZ).
    • Sandbox MG: For disconnected R&D.

2. The Network Backbone: Virtual WAN with Routing Intent

In a greenfield 2026 design, Virtual WAN (Standard SKU) is the preferred “Hub.” It acts as a managed routing engine.

The “Routing Intent” Strategy

Traditional hubs require you to manually manage Route Tables (UDRs) in every spoke. With Routing Intent enabled in your Virtual Hub:

  1. Centralized Inspection: You define that “Private Traffic” (East-West) must go to the Azure Firewall in the Hub.
  2. Auto-Propagation: Azure automatically “attracts” the traffic from the spokes to the Firewall. You no longer need to write a 0.0.0.0/0 or 10.0.0.0/8 UDR in every spoke.
  3. Inter-Hub Routing: If you expand to another region (e.g., East US to West Europe), vWAN handles the inter-region routing natively without complex global peering strings.

3. The Security Strategy: Micro-segmentation

Don’t rely solely on the central Firewall; itโ€™s too “coarse” for large environments. Use a layered approach:

  • North-South (Internet): Managed by Azure Firewall Premium in the vWAN Hub (IDPS, TLS Inspection).
  • East-West (Cross-Spoke): Managed by Routing Intent + Azure Firewall.
  • Intra-Spoke (Subnet-to-Subnet): Use Network Security Groups (NSGs) and Application Security Groups (ASGs).
    • Tip: Use Azure Virtual Network Manager (AVNM) to deploy “Security Admin Rules” that stay at the top of the NSG stack across all spokes, preventing developers from accidentally opening SSH/RDP to the world.

4. The “Subscription Vending” Machine

In 2026, you shouldn’t “build” a spoke; you should “vend” it. When a team needs a new environment:

  1. A CI/CD pipeline (Terraform/Bicep) creates a new Subscription.
  2. Azure Policy automatically moves it to the Corp Management Group.
  3. Policy triggers the creation of a Spoke VNet and Peerings to the vWAN Hub.
  4. Routing Intent automatically secures the traffic without the team ever seeing a Route Table.

5. Architectural Blueprint (Summary)

ComponentRecommendation
TopologyAzure Virtual WAN (vWAN) Hub-and-Spoke.
Routing ControlRouting Intent (Standard Hub) to eliminate manual UDRs.
GovernanceAzure Virtual Network Manager (AVNM) for global security rules.
Internal CommsPrivate Link for sensitive PaaS-to-PaaS traffic (SQL, Storage).
AutomationInfrastructure as Code (Bicep/Terraform) via ALZ Accelerators.

Implementation Pro-Tip

Since you are starting from scratch, ensure you implement Azure DDoS Protection on the Hub IP addresses and enable Diagnostic Settings to a central Log Analytics Workspace immediately. In 2026, observability is not an “add-on”โ€”it’s a requirement for troubleshooting the automated routing of vWAN.

Since you’re using Terraform, you have the advantage of leveraging the Azure Verified Modules (AVM) and the ALZ (Azure Landing Zones) Terraform module. These are the industry standard for codifying the “Hub-and-Spoke” architecture with Virtual WAN.

Here is the architectural design and the strategic Terraform implementation path for your greenfield environment.


1. The Terraform-Centric Architecture

The goal is to treat your network as a “platform” that provides services to “application” spokes.

Core Components

  • Virtual WAN Module: Deploys the azurerm_virtual_wan and azurerm_virtual_hub.
  • Secured Hub: Deploy azurerm_firewall within the hub.
  • Routing Intent: Configures azurerm_virtual_hub_routing_intent to point all $0.0.0.0/0$ (Internet) and private traffic (Internal) to the firewall.
  • Spoke Vending: A reusable module that creates a VNet, subnets, and the azurerm_virtual_hub_connection.

2. Recommended Terraform Structure

For a large environment, do not put everything in one state file. Use a layered approach with remote state lookups or specialized providers.

Layer 1: Foundation (Identity & Governance)

  • Deploys Management Groups and Subscription aliases.
  • Sets up the Terraform Backend (Azure Storage Account with State Locking).

Layer 2: Connectivity (The “Hub”)

  • Deploys the vWAN, Hubs, Firewalls, and VPN/ExpressRoute Gateways.
  • Crucial Logic: Define your routing_intent here. This ensures that the moment a spoke connects, it is governed by the central firewall.

Layer 3: Landing Zones (The “Spokes”)

  • Use a Terraform For_Each loop or a Spoke Factory pattern.
  • Each spoke is its own module instance, preventing a single “blast radius” if one VNet deployment fails.

3. Handling “East-West” Traffic in Code

With vWAN and Routing Intent, your Terraform code for a spoke becomes incredibly simple because you omit the azurerm_route_table.

Terraform

# Example of a Spoke Connection in Terraform
resource "azurerm_virtual_hub_connection" "spoke_a" {
name = "conn-spoke-prod-001"
virtual_hub_id = data.terraform_remote_state.connectivity.outputs.hub_id
remote_virtual_network_id = azurerm_virtual_network.spoke_a.id
# Routing Intent at the Hub level handles the traffic redirection,
# so no complex 'routing' block is needed here for East-West inspection.
}

4. Addressing Modern Constraints (2026)

  • Provider Constraints: Ensure you are using azurerm version 4.x or higher, as many vWAN Routing Intent features were stabilized in late 2024/2025.
  • Orchestration: Use Terraform Cloud or GitHub Actions/Azure DevOps with “OIDC” (Workload Identity) for authentication. Avoid using static Service Principal secrets.
  • Policy as Code: Use the terraform-azurerm-caf-enterprise-scale module (often called the ALZ module) to deploy Azure Policies that deny the creation of VNets that aren’t peered to the Hub.

5. Summary of Design Benefits

  1. Zero UDR Maintenance: Routing Intent removes the need to calculate and update CIDR blocks in Route Tables every time a new spoke is added.
  2. Scalability: Terraform can stamp out 100 spokes in a single plan/apply cycle.
  3. Security by Default: All cross-spoke traffic is forced through the Firewall IDPS via the Hub connection logic.

Would you like to see a more detailed code snippet for the vWAN Routing Intent configuration, or should we look at how to structure the Spoke Vending module?

Azure DNS Private Resolver

Azure DNS Private Resolver

The Problem It Solves

Before DNS Private Resolver existed, if you wanted to resolve Azure Private DNS Zone records from on-premises, or forward on-premises domain queries from Azure, you had to run a custom DNS forwarder VM (e.g., Windows DNS Server or BIND on a Linux VM). This meant managing, patching, scaling, and ensuring high availability of that VM yourself โ€” a maintenance burden and a potential single point of failure.

Azure DNS Private Resolver eliminates that entirely.


What It Is

Azure DNS Private Resolver is a fully managed, cloud-native DNS service deployed inside your VNet that acts as a bridge between:

  • Azure (Private DNS Zones, VNet-internal resolution)
  • On-premises networks (your corporate DNS servers)

It handles DNS queries coming in from on-premises and DNS queries going out from Azure โ€” without any VMs to manage.


How It Works โ€” The Two Endpoints

The resolver has two distinct components:

1. Inbound Endpoint

  • Gets assigned a private IP address inside your VNet
  • On-premises DNS servers can forward queries to this IP over ExpressRoute or VPN
  • Allows on-premises clients to resolve Azure Private DNS Zone records โ€” something that was previously impossible without a forwarder VM
  • Example use case: on-premises user needs to resolve mystorageaccount.privatelink.blob.core.windows.net to its private IP

2. Outbound Endpoint

  • Used with DNS Forwarding Rulesets
  • Allows Azure VMs to forward specific domain queries to external DNS servers (e.g., on-premises DNS)
  • Example use case: Azure VM needs to resolve server01.corp.contoso.local which only exists on-premises

DNS Forwarding Rulesets

A Forwarding Ruleset is a set of rules attached to the Outbound Endpoint that says:

DomainForward To
corp.contoso.local10.0.0.5 (on-prem DNS)
internal.company.com10.0.0.6 (on-prem DNS)
. (everything else)Azure default resolver

Rulesets are associated with VNets, so multiple Spokes can share the same ruleset without duplicating configuration.


How It Fits Into Hub-and-Spoke

In an enterprise Hub-and-Spoke architecture, DNS Private Resolver lives in the Hub VNet and serves all Spokes centrally:

On-Premises DNS
โ”‚
โ”‚ (conditional forward)
โ–ผ
DNS Private Resolver โ”€โ”€โ–บ Inbound Endpoint (resolves Azure Private DNS Zones)
โ”‚
โ”‚ (outbound ruleset)
โ–ผ
On-Premises DNS (for corp.contoso.local queries from Azure VMs)
Spoke VNets โ”€โ”€โ–บ point DNS setting to Private Resolver inbound IP

All Spoke VNets are configured to use the resolver’s inbound endpoint IP as their DNS server, giving every workload consistent, centralized DNS resolution.


Key Benefits Over a Forwarder VM

DNS Forwarder VMDNS Private Resolver
ManagementYou manage patching, reboots, scalingFully managed by Microsoft
AvailabilityYou build HA (2 VMs, load balancer)Built-in high availability
ScalabilityManual VM resizingScales automatically
CostVM + disk + load balancer costsPay per endpoint per hour
SecurityVM attack surfaceNo VM, no management ports
IntegrationManual config to reach Azure DNSNative Azure DNS integration

A Real-World DNS Flow Example

Scenario: On-premises user wants to access a Storage Account via its private endpoint.

  1. User’s machine queries on-premises DNS for mystorageaccount.privatelink.blob.core.windows.net
  2. On-premises DNS has a conditional forwarder: send privatelink.blob.core.windows.net queries โ†’ DNS Private Resolver inbound endpoint IP
  3. DNS Private Resolver receives the query
  4. It checks the Azure Private DNS Zone linked to the Hub VNet
  5. Finds the A record โ†’ returns the private endpoint IP (e.g., 10.1.2.5)
  6. Traffic flows from on-premises over ExpressRoute directly to the private endpoint โ€” never touching the public internet

In One Sentence

Azure DNS Private Resolver is a managed service that sits inside your VNet and acts as the intelligent DNS bridge between your on-premises network and Azure โ€” handling both inbound queries from on-premises and outbound forwarding from Azure, without any VMs to maintain.


Here is how to read the diagram step by step:

โ‘  Query starts โ€” An on-premises client needs to resolve a private endpoint hostname (e.g. mystorageaccount.privatelink.blob.core.windows.net). It asks the on-premises DNS server.

โ‘ก Conditional forwarding โ€” The on-premises DNS server has a conditional forwarder rule for privatelink.* domains. It forwards the query through the ExpressRoute/VPN tunnel to the DNS Private Resolver’s inbound endpoint IP inside the Hub VNet.

โ‘ข Zone lookup โ€” The DNS Private Resolver receives the query and checks the linked Private DNS Zone inside Azure, which holds the A record for the private endpoint.

โ‘ฃ Private IP returned โ€” The zone returns the private endpoint’s internal IP address (e.g. 10.1.2.5) โ€” never the public IP โ€” back through the chain to the on-premises client.

โ‘ค Traffic flows privately โ€” The client now connects directly to that private IP over ExpressRoute, completely bypassing the public internet. Traffic never touches a public endpoint.

Spoke VNets inside Azure follow the same pattern โ€” their DNS settings point to the resolver, and all resolution flows through the Hub centrally. Click any box in the diagram to explore that component further.

Great question โ€” this is one of the trickiest concepts in hybrid Azure networking. Here is the full picture:


The Core Challenge

By default, Azure Private DNS Zones are only visible inside Azure VNets that are linked to them. An on-premises client has no way to query them directly โ€” it sits outside Azure entirely. So without special architecture, an on-premises client resolving mystorageaccount.privatelink.blob.core.windows.net would get the public IP back, defeating the purpose of a private endpoint.

The solution is to build a DNS resolution chain that bridges on-premises and Azure.


The Resolution Chain โ€” Step by Step

Step 1 โ€” Client queries its local DNS

The on-premises client (laptop, server, application) sends a DNS query to its configured DNS server, just as it always would. Nothing special happens at the client level โ€” it has no knowledge of Azure.

Step 2 โ€” On-premises DNS checks its conditional forwarder

The on-premises DNS server (Windows DNS, BIND, etc.) has a conditional forwarder rule configured by your network team that says:

“Any query for privatelink.blob.core.windows.net โ€” don’t try to resolve it yourself. Forward it to this IP address instead.”

That IP address is the inbound endpoint of Azure DNS Private Resolver, which is a private IP routable over ExpressRoute or VPN (e.g. 10.0.1.4).

Step 3 โ€” Query travels over ExpressRoute or VPN

The forwarded query travels from on-premises, through the private tunnel, and arrives at the DNS Private Resolver’s inbound endpoint inside the Hub VNet. This is just a UDP packet on port 53 โ€” it looks like any other DNS query.

Step 4 โ€” DNS Private Resolver checks the Private DNS Zone

The resolver receives the query and uses Azure’s built-in DNS (168.63.129.16) to look up the answer. Because the Hub VNet is linked to the Private DNS Zone for privatelink.blob.core.windows.net, it can see the A record inside that zone โ€” which contains the private endpoint’s internal IP (e.g. 10.1.2.5).

Step 5 โ€” Private IP is returned all the way back

The resolver returns 10.1.2.5 back through the tunnel to the on-premises DNS server, which passes it back to the client. The client now has the private IP, not the public one.

Step 6 โ€” Traffic flows privately

The client connects to 10.1.2.5 directly over ExpressRoute or VPN. The traffic never touches the public internet โ€” it flows entirely over your private network into Azure.


What Has to Be in Place

For this to work, several things must be correctly configured:

On the Azure side:

  • Azure DNS Private Resolver deployed in the Hub VNet with an inbound endpoint assigned a private IP
  • The relevant Private DNS Zone (e.g. privatelink.blob.core.windows.net) linked to the Hub VNet
  • An A record in that zone pointing to the private endpoint’s IP
  • The inbound endpoint’s IP must be reachable from on-premises over ExpressRoute or VPN (NSGs and routing must allow UDP 53)

On the on-premises side:

  • A conditional forwarder on the on-premises DNS server for each privatelink.* domain pointing to the resolver’s inbound endpoint IP
  • Note: you need a separate conditional forwarder per private link zone (blob, sql, vault, etc.) โ€” there is no wildcard forwarder for all privatelink.* in most DNS servers

The Most Common Misconfiguration

The single most frequent mistake is when Azure VMs also use a custom DNS server (e.g. a forwarder VM), and that custom server does not forward privatelink.* queries to Azure’s resolver at 168.63.129.16.

The result: the custom DNS server tries to resolve the private endpoint hostname itself, fails to find it, and either returns the public IP or returns nothing. The fix is always to ensure your custom DNS server has a forwarder rule sending privatelink.* queries to 168.63.129.16 before any other resolution is attempted.


Why You Can’t Just Use Azure Public DNS from On-Premises

You might wonder โ€” why not just point on-premises clients directly at Azure’s public DNS? The answer is that Azure’s public DNS (168.63.129.16) is a link-local address, only reachable from inside an Azure VNet. It is not routable from on-premises at all. That is precisely why the DNS Private Resolver inbound endpoint exists โ€” it gives you a routable private IP that acts as the on-ramp into Azure’s DNS infrastructure.


In short: the on-premises client never knows any of this is happening. It asks its local DNS, the local DNS quietly forwards to Azure via the tunnel, Azure resolves using its private zone, and the private IP comes back. The entire chain is transparent to the end user.

Would you like a diagram showing this flow for a specific service like Storage or SQL, or would you like to go deeper on the conditional forwarder setup?

DNS resolution patterns across hybrid environments – Azure

Managing DNS in a hybrid environment (Azure + On-premises) can feel like a high-stakes game of “telephone.” As of 2026, the industry standard has moved away from the old “DNS Forwarder VMs” and settled on the Azure DNS Private Resolver.

Here are the primary resolution patterns you should know to keep your traffic flowing smoothly over VPN or ExpressRoute.


1. The Modern Hub-Spoke Pattern (Azure DNS Private Resolver)

This is the recommended architecture. It uses a managed service instead of VMs, reducing overhead and providing built-in high availability.

How it Works:

  • Azure to On-Prem: You create an Outbound Endpoint in your Hub VNet and a Forwarding Ruleset. You link this ruleset to your Spoke VNets. When an Azure VM tries to resolve internal.corp.com, Azure DNS sees the rule and sends the query to your on-premises DNS servers.
  • On-Prem to Azure: You create an Inbound Endpoint (a static IP in your VNet). On your local Windows/Linux DNS servers, you set up a Conditional Forwarder for Azure zones (like privatelink.blob.core.windows.net) pointing to that Inbound Endpoint IP.

2. The “Private Link” Pattern (Split-Brain Avoidance)

One of the biggest “gotchas” in hybrid setups is resolving Azure Private Endpoints. If you aren’t careful, your on-premises machine might resolve the public IP of a storage account instead of the private one.

  • The Pattern: Always forward the public service suffix (e.g., blob.core.windows.net) to the Azure Inbound Endpoint, not just the privatelink version.
  • Why: Azure DNS is “smart.” If you query the public name from an authorized VNet, it automatically checks for a matching Private DNS Zone and returns the private IP. If you only forward the privatelink zone, local developers have to change their connection strings, which is a massive headache.

3. Legacy DNS Forwarder Pattern (IaaS VMs)

While largely replaced by the Private Resolver, some organizations still use Domain Controllers or BIND servers sitting in a Hub VNet.

FeatureVM-based ForwardersAzure DNS Private Resolver
ManagementYou patch, scale, and backup.Fully managed by Microsoft.
AvailabilityRequires Load Balancers/Availability Sets.Built-in 99.9% – 99.99% SLA.
CostHigh (Compute + Licenses + Management).Consumption-based (often cheaper).
ComplexityHigh (Custom scripts for sync).Low (Native ARM/Terraform support).

4. Key Configuration Rules for 2026

  • The 168.63.129.16 Rule: This is the “Magic IP” for Azure DNS. You cannot query this IP directly from on-premises. You must use an Inbound Endpoint as a bridge.
  • VNet Delegation: Remember that the subnets used for Private Resolver endpoints must be delegated specifically to Microsoft.Network/dnsResolvers. You can’t put VMs or other resources in those subnets.
  • Avoid Recursive Loops: Never point an Azure Outbound Forwarder to an on-premises server that is also configured to forward those same queries back to Azure. This creates a “DNS death loop” that will drop your resolution performance to zero.

Pro-Tip: If you are using Azure Virtual WAN, the DNS Private Resolver can be integrated into the Hub, allowing all connected spokes and branches to share the same resolution logic without redundant configurations.

Starting a fresh greenfield deployment?

For a greenfield deployment in 2026, you have the advantage of skipping the “technical debt” of legacy VM forwarders. The goal is a Hub-and-Spoke architecture using the Azure DNS Private Resolver.

This setup ensures that your on-premises office and your Azure cloud act as a single, cohesive network for naming.


1. The Essential Architecture

In a greenfield setup, you should centralize the resolver in your Hub VNet.

  • Inbound Endpoint: Provides a static IP address in your Hub VNet. Your on-premises DNS servers (Windows/BIND) will use this as a Conditional Forwarder.
  • Outbound Endpoint: A dedicated egress point that Azure DNS uses to send queries out to your on-premises DNS.
  • Forwarding Ruleset: A logic engine where you define: “If a query is for corp.local, send it to On-Prem IP 10.50.0.4.”

2. Step-by-Step Implementation Strategy

Step 1: Subnet Design (Non-Negotiable)

The Private Resolver requires two dedicated subnets in your Hub VNet. They cannot contain any other resources (no VMs, no Gateways).

  • Subnet 1 (Inbound): Min /28. Delegate to Microsoft.Network/dnsResolvers.
  • Subnet 2 (Outbound): Min /28. Delegate to Microsoft.Network/dnsResolvers.

Step 2: The “Private Link” Rule of Thumb

To avoid the common “2026 DNS Trap” where resolution fails for Private Endpoints, do not just forward privatelink zones.

  • On-Prem Config: Create conditional forwarders for the public suffixes (e.g., blob.core.windows.net, database.windows.net) pointing to your Azure Inbound Endpoint IP.
  • Why? This allows the Azure Resolver to handle the logic. If a Private Endpoint exists, it returns the private IP. If not, it returns the public IP. This keeps your application connection strings identical regardless of where the app sits.

Step 3: Global Reach (Spoke VNets)

You don’t need a resolver in every spoke.

  • Link your Forwarding Ruleset to every Spoke VNet.
  • This instantly gives all spokes the ability to resolve on-premises names without changing their DNS server settings from “Azure Provided.”

3. Financial Blueprint (2026 Pricing)

Budgeting for this is straightforward but requires awareness of the “Base Fee.”

ComponentEstimated Cost (Monthly)
Inbound Endpoint~$180
Outbound Endpoint~$180
Forwarding Ruleset~$2.50
Query Processing~$0.60 per million queries
Total Base~$362.50 / month

4. Greenfield “Gotchas” to Avoid

  • DNS Loops: Never link a Forwarding Ruleset to the same VNet where the Inbound Endpoint lives if that ruleset contains a rule pointing back to that Inbound Endpoint. You’ll create a recursive loop that crashes resolution.
  • The 168.63.129.16 Myth: In 2026, some older documentation still suggests querying the Azure WireServer IP directly from on-prem. This will not work. Traffic to this IP is only routable within the Azure backbone.
  • Subnet Sizing: While /28 is the minimum, if you plan on a massive global architecture with dozens of endpoints, consider a /27 to avoid future re-addressing headaches.

deploy this via Terraform

Deploying a Private DNS Resolver via Terraform requires a specific sequence of resources. In a greenfield scenario, you’ll likely want to use the Azure Verified Modules (AVM) or the standard azurerm provider resources.

Here is the “cleanest” implementation using standard azurerm resources for a Hub-and-Spoke architecture.


1. Subnet Delegation (The Foundation)

You cannot use standard subnets. You must delegate them to Microsoft.Network/dnsResolvers.

Terraform

# Inbound Subnet
resource "azurerm_subnet" "dns_inbound" {
name = "snet-dns-inbound"
resource_group_name = azurerm_resource_group.hub.name
virtual_network_name = azurerm_virtual_network.hub.name
address_prefixes = ["10.0.10.0/28"]
delegation {
name = "delegation"
service_delegation {
name = "Microsoft.Network/dnsResolvers"
actions = ["Microsoft.Network/virtualNetworks/subnets/join/action"]
}
}
}
# Outbound Subnet
resource "azurerm_subnet" "dns_outbound" {
name = "snet-dns-outbound"
resource_group_name = azurerm_resource_group.hub.name
virtual_network_name = azurerm_virtual_network.hub.name
address_prefixes = ["10.0.10.16/28"]
delegation {
name = "delegation"
service_delegation {
name = "Microsoft.Network/dnsResolvers"
actions = ["Microsoft.Network/virtualNetworks/subnets/join/action"]
}
}
}

2. The Resolver & Endpoints

The Resolver itself is just a container; the endpoints do the heavy lifting.

Terraform

resource "azurerm_private_dns_resolver" "main" {
name = "res-hub-dns"
resource_group_name = azurerm_resource_group.hub.name
location = azurerm_resource_group.hub.location
virtual_network_id = azurerm_virtual_network.hub.id
}
resource "azurerm_private_dns_resolver_inbound_endpoint" "inbound" {
name = "inbound-endpoint"
private_dns_resolver_id = azurerm_private_dns_resolver.main.id
location = azurerm_private_dns_resolver.main.location
ip_configurations {
subnet_id = azurerm_subnet.dns_inbound.id
private_ip_allocation_method = "Dynamic" # Azure will assign an IP from the subnet
}
}
resource "azurerm_private_dns_resolver_outbound_endpoint" "outbound" {
name = "outbound-endpoint"
private_dns_resolver_id = azurerm_private_dns_resolver.main.id
location = azurerm_private_dns_resolver.main.location
subnet_id = azurerm_subnet.dns_outbound.id
}

3. Forwarding Rulesets (The “Logic”)

This is where you tell Azure to send traffic to your on-premises DNS servers.

Terraform

resource "azurerm_private_dns_resolver_dns_forwarding_ruleset" "onprem" {
name = "ruleset-onprem"
resource_group_name = azurerm_resource_group.hub.name
location = azurerm_resource_group.hub.location
private_dns_resolver_outbound_endpoint_ids = [azurerm_private_dns_resolver_outbound_endpoint.outbound.id]
}
resource "azurerm_private_dns_resolver_forwarding_rule" "corp_local" {
name = "rule-corp-local"
dns_forwarding_ruleset_id = azurerm_private_dns_resolver_dns_forwarding_ruleset.onprem.id
domain_name = "corp.local." # Note the trailing dot
enabled = true
target_dns_servers {
ip_address = "10.50.0.4" # Your On-Prem DNS Server IP
port = 53
}
}

4. Linking to Spokes

To make this work for your entire environment, you must link the ruleset to your spoke VNets.

Terraform

resource "azurerm_private_dns_resolver_virtual_network_link" "spoke_links" {
name = "link-spoke-app"
dns_forwarding_ruleset_id = azurerm_private_dns_resolver_dns_forwarding_ruleset.onprem.id
virtual_network_id = azurerm_virtual_network.spoke_app.id
}

Important Terraform Considerations for 2026:

  • Trailing Dots: Always include the trailing dot in domain_name (e.g., internal.com.). If omitted, some versions of the provider may show a perpetual “diff” in your plan.
  • IP Outputs: Since the Inbound Endpoint uses dynamic allocation, use an output block to capture the IP address. You’ll need this IP to configure the conditional forwarders on your On-Premises DNS servers.Terraformoutput "dns_inbound_ip" { value = azurerm_private_dns_resolver_inbound_endpoint.inbound.ip_configurations[0].private_ip_address }
  • Lifecycle Management: Endpoints can take 10-15 minutes to deploy. If you are running this in a CI/CD pipeline, ensure your timeout settings are sufficient.

Does your current Terraform setup include a central “Hub” module where this configuration would live?