Deploy AKS Clusters with Terraform: Best Practices

To deploy a production-ready AKS cluster using Terraform, it is best practice to separate your Network (VNet/Subnet) from the AKS Cluster resource. This ensures that if you ever need to destroy the cluster, your networking infrastructure remains intact.

Here is a clean, modular example using the AzureRM provider.

1. The Provider Configuration

First, create a main.tf to define your requirements.

Terraform

terraform {
required_version = ">= 1.5.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0" # Or 4.x if using the latest 2026 releases
}
}
}
provider "azurerm" {
features {}
}

2. Networking Resources

AKS needs a dedicated subnet. We’ll use Azure CNI (Advanced Networking) as it’s the standard for enterprise security.

Terraform

resource "azurerm_resource_group" "aks_rg" {
name = "rg-production-aks"
location = "East US"
}
resource "azurerm_virtual_network" "aks_vnet" {
name = "vnet-aks-prod"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
address_space = ["10.0.0.0/16"]
}
resource "azurerm_subnet" "aks_subnet" {
name = "snet-aks-nodes"
resource_group_name = azurerm_resource_group.aks_rg.name
virtual_network_name = azurerm_virtual_network.aks_vnet.name
address_prefixes = ["10.0.1.0/24"]
}

3. The AKS Cluster Resource

This block includes the security features we discussed: System Assigned Identity, Azure RBAC, and Azure Linux as the OS.

Terraform

resource "azurerm_kubernetes_cluster" "aks" {
name = "aks-prod-01"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
dns_prefix = "aksprod"
# Enable Azure RBAC for Kubernetes
azure_policy_enabled = true
local_account_disabled = true
default_node_pool {
name = "systempool"
node_count = 3
vm_size = "Standard_DS2_v2"
vnet_subnet_id = azurerm_subnet.aks_subnet.id
# Use Azure Linux for better security/performance
os_sku = "AzureLinux"
# Enable auto-scaling for production
enable_auto_scaling = true
min_count = 3
max_count = 5
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
load_balancer_sku = "standard"
network_policy = "azure" # Enables Kubernetes Network Policies
}
tags = {
Environment = "Production"
ManagedBy = "Terraform"
}
}

4. Essential Outputs

You’ll need the cluster configuration to connect via kubectl.

Terraform

output "client_certificate" {
value = azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate
sensitive = true
}
output "kube_config" {
value = azurerm_kubernetes_cluster.aks.kube_config_raw
sensitive = true
}

Key Implementation Steps

  1. Initialize: Run terraform init to download the Azure provider.
  2. Plan: Run terraform plan -out=main.tfplan to preview the 4 resources being created.
  3. Apply: Run terraform apply "main.tfplan".
  4. Connect: Once finished, use the Azure CLI to get your credentials:Bashaz aks get-credentials --resource-group rg-production-aks --name aks-prod-01

Why this is a “Support Pro” Move

By delivering this in Terraform, you are telling the company: “I don’t just click buttons in the portal. I provide Infrastructure as Code that is version-controlled, repeatable, and documented.” This makes it much easier to propose a “Disaster Recovery” service later on.

Integrating the Azure Key Vault (AKV) Secrets Store CSI Driver into your Terraform code is the final step in removing sensitive data (like database passwords or API keys) from your Kubernetes manifests.

Here is the additional code to enable the driver and set up the necessary permissions.


1. Enable the CSI Driver in AKS

In your azurerm_kubernetes_cluster resource block (from the previous code), you need to add the key_vault_secrets_provider block:

Terraform

resource "azurerm_kubernetes_cluster" "aks" {
# ... existing config ...
key_vault_secrets_provider {
secret_rotation_enabled = true
secret_rotation_interval = "2m"
}
}

2. Create the Key Vault

You need a vault to actually store the secrets.

Terraform

resource "azurerm_key_vault" "kv" {
name = "kv-prod-aks-01"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
enabled_for_disk_encryption = true
tenant_id = data.azurerm_client_config.current.tenant_id
sku_name = "standard"
# Best practice: Don't use access policies, use RBAC
enable_rbac_authorization = true
}
data "azurerm_client_config" "current" {}

3. Link AKS to Key Vault (The “Magic” Link)

When you enable the CSI driver, AKS creates a “Secret Provider Class” identity. You must give that identity permission to read from the Key Vault.

Terraform

# Identify the Managed Identity created by the AKS CSI Driver
resource "azurerm_role_assignment" "aks_kv_reader" {
scope = azurerm_key_vault.kv.id
role_definition_name = "Key Vault Secrets User"
principal_id = azurerm_kubernetes_cluster.aks.key_vault_secrets_provider[0].secret_identity[0].object_id
}

4. Usage: The SecretProviderClass (K8s Manifest)

Terraform sets up the infrastructure, but you still need a small Kubernetes object to tell the pod which secrets to pull. You can apply this via kubectl or a Terraform kubernetes_manifest resource:

YAML

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-kv-provider
namespace: production
spec:
provider: azure
parameters:
usePodIdentity: "false"
useVMManagedIdentity: "true"
userAssignedIdentityID: "<AKS_CSI_CLIENT_ID>" # Output this from Terraform
keyvaultName: "kv-prod-aks-01"
objects: |
array:
- |
objectName: db-password
objectType: secret
tenantId: "<YOUR_TENANT_ID>"

Why this is a “Gold Standard” Setup

By using this approach, your Linux servers and Docker microservices become significantly more secure:

  • No “Cleartext” Secrets: Developers never see the production password.
  • Auto-Rotation: If you change the password in the Azure Portal/Key Vault, the CSI driver automatically updates the file inside the running Docker container within 2 minutes.
  • Audit Trail: Every time a pod accesses a secret, it’s logged in Azure Monitor.

Pro-Tip for your Proposal

When talking to the client, use this phrasing:

“I am implementing a Zero-Trust Secret Architecture. This ensures that sensitive credentials never touch our code repository or container images, and they are rotated automatically to prevent long-term credential leakage.”

Leave a comment