Securing Azure Kubernetes Service: A 2026 Guide

Securing Azure Kubernetes Service (AKS) involves a Shared Responsibility Model: Microsoft manages the control plane (API server, etcd), while you are responsible for securing the worker nodes, networking, and the workloads themselves.

As of 2026, here are the non-negotiable security best practices for AKS.


1. Identity and Access Management (IAM)

Authentication is your first line of defense.

  • Microsoft Entra ID (Azure AD) Integration: Never use local Kubernetes accounts. Integrate AKS with Entra ID to manage cluster access via existing corporate identities and groups.
  • Azure RBAC for Kubernetes: Use Azure Role-Based Access Control to provide granular permissions (e.g., Azure Kubernetes Service RBAC Reader).
  • Workload Identity: Avoid using “Secret” objects for Azure credentials. Use Entra Workload ID, which allows pods to authenticate to Azure services (like Key Vault or Storage) using managed identities instead of long-lived passwords.

2. Network Security

Don’t let your cluster be a “sitting duck” on the public internet.

  • Private Clusters: Deploy AKS as a Private Cluster. This ensures that the API server is only accessible via a private IP within your Virtual Network (VNet), completely removing it from the public internet.
  • Authorized IP Ranges: If you must have a public API server, strictly limit access to specific CIDR ranges (e.g., your office or VPN IP).
  • Default-Deny Network Policies: By default, all pods in Kubernetes can talk to each other. Implement Azure Network Policies or Calico to enforce a “Default Deny” posture, only allowing explicitly permitted traffic.

3. Host and Node Security

The underlying VMs (nodes) are often the weakest link.

  • Azure Linux (CBL-Mariner): Use Azure Linux as your Node OS. It is a lightweight, security-hardened distribution maintained by Microsoft specifically for AKS.Critical Update: Support for Azure Linux 2.0 ended in late 2025. Ensure you are on Azure Linux 3.0 or higher to receive security patches.
  • Automatic Upgrades: Enable the Auto-upgrade channel (e.g., stable or node-image) to ensure your nodes automatically receive OS security patches and Kubernetes version updates.
  • Disable Public IPs for Nodes: Ensure worker nodes do not have public IP addresses; use an Azure Load Balancer or NAT Gateway for egress.

4. Workload and Secret Management

How you run your code determines your “blast radius” during an attack.

  • Secrets Store CSI Driver: Do not store secrets in standard Kubernetes YAML files. Use the Azure Key Vault Provider for Secrets Store CSI Driver to mount secrets directly from Key Vault into your pods as volumes.
  • Pod Security Standards: Use Azure Policy for Kubernetes to enforce the “Restricted” pod security standard. This prevents:
    • Containers running as root.
    • Privileged escalation.
    • Writing to the root filesystem (use read-only filesystems instead).
  • Resource Limits: Always define CPU and Memory limits. This prevents a single compromised or “runaway” container from causing a Denial of Service (DoS) for the entire node.

5. Continuous Monitoring and Defense

You cannot protect what you cannot see.

  • Microsoft Defender for Containers: Enable this for real-time threat detection. It scans images for vulnerabilities in the registry and monitors running containers for suspicious behavior (e.g., unexpected shell execution).
  • Image Scanning in CI/CD: Use tools like Trivy or Microsoft Defender to scan images before they are pushed to the Azure Container Registry (ACR).
  • Audit Logging: Stream Kubernetes Audit Logs to a Log Analytics Workspace. This provides a paper trail of who did what in your cluster.

AKS Security Checklist (Quick Reference)

CategoryHigh Priority Task
AccessDisable local accounts; use Entra ID integration.
API ServerUse a Private Cluster or Authorized IP ranges.
OSMigrate to Azure Linux 3.0; enable auto-patches.
SecretsUse Azure Key Vault via CSI Driver (no YAML secrets).
TrafficImplement “Default Deny” Network Policies.
PolicyApply Azure Policy to block non-compliant deployments.

To get you started, I’ve provided a Network Policy to secure traffic and an Azure Policy to enforce security rules at the cluster level.


1. Default Deny Network Policy

By default, Kubernetes allows all pods to talk to each other. This is a major security risk. The following policy creates a “Default Deny” posture for a specific namespace, meaning you must explicitly “whitelist” any traffic you want to allow.

YAML

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: default-deny-all
namespace: production # Apply to your specific namespace
spec:
podSelector: {} # Selects all pods in the namespace
policyTypes:
- Ingress
- Egress

2. Azure Policy for Kubernetes (Enforcement)

While Network Policies handle traffic, Azure Policy ensures that developers don’t deploy “insecure” containers in the first place. You can apply the “Kubernetes cluster pod security restricted standards for Linux-based workloads” initiative directly from the Azure Portal.

Once assigned, it will automatically block any deployment that:

  • Tries to run as the Root user.
  • Attempts to mount the host path (which could allow a container to escape to the node).
  • Uses Privileged mode.

How to apply it via CLI:

Bash

# Get the ID of the 'Restricted' policy initiative
policyId=$(az policy set-definition list --query "[?displayName=='Kubernetes cluster pod security restricted standards for Linux-based workloads'].id" -o tsv)
# Assign it to your Resource Group or Subscription
az policy assignment create --name 'aks-restricted-policy' --policy-set-definition $policyId --scope /subscriptions/<your-subscription-id>/resourceGroups/<your-rg-name>

3. Securing the Flow

To visualize how these layers interact, imagine the lifecycle of a request:

  1. Identity: The user authenticates via Entra ID.
  2. Admission: Azure Policy checks if the YAML is “safe.”
  3. Network: Network Policies ensure the pod only talks to its specific database.
  4. Secrets: The pod pulls credentials from Azure Key Vault at runtime.

Pro-Tip: The “Audit” Mode

If you are worried that applying these policies will break the current app, you can set the Azure Policy effect to Audit instead of Deny. This allows you to see a list of non-compliant pods in the Azure Portal without actually taking them offline. Once you’ve fixed the configurations, you can flip the switch to Deny.

To pass the “Restricted” policy you just applied, your pod deployments can no longer be “lazy.” You must explicitly tell Kubernetes that the container is not allowed to do anything dangerous.

Below is a compliant deployment.yaml for a microservice.

The “Secure” Pod Template

YAML

apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-microservice
namespace: production
spec:
replicas: 2
selector:
matchLabels:
app: web-api
template:
metadata:
labels:
app: web-api
spec:
# 1. Don't run as Root!
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: api-container
image: myacr.azurecr.io/web-api:v1.2
# 2. Tighten the container's capabilities
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# 3. Resource limits prevent DoS attacks
resources:
limits:
cpu: "500m"
memory: "256Mi"
requests:
cpu: "100m"
memory: "128Mi"
ports:
- containerPort: 8080

Why this works (The “Why” for your client)

If you are proposing this to your client, explain the benefits of these specific lines:

  • runAsNonRoot: true: If a hacker exploits a bug in the app code, they only get “user” access. They can’t install software, change kernel settings, or break out of the container to the host server.
  • readOnlyRootFilesystem: true: This is a massive security win. It prevents malware from being downloaded and saved into the container. If the app needs to write temporary data, you should mount an emptyDir volume specifically for that.
  • capabilities: drop: - ALL: Linux containers usually come with “capabilities” they don’t need (like changing the system clock). Dropping all of them follows the Principle of Least Privilege.

How to handle “Storage” in a Read-Only Filesystem

Since the filesystem is now read-only, your app might crash if it tries to write to /tmp. You can solve this by adding a small “scratch space” in memory:

YAML

        volumeMounts:
        - name: tmp-volume
          mountPath: /tmp
      volumes:
      - name: tmp-volume
        emptyDir: {} # This creates a temporary writable folder in RAM

Proposing this as a “Service”

You can frame this to your company as a “Hardening Sprint.” > “I want to audit our existing 15 microservices to ensure they meet the 2026 AKS Restricted Security Standards. This will reduce our attack surface by 80% and ensure we pass our next compliance audit.”

Leave a comment