Certificate Management in Kubernetes
Why It Matters
Kubernetes workloads need TLS certificates for ingress (HTTPS), service-to-service mTLS, and webhook endpoints. Manual cert management doesn’t scale — automation is essential.
The Cert-Management Stack
┌─────────────────────────────────────────────────────┐│ INGRESS / GATEWAY ││ NGINX / Traefik / Azure Application GW │└──────────────────────┬──────────────────────────────┘ ↓┌─────────────────────────────────────────────────────┐│ CERT-MANAGER ││ Watches → Requests → Stores → Renews certificates │└──────┬───────────────┬──────────────────────────────┘ ↓ ↓┌────────────┐ ┌──────────────────┐│ Issuer / │ │ Certificate ││ ClusterIssuer│ │ (CRD) │└────────────┘ └──────────────────┘ ↓┌─────────────────────────────────────────────────────┐│ CERTIFICATE AUTHORITIES ││ Let's Encrypt │ Azure Key Vault │ Vault │ Self-signed│└─────────────────────────────────────────────────────┘ ↓┌─────────────────────────────────────────────────────┐│ KUBERNETES SECRETS ││ TLS cert stored as kubernetes.io/tls │└─────────────────────────────────────────────────────┘
Option 1 — cert-manager (Industry Standard)
Install cert-manager
# Install via Helmhelm repo add jetstack https://charts.jetstack.iohelm repo updatehelm install cert-manager jetstack/cert-manager \ --namespace cert-manager \ --create-namespace \ --version v1.14.0 \ --set installCRDs=true \ --set global.leaderElection.namespace=cert-manager# Verifykubectl get pods -n cert-manager# NAME READY# cert-manager-xxxx 1/1# cert-manager-cainjector-xxxx 1/1# cert-manager-webhook-xxxx 1/1
Issuers — Who Signs Your Certs
1. Let’s Encrypt (Public HTTPS)
# ClusterIssuer — available across all namespacesapiVersion: cert-manager.io/v1kind: ClusterIssuermetadata: name: letsencrypt-prodspec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: devops@yourcompany.com privateKeySecretRef: name: letsencrypt-prod-key solvers: - http01: ingress: class: nginx # or azure/application-gateway---# Staging (for testing — no rate limits)apiVersion: cert-manager.io/v1kind: ClusterIssuermetadata: name: letsencrypt-stagingspec: acme: server: https://acme-staging-v02.api.letsencrypt.org/directory email: devops@yourcompany.com privateKeySecretRef: name: letsencrypt-staging-key solvers: - http01: ingress: class: nginx
2. Self-Signed (Internal / Dev)
apiVersion: cert-manager.io/v1kind: ClusterIssuermetadata: name: selfsigned-issuerspec: selfSigned: {}---# Create a CA from self-signedapiVersion: cert-manager.io/v1kind: Certificatemetadata: name: internal-ca namespace: cert-managerspec: isCA: true commonName: internal-ca secretName: internal-ca-secret privateKey: algorithm: ECDSA size: 256 issuerRef: name: selfsigned-issuer kind: ClusterIssuer---# Use the CA as an issuerapiVersion: cert-manager.io/v1kind: ClusterIssuermetadata: name: internal-ca-issuerspec: ca: secretName: internal-ca-secret
3. Vault Issuer (Enterprise)
apiVersion: cert-manager.io/v1kind: ClusterIssuermetadata: name: vault-issuerspec: vault: server: https://vault.internal.company.com path: pki/sign/my-role auth: kubernetes: mountPath: /v1/auth/kubernetes role: cert-manager secretRef: name: vault-token key: token
Certificate Resources
# Manually request a certificateapiVersion: cert-manager.io/v1kind: Certificatemetadata: name: api-tls namespace: productionspec: secretName: api-tls-secret # Where cert is stored duration: 2160h # 90 days renewBefore: 360h # Renew 15 days before expiry subject: organizations: - Acme Corp commonName: api.acme.com dnsNames: - api.acme.com - api-internal.acme.com ipAddresses: - 10.0.1.100 issuerRef: name: letsencrypt-prod kind: ClusterIssuer privateKey: algorithm: RSA size: 2048 rotationPolicy: Always # Rotate key on renewal
Ingress — Auto Certificate (Most Common Pattern)
# Just annotate your Ingress — cert-manager does the restapiVersion: networking.k8s.io/v1kind: Ingressmetadata: name: api-ingress namespace: production annotations: cert-manager.io/cluster-issuer: "letsencrypt-prod" nginx.ingress.kubernetes.io/ssl-redirect: "true"spec: ingressClassName: nginx tls: - hosts: - api.acme.com secretName: api-tls-secret # cert-manager creates this rules: - host: api.acme.com http: paths: - path: / pathType: Prefix backend: service: name: api-service port: number: 80
Option 2 — DNS-01 Challenge (Wildcard Certs)
Use when HTTP-01 challenge isn’t possible (internal clusters, firewalled environments).
# Azure DNS solverapiVersion: cert-manager.io/v1kind: ClusterIssuermetadata: name: letsencrypt-dnsspec: acme: server: https://acme-v02.api.letsencrypt.org/directory email: devops@company.com privateKeySecretRef: name: letsencrypt-dns-key solvers: - dns01: azureDNS: subscriptionID: "your-subscription-id" resourceGroupName: "rg-dns" hostedZoneName: "acme.com" environment: AzurePublicCloud managedIdentity: clientID: "your-managed-identity-client-id"---# Request wildcard certapiVersion: cert-manager.io/v1kind: Certificatemetadata: name: wildcard-tls namespace: productionspec: secretName: wildcard-tls-secret dnsNames: - "*.acme.com" - "acme.com" issuerRef: name: letsencrypt-dns kind: ClusterIssuer
Option 3 — Azure Key Vault Integration (AKS)
Using CSI Secrets Driver (Mount cert as volume)
# Enable on AKSaz aks enable-addons \ --addons azure-keyvault-secrets-provider \ --name myAKSCluster \ --resource-group myResourceGroup
# SecretProviderClass — pull cert from Key VaultapiVersion: secrets-store.csi.x-k8s.io/v1kind: SecretProviderClassmetadata: name: azure-kv-tls namespace: productionspec: provider: azure parameters: usePodIdentity: "false" clientID: "your-managed-identity-client-id" keyvaultName: "kv-mycompany-prod" tenantID: "your-tenant-id" objects: | array: - | objectName: api-tls-cert objectType: secret # certificate stored as secret in KV objectVersion: "" secretObjects: - secretName: api-tls-k8s-secret # creates K8s secret type: kubernetes.io/tls data: - objectName: api-tls-cert key: tls.crt - objectName: api-tls-cert key: tls.key
# Pod that mounts the certapiVersion: apps/v1kind: Deploymentmetadata: name: apispec: template: spec: containers: - name: api image: myapp:latest volumeMounts: - name: tls-secret mountPath: "/mnt/tls" readOnly: true volumes: - name: tls-secret csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: "azure-kv-tls"
Using akv2k8s (Azure Key Vault to Kubernetes)
# Installhelm repo add spv-charts https://charts.spvapi.nohelm install akv2k8s spv-charts/akv2k8s \ --namespace akv2k8s \ --create-namespace
# Sync cert from Key Vault to K8s Secret automaticallyapiVersion: spv.no/v2beta1kind: AzureKeyVaultSecretmetadata: name: api-tls-sync namespace: productionspec: vault: name: kv-mycompany-prod object: name: api-tls-cert type: certificate output: secret: name: api-tls-secret type: kubernetes.io/tls
Option 4 — mTLS with Service Mesh
For service-to-service cert management inside the cluster.
Istio (Auto mTLS)
# Enable strict mTLS for a namespaceapiVersion: security.istio.io/v1beta1kind: PeerAuthenticationmetadata: name: default namespace: productionspec: mtls: mode: STRICT # All service-to-service must use mTLS---# Istio handles cert issuance, rotation, distribution# automatically via its built-in CA (istiod)# Zero config needed per service
Linkerd (Automatic mTLS)
# Install Linkerd — mTLS is on by defaultcurl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/install | shlinkerd install --crds | kubectl apply -f -linkerd install | kubectl apply -f -# Annotate namespace — all pods get mTLS automaticallykubectl annotate namespace production \ linkerd.io/inject=enabled
Certificate Rotation & Monitoring
Check Certificate Status
# List all certificateskubectl get certificates -A# Check cert detailskubectl describe certificate api-tls -n production# Check cert expirykubectl get secret api-tls-secret -n production \ -o jsonpath='{.data.tls\.crt}' | \ base64 -d | openssl x509 -noout -dates# Check cert-manager logskubectl logs -n cert-manager \ deployment/cert-manager -f
Monitor Expiry with Prometheus + Grafana
# cert-manager exposes metrics — scrape with PrometheusapiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata: name: cert-manager namespace: cert-managerspec: selector: matchLabels: app: cert-manager endpoints: - port: http-metrics interval: 60s
Key Prometheus metrics:
certmanager_certificate_expiration_timestamp_secondscertmanager_certificate_ready_statuscertmanager_http_acme_client_request_duration_seconds
Alert Rules
# PrometheusRule — alert before cert expiresapiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata: name: cert-expiry-alertsspec: groups: - name: certificates rules: - alert: CertificateExpiringSoon expr: | certmanager_certificate_expiration_timestamp_seconds - time() < 7 * 24 * 3600 for: 1h labels: severity: warning annotations: summary: "Certificate expiring in < 7 days" description: "{{ $labels.namespace }}/{{ $labels.name }}" - alert: CertificateExpired expr: | certmanager_certificate_expiration_timestamp_seconds - time() < 0 labels: severity: critical annotations: summary: "Certificate has EXPIRED"
Troubleshooting Common Issues
# 1. Certificate stuck in "False" ready statekubectl describe certificate api-tls -n productionkubectl describe certificaterequest -n productionkubectl describe order -n productionkubectl describe challenge -n production# 2. HTTP-01 challenge failing# Check ingress is reachable on port 80# Check /.well-known/acme-challenge/ path is not blocked# 3. Rate limited by Let's Encrypt# Switch to letsencrypt-staging for testing# Production limit: 50 certs/domain/week# 4. Secret not being createdkubectl get events -n production --sort-by='.lastTimestamp'# 5. Webhook issueskubectl get validatingwebhookconfigurations | grep cert-managerkubectl logs -n cert-manager deployment/cert-manager-webhook
Decision Guide
| Scenario | Solution |
|---|---|
| Public HTTPS on AKS ingress | cert-manager + Let’s Encrypt HTTP-01 |
Wildcard cert (*.domain.com) | cert-manager + Let’s Encrypt DNS-01 |
| Internal cluster, no internet | cert-manager + self-signed CA or Vault |
| Certs managed in Azure Key Vault | CSI Secrets Driver or akv2k8s |
| Service-to-service mTLS | Istio or Linkerd (mesh handles it) |
| Enterprise PKI / custom CA | cert-manager + Vault issuer |
| Dev / local cluster | cert-manager + self-signed |
Best Practices
| Practice | Why |
|---|---|
Always use ClusterIssuer over Issuer | Reusable across namespaces |
Set renewBefore to 15–30 days | Buffer time if renewal fails |
Set rotationPolicy: Always | Rotate private key on every renewal |
| Use staging LE first | Avoid hitting production rate limits |
| Monitor expiry via Prometheus | Catch failures before users do |
| Store CA private key in Key Vault | Never leave it only in K8s Secret |
| Use DNS-01 for internal clusters | HTTP-01 requires public exposure |
| Enable mTLS via service mesh | Zero-config service-to-service security |
cert-manager is the de facto standard — deploy it first, then layer in Key Vault integration and service mesh mTLS for a fully automated, enterprise-grade certificate lifecycle.