Ingress

In Kubernetes, Ingress is an API object that acts as a “smart router” for your cluster. While a standard Service (like a LoadBalancer) simply opens a hole in the firewall for one specific app, Ingress allows you to consolidate many services behind a single entry point and route traffic based on the URL or path.

Think of it as the receptionist of an office building: instead of every employee having their own front door, everyone uses one main entrance, and the receptionist directs visitors to the correct room based on who they are looking for.


1. How Ingress Works

There are two distinct parts required to make this work:

  1. Ingress Resource: A YAML file where you define your “rules” (e.g., “Send all traffic for myapp.com/api to the api-service“).
  2. Ingress Controller: The actual software (like NGINX, HAProxy, or Traefik) that sits at the edge of your cluster, reads those rules, and physically moves the traffic. Kubernetes does not come with a controller by default; you must install one.

2. Key Capabilities

Ingress is much more powerful than a simple Port or LoadBalancer because it operates at Layer 7 (HTTP/HTTPS).

  • Host-based Routing: Route blue.example.com to the Blue Service and green.example.com to the Green Service using a single IP.
  • Path-based Routing: Route example.com/login to the Auth service and example.com/search to the Search service.
  • SSL/TLS Termination: You can handle your SSL certificates at the Ingress level so your individual application pods don’t have to deal with encryption/decryption.
  • Name-based Virtual Hosting: Supporting multiple domain names on the same IP address.

3. Ingress vs. LoadBalancer vs. NodePort

Choosing how to expose your app is a common point of confusion. Here is the breakdown:

MethodBest ForPros/Cons
NodePortTesting/DevOpens a high-range port (30000+) on every node. Very insecure for production.
LoadBalancerSingle AppsStandard way to get an external IP. Expensive if you have 50 apps, as you pay for 50 cloud load balancers.
IngressProductionMultiple apps share one Load Balancer. Extremely cost-effective and feature-rich.

4. What about OpenShift “Routes”?

Since you were asking about OCP earlier, it’s important to note that OpenShift has its own version called a Route.

  • Routes were created by Red Hat before the Kubernetes Ingress standard existed.
  • In modern OpenShift, an Ingress is automatically converted into a Route behind the scenes.
  • The main difference: Routes support “F5” and more complex HAProxy features natively, while Ingress is the industry-standard “portable” way to do it.

5. Basic Ingress Example

This YAML routes traffic to two different services based on the URL path:

YAML

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: main-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /billing
pathType: Prefix
backend:
service:
name: billing-service
port:
number: 80
- path: /inventory
pathType: Prefix
backend:
service:
name: inventory-service
port:
number: 80

To set up HTTPS for an Ingress, you need a TLS Secret. This secret acts as a container for your SSL certificate and private key, which the Ingress Controller (like NGINX or HAProxy) uses to encrypt the traffic before it reaches your pods.


1. Create the TLS Secret

First, you need your certificate files (tls.crt and tls.key). You can create a Kubernetes Secret from these files using the CLI:

Bash

kubectl create secret tls my-app-tls \
--cert=path/to/tls.crt \
--key=path/to/tls.key

Note: If you are using OpenShift, you can often skip this manual step by using Service Serving Certificates, which allow OpenShift to generate and manage the certificates for you automatically.


2. Update the Ingress YAML

Now, you tell the Ingress resource to use that secret for a specific hostname.

YAML

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: secure-ingress
annotations:
# This annotation tells the controller to redirect HTTP to HTTPS
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- myapp.example.com
secretName: my-app-tls # This must match the secret name created in Step 1
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-web-service
port:
number: 80

3. Automation with Cert-Manager

Manually updating certificates before they expire is a headache. Most production clusters use Cert-Manager.

Cert-Manager is an operator that talks to certificate authorities like Let’s Encrypt. You simply add an annotation to your Ingress, and Cert-Manager handles the rest:

The “Magic” Annotation:

YAML

metadata:
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"

Once you add this, Cert-Manager will:

  1. See the Ingress request.
  2. Reach out to Let’s Encrypt to verify you own the domain.
  3. Generate the tls.crt and tls.key.
  4. Create the Secret for you and renew it every 90 days automatically.

Summary Checklist for HTTPS

StepAction
1. CertificateObtain a CA-signed cert or use Let’s Encrypt.
2. SecretStore the cert/key in a kind: Secret (type kubernetes.io/tls).
3. Ingress SpecAdd the tls: section to your Ingress YAML.
4. DNSEnsure your domain points to the Ingress Controller’s IP.

To automate SSL certificates with Cert-Manager, you need a ClusterIssuer. This is a cluster-wide resource that tells Cert-Manager how to talk to a Certificate Authority (CA) like Let’s Encrypt.

Before you start, ensure the Cert-Manager Operator is installed in your cluster (in OpenShift, you can find this in the OperatorHub).


1. Create a ClusterIssuer (The “Account”)

This YAML defines your identity with Let’s Encrypt. It uses the ACME (Automated Certificate Management Environment) protocol.

YAML

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# The ACME server address for Let's Encrypt production
server: https://acme-v02.api.letsencrypt.org/directory
# Your email address for expiration notices
email: admin@yourdomain.com
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod-account-key
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: nginx # Or 'openshift-default' depending on your ingress controller

2. Update your Ingress to “Request” the Cert

Once the ClusterIssuer is created, you don’t need to manually create secrets anymore. You just “tag” your Ingress with an annotation. Cert-Manager will see this, perform the challenge, and create the secret for you.

YAML

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-secure-app
annotations:
# THIS IS THE TRIGGER: It links the Ingress to your ClusterIssuer
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- app.yourdomain.com
secretName: app-tls-cert # Cert-Manager will create this secret automatically
rules:
- host: app.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-service
port:
number: 80

3. How to verify it’s working

After you apply the Ingress, Cert-Manager creates a Certificate object and a Challenge object. You can track the progress:

  • Check the certificate status:kubectl get certificate(Look for READY: True)
  • Check the order status (if it’s stuck):kubectl get challenges
  • Check the secret:kubectl get secret app-tls-cert(If this exists, your site is now HTTPS!)

Why use Let’s Encrypt?

  1. Cost: It is 100% free.
  2. Trust: It is recognized by all major browsers (unlike self-signed certs).
  3. No Maintenance: Cert-Manager automatically renews the cert 30 days before it expires.

A Small Warning:

Let’s Encrypt has rate limits. If you are just testing, use the “Staging” URL (https://acme-staging-v02.api.letsencrypt.org/directory) first. Browsers will show a warning for staging certs, but you won’t get blocked for hitting limit thresholds while debugging.

When Cert-Manager fails to issue a certificate, it usually gets stuck in the Challenge phase. Let’s look at how to diagnose and fix the most common “Let’s Encrypt” roadblocks.


1. The Troubleshooting Command Chain

If your certificate isn’t appearing, follow this hierarchy to find where the “handshake” broke:

  • Check the Certificate status:oc get certificateIf READY is False, move to the next step.
  • Check the Order:oc get orderThe Order is the request sent to Let’s Encrypt. Look at the STATE column.
  • Check the Challenge (The most important step):oc get challengesIf a challenge exists, it means Let’s Encrypt is trying to verify your domain but can’t.
  • Describe the Challenge for the error message:oc describe challenge <challenge-name>

2. Common Failure Reasons

A. The “I Can’t See You” (Firewall/Network)

Let’s Encrypt uses the HTTP-01 challenge. It tries to reach http://yourdomain.com/.well-known/acme-challenge/<TOKEN>.

  • The Problem: Your firewall, Security Group (AWS/Azure), or OpenShift Ingress Controller is blocking Port 80.
  • The Fix: Ensure Port 80 is open to the public internet. Let’s Encrypt cannot verify your domain over Port 443 (HTTPS) because the certificate doesn’t exist yet!

B. DNS Record Mismatch

  • The Problem: Your DNS A record or CNAME for app.yourdomain.com hasn’t propagated yet or is pointing to the wrong Load Balancer IP.
  • The Fix: Use dig app.yourdomain.com or nslookup to ensure the domain points exactly to your Ingress Controller’s external IP.

C. Rate Limiting

  • The Problem: You’ve tried to issue the same certificate too many times in one week (Let’s Encrypt has a limit of 5 duplicate certs per week).
  • The Fix: Switch your ClusterIssuer to use the Staging URL (mentioned in the previous step) until your configuration is 100% correct, then switch back to Production.

3. Dealing with Internal/Private Clusters

If your OpenShift cluster is behind a VPN and not accessible from the public internet, the HTTP-01 challenge will always fail because Let’s Encrypt can’t “see” your pods.

The Solution: DNS-01 Challenge

Instead of a web check, Cert-Manager proves ownership by adding a temporary TXT record to your DNS provider (Route53, Cloudflare, Azure DNS).

Example DNS-01 Issuer (Route53):

YAML

spec:
acme:
solvers:
- dns01:
aws-route53:
region: us-east-1
hostedZoneID: Z123456789

Summary Checklist

  1. Is Port 80 open?
  2. Does DNS point to the cluster?
  3. Are you hitting Rate Limits?
  4. Is your Ingress Class correct in the Issuer?

Persistent Volumes (PV) and Persistent Volume Claims (PVC)

In Kubernetes, storage is handled separately from your application’s logic. To understand Persistent Volumes (PV) and Persistent Volume Claims (PVC), it helps to use the “Electricity” analogy:

  • PV (The Infrastructure): This is like the power plant and the grid. It’s the actual physical storage (a disk, a cloud drive, or a network share).
  • PVC (The Request): This is like the power outlet in your wall. Your application “plugs in” to the PVC to get what it needs without needing to know where the power plant is.

1. Persistent Volume (PV)

A PV is a piece of storage in the cluster that has been provisioned by an administrator or by a storage class. It is a cluster-level resource (like a Node) and exists independently of any individual Pod.

  • Capacity: How much space is available (e.g., 5Gi, 100Gi).
  • Access Modes: * ReadWriteOnce (RWO): Can be mounted by one node at a time.
    • ReadOnlyMany (ROX): Many nodes can read it simultaneously.
    • ReadWriteMany (RWX): Many nodes can read and write at the same time (requires specific hardware like NFS or ODF).
  • Reclaim Policy: What happens to the data when you delete the PVC? (Retain it for manual cleanup or Delete it immediately).

2. Persistent Volume Claim (PVC)

A PVC is a request for storage by a user. If a Pod needs a “hard drive,” it doesn’t look for a specific disk; it creates a PVC asking for “10Gi of storage with ReadWriteOnce access.”

  • The “Binding” Process: Kubernetes looks at all available PVs. If it finds a PV that matches the PVC’s request, it “binds” them together.
  • Namespace Scoped: Unlike PVs, PVCs live inside a specific Namespace.

3. Dynamic Provisioning (StorageClasses)

In modern clusters (like OpenShift), admins don’t manually create 100 different PVs. Instead, they use a StorageClass.

  1. The user creates a PVC.
  2. The StorageClass notices the request.
  3. It automatically talks to the cloud provider (AWS/Azure/GCP) to create a new disk.
  4. It automatically creates the PV and binds it to the PVC.

4. How a Pod uses it

Once the PVC is bound to a PV, you tell your Pod to use that “outlet.”

YAML

spec:
containers:
- name: my-db
image: postgres
volumeMounts:
- mountPath: "/var/lib/postgresql/data"
name: my-storage
volumes:
- name: my-storage
persistentVolumeClaim:
claimName: task-pv-claim # This matches the name of your PVC

Summary Comparison

FeaturePersistent Volume (PV)Persistent Volume Claim (PVC)
Who creates it?Administrator or Storage SystemDeveloper / Application
ScopeCluster-wideNamespace-specific
AnalogyThe actual Hard DriveThe request for a Hard Drive
LifecycleExists even if no one uses itTied to the application’s needs

Here is a standard YAML example for a Persistent Volume Claim (PVC).

In this scenario, we aren’t manually creating a disk. Instead, we are telling OpenShift/Kubernetes: “I need 10Gi of fast storage. Please go talk to the cloud provider or storage backend and create it for me.”

1. The PVC Definition

This is the “request” for storage.

YAML

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-pvc-example
namespace: my-app-project
spec:
storageClassName: gp3-csi # Or 'thin', 'ocs-storagecluster-ceph-rbd', etc.
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

2. How the Binding Works

When you apply this YAML, the following chain reaction happens:

  1. The Claim: You submit the PVC.
  2. The Provisioner: The StorageClass (e.g., AWS EBS, Azure Disk, or OpenShift Data Foundation) sees the request.
  3. The Asset: The storage backend creates a physical 10Gi volume.
  4. The Volume: Kubernetes automatically creates a PersistentVolume (PV) object to represent that physical disk.
  5. The Binding: The PVC status changes from Pending to Bound.

3. Attaching the PVC to a Pod

A PVC is useless until a Pod “claims” it. Here is how you mount that 10Gi disk into a container:

YAML

apiVersion: v1
kind: Pod
metadata:
name: storage-test-pod
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: my-data-volume
mountPath: /usr/share/nginx/html # Where the disk appears inside the container
volumes:
- name: my-data-volume
persistentVolumeClaim:
claimName: dynamic-pvc-example # Must match the name in the PVC YAML

Important “Gotchas” with PVCs

  • Access Modes: * ReadWriteOnce (RWO): Most common. If Pod A is using the disk on Node 1, Pod B cannot use it if Pod B is on Node 2.
    • ReadWriteMany (RWX): Required if you want multiple Pods across different nodes to share the same files (common for web servers sharing a shared uploads folder).
  • Expansion: Many modern StorageClasses allow you to increase the storage size in the PVC YAML after it’s created, and Kubernetes will expand the disk on the fly (provided the underlying storage supports it).
  • Sticky Nodes: If you use a cloud-based RWO disk (like AWS EBS), your Pod becomes “stuck” to the availability zone where that disk was created.

Checking for available StorageClasses is one of the most common tasks for an OpenShift administrator or developer. It tells you exactly what “flavors” of storage are available for your apps.

1. Using the CLI (Recommended)

Run the following command to see a list of all storage providers configured in your cluster:

Bash

oc get storageclass

(Or use the shorthand: oc get sc)

Example Output:

Plaintext

NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
gp3-csi (default) ebs.csi.aws.com Delete WaitForFirstConsumer true 45d
thin kubernetes.io/vsphere Delete Immediate false 102d
ocs-storage-rbd openshift-storage.rbd... Delete Immediate true 12d

2. How to Read the Results

  • NAME: This is what you put in the storageClassName field of your PVC YAML.
  • (default): If you see this next to a name, it means any PVC that doesn’t specify a class will automatically get this one.
  • PROVISIONER: This tells you the underlying technology (e.g., AWS EBS, VMware vSphere, or Ceph/ODF).
  • RECLAIMPOLICY: * Delete: When you delete the PVC, the physical disk is also deleted.
    • Retain: When you delete the PVC, the physical disk stays (so you can recover data manually).
  • VOLUMEBINDINGMODE:
    • Immediate: The disk is created the second you create the PVC.
    • WaitForFirstConsumer: The disk isn’t created until a Pod actually tries to use it. This is smarter because it ensures the disk is created in the same Availability Zone as the Pod.

3. Inspecting a Specific StorageClass

If you want to see the “fine print” (like encryption settings or IOPS), you can look at the YAML of a specific class:

Bash

oc get sc gp3-csi -o yaml

4. Using the Web Console

If you prefer the GUI:

  1. Log in to the OpenShift Web Console.
  2. Ensure you are in the Administrator perspective.
  3. Navigate to Storage -> StorageClasses.
  4. Here, you can see all classes, which one is the default, and even create new ones using a wizard.

Pro-Tip: If your cluster has no default storage class, your PVCs will stay in a Pending state forever unless you explicitly name one in your YAML.

Autoscaling in Kubernetes

Autoscaling in Kubernetes is the process of automatically adjusting your resources to match the current demand. Instead of guessing how many servers or how much memory you need, Kubernetes monitors your traffic and “flexes” the infrastructure in real-time.

There are three main “layers” of autoscaling. Think of them as a chain: if one layer can’t handle the load, the next one kicks in.


1. Horizontal Pod Autoscaler (HPA)

The Concept: Adding more “lanes” to the highway.

HPA is the most common form of scaling. It increases or decreases the number of pod replicas based on metrics like CPU usage, memory, or custom traffic data.

  • How it works: It checks your pods every 15 seconds. If the average CPU across all pods is above your target (e.g., 70%), it tells the Deployment to spin up more pods.
  • Best for: Stateless services like web APIs or microservices that can handle traffic by simply having more copies running.

2. Vertical Pod Autoscaler (VPA)

The Concept: Making the “cars” bigger.

VPA doesn’t add more pods; instead, it looks at a single pod and decides if it needs more CPU or Memory. It “right-sizes” your containers.

  • How it works: It observes your app’s actual usage over time. If a pod is constantly hitting its memory limit, VPA will recommend (or automatically apply) a higher limit.
  • The Catch: Currently, in most versions of Kubernetes, changing a pod’s size requires restarting the pod.
  • Best for: Stateful apps (like databases) that can’t easily be “split” into multiple copies, or apps where you aren’t sure what the resource limits should be.

3. Cluster Autoscaler (CA)

The Concept: Adding more “pavement” to the highway.

HPA and VPA scale Pods, but eventually, you will run out of physical space on your worker nodes (VMs). This is where the Cluster Autoscaler comes in.

  • How it works: It watches for “Pending” pods—pods that want to run but can’t because no node has enough free CPU/RAM. When it sees this, it calls your cloud provider (AWS, Azure, GCP) and asks for a new VM to be added to the cluster.
  • Downscaling: It also watches for underutilized nodes. If a node is mostly empty, it will move those pods elsewhere and delete the node to save money.

The “Scaling Chain” in Action

Imagine a sudden surge of users hits your website:

  1. HPA sees high CPU usage and creates 10 new Pods.
  2. The cluster is full, so those 10 Pods stay in Pending status.
  3. Cluster Autoscaler sees the Pending pods and provisions 2 new Worker Nodes.
  4. The Pods finally land on the new nodes, and your website stays online.

Comparison Summary

FeatureHPAVPACluster Autoscaler
What it scalesNumber of PodsSize of Pods (CPU/RAM)Number of Nodes (VMs)
Primary GoalHandle traffic spikesOptimize resource efficiencyProvide hardware capacity
ImpactFast, no downtimeUsually requires pod restartSlower (minutes to boot VM)

Pro-Tip: Never run HPA and VPA on the same metric (like CPU) for the same app. They will “fight” each other—HPA will try to add pods while VPA tries to make them bigger, leading to a “flapping” state where your app is constantly restarting.

To set up a Horizontal Pod Autoscaler (HPA), you need two things: a Deployment (your app) and an HPA resource that watches it.

Here is a breakdown of how to configure this in a way that actually works.

1. The Deployment

First, your pods must have resources.requests defined. If the HPA doesn’t know how much CPU a pod should use, it can’t calculate the percentage.

YAML

apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
selector:
matchLabels:
run: php-apache
replicas: 1
template:
metadata:
labels:
run: php-apache
spec:
containers:
- name: php-apache
image: registry.k8s.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m # HPA uses this as the baseline

2. The HPA Resource

This YAML tells Kubernetes: “Keep the average CPU usage of these pods at 50%. If it goes higher, spin up more pods (up to 10). If it goes lower, scale back down to 1.”

YAML

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50

3. How to Apply and Test

You can apply these using oc apply -f <filename>.yaml (in OpenShift) or kubectl apply.

Once applied, you can watch the autoscaler in real-time:

  • View status: oc get hpa
  • Watch it live: oc get hpa php-apache-hpa --watch

The Calculation Logic:

The HPA uses a specific formula to decide how many replicas to run:

$$\text{Desired Replicas} = \lceil \text{Current Replicas} \times \frac{\text{Current Metric Value}}{\text{Desired Metric Value}} \rceil$$

Quick Tip: If you are using OpenShift, you can also do this instantly via the CLI without a YAML file:

oc autoscale deployment/php-apache --cpu-percent=50 --min=1 --max=10

To make your autoscaling more robust, you can combine CPU and Memory metrics in a single HPA. Kubernetes will look at both and scale based on whichever one hits the limit first.

Here is the updated YAML including both resource types and a “Scale Down” stabilization period to prevent your cluster from “flapping” (rapidly adding and removing pods).

1. Advanced HPA YAML (CPU + Memory)

YAML

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: advanced-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: advanced-app
minReplicas: 2
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300 # Wait 5 mins before scaling down to ensure traffic is actually gone
policies:
- type: Percent
value: 10
periodSeconds: 60

2. Scaling on Custom Metrics (e.g., HTTP Requests)

Sometimes CPU doesn’t tell the whole story. If your app is waiting on a database, CPU might stay low while users experience lag. In these cases, you can scale based on Requests Per Second (RPS).

To use this, you must have the Prometheus Adapter installed (which comes standard in OpenShift’s monitoring stack).

YAML

  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 500 # Scale up if pods average more than 500 requests/sec


Pro-Tips for Memory Scaling

  1. Memory is “Sticky”: Unlike CPU, which drops the moment a process finishes, many runtimes (like Java/JVM or Node.js) do not immediately release memory back to the OS.
  2. The Danger: If your app doesn’t have a good Garbage Collector configuration, the HPA might see high memory usage, spin up 10 pods, and never scale back down because the memory stays “reserved” by the app.
  3. The Fix: Always ensure your memory.requests in the Deployment are set to what the app actually needs to start, not its peak limit.

Summary Table: Which metric to use?

ScenarioRecommended MetricWhy?
Calculation heavyCPUDirectly maps to processing power.
Caching/Large DataMemoryPrevents OOM (Out of Memory) kills.
Web APIsRequests Per SecondScaled based on actual user load.
Message QueueQueue DepthScales based on “work to be done.”

When an HPA isn’t behaving as expected—maybe it’s not scaling up during a spike, or it’s “stuck” at the minimum replicas—you need to look at the Controller Manager’s internal logic.

Here is how you can perform a “health check” on your HPA’s decision-making process.


1. The “Describe” Command (Most Useful)

The describe command provides a chronological log of every scaling action and, more importantly, why a request failed.

Bash

oc describe hpa advanced-app-hpa

What to look for in the “Events” section:

  • SuccessfulRescale: The HPA successfully changed the replica count.
  • FailedComputeMetricsReplicas: Usually means the HPA can’t talk to the Metrics Server (check if your pods have resources.requests defined!).
  • FailedGetResourceMetric: The pods might be crashing or “Unready,” so the HPA can’t pull their CPU/Memory usage.

2. Checking the “Conditions”

In the output of the describe command, look for the Conditions section. It tells you the current “brain state” of the autoscaler:

ConditionStatusMeaning
AbleToScaleTrueThe HPA is healthy and can talk to the Deployment.
ScalingActiveTrueMetrics are being received and scaling logic is running.
ScalingLimitedTrueWarning: You’ve hit your maxReplicas or minReplicas. It wants to scale further but you’ve capped it.

3. Real-time Metric Monitoring

If you want to see exactly what numbers the HPA is seeing right now compared to your target, use:

Bash

oc get hpa advanced-app-hpa -w

Example Output:

Plaintext

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
advanced-app-hpa Deployment/advanced-app 75%/60%, 40%/80% 2 15 5 10m

In this example, CPU is at 75% (above the 60% target), so it has already scaled to 5 replicas.


4. Debugging Common “Stuck” Scenarios

Scenario A: “Target: “

If the TARGETS column shows <unknown>, it almost always means:

  1. Missing Requests: You forgot to set resources.requests in your Deployment YAML.
  2. Metrics Server Down: The cluster-wide metrics service is having issues.
  3. Labels Mismatch: The HPA selector doesn’t match the Deployment labels.

Scenario B: High CPU but No Scaling

Check if the pods are in a Ready state. HPA ignores “Unready” pods to prevent scaling up based on the high CPU usage often seen during a container’s startup/boot phase.


Pro-Tip: The “Cooldown” Period

If you just stopped a load test and the pods are still running, don’t panic! By default, Kubernetes has a 5-minute stabilization window for scaling down. This prevents the “Flapping” effect where pods are deleted and then immediately recreated because of a small traffic blip.

How to audit an MCP server

Auditing a Model Context Protocol (MCP) server in 2026 requires a shift from traditional web auditing to Agentic Security Auditing. Since an LLM acts as the user of your server, you must audit not just the code, but the “instructions” and “boundaries” presented to the AI.

Here is the professional workflow for conducting a comprehensive MCP server audit.


1. Discovery & Tool Inspection

The first step is to see exactly what the AI sees. A malicious agent or a prompt injection can only exploit what is exposed in the tool definitions.

  • Use the MCP Inspector: Run npx @modelcontextprotocol/inspector to launch a local GUI. Connect your server and inspect the Tools tab.
  • Audit Tool Descriptions: Check if the descriptions are too “helpful.”
    • Bad: “This tool runs any bash command.”
    • Good: “This tool lists files in the /public directory only.”
  • Schema Strictness: Ensure every tool uses strict JSON Schema. AI agents are prone to “hallucinating” extra arguments; your server should reject any input that doesn’t perfectly match the schema.

2. Static Analysis (The “Code” Audit)

Since most MCP servers are written in TypeScript or Python, use standard security scanners with MCP-specific rules.

  • Dependency Check: Use npm audit or pip-audit. MCP is a new ecosystem; many early community servers use outdated, vulnerable libraries.
  • Path Traversal Check: This is the #1 vulnerability in MCP (found in 80% of filesystem-based servers).
    • Audit Task: Search your code for fs.readFile or open(). Ensure user-provided paths are sanitized using path.resolve and checked against a “Root” directory.
  • Command Injection: If your tool executes shell commands (e.g., a Git or Docker tool), ensure inputs are passed as arrays, never as strings.
    • Vulnerable: exec("git log " + user_input)
    • Secure: spawn("git", ["log", user_input])

3. Runtime & Behavioral Auditing

In 2026, we use eBPF-based monitoring or MCP Gateways to watch what the server actually does during a session.

  • Sandbox Verification: Run the server in a restricted Docker container. Audit the Dockerfile to ensure it runs as a non-root user (USER node or USER python).
  • Network Egress Audit: Does your server need to talk to the internet? If it’s a “Local File” tool, use firewall rules (or Docker network flags) to block all outgoing traffic. This prevents “Data Exfiltration” where an AI is tricked into sending your files to a remote server.
  • AIVSS Scoring: Use the AI Vulnerability Scoring System (AIVSS) to rank findings. A “Prompt Injection” that leads to a file read is a High; a “Prompt Injection” that leads to a shell execution is Critical.

4. The 2026 Audit Checklist

If you are performing a formal audit, ensure you can check “Yes” to all of the following:

CategoryAudit Check
AuthenticationDoes the server require a token for every request (especially for HTTP transports)?
SanitizationAre all LLM-generated arguments validated against a regex or allowlist?
Least PrivilegeDoes the server only have access to the specific folders/APIs it needs?
Human-in-LoopAre “Write” or “Delete” actions flagged to require manual user approval in the client?
LoggingDoes the server log the User ID, Tool Name, and Arguments for every call?

5. Automated Auditing Tools

To speed up the process, you can use these 2026-standard tools:

  1. mcpserver-audit: A GitHub-hosted tool that scans MCP source code for common dangerous patterns (like unparameterized SQL or open shell calls).
  2. Trivy / Docker Scout: For scanning the container image where your MCP server lives.
  3. Semgrep (MCP Ruleset): Use specialized Semgrep rules designed to find “AI Injection” points in Model Context Protocol implementations.

Multi-Layered Test Plan.

To perform a professional audit of an MCP server in 2026, you should follow a Multi-Layered Test Plan. Since MCP servers act as “Resource Servers” in an agentic ecosystem, your audit must verify that a compromised or malicious AI cannot “break out” of its intended scope.

Here is a 5-step Security Test Plan for an MCP server.


1. Static Analysis: “The Code Review”

Before running the server, scan the source code for common “agent-trap” patterns.

  • Check for shell=True (Python) or exec() (Node.js): These are the most common entry points for Remote Code Execution (RCE).
    • Test: Ensure all CLI tools use argument arrays instead of string concatenation.
  • Path Traversal Audit: Look for any tool that takes a path or filename as an argument.
    • Test: Verify that the code uses path.resolve() and checks if the resulting path starts with an allowed root directory.
    • Common Fail: Using simple string .startsWith() without resolving symlinks first (CVE-2025-53109).

2. Manifest & Metadata Audit

The LLM “sees” your server through its JSON-RPC manifest. If your tool descriptions are vague, the LLM might misuse them.

  • Tool Naming: Ensure tool names use snake_case (e.g., get_user_data) for optimal tokenization and clarity.
  • Prompt Injection Resilience: Check if tool descriptions include “Safety instructions.”
    • Example: “This tool reads files. Safety: Never read files ending in .env or .pem.
  • Annotations: Verify that “destructive” tools (delete, update, send) are marked with destructiveHint: true. This triggers a mandatory confirmation popup in modern MCP clients like Cursor or Claude Desktop.

3. Dynamic “Fuzzing” (The AI Stress Test)

In 2026, we use tools like mcp-sec-audit to “fuzz” the server. This involves sending nonsensical or malicious JSON-RPC payloads to see how the server reacts.

Test ScenarioPayload ExampleExpected Result
Path Traversal{"path": "../../../etc/passwd"}403 Forbidden or Error: Invalid Path
Command Injection{"cmd": "ls; rm -rf /"}The server should treat ; as a literal string, not a command separator.
Resource ExhaustionCalling read_file 100 times in 1 second.Server should trigger Rate Limiting.

4. Sandbox & Infrastructure Audit

An MCP server should never “run naked” on your host machine.

  • Docker Isolation: Audit the Dockerfile. It should use a distroless or minimal image (like alpine) and a non-root user.
  • Network Egress: Use iptables or Docker network policies to block the MCP server from reaching the internet unless its specific function requires it (e.g., a “Web Search” tool).
  • Memory/CPU Limits: Ensure the container has cpus: 0.5 and memory: 512mb limits to prevent a “Looping AI” from crashing your host.

5. OAuth 2.1 & Identity Verification

If your MCP server is shared over a network (HTTP transport), it must follow the June 2025 MCP Auth Spec.

  • PKCE Implementation: Verify that the server requires Proof Key for Code Exchange (PKCE) for all client connections. This prevents “Authorization Code Interception.”
  • Scope Enforcement: If a user only authorized the read_only scope, ensure the server rejects calls to delete_record even if the token is valid.
  • Audit Logging: Every tool call must be logged with:
    1. The user_id who initiated it.
    2. The agent_id that generated the call.
    3. The exact arguments used.

Pro-Tooling for 2026

  • MCP Inspector: Use npx @modelcontextprotocol/inspector for a manual “sanity check” of your tools.
  • Snyk / Trivy: Run these against your MCP server’s repository to catch vulnerable 3rd-party dependencies.

Would you like me to help you write a “Safety Wrapper” in Python or TypeScript that automatically validates all file paths before your MCP server processes them?

MCP security

The Model Context Protocol (MCP) is a powerful “USB-C for AI,” but because it allows LLMs to execute code and access private data, it introduces unique security risks.

In 2026, security for MCP has moved beyond simple API keys to a Zero Trust architecture. Here are the best practices for securing your MCP implementation.


1. The “Human-in-the-Loop” (HITL) Requirement

The most critical defense is ensuring an AI never executes “side-effect” actions (writing, deleting, or sending data) without manual approval.

  • Tiered Permissions: Classify tools into read-only (safe) and sensitive (requires approval).
  • Explicit Confirmation: The MCP client must display the full command and all arguments to the user before execution. Never allow the AI to “hide” parameters.
  • “Don’t Ask Again” Risks: Avoid persistent “allowlists” for bash commands or file writes; instead, scope approvals to a single session or specific directory.

2. Secure Architecture & Isolation

Running an MCP server directly on your host machine is a major risk. If the AI is tricked into running a malicious command, it has the same permissions as you.

  • Containerization: Always run MCP servers in a Docker container or a WebAssembly (Wasm) runtime. This prevents “Path Traversal” attacks where an AI might try to read your ~/.ssh/ folder.
  • Least Privilege: Use a dedicated, unprivileged service account to run the server. If the tool only needs to read one folder, do not give it access to the entire drive.
  • Network Egress: Block the MCP server from accessing the public internet unless it’s strictly necessary for that tool’s function.

3. Defense Against Injection Attacks

MCP is vulnerable to Indirect Prompt Injection, where a malicious instruction is hidden inside data the AI reads (like a poisoned webpage or email).

  • Tool Description Sanitization: Attackers can “poison” tool descriptions to trick the AI into exfiltrating data. Regularly audit the descriptions of third-party MCP servers.
  • Input Validation: Treat all inputs from the LLM as untrusted. Use strict typing (Pydantic/Zod) and regex patterns to ensure the AI isn’t passing malicious flags to a bash command.
  • Semantic Rate Limiting: Use an MCP Gateway to kill connections if an agent attempts to call a “Read File” tool hundreds of times in a few seconds—a classic sign of data exfiltration.

4. Identity & Authentication (2026 Standards)

For remote or enterprise MCP setups, static API keys are no longer sufficient.

  • OAuth 2.1 + PKCE: This is the mandated standard for HTTP-based MCP. It ensures that tokens are bound to specific users and cannot be easily intercepted.
  • Token Scoping: Never use a single “Master Key.” Issue short-lived tokens that are scoped only to the specific MCP tools the user needs.
  • Separation of Roles: Keep your Authorization Server (which identifies the user) separate from your Resource Server (the MCP server). This makes auditing easier and prevents a breach of one from compromising the other.

5. Supply Chain Security

The “Rug Pull” is a common 2026 threat where a popular open-source MCP server is updated with malicious code (e.g., a BCC field added to an email tool).

  • Pin Versions: Never pull the latest version of an MCP server in production. Pin to a specific, audited version or hash.
  • Vetted Registries: Only use servers from trusted sources like the Official MCP Catalog or internally vetted company registries.
  • Audit Logs: Log every tool invocation, including who requested it, what the arguments were, and what the output was.

Summary Checklist for Developers

RiskMitigation
Data ExfiltrationDisable network access for local tools; use PII redaction.
Command InjectionUse argument arrays (parameterized) instead of shell strings.
Unauthorized AccessImplement OAuth 2.1 with scope-based tool control.
Lateral MovementSandbox servers in Docker/Wasm; limit filesystem access.

MCP + Kubernetes Management

Here’s a breakdown of this topic across all three domains:


MCP + Kubernetes Management

What it looks like: An LLM agent connects to a Kubernetes MCP server that exposes kubectl operations as tools. The agent can then:

  • list_pods(namespace) → find failing pods
  • get_pod_logs(pod, namespace) → fetch logs
  • describe_deployment(name) → inspect rollout status
  • scale_deployment(name, replicas) → auto-scale
  • apply_manifest(yaml) → deploy changes

Real implementations:

  • kubectl-ai — natural language to kubectl commands
  • Robusta — AI-powered Kubernetes troubleshooting with MCP support
  • k8s-mcp-server — open-source MCP server wrapping the Kubernetes API
  • OpenShift + ACM — Red Hat is building AI-assisted cluster management leveraging MCP for tool standardization

Example agent workflow:

User: “Why is the payments service degraded?”

Agent →  list_pods(namespace=”payments”)

      →  get_pod_logs(pod=”payments-7f9b”, tail=100)

      →  describe_deployment(“payments”)

      →  LLM reasons: “OOMKilled — memory limit too low”

      →  Proposes: patch_deployment(memory_limit=”1Gi”)

      →  HITL: “Approve this change?” → Engineer approves

      →  apply_patch() → monitors rollout → confirms healthy


MCP + Terraform Pipelines

What it looks like: A Terraform MCP server exposes infrastructure operations. The agent can plan, review, and apply infrastructure changes conversationally.

MCP tools exposed:

  • terraform_plan(module, vars) → generate and review a plan
  • terraform_apply(plan_id) → apply approved changes
  • terraform_state_show(resource) → inspect current state
  • terraform_output(name) → read output values
  • detect_drift() → compare actual vs declared state

Key use cases:

  • Drift detection agent: continuously checks for infrastructure drift and auto-raises PRs to correct it
  • Cost optimization agent: analyzes Terraform state, identifies oversized resources, proposes rightsizing
  • Compliance agent: scans Terraform plans against OPA/Sentinel policies before apply
  • PR review agent: reviews Terraform PRs, flags security misconfigs, suggests improvements

Example pipeline:

PR opened with Terraform changes

       │

       ▼

MCP Terraform Agent

  ├── terraform_plan() → generates plan

  ├── scan_security(plan) → checks for open security groups, no encryption

  ├── estimate_cost(plan) → computes monthly cost delta

  ├── LLM summarizes: “This adds an unencrypted S3 bucket costing ~$12/mo”

  └── Posts review comment to PR with findings + recommendations


📊MCP + Infrastructure Observability

What it looks like: Observability tools (Prometheus, Grafana, Loki, Datadog) are wrapped as MCP servers. The agent queries them in natural language and correlates signals across tools autonomously.

MCP tools exposed:

  • query_prometheus(promql, time_range) → fetch metrics
  • search_logs(query, service, time_range) → Loki/Elasticsearch
  • get_traces(service, error_only) → Jaeger/Tempo
  • list_active_alerts() → current firing alerts
  • get_dashboard(name) → Grafana snapshot
  • create_annotation(text, time) → mark events on dashboards

Key use cases:

  • Natural language observability: “Show me error rate for the checkout service in the last 30 mins” — no PromQL needed
  • Automated RCA: agent correlates metrics + logs + traces to pinpoint root cause
  • Alert noise reduction: agent groups related alerts, suppresses duplicates, and writes a single incident summary
  • Capacity planning: agent queries historical metrics, detects trends, forecasts when resources will be exhausted

🔗 How MCP Ties It All Together

The power of MCP is that a single agent can hold tools from all three domains simultaneously:

┌─────────────────────────────────────────────────────┐

│                   LLM Agent                         │

│              (Claude / GPT-4o)                      │

└────────────────────┬────────────────────────────────┘

                     │ MCP

        ┌────────────┼────────────┐

        ▼            ▼            ▼

┌──────────────┐ ┌──────────┐ ┌──────────────────┐

│  Kubernetes  │ │Terraform │ │  Observability   │

│  MCP Server  │ │ MCP Server│ │   MCP Server     │

│  (kubectl,   │ │(plan,    │ │(Prometheus, Loki,│

│   Helm, ACM) │ │ apply,   │ │ Grafana, Jaeger) │

└──────────────┘ │ drift)   │ └──────────────────┘

                 └──────────┘

End-to-end scenario:

  1. Observability MCP detects CPU spike on node pool
  2. Agent queries Terraform MCP → finds node group is at max capacity
  3. Agent queries Kubernetes MCP → confirms pods are pending due to insufficient nodes
  4. Agent generates Terraform plan to scale node group from 3→5 nodes
  5. HITL approval → Terraform apply → Kubernetes confirms new nodes joined
  6. Agent posts incident summary to Slack with full audit trail

docker

test

Here are comprehensive Docker interview questions organized by level:


🟢 Beginner Level

Concepts

Q1: What is Docker and why is it used?

Docker is an open-source containerization platform that packages applications and their dependencies into lightweight, portable containers — ensuring they run consistently across any environment (dev, staging, production).

Q2: What is the difference between a container and a virtual machine?

ContainerVirtual Machine
OSShares host OS kernelHas its own OS
SizeLightweight (MBs)Heavy (GBs)
StartupSecondsMinutes
IsolationProcess-levelFull hardware-level
PerformanceNear-nativeOverhead

Q3: What is a Docker image vs a Docker container?

  • Image — A read-only blueprint/template used to create containers
  • Container — A running instance of an image

Q4: What is a Dockerfile?

A text file containing step-by-step instructions to build a Docker image automatically.

Q5: What is Docker Hub?

A public cloud-based registry where Docker images are stored, shared, and distributed.


Basic Commands

Q6: What are the most common Docker commands?

docker build -t myapp .                                  # Build image

docker run -d -p 8080:80 myapp               # Run container

docker ps                                                              # List running containers

docker ps -a                                                         # List all containers

docker stop <container_id>                          # Stop container

docker rm <container_id>                              # Remove container

docker images                                                    # List images

docker rmi <image_id>                                   # Remove image

docker logs <container_id>                           # View logs

docker exec -it <id> /bin/bash                     # Enter container shell

Q7: What is the difference between CMD and ENTRYPOINT?

CMDENTRYPOINT
PurposeDefault command, easily overriddenFixed command, always executes
OverrideYes, at runtimeOnly with –entrypoint flag
Use caseFlexible defaultsEnforced commands

ENTRYPOINT [“python”]                     # always runs python

CMD [“app.py”]                                      # default arg, can be overridden

Q8: What is the difference between COPY and ADD?

  • COPY — Simply copies files from host to container (preferred)
  • ADD — Same as COPY but also supports URLs and auto-extracts tar files

🟡 Intermediate Level

Networking

Q9: What are Docker network types?

NetworkDescriptionUse Case
bridgeDefault, isolated networkSingle host containers
hostShares host network stackHigh performance needs
noneNo networkingFully isolated containers
overlayMulti-host networkingDocker Swarm / distributed apps

docker network create my-network

docker run –network my-network myapp

Q10: How do containers communicate with each other?

Containers on the same custom bridge network can communicate using their container name as hostname.

# Both containers on same network can reach each other by name

docker run –network my-net –name db postgres

docker run –network my-net –name app myapp  # app can reach “db”


Volumes & Storage

Q11: What is the difference between volumes, bind mounts, and tmpfs?

TypeDescriptionUse Case
VolumeManaged by DockerPersistent data (databases)
Bind MountMaps host directory to containerDevelopment, live code reload
tmpfsStored in memory onlySensitive/temporary data

docker run -v myvolume:/data myapp          # volume

docker run -v /host/path:/container myapp   # bind mount

Q12: How do you persist data in Docker?

Use named volumes — data persists even after the container is removed.

docker volume create mydata

docker run -v mydata:/app/data myapp


Docker Compose

Q13: What is Docker Compose and when do you use it?

Docker Compose defines and runs multi-container applications using a single docker-compose.yml file.

version: “3.8”

services:

  app:

    build: .

    ports:

      – “8080:80”

    depends_on:

      – db

    environment:

      – DB_HOST=db

  db:

    image: postgres:15

    volumes:

      – pgdata:/var/lib/postgresql/data

    environment:

      – POSTGRES_PASSWORD=secret

volumes:

  pgdata:

docker-compose up -d      # Start all services

docker-compose down       # Stop and remove

docker-compose logs -f    # Follow logs

Q14: What is the difference between docker-compose up and docker-compose start?

  • up — Creates and starts containers (builds if needed)
  • start — Starts existing stopped containers only

Images & Optimization

Q15: How do you reduce Docker image size?

  • Use minimal base images like alpine
  • Use multi-stage builds
  • Combine RUN commands to reduce layers
  • Use .dockerignore to exclude unnecessary files

# Multi-stage build example

FROM node:18 AS builder

WORKDIR /app

COPY . .

RUN npm run build

FROM nginx:alpine

COPY –from=builder /app/dist /usr/share/nginx/html

Q16: What is a .dockerignore file?

Similar to .gitignore — tells Docker which files to exclude from the build context.

node_modules

.git

*.log

.env

dist


🔴 Advanced Level

Security

Q17: How do you secure Docker containers?

  • Run containers as non-root user
  • Use read-only filesystems where possible
  • Scan images for vulnerabilities (docker scout)
  • Limit container capabilities with –cap-drop
  • Never store secrets in Dockerfiles — use Docker Secrets or environment variables

# Run as non-root

RUN adduser –disabled-password appuser

USER appuser

Q18: What is the difference between docker save and docker export?

docker save    docker export
TargetImage      Container
IncludesAll layers & history      Flattened filesystem only
Use caseBackup/transfer images      Snapshot a running container

Performance & Production

Q19: How do you limit container resources?

docker run \

  –memory=”512m” \

  –cpus=”1.0″ \

  –memory-swap=”1g” \

  Myapp

Q20: What is the difference between Docker Swarm and Kubernetes?

 Docker SwarmKubernetes
Complexity SimpleComplex but powerful
Setup EasySteeper learning curve
Scaling Basic auto-scalingAdvanced auto-scaling
Community  SmallerVery large
Best forSmall–medium workloadsLarge enterprise workloads

Q21: What happens when a Docker container crashes?

Use restart policies to handle crashes automatically:

docker run –restart=always myapp         # Always restart

docker run –restart=on-failure:3 myapp   # Restart up to 3 times on failure

docker run –restart=unless-stopped myapp # Restart unless manually stopped


Dockerfile Best Practices

Q22: What are Dockerfile best practices?

# ✅ Good Dockerfile

FROM node:18-alpine                    # Use minimal base image

WORKDIR /app

COPY package*.json ./                  # Copy dependency files first

RUN npm ci –only=production           # Install dependencies

COPY . .                               # Copy source code

RUN adduser –disabled-password app    # Create non-root user

USER app                               # Switch to non-root

EXPOSE 3000

CMD [“node”, “server.js”]


⚡ Quick-Fire Questions

QuestionAnswer
Default Docker network?bridge
Docker config file location?/etc/docker/daemon.json
How to see container resource usage?docker stats
How to copy files into a container?docker cp file.txt container:/path
Difference between stop and kill?stop = graceful (SIGTERM), kill = forceful (SIGKILL)
What is a dangling image?An image with no tag, created by rebuilds
How to clean up unused resources?docker system prune

kong – client certs

Short answer: Yes—use the full chain (leaf + intermediates), not just the leaf. Don’t include the root CA in the chain you send.

Here’s how it applies in the three common Kong TLS cases:

  1. Clients → Kong (mTLS client-auth)
  • The client must present its leaf cert + intermediate(s) during the handshake.
  • Kong must trust the issuing CA (configure trusted CA(s) for client verification).
  • If you only send the leaf, you’ll hit errors like “unable to get local issuer certificate.”

Example (client side):

# build a fullchain for the client cert (no root)
cat client.crt intermediate.crt > client-fullchain.crt

# test against Kong (mTLS)
curl --cert client-fullchain.crt --key client.key https://kong.example.com/secure

  1. Kong → Upstream (mTLS to your backend)
  • In Kong, create a Certificate whose cert field is full chain (leaf + intermediates) and key is the private key.
  • Attach it to the service via client_certificate.
  • Ensure the upstream trusts the issuing CA.

Kong (DB mode, gist):

# upload cert+key (cert must be full chain)
POST /certificates
{ "cert": "<PEM fullchain>", "key": "<PEM key>" }

# bind to service
PATCH /services/{id}
{ "client_certificate": "<certificate_id>" }

  1. Kong’s server cert (TLS termination at Kong)
  • Serve a full chain so browsers/clients validate without needing to have the intermediate locally.
  • If using Kong Ingress, put the full chain in tls.crt of the Kubernetes secret.

Quick checks & common pitfalls

  • Do not include the root CA in the chain you send.
  • Order matters: leaf first, then each intermediate in order up to (but excluding) the root.
  • If you see “No required SSL certificate was sent” → the client didn’t present a cert at all.
  • If you see “certificate verify failed” / “unable to get local issuer certificate” → chain or trust store problem (usually missing intermediate).

If you tell me which direction you’re doing mTLS (client→Kong or Kong→upstream) and how you deploy (Docker / KIC / bare-metal), I’ll give you the exact Kong config block to drop in.

latency

That message comes from Nginx (which Kong runs on). It means the upstream response didn’t fit in memory buffers, so Nginx spooled it to disk under /usr/local/kong/proxy_temp. It’s not fatal, but it adds I/O and latency.

Here are practical fixes—pick what matches your API pattern.

What you can do

1) Quick stopgap: don’t write to disk

Prevents temp-file writes; response still buffered in memory.

proxy_buffering on;
proxy_max_temp_file_size 0;

2) Stream responses (no buffering at all)

Great for large downloads/streaming; reduces latency & disk I/O. (Backpressure goes directly to upstream.)

proxy_buffering off;

3) Increase memory buffers (keep buffering, avoid disk)

Size these to your typical response size and concurrency.

proxy_buffering on;
proxy_buffer_size 64k;        # header/first buffer
proxy_buffers 32 64k;         # total ~2 MB per connection here
proxy_busy_buffers_size 256k; # busy threshold before spooling

4) If you must spool, make it fast

Put Kong’s proxy temp on tmpfs or faster disk:

  • Mount /usr/local/kong/proxy_temp on tmpfs (container/VM)
  • Or move it via: proxy_temp_path /usr/local/kong/proxy_temp 1 2;

5) Tame slow clients (common cause of spooling)

Slow downloads force Nginx to hold data. Tighten/adjust:

send_timeout 30s;
tcp_nodelay on;
keepalive_timeout 65s;

(Shorter timeouts reduce long-lived slow sends.)

How to set these in Kong

Pick one method you already use:

A) With kong.conf

Add proxy-level directives (no need to rebuild Kong):

nginx_proxy_directives = 
  proxy_buffering=on,
  proxy_max_temp_file_size=0,
  proxy_buffer_size=64k,
  proxy_buffers=32 64k,
  proxy_busy_buffers_size=256k

(Comma-separated key=value pairs; restart/reload Kong.)

B) With environment variables (containers)

KONG_NGINX_PROXY_DIRECTIVES=proxy_buffering=on,proxy_max_temp_file_size=0,proxy_buffer_size=64k,proxy_buffers=32 64k,proxy_busy_buffers_size=256k

C) With a custom Nginx template

If you already manage an nginx-kong.conf template, put the directives in the proxy context and redeploy.

D) Helm (Kubernetes)

If using the Kong Helm chart, set the proxy directives block (values key often named like proxy.nginx.* or nginxProxyDirectives) to include the same directives. (Exact key names vary by chart version—mirror the options above.)

What to choose

  • APIs returning big files / streamingproxy_buffering off;
  • APIs with occasional large JSON → keep buffering on and increase proxy_buffer_* sizes; also set proxy_max_temp_file_size 0;
  • Mobile or very slow clients → consider buffering on + larger buffers, or keep temp on tmpfs

Validate

  1. Hit the route and watch /usr/local/kong/proxy_temp usage.
  2. Check error log: the “buffered to a temporary file” line should stop.
  3. Watch p95/p99 request latency; it usually drops when disk spooling stops.

If you share your response size range and traffic pattern (download/JSON/streaming), I’ll tune exact buffer counts/sizes so you don’t overspend RAM.

500 errors

Best practice for 500 errors

  • Enable error.log (debug) in Kong.
  • Use file-log or http-log plugin to capture structured traffic.
  • Use correlation-id plugin to align Kong + upstream logs.
  • Always check upstream service logs — a 500 originates there, not in Kong.

Kong automatically adds X-Kong-Request-ID (if enabled).

  • Add plugin: correlation-id
  • Configure a header (e.g. X-Request-ID) so you can trace across Kong logs, upstream app logs, and client logs.