Debugging OVN Issues in OpenShift

Let’s walk through a realistic, production-style OVN debugging scenario in
OpenShift Container Platform using OVN-Kubernetes.

Scenario

A frontend pod cannot reach a backend service

You have:

frontend pod
backend pod
backend-service (ClusterIP)

And:

curl http://backend-service

fails

Step-by-step debugging (real flow)

Step 1: Check if backend pod is healthy

oc get pods -o wide

You want:

Backend pod = Running
Has an IP (e.g., 10.128.2.15)

If pod is not running → stop here (not an OVN issue)

Step 2: Test direct pod-to-pod connectivity

From frontend pod:

oc exec -it frontend -- curl http://10.128.2.15

Outcomes:

Case A: This FAILS

→ Problem is networking (OVN / policy / routing)

Case B: This WORKS

→ Networking is fine → problem is service layer

Branch A: Pod-to-pod FAILS (OVN issue)

Step 3A: Check NetworkPolicies

oc get networkpolicy -A

Look for anything like:

Deny all ingress
Missing allow rules

Quick test:
Create temporary allow-all policy

If it suddenly works → root cause = NetworkPolicy

Step 4A: Check node-level OVN

Find nodes:

oc get pods -o wide

Then:

oc get pods -n openshift-ovn-kubernetes -o wide

Check:

Is ovnkube-node running on both nodes?
Any restarts?

Step 5A: Test OVS health

			
oc debug node/<node>
chroot /host
ovs-vsctl show

Look for:

br-int bridge
Proper interfaces

Missing interfaces = OVN not wiring pods correctly

Step 6A: Check OVN logs

oc logs -n openshift-ovn-kubernetes <ovnkube-node>

Common errors:

Flow install failures
DB sync issues

Branch B: Pod-to-pod WORKS, Service FAILS

This is VERY common and often misunderstood.

Step 3B: Check service

oc get svc backend-service -o wide

Check:

ClusterIP exists
Correct port

Step 4B: Check endpoints

oc get endpoints backend-service

If EMPTY:

→ Service is not linked to pods

Root cause:

Wrong selector labels

Fix:

			
selector:
  app: backend

Step 5B: Test service IP directly

curl <ClusterIP>

Fails but pod IP works:

→ OVN load-balancing issue

Step 6B: Check OVN load balancer

On node:

ovn-nbctl lb-list

You should see:

Service IP mapped to pod IPs

If missing → OVN not programming service

Bonus: DNS check (often confused with OVN)

From frontend:

nslookup backend-service

If fails:

→ DNS issue, NOT OVN

Check:

oc get pods -n openshift-dns

Real root cause examples (from production)

Case 1: Wrong labels

Service selector doesn’t match pod
→ No endpoints → service fails

Case 2: NetworkPolicy blocking traffic

Default deny policy applied
→ Pods isolated

Case 3: OVN desync

Pod exists but not in OVN DB
→ No routing

Case 4: Node issue

Only pods on one node fail
→ ovnkube-node broken there

Case 5: MTU mismatch

Small packets work, large fail
→ Very tricky to spot

The mental model (this is what experts use)

When debugging:

Pod IP → works?
- ❌ → OVN / policy / routing
- ✅ → go to service layer
Service endpoints exist?
- ❌ → labels problem
- ✅ → OVN load balancing
DNS works?
- ❌ → DNS, not OVN

Pro move (what senior engineers do)

Spin up a debug pod:

oc run debug --image=busybox -it --rm -- sh

Then test:

ping
curl
nslookup

This removes app complexity completely.

Infra Cloud Solutions

Leave a ReplyCancel reply

Scenario

Step-by-step debugging (real flow)

Step 1: Check if backend pod is healthy

Step 2: Test direct pod-to-pod connectivity

Outcomes:

Case A: This FAILS

Case B: This WORKS

Branch A: Pod-to-pod FAILS (OVN issue)

Step 3A: Check NetworkPolicies

Step 4A: Check node-level OVN

Step 5A: Test OVS health

Step 6A: Check OVN logs

Branch B: Pod-to-pod WORKS, Service FAILS

Step 3B: Check service

Step 4B: Check endpoints

If EMPTY:

Step 5B: Test service IP directly

Fails but pod IP works:

Step 6B: Check OVN load balancer

Bonus: DNS check (often confused with OVN)

If fails:

Real root cause examples (from production)

Case 1: Wrong labels

Case 2: NetworkPolicy blocking traffic

Case 3: OVN desync

Case 4: Node issue

Case 5: MTU mismatch

The mental model (this is what experts use)

Pro move (what senior engineers do)

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from Infra Cloud Solutions