Troubleshooting Egress Issues in OpenShift Namespaces

This is a classic OpenShift case because egress controls can be namespace-scoped, so one project can reach the internet while another cannot even though both are on the same cluster. In OpenShift with OVN-Kubernetes, the main things to check are Kubernetes NetworkPolicy egress rules, OpenShift EgressFirewall objects, and sometimes EgressIP if the namespace is supposed to leave the cluster from a specific source IP.

OpenShift documents EgressFirewall as a namespace-level object, and Kubernetes documents that once a pod is selected by an egress policy, only the explicitly allowed outbound traffic is permitted. (Red Hat Documentation)

Scenario

Pods in namespace team-a can reach external sites, but pods in team-b cannot.

Examples:

oc exec -n team-a deploy/app -- curl https://example.com # works
oc exec -n team-b deploy/app -- curl https://example.com # fails

That pattern strongly suggests the problem is policy attached to the namespace, not a cluster-wide outage. OpenShift’s EgressFirewall is evaluated per namespace, and if there is no matching rule then traffic is allowed by default unless something else, like a NetworkPolicy, restricts it. (Red Hat Documentation)

Diagram

          Namespace team-a                 Namespace team-b
      +---------------------+           +---------------------+
      | pod -> external IP  |           | pod -> external IP  |
      +----------+----------+           +----------+----------+
                 |                                 |
                 v                                 v
        [no blocking policy]            [NetworkPolicy and/or
                 |                     EgressFirewall applies]
                 v                                 |
           traffic allo                            v
                                        traffic denied or limited

Where namespace-specific egress can break:
1) Egress NetworkPolicy in that namespace
2) EgressFirewall object in that namespace
3) EgressIP expected for that namespace but misconfigured
4) DNS works, but external traffic is filtered after resolution


How to debug it

1. Prove it is really namespace-specific

Run the same test from a working namespace and a failing one:

oc exec -n team-a deploy/app -- curl -I https://example.com
oc exec -n team-b deploy/app -- curl -I https://example.com

Then test direct IP and DNS separately from the failing namespace:

oc exec -n team-b deploy/app -- nslookup example.com
oc exec -n team-b deploy/app -- curl -I https://93.184.216.34

If DNS works but outbound HTTP to external IPs fails, that points more toward egress filtering than DNS. This is an inference from Kubernetes DNS and policy behavior together. (Kubernetes)

2. Check NetworkPolicy in the failing namespace

This is the first thing I’d inspect:

oc get networkpolicy -n team-b
oc get networkpolicy -n team-b -o yaml

Kubernetes says that if a pod is selected by a policy with policyTypes: [Egress], the allowed outbound traffic is restricted to what the policy permits. A “default deny all ingress and all egress” policy is a standard pattern. (Kubernetes)

Typical bad case:

policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: internal-only

That would allow only a narrow set of destinations and block internet egress.

3. Check for an OpenShift EgressFirewall

OpenShift provides EgressFirewall as a namespace object for controlling traffic from pods to destinations outside the cluster. It is specific to OVN-Kubernetes. (Red Hat Documentation)

Commands:

oc get egressfirewall -n team-b
oc get egressfirewall -n team-b -o yaml

OpenShift documents that traffic to an IP outside the cluster is checked against the namespace’s EgressFirewall rules in order. If a rule matches, that action applies; if no rule matches, traffic is allowed by default. (Red Hat Documentation)

A realistic blocking example is a namespace with rules allowing only a few CIDRs or DNS names and denying everything else.

4. Check whether the namespace is supposed to use EgressIP

If the application depends on a fixed source IP for outbound allowlisting, verify whether EgressIP is configured and healthy. OpenShift documents that an egress IP can be assigned to a namespace and is distinct from an egress router. (Red Hat Documentation)

Check:

oc get egressip
oc describe egressip <name>

If team-b is expected to leave via a specific egress IP and that configuration is broken, outbound access to third-party systems may fail even though generic internet access from other namespaces works. That last part is an inference from how vendor allowlists usually interact with source IP–based egress. (Red Hat Documentation)

5. Verify DNS separately

Sometimes people say “egress is broken” when the real failure is DNS.

oc exec -n team-b deploy/app -- nslookup example.com
oc exec -n team-b deploy/app -- curl -I https://example.com
oc exec -n team-b deploy/app -- curl -I https://93.184.216.34

Interpretation:

  • nslookup fails, IP curl fails: maybe DNS or broader networking
  • nslookup works, IP curl fails: likely egress filtering
  • nslookup fails, IP curl works: DNS-only issue

That distinction follows from Kubernetes DNS behavior plus the documented policy mechanisms above. (Kubernetes)

6. Compare with a working namespace

This is one of the fastest ways to spot the difference:

oc get networkpolicy -n team-a -o yaml
oc get networkpolicy -n team-b -o yaml
oc get egressfirewall -n team-a -o yaml
oc get egressfirewall -n team-b -o yaml

When only one namespace is failing, the delta between those objects often explains it immediately.

7. Check whether the block is by destination type

OpenShift supports EgressFirewall rules for external destinations, and OpenShift also documents audit logging for egress firewall and network policy, which can help when you need proof of what is being denied. (Red Hat Documentation)

Ask:

  • does external IP fail?
  • does internal service traffic still work?
  • does only one external domain fail?

That helps separate “internet blocked” from “specific destinations blocked.”

What this usually turns out to be

Most common causes:

  • Default deny egress NetworkPolicy in the failing namespace. Kubernetes explicitly documents this pattern. (Kubernetes)
  • Namespace EgressFirewall allowing only selected external destinations. OpenShift documents EgressFirewall as namespace-scoped and processed rule by rule for external IP traffic. (Red Hat Documentation)
  • Broken or missing EgressIP where the app depends on outbound source-IP allowlists. OpenShift documents namespace egress IP configuration separately from egress routers. (Red Hat Documentation)
  • Misdiagnosed DNS problem, where name resolution fails and looks like internet egress failure. (Red Hat Documentation)

Fast triage sequence

oc exec -n team-b deploy/app -- nslookup example.com
oc exec -n team-b deploy/app -- curl -I https://93.184.216.34
oc get networkpolicy -n team-b -o yaml
oc get egressfirewall -n team-b -o yaml
oc get egressip

Mental model

When egress fails only in some namespaces:

  • think namespace policy first
  • then think OpenShift EgressFirewall
  • then think EgressIP expectations
  • only after that think cluster-wide OVN trouble

Because if it were a true cluster-wide OVN failure, you would usually see the problem across many namespaces, not just one. That last point is an operational inference, but it is a very useful one. (Red Hat Documentation)

Leave a comment