For an OpenShift (OCP) interview in 2026, you should expect questions that move beyond basic Kubernetes concepts and focus on enterprise operations, automation (Operators), and security.
Here is a curated list of high-value interview questions categorized by role and complexity.
1. Architectural Concepts
- What is the role of the Cluster Version Operator (CVO)?
- Answer: The CVO is the heart of OCP 4.x upgrades. It monitors the “desired state” of the cluster’s operators (the “payload”) and ensures the cluster is updated in a safe, coordinated manner across all components.
- Explain the difference between an Infrastructure Node and a Worker Node.
- Answer: Infrastructure nodes are used to host “cluster-level” services like the Router (Ingress), Monitoring (Prometheus/Grafana), and Registry. By labeling nodes as
infra, companies can often save on Red Hat subscription costs, as these nodes typically don’t require the same licensing as nodes running application workloads.
- Answer: Infrastructure nodes are used to host “cluster-level” services like the Router (Ingress), Monitoring (Prometheus/Grafana), and Registry. By labeling nodes as
- What is the “Etcd Quorum” and why is it important in OCP?
- Answer: OpenShift typically requires an odd number of Control Plane nodes (usually 3) to maintain a quorum in the etcd database. If you lose more than half of your masters, the cluster becomes read-only to prevent data corruption.
2. Networking & Traffic (The Gateway API Era)
- Explain Ingress vs. Route vs. Gateway API. (See previous discussion)
- Key Focus: Interviewers want to know if you understand that Routes are OCP-native, Ingress is K8s-standard, and Gateway API is the future standard for advanced traffic management (canary, mirroring, etc.).
- How does “Service Serving Certificate Secrets” work in OCP?
- Answer: OCP can automatically generate a TLS certificate for a Service. You annotate a Service with
service.beta.openshift.io/serving-cert-secret-name. OCP then creates a secret containing a cert/key signed by the internal Cluster CA, allowing for easy end-to-end encryption.
- Answer: OCP can automatically generate a TLS certificate for a Service. You annotate a Service with
3. Security (The “Hardest” Category)
- Scenario: A developer says their pod won’t start because of a “Security Context” error. What do you check?
- Answer: I would check the Security Context Constraints (SCC). By default, OCP runs pods with the
restricted-v2SCC, which prevents running as root. If the pod requires root or host access, I’d check if the ServiceAccount has been granted a more permissive SCC likeanyuidorprivileged.
- Answer: I would check the Security Context Constraints (SCC). By default, OCP runs pods with the
- What are NetworkPolicies vs. EgressFirewalls?
- Answer: NetworkPolicies control traffic between pods inside the cluster (East-West). EgressFirewalls (part of OCP’s OVN-Kubernetes) control traffic leaving the cluster to external IPs or CIDR blocks (North-South).
4. Troubleshooting & Operations
- How do you recover a cluster if the Control Plane certificates have expired?
- Answer: This usually involves using the
oc adm certificate approvecommand to approve pending CSRs (Certificate Signing Requests) or manually rolling back the cluster clock if it’s an emergency. OCP 4.x generally tries to auto-renew these, but clock drift can break it.
- Answer: This usually involves using the
- Describe the Source-to-Image (S2I) workflow.
- Answer: S2I takes source code from Git, injects it into a “builder image” (like Node.js or Java), and outputs a ready-to-run container image. It simplifies the CI/CD process for developers who don’t want to write Dockerfiles.
5. Advanced / 2026 Trends
- What is OpenShift Virtualization (KubeVirt)?
- Answer: It allows you to run legacy Virtual Machines (VMs) as pods on OpenShift. This is critical for “modernizing” apps where one part is a container and the other is a legacy Windows or Linux VM that can’t be containerized yet.
- How does Red Hat Advanced Cluster Management (RHACM) help in a multi-cluster setup?
- Answer: RHACM provides a single pane of glass to manage security policies, application placement, and cluster lifecycle (creation/deletion) across multiple OCP clusters on AWS, Azure, and on-prem.
Quick Tip for the Interview
Whenever you answer, use the phrase “Operator-led design.” OpenShift 4 is built entirely on Operators. If the interviewer asks, “How do I fix the registry?” the best answer starts with, “I would check the status of the Image Registry Operator using oc get clusteroperator.” This shows you understand the fundamental architecture of the platform.
As an OpenShift Administrator, your interview will focus heavily on cluster stability, lifecycle management (upgrades), security enforcement, and the “Day 2” operations that keep an enterprise cluster running.
Here are the top admin-focused interview questions for 2026, divided by functional area.
1. Cluster Lifecycle & Maintenance
- How does the Cluster Version Operator (CVO) manage upgrades, and what do you check if an upgrade hangs at 57%?
- Answer: The CVO coordinates with all other cluster operators to reach a specific “desired version.” If it hangs, I check
oc get clusteroperatorsto see which specific operator is degraded. Usually, it’s the Machine Config Operator (MCO) waiting for nodes to drain or the Authentication Operator having issues with etcd.
- Answer: The CVO coordinates with all other cluster operators to reach a specific “desired version.” If it hangs, I check
- What is the “Must-Gather” tool, and when would you use it?
- Answer:
oc adm must-gatheris the primary diagnostic tool. It launches a pod that collects logs, CRD states, and operating system debugging info. I use it before opening a Red Hat support ticket or when a complex issue involves multiple operators.
- Answer:
- Explain how to back up and restore the etcd database.
- Answer: I use the
etcd-snapshot.shscript provided on the control plane nodes. For restoration, I must stop the static pods for the API server and etcd, then use the backup to restore the data directory. It’s critical to do this on a single control plane node first to re-establish a quorum.
- Answer: I use the
2. Node & Infrastructure Management
- What is a MachineConfigPool (MCP), and why would you pause it?
- Answer: An MCP groups nodes (like
masterorworker) so the MCO can apply configurations to them. I would pause an MCP during a sensitive maintenance window or when troubleshooting a configuration change that I don’t want to roll out to all nodes at once.
- Answer: An MCP groups nodes (like
- How do you add a custom SSH key or a CronJob to the underlying RHCOS nodes?
- Answer: You don’t log into the nodes manually. You create a MachineConfig YAML. The MCO then detects this, reboots the nodes (if necessary), and applies the change to the immutable filesystem.
- What happens if a node enters a
NotReadystate?- Answer: First, I check node pressure (CPU/Memory/Disk). Then I check the
kubeletandcrioservices on the node usingoc debug node/<node-name>. I also check for network reachability between the node and the Control Plane.
- Answer: First, I check node pressure (CPU/Memory/Disk). Then I check the
3. Networking & Security
- What is the benefit of OVN-Kubernetes over the legacy OpenShift SDN?
- Answer: OVN-K is the default in 4.x. It supports modern features like IPsec encryption for pod-to-pod traffic, smarter load balancing, and Egress IPs for specific projects to exit the cluster via a fixed IP address for firewall white-listing.
- A user is complaining they can’t reach a service in another project. What do you check?
- Answer:
- NetworkPolicies: Is there a policy blocking “Cross-Namespace” traffic?
- Service/Endpoints: Does the Service have active Endpoints (
oc get endpoints)? - Namespace labels: If using a high-isolation network plugin, do the namespaces have the correct labels to “talk” to each other?
- Answer:
- How do you restrict a specific group of users from creating
LoadBalancertype services?- Answer: I would use an Admission Controller or a specialized RBAC role that removes the
update/createverbs for theservices/statusresource, or more commonly, use a Policy Engine like Gatekeeper/OPA to deny the request.
- Answer: I would use an Admission Controller or a specialized RBAC role that removes the
4. Storage & Capacity Planning
- How do you handle “Volume Expansion” if a database runs out of space?
- Answer: If the underlying StorageClass supports
allowVolumeExpansion: true, I simply edit the PersistentVolumeClaim (PVC) and increase thestoragevalue. OpenShift and the CSI driver handle the resizing of the file system on the fly.
- Answer: If the underlying StorageClass supports
- What is the difference between
ReadWriteOnce(RWO) andReadWriteMany(RWX)?- Answer: RWO allows only one node to mount the volume (good for databases). RWX allows multiple nodes/pods to mount it simultaneously (required for shared file storage like NFS or ODF).
5. Scenario-Based: “The Midnight Call”
- Scenario: The Web Console is down, and
occommands are timing out. Where do you start?- Answer: This sounds like an API Server or etcd failure. I would:
- Log into a Control Plane node directly via SSH.
- Check the status of static pods in
/etc/kubernetes/manifests. - Run
crictl psto see if thekube-apiserveroretcdcontainers are crashing. - Check the node’s disk space (etcd often fails if the disk is 100% full).
- Answer: This sounds like an API Server or etcd failure. I would:
💡 Pro-Tip for Admin Interviews:
In 2026, emphasize GitOps. Mention that you prefer managing cluster configurations (like HTPasswd providers or Quota objects) via ArgoCD rather than manual oc apply commands. This shows you are an admin who values Idempotency and Disaster Recovery.