OpenShift Troubleshooting: Essential OC Commands

OC Commands for OpenShift (OCP) Troubleshooting

Here’s a comprehensive reference organized by troubleshooting category:


Cluster & Node Health

# Cluster version and status
oc get clusterversion
oc describe clusterversion
# Node status
oc get nodes
oc get nodes -o wide
oc describe node <node-name>
# Node resource usage
oc adm top nodes
# Check cluster operators
oc get clusteroperators
oc get co # shorthand

Pod Troubleshooting

# List pods (all namespaces)
oc get pods -A
oc get pods -n <namespace> -o wide
# Pod details & events
oc describe pod <pod-name> -n <namespace>
# Logs
oc logs <pod-name> -n <namespace>
oc logs <pod-name> -n <namespace> --previous # crashed container
oc logs <pod-name> -n <namespace> -c <container> # specific container
oc logs <pod-name> -n <namespace> --tail=100 -f # follow last 100 lines
# Exec into pod
oc exec -it <pod-name> -n <namespace> -- /bin/bash
# Pod resource usage
oc adm top pods -n <namespace>

Deployment / DeploymentConfig

# Status
oc get deployments -n <namespace>
oc get dc -n <namespace> # DeploymentConfig
oc rollout status deployment/<name> -n <namespace>
# Rollout history & rollback
oc rollout history deployment/<name> -n <namespace>
oc rollout undo deployment/<name> -n <namespace>
# Scale
oc scale deployment/<name> --replicas=3 -n <namespace>
# Restart (rolling)
oc rollout restart deployment/<name> -n <namespace>

Networking & Routes

# Services
oc get svc -n <namespace>
oc describe svc <service-name> -n <namespace>
# Routes
oc get routes -n <namespace>
oc describe route <route-name> -n <namespace>
# Endpoints (checks pod–service binding)
oc get endpoints -n <namespace>
# Network policies
oc get networkpolicy -n <namespace>

RBAC & Permissions

# Check what a user can do
oc auth can-i <verb> <resource> --as=<user> -n <namespace>
oc auth can-i --list --as=<user> -n <namespace>
# Role bindings
oc get rolebindings -n <namespace>
oc get clusterrolebindings | grep <user>
# Service account permissions
oc get sa -n <namespace>
oc describe sa <sa-name> -n <namespace>
# Add role to user
oc adm policy add-role-to-user <role> <user> -n <namespace>
oc adm policy add-cluster-role-to-user <role> <user>

Events & Alerts

# Namespace events (sorted by time)
oc get events -n <namespace> --sort-by='.lastTimestamp'
# Warning events only
oc get events -n <namespace> --field-selector type=Warning
# Cluster-wide events
oc get events -A --sort-by='.lastTimestamp'

Storage / PVCs

# PVCs and PVs
oc get pvc -n <namespace>
oc get pv
oc describe pvc <pvc-name> -n <namespace>
# Storage classes
oc get storageclass

etcd & Control Plane

# etcd pod health
oc get pods -n openshift-etcd
oc logs -n openshift-etcd <etcd-pod-name> -c etcd
# API server
oc get pods -n openshift-kube-apiserver
oc get pods -n openshift-apiserver
# Scheduler & controller manager
oc get pods -n openshift-kube-scheduler
oc get pods -n openshift-kube-controller-manager

MachineConfig & Node Config

# MachineConfig pools
oc get mcp
oc describe mcp worker
# MachineConfigs
oc get mc
oc describe mc <mc-name>
# Machine health check
oc get machinehealthcheck -n openshift-machine-api
oc get machines -n openshift-machine-api

Resource Inspection (General)

# Get all resources in a namespace
oc get all -n <namespace>
# Output as YAML/JSON
oc get pod <pod-name> -n <namespace> -o yaml
oc get pod <pod-name> -n <namespace> -o json
# jsonpath query
oc get pods -n <namespace> -o jsonpath='{.items[*].metadata.name}'
# Watch resources live
oc get pods -n <namespace> -w
# Diff live vs local config
oc diff -f <manifest.yaml>

Quick Fixes

# Force delete stuck pod
oc delete pod <pod-name> -n <namespace> --grace-period=0 --force
# Cordon / uncordon node
oc adm cordon <node-name>
oc adm uncordon <node-name>
# Drain node
oc adm drain <node-name> --ignore-daemonsets --delete-emptydir-data
# Mark node schedulable
oc adm manage-node <node-name> --schedulable=true

Pro tip: Add -o wide to most get commands for extra detail, and --sort-by='.lastTimestamp' to events for chronological ordering. Use oc explain <resource> to explore any resource’s field schema inline.

Leave a Reply