When moving from on-premise to a Managed Service like ROSA (Red Hat OpenShift on AWS) or ARO (Azure Red Hat OpenShift), the interview shifts. The technical “heavy lifting” of managing master nodes and etcd is now handled by Red Hat and AWS/Microsoft (the SRE team).
Your role as an administrator moves from “Keeping the lights on” to “Governance, Cost Optimization, and Integration.”
1. The Shared Responsibility Model
This is the #1 question for managed services.
Q: Who is responsible for what in a ROSA/ARO environment?
- The Provider (Red Hat/Cloud Provider): Manages the Control Plane (Masters), etcd health, patching of the underlying OS, and the core OpenShift Operators.
- The Customer (You): Manages Worker nodes (scaling), Application lifecycle, RBAC, Network Policies, and Quotas.
Interview Tip: Mention that you no longer have
cluster-adminin the traditional sense on ARO; you have acustomer-adminrole. You cannot SSH into master nodes or modify theetcdconfiguration directly.
2. Day 1: Provisioning & Connectivity
Q1: How does networking differ in ROSA/ARO compared to on-prem?
Answer: In a managed service, OpenShift is integrated into the Cloud’s Virtual Private Cloud (VPC/VNet).
- Private vs. Public Clusters: You must decide if the API and Ingress are “Public” (accessible over the internet) or “Private” (only accessible via VPN/DirectConnect/ExpressRoute).
- VPC Peering/Transit Gateway: You are responsible for connecting the OpenShift VPC to the rest of your cloud infrastructure (e.g., to reach a managed RDS database or Azure SQL).
Q2: What is the “Assisted Installer” vs. “Cloud CLI”?
Answer: For ROSA, you use the rosa CLI. For ARO, you use the az aro command. These tools abstract the CloudFormation or ARM templates required to spin up the infrastructure.
3. Day 2: Managed Operations
Q3: How do you handle cluster upgrades in ROSA/ARO?
Answer: You don’t just “hit update” and pray.
- In ROSA, you can schedule upgrade windows via the OpenShift Cluster Manager (OCM).
- The Red Hat SRE team monitors the upgrade. If it fails, they are the ones paged, not you. However, you must ensure your applications have correct Pod Disruption Budgets (PDBs) so the rolling update doesn’t take down your service.
Q4: How do you scale the cluster in the cloud?
Answer: You use MachineAutoscalers.
- Unlike on-prem, where you are limited by physical hardware, in ROSA/ARO you define a
MachineAutoscalerthat monitors the cluster’s resource requests. If a pod can’t be scheduled due to lack of CPU, the autoscaler automatically provisions a new EC2/Azure VM and joins it to the cluster.
4. Cost & Security
Q5: How do you control costs in a managed OpenShift environment?
Answer: Since you pay for every worker node, I implement:
- Cluster Autoscaling: Scaling down to minimum nodes at night.
- Resource Quotas: Preventing developers from requesting 16GB of RAM for a “Hello World” app.
- Spot Instances: Using AWS Spot or Azure Priority instances for non-production workloads to save up to 70% on compute costs.
Q6: How do you handle Authentication?
Answer: You typically don’t use local users. You integrate OpenShift with Azure AD (Entra ID) or AWS IAM/OIDC.
- Question: “How do pods access cloud resources (like S3 or Azure Vault)?”
- Answer: STS (Security Token Service) or Managed Identities. This allows pods to assume a cloud role without needing to store static “Access Keys” inside a Secret.
5. Summary Comparison: On-Prem vs. Managed
| Feature | On-Prem (Bare Metal/VMware) | Managed (ROSA/ARO) |
| Control Plane | You manage (3 VMs/Servers) | Managed by SRE (Hidden/Bundled) |
| Updates | Manual / High Risk | Scheduled / Automated |
| Load Balancer | MetalLB / F5 / HAProxy | AWS NLB/ALB or Azure LB |
| Storage | ODF / vSphere CSI | EBS/EFS or Azure Disk/Files |
| Failure Response | You get paged at 3 AM | Red Hat/Cloud SRE handles infra |
The “Pro” Managed Question:
“If the cluster is managed by Red Hat, why do they still need an Administrator like you?”
Winning Answer: “Because while Red Hat manages the platform, I manage the consumption. I ensure the networking between our VPCs is secure, I manage the RBAC and onboarding for our developers, I optimize costs so we aren’t over-provisioning cloud resources, and I implement the CI/CD patterns that allow our apps to run reliably on that platform.”