Here’s a comprehensive breakdown of OCP backup — covering the two distinct layers you need to protect.
The two backup layers in OCP
OCP backup is not a single thing — you need two separate strategies working together:
| Layer | What it protects | Tool |
|---|---|---|
| Control plane (etcd) | Cluster state — all Kubernetes/OCP objects, CRDs, configs, RBAC | cluster-backup.sh / EtcdBackup CR |
| Application data | Namespaces, workloads, PVs/PVCs, images | OADP (OpenShift API for Data Protection) |
Use etcd backups with automated snapshots to protect and recover the cluster itself. Use OADP to protect and recover your applications and their data on top of a healthy cluster. — they are complementary, not interchangeable. OADP will not successfully backup and restore operators or etcd.
Layer 1 — etcd backup (control plane)
etcd is the key-value store for OpenShift Container Platform, which persists the state of all resource objects. An etcd backup plays a crucial role in disaster recovery.
What the backup produces
Running cluster-backup.sh on a control plane node generates two files:
snapshot_<timestamp>.db— the etcd snapshot (all cluster state)static_kuberesources_<timestamp>.tar.gz— static pod manifests + encryption keys (if etcd encryption is enabled)
How to take a manual backup
# SSH into any control plane nodessh core@master-0.example.com# Run the built-in backup scriptsudo /usr/local/bin/cluster-backup.sh /home/core/backup# Copy the backup off-cluster immediatelyscp core@master-0:/home/core/backup/* /safe/offsite/location/
Automated scheduled backup (OCP 4.14+)
You can create a CRD to define the schedule and retention type of automated backups:
# 1. Create a PVC for backup storageapiVersion: v1kind: PersistentVolumeClaimmetadata: name: etcd-backup-pvc namespace: openshift-etcdspec: accessModes: - ReadWriteOnce resources: requests: storage: 200Gi---# 2. Schedule recurring backupsapiVersion: config.openshift.io/v1alpha1kind: Backupmetadata: name: etcd-recurring-backupspec: etcd: schedule: "20 4 * * *" # Daily at 04:20 UTC timeZone: "UTC" pvcName: etcd-backup-pvc retentionPolicy: retentionType: RetentionNumber retentionNumber: maxNumberOfBackups: 15
Key rules for etcd backups
Do not take an etcd backup before the first certificate rotation completes, which occurs 24 hours after installation, otherwise the backup will contain expired certificates. It is also recommended to take etcd backups during non-peak usage hours, as it is a blocking action.
- Backups only need to be taken from one master — there is no need to run on every master. Store backups in either an offsite location or somewhere off the server.
- Be sure to take an etcd backup after you upgrade your cluster. When you restore your cluster, you must use an etcd backup that was taken from the same z-stream release — for example, an OCP 4.14.2 cluster must use a backup taken from 4.14.2.
Restore procedure (high level)
# On the designated recovery control plane node:sudo -E /usr/local/bin/cluster-restore.sh /home/core/backup# After restore completes, force etcd redeployment:oc edit etcd cluster# Add under spec:# unsupportedConfigOverrides:# forceRedeploymentReason: recovery-2025-04-17# Monitor etcd pods coming back upoc get pods -n openshift-etcd | grep -v quorum
Layer 2 — OADP (application backup)
OADP uses Velero to perform both backup and restore tasks for either resources and/or internal images, while also being capable of working with persistent volumes via Restic or with snapshots.
Install OADP via OperatorHub
Operators → OperatorHub → search "OADP" → Install
Configure a backup location (S3 example)
apiVersion: oadp.openshift.io/v1alpha1kind: DataProtectionApplicationmetadata: name: dpa-cluster namespace: openshift-adpspec: configuration: velero: defaultPlugins: - openshift # Required for OCP-specific resources - aws nodeAgent: enable: true uploaderType: kopia # Preferred over restic in OADP 1.3+ backupLocations: - name: default velero: provider: aws default: true objectStorage: bucket: my-ocp-backups prefix: cluster-1 credential: name: cloud-credentials key: cloud snapshotLocations: - name: default velero: provider: aws config: region: ca-central-1
Taking an application backup
# Backup a specific namespaceapiVersion: velero.io/v1kind: Backupmetadata: name: my-app-backup namespace: openshift-adpspec: includedNamespaces: - my-app - my-app-db defaultVolumesToFsBackup: true # Use kopia/restic for PVs storageLocation: default ttl: 720h0m0s # 30-day retention
# Scheduled backup (daily at 2am)apiVersion: velero.io/v1kind: Schedulemetadata: name: daily-app-backup namespace: openshift-adpspec: schedule: "0 2 * * *" template: includedNamespaces: - "*" # All namespaces excludedNamespaces: - openshift-* # Exclude platform namespaces - kube-* defaultVolumesToFsBackup: true storageLocation: default ttl: 168h0m0s # 7-day retention
Restoring from OADP
apiVersion: velero.io/v1kind: Restoremetadata: name: my-app-restore namespace: openshift-adpspec: backupName: my-app-backup includedNamespaces: - my-app restorePVs: true
PV backup methods
| Method | How it works | Best for |
|---|---|---|
| CSI Snapshots | Point-in-time volume snapshot via storage driver | Cloud PVs (AWS EBS, Azure Disk, Ceph RBD) |
| Kopia/Restic (fs backup) | File-level copy streamed to object storage | Any PV, slower but universal |
Supported backup storage targets
OADP supports AWS, MS Azure, GCP, Multicloud Object Gateway, and S3-compatible object storage (MinIO, NooBaa, etc.). Snapshot backups can be performed for AWS, Azure, GCP, and CSI snapshot-enabled cloud storage such as Ceph FS and Ceph RBD.
Best practices summary
| Practice | Detail |
|---|---|
| 3-2-1 rule | 3 copies, 2 media types, 1 offsite — etcd snapshots must be stored outside the cluster |
| Test restores | Regularly restore to a test cluster — an untested backup is not a backup |
| Version lock | etcd restores must use a backup from the same OCP z-stream version |
| Frequency | etcd: at minimum daily; before every upgrade; OADP: daily or per RPO requirement |
| Exclude platform namespaces | Don’t include openshift-* in OADP — OADP doesn’t restore operators or etcd |
| Encryption | Encrypt backup storage at rest; etcd snapshot includes encryption keys if etcd encryption is on |
| Monitor backup jobs | Set up alerts on failed Schedule or EtcdBackup CRs |