The Problem: hundreds of Unattached Disks

During our migration to ARO, we frequently recreated disks during development, using a custom storage class with `reclaimPolicy: Retain` to prevent accidental deletion of PersistentVolumes (PVs). This worked as expected: deleting a PersistentVolumeClaim (PVC) in OpenShift marked the PV as `Released`, but the underlying Azure disk remained intact, even after deleting the associated PV.

Months later, our manager flagged unexpectedly high Azure storage costs. We discovered hundreds of unattached disks lingering in Azure. It seems that the `Retain` policy prevented automatic deletion of the Azure disks. To make matters worse, our team lacked the necessary Azure permissions to manually delete them.

Solution: manually set persistentVolumeReclaimPolicy to Delete in PV

-> Existing PVs

If the PVC was deleted but the PV still exists in OpenShift, the fix is straightforward:

  1. Verify the PV is no longer needed to avoid accidental data loss
  2. Edit the PV’s YAML to change persistentVolumeReclaimPolicy from Retain to Delete
  3. Delete the PV in OpenShift, which will also remove the underlying Azure disk

-> Handle Orphaned Disks

If both the PVC and PV are gone, but the Azure disk remains, we need to create a new PV and PVC to bind to the disk and then delete it:

  1. Identify the unattached disk in Azure (e.g., via Azure Portal or CLI: az disk list --query "[?managedBy==null]")
  2. Create a storage class with reclaimPolicy: Delete
  3. Create a PV pointing to the Azure disk
  4. Create a PVC to bind to the PV
  5. Delete the PVC, which will delete the PV and the Azure disk

When all connections in OpenShift towards the Azure disk are gone, a new PV and PVC has to be created, pointing to that disk. Just use or create a storageClass similar but with reclaimPolicy set to Delete. Below are some examples.

Example PV (pv-to-delete.yaml)

kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv-to-delete
  annotations:
    pv.kubernetes.io/provisioned-by: disk.csi.azure.com
spec:
  capacity:
    storage: 1Gi
  csi:
    driver: disk.csi.azure.com
    volumeHandle: /subscriptions/eae2918f-edf0-4d65-bc8c-5d4f0662d5f0/resourceGroups/aro-infra-my-cluster-name/providers/Microsoft.Compute/disks/pvc-6348c859-9577-4fd5-88f9-ebfc019eb428
    fsType: ext4
    volumeAttributes:
      csi.storage.k8s.io/pv/name: pvc-6348c859-9577-4fd5-88f9-ebfc019eb428
      csi.storage.k8s.io/pvc/name: pvc-to-delete
      csi.storage.k8s.io/pvc/namespace: my-namespace
      requestedsizegib: '1'
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  storageClassName: my-managed-csi-premium-with-reclaimpolicy-delete
  volumeMode: Filesystem

Example PVC (pvc-to-delete.yaml)

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-to-delete
  namespace: my-namespace
  annotations:
    pv.kubernetes.io/bind-completed: 'yes'
    pv.kubernetes.io/bound-by-controller: 'yes'
    volume.beta.kubernetes.io/storage-provisioner: disk.csi.azure.com
    volume.kubernetes.io/storage-provisioner: disk.csi.azure.com
  finalizers:
    - kubernetes.io/pvc-protection
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  volumeName: pv-to-delete
  storageClassName: my-managed-csi-premium-with-reclaimpolicy-delete
  volumeMode: Filesystem
status:
  phase: Bound
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 1Gi

This time, after the PV is bounded to the azure disk, simply delete the PVC. The corresponding PV and the Azure disk will be deleted successfully.