Clean up K10 orphaned volumes on GKE k8s 1.28.x

This guide provides instructions on how to cleanup Kasten K10 provisioned PVs, which are not deleted causing orphaned volumes in GKE k8s 1.28 release.

It has been observed on GKE  running k8s 1.28, Kasten K10 provisioned PVs using in-tree provisioner cannot be deleted via “kubectl delete pvc <pvcname>”, resulting in volume sprawl requiring manual remediation. Restoring from snapshots/backups still functions. 

There is no risk of data loss or inability to recover from existing backups.

This issue effects only in-tree provisioner “kubernetes.io/gce-pd”, on GKE k8s 1.28.x.

Environment

  • GKE k8s 1.28.x

  • Kasten K10 Version : 6.5.2

  • in-tree provisioned “kubernetes.io/gce-pd”

Following example shows how to check for orphaned volumes in the environment mentioned above and successfully clean them.

Step 1:- Protect workloads using Kasten K10. Verify any Orphaned PVs

a. Run the Policy to take a Snap+Export and capture the result. Below screenshot displays succeeded snapshot and export.

However, report shows failed status after export completed with orphaned volumes

 
status:
message: 'error getting deleter volume plugin for volume "kio-4e1c8777bcd411eeb71cde3cda5267f6-0":
    no volume plugin matched'
  phase: Failed

b. Check for failed PV to trace orphaned volume

kubectl get pv |grep -i failed
kio-4e1c8777bcd411eeb71cde3cda5267f6-0     8589934592   RWO            Delete           Failed   kasten-io/kio-4e1c8777bcd411eeb71cde3cda5267f6-0   standard                39m
kio-e747f57abcd111eeb71cde3cda5267f6-0     8589934592   RWO            Delete           Failed   kasten-io/kio-e747f57abcd111eeb71cde3cda5267f6-0   standard                57m

kubectl describe pv kio-4e1c8777bcd411eeb71cde3cda5267f6-0

Events:
  Type     Reason              Age   From                         Message
  ----     ------              ----  ----                         -------
  Warning  VolumeFailedDelete  15m   persistentvolume-controller  error getting deleter volume plugin for volume "kio-4e1c8777bcd411eeb71cde3cda5267f6-0": no volume plugin matched

Step 2: Attempt to delete the orphaned pvc

kubectl delete pvc kio-4e1c8777bcd411eeb71cde3cda5267f6-0 - n kasten-io

PV still results in failed status

status:
message: 'error getting deleter volume plugin for volume "kio-4e1c8777bcd411eeb71cde3cda5267f6-0":
    no volume plugin matched'
  phase: Failed
 
Step 3: Restore workload from k10 snapshot/export, which succeeds!

Resolution:

The following steps outline the process to clean up any orphaned volumes.

  • Identify GCE Disk names for orphaned volumes

kubectl get pv --selector k10pvmatchid \
  -o jsonpath='{.items[?(@.status.phase == "Failed")].spec.gcePersistentDisk.pdName}'  
  • Cleanup orphaned disks from GCE

disks=$(kubectl get pv --selector k10pvmatchid -o jsonpath='{.items[?(@.status.phase == "Failed")].spec.gcePersistentDisk.pdName}')
for disk in $disks; do
  gcloud compute disks delete $disk --quiet
done
  • Cleanup failed K10 PV Resources in k8s cluster

failedpvs=$(kubectl get pv --selector k10pvmatchid -o jsonpath='{.items[?(@.status.phase == "Failed")].metadata.name}')
for failedpv in $failedpvs; do
  kubectl delete pv $failedpv
done