Debugging the never-ending backup job while using K10 with Longhorn CSI
K10 backup job that doesn't complete while using Longhorn CSI drivers is never-ending even after the proper installation of CSI snapshotter components and controllers.
No errors are noticed generally for this issue. The job waits for the volumesnapshot object in the k8s to become readyToUse..
During this state the manifest for the volumesnapshot shows "readyToUse: false"
$ kubectl get volumesnapshot k10-csi-snap-h5vglbt6b7fr6dnd -n postgresql
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
k10-csi-snap-h5vglbt6b7fr6dnd false data-postgres-postgresql-0 longhorn snapcontent-51725945-bed7-42cb-8c4c-fde992f5553c 53m
The default behavior of associating a longhorn backup(backup to a target s3/NFS) or a longhorn snapshot to a CSI snapshot is dependent on the version of Longhorn driver used.
Section below provides details the behaviors and recommended ways to configure CSI snapshot with different versions of longhorn drivers.
Longhorn versions <= 1.2.4
In Longhorn versions 1.2.4 or below, when the snapshot is invoked through a CSI driver, Longhorn will default to longhorn backups due to non availability of CSI snapshot mapping to Longhorn snapshot.
Note: There is no explicit mention of this behavior in Longhorn documentation. However, reference of this behavior can be found here in the github issue comment.
So, it is required to have a backup target when we run k10 policies.
The following can be configured as Backup targets.
- S3 Object store
- S3 compatible Object store like MinIO
- NFS (must support NFSv4)
Longhorn Versions >= 1.3.0
From version 1.3.0, Longhorn allows configuration options and capability to handle CSI snapshots. It can be either pointed to Longhorn Snapshot or Longhorn Backup based on the requirement.It is recommend to use longhorn snapshots with K10 to take advantage of snapshot capability for platform agnostic portable backups using exports in K10.
For CSI snapshots to be associated with a longhorn snapshot, the type parameter has to be set to snap in longhorn volumesnapshotclass.