Debugging backups with Longhorn CSI

Debugging the never-ending backup job while using K10 with Longhorn CSI

Description:

K10 backup job that doesn't complete while using Longhorn CSI drivers is never-ending even after the proper installation of CSI snapshotter components and controllers.

Error:

No errors are noticed generally for this issue. The job waits for the volumesnapshot object in the k8s to become readyToUse..

During this state the manifest for the volumesnapshot shows "readyToUse: false"

$ kubectl get volumesnapshot k10-csi-snap-h5vglbt6b7fr6dnd -n postgresql
NAME                            READYTOUSE   SOURCEPVC                    SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS      SNAPSHOTCONTENT                                    CREATIONTIME   AGE
k10-csi-snap-h5vglbt6b7fr6dnd   false        data-postgres-postgresql-0                                         longhorn   	   snapcontent-51725945-bed7-42cb-8c4c-fde992f5553c                  53m

Resolution:

The default behavior of associating a longhorn backup(backup to a target s3/NFS) or a longhorn snapshot to a CSI snapshot is dependent on the version of Longhorn driver used.

Section below provides details the behaviors and recommended ways to configure CSI snapshot with different versions of longhorn drivers.

Longhorn versions <= 1.2.4

In Longhorn versions 1.2.4 or below, when the snapshot is invoked through a CSI driver, Longhorn will default to longhorn backups due to non availability of CSI snapshot mapping to Longhorn snapshot.

Note: There is no explicit mention of this behavior in Longhorn documentation. However, reference of this behavior can be found here in the github issue comment.

So, it is required to have a backup target when we run k10 policies.
The following can be configured as Backup targets.

S3 Object store
S3 compatible Object store like MinIO
NFS (must support NFSv4)

Please refer this longhorn documentation to configure Backup target.

Longhorn Versions >= 1.3.0

From version 1.3.0, Longhorn allows configuration options and capability to handle CSI snapshots. It can be either pointed to Longhorn Snapshot or Longhorn Backup based on the requirement.

It is recommend to use longhorn snapshots with K10 to take advantage of snapshot capability for platform agnostic portable backups using exports in K10.

For CSI snapshots to be associated with a longhorn snapshot, the type parameter has to be set to snap in longhorn volumesnapshotclass.

apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Delete
driver: driver.longhorn.io
kind: VolumeSnapshotClass
metadata:
  name: longhorn
parameters:
  type: snap