Debugging backups with Longhorn CSI

Debugging the never-ending backup job while using K10 with Longhorn CSI​

Description:

The K10 backup job that runs while using Longhorn CSI drivers is never-ending even after the proper installation of CSI snapshotter components and controllers.

Error:

We don't generally notice any errors for this issue. But the job waits for the volumesnapshot object in the k8s to become readyToUse.

If we get the manifest for the volumesnapshot it shows "readyToUse: false"

$ kubectl get volumesnapshot k10-csi-snap-h5vglbt6b7fr6dnd -n postgresql
NAME                            READYTOUSE   SOURCEPVC                    SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS      SNAPSHOTCONTENT                                    CREATIONTIME   AGE
k10-csi-snap-h5vglbt6b7fr6dnd false data-postgres-postgresql-0 longhorn snapcontent-51725945-bed7-42cb-8c4c-fde992f5553c 53m

Resolution:

Longhorn supports snapshots in local volume as well as backups to a backup target.

But when the snapshot is invoked through a CSI driver, Longhorn will create both a local snapshot and a backup to target.


There is no explicit mention of this in the Longhorn documentation. You can see a mention of this in this github issue comment.

So It is required to have a backup target when we run k10 policies.
The following can be configured as Backup targets.

  • S3 Object store
  • S3 compatible Object store like MinIO
  • NFS (must support NFSv4)
Please refer this longhorn documentation to configure Backup target.