K10 - vSphere Block Mode Exports failure with error code 14009

This guide provides instructions on how to troubleshoot issues related to block mode exports failing with error code 14009 in debug logs

Problem description

During vsphere block mode export, in certain situations it might be observed that the folders/objects get created in the storage bucket, however export job remains stuck and following error message is seen debug logs

"Open virtual disk file failed. The error code is 14009., try --help\nError: exit status 1"

K10 requires network access from the cluster to vCenter and the ESX hosts in TKGs environments.

Workaround/Resolution:

Check firewall configuration - port 902 should be open between worker nodes to the ESX nodes.

kubectl run -i --tty busybox --image=busybox:1.28 -- sh 

telnet <esxhost name> 902

Addition to 14009 error if 3014 error is seen, check if the vSphere account used to create Infrastructure Profile has enough permissions to create disk/snapshot in vSphere console.

 

Check if the ESX host can be reached with DNS name. In some cases if the ESX server is not communicable using DNS , it throws 14009 error.

ping <ESX Host Name>

If the job is running for some time and then 14009 errors are seen, then it is due to  the kanisterbackup timeout. This is set to 45 minutes by default.. Increase the "KanisterBackupTimeout", value to longer than 45 minutes, example to 600 minutes.

helm upgrade k10 kasten/k10 --namespace=kasten-io --reuse-values -set kanister.backupTimeout=600

In some cases it might be observed that the data-mover pod does not initiate and log shows 14009 errors. Further troubleshooting to find why the data-mover is not initiating is recommended in these cases.