Issue to start VZEN VM

Hello,
I am contacting you about VZEN and more particularly about the backup of these VMs.

For a client in Qatar, I have performed a VZEN cluster on ESXI. Also, I made a manual backup, i.e. a copy of the prod vm vzen disk.
Then (to test that the copies were working fine), we started these disk backups and it starts and works fine.

Recently, we deployed another vzen on another site (abu dhabi).

Similar to vzen qatar, we made a copy of the prod vzen disc.

But when we start these disks we get the following error:

I opened a ticket to Zscaler support but they don’t seem to know how to fix this.

Could you help us with this problem ?

I searched everywhere in Internet and I cannot fix this issue.

Thank you.

Adrien.

Hi @Adrien_Maquin, for most of my customers, I do not recommend using VM backup and recovery. There’s not data on the VZEN (or NSS and ZAB too) that needs a local backup.

If there was an issues, the general guidance is to redeploy the OVA, take a minute or two to configure the network/certificate, and everything just comes back up (remaining config pulled from the cloud). Considering VZEN’s must always be deployed in a cluster, this can be done without service impact.

Hello Scott,

Thank you for your reply.
Yes, the network and certificate configuration don’t take a lot of time.
But the “download-build” take 20 minutes on the affiliate that I deployed the VZEN.
And to convert the OVA VZEN to a format compatible with our ESXI with esx tool, we need to wait 15 minutes. So it take some time to buid another VZEN VM in case of VZEN VM prod crash.
What solution we can use to avoid to have all this time to build a new VZEN VM?

Thank you.

Adrien.

Hi Adrien,
If possible, I’d build some automation workflows for this, possibly triggering in the event of really loosing an instance. Generally, I’d steer away from cloning, even with the full build downloaded, cloned images are stored cold, and the threat databases (the bulk of the build) will age, hence why a redeploy is recommended.

As there will be a min of two in a cluster, this restore active/time is de-risked, and if needed, you can always add more live instances to a cluster for increased resiliency.

Cheers,
@skottieb