Policy update failes intermittently

Hello folks,

all of our user experience intermittently policy updates since some days. When manually refreshing policy via button in ZCC we got the following error:

image

Sometimes it works instantly, sometime we need 3-10 retries. After repeated pressing “refresh policy” it eventually works. We use ZCC V3.5.0.108.

Any ideas? Hints? ZPA/ZIA are running fine, we even tried forcefully remove one client and competely reauth ZCC, which worked without issues.

Thanks and BR
Manuel

We have been facing this issue too, noticed it in versions prior to 3.5.108 also

Hi Manuel,
Have you tried to connect the same laptop to another network and update the policy? Normally this error indicates a network issue to communicate with our backend servers.
Try to ping -t for mobile.[cloudname].net, login.[cloudname].net and see if there is any timeout

Hey Jamil,

thanks, but it happens for all of our users. Different clients, different locations (mostly homeoffices).
IMHO seems to be a issue caused by something != network connection.

BR
Manuel

Yes, see same error in different networks. GRE tunnel, Tunnel 2, Off/On trusted networks. Ping to both mobile.[cloudname].net, login.[cloudname].net is good, no timeouts

Also tried the mentioned pings without any issues.

Have you tested another Client Connector versions? You might need to report the issue to the support team to investigate it further and take it to the engineering if needed.

No, did not test any other client yet. I would like to avoid downgrading issues…
At least I found some hints in ZSATray-Logs:

Working:

2021-09-08 14:17:36.170903(+0200)[3752:11904] INF UI: Main Form, Update Policy Label Clicked
2021-09-08 14:17:36.174903(+0200)[3752:11312] INF Keep alive req rpc sent
2021-09-08 14:17:37.746631(+0200)[3752:19796] INF RPC notification code: ZSATRAYMANAGER_INSTALL_FIREFOX_CERT
2021-09-08 14:17:37.760631(+0200)[3752:18812] INF Installing certificates for FireFox
[...]
2021-09-08 14:17:38.367632(+0200)[3752:15556] INF Pulling tray policy.
2021-09-08 14:17:39.262699(+0200)[3752:4412] INF RPC notification code: ZSATRAYMANAGER_SEND_KEEPALIVE_RESPONSE
2021-09-08 14:17:39.262699(+0200)[3752:11312] INF sendKeepAliveResponse: {"error":0,"errorMessage":"","logFetchTs":0,"loginName":"USERNAME","success":true}

Not Working:

2021-09-08 14:18:54.910859(+0200)[3752:11904] INF UI: Main Form, Update Policy Label Clicked
2021-09-08 14:18:54.914855(+0200)[3752:19796] INF Keep alive req rpc sent
2021-09-08 14:18:55.306341(+0200)[3752:10492] INF RPC notification code: ZSATRAYMANAGER_SEND_KEEPALIVE_RESPONSE
2021-09-08 14:18:55.306341(+0200)[3752:18812] INF sendKeepAliveResponse: {"error":0,"errorMessage":"","logFetchTs":0,"loginName":"USERNAME","success":false}
2021-09-08 14:18:55.307071(+0200)[3752:18812] INF Keep Alive failed: {"error":0,"errorMessage":"","logFetchTs":0,"loginName":"USERNAME","success":false}

Everytime when ZCC pulls Firefox Certs for some reason (?) it works. Looks like an application issue to me.

BR
Manuel

Ticket opened, ID 03023186.

BR
Manuel

JFYI: confirmed as a known bug (ticket MO-4412). Affects zscloud and one other Zscaler cloud. Support stated it is now a P1 Prio and fix should arrive soon.

BR
Manuel

1 Like

Hi,
We have/had the same issue, this seems to have started after the big ZSCloud login issue Zscaler applying a fix/maintenance. After the P1 for Policy updates was resolved it seemed to be better but we still sometimes get Policy update failed errors when doing a manual policy update. Zscaler support claimed our DNS is not ok instead of searching for a proper root cause. I wonder was it completely solved for you and are you also using zscloud.net?

Hello Andi,

no fix yet, it even got slighty worse after the latest ZEN updates. Ticket is still open, no ETA for another fix. I do not think this is related to DNS issues as all our users experience this, most of them in homeoffices with all kind of different connections setups and providers.

Z-Support also stated we “are not the only company still seeing this issue pretty often in their environment”.

And yes, we are assigned to zscloud.net.

BR
Manuel

1 Like

I’ve been finding this on v3.5.0.100 and we’d not see this before. We’ll be evaluating v3.6.0.26 and will see if that version is also affected.

Still happening after updating to v3.6.0.26

Yep, same here. I think ZCC version does not matter as it seems to be a ZEN related issue. But thanks for verification.

1 Like

Hi all,
do you still have this issue? This morning I manually updated policy around 50 times to test but did not get any error.

Regards,
Andi

Hello Andi,

definitely better, although not yet solved. But of course I have to admit that manually updating policy so frequently in such a short sequence is not actually a valid scenario. It works now MOST of the time and thats the important step forward.

	Line 2517: 2021-10-11 07:21:20.774704(+0200)[11140:15740] INF Keep Alive success
	Line 4855: 2021-10-11 07:21:23.492444(+0200)[11140:15740] INF Keep Alive success
	Line 7396: 2021-10-11 07:21:26.199813(+0200)[11140:15740] INF Keep Alive success
	Line 9821: 2021-10-11 07:21:29.045333(+0200)[11140:15740] ERR Keep Alive failed: {"message":"Error: please try again in few minutes"}
	Line 9891: 2021-10-11 07:21:32.966049(+0200)[11140:15740] INF Keep Alive success
	Line 12548: 2021-10-11 07:21:36.868555(+0200)[11140:15740] INF Keep Alive success
	Line 14915: 2021-10-11 07:21:41.146833(+0200)[11140:15740] INF Keep Alive success
	Line 17601: 2021-10-11 07:21:45.555845(+0200)[11140:15740] INF Keep Alive success
	Line 18808: 2021-10-11 07:21:50.115295(+0200)[11140:15740] INF Keep Alive success
	Line 21233: 2021-10-11 07:21:53.272805(+0200)[11140:15740] INF Keep Alive success
	Line 23571: 2021-10-11 07:21:56.227874(+0200)[11140:15740] ERR Keep Alive failed: {"message":"Error: please try again in few minutes"}
	Line 23608: 2021-10-11 07:22:00.262496(+0200)[11140:15740] INF Keep Alive success
	Line 26149: 2021-10-11 07:22:04.694365(+0200)[11140:15740] INF Keep Alive success
	Line 28516: 2021-10-11 07:22:09.046980(+0200)[11140:15740] INF Keep Alive success
	Line 30999: 2021-10-11 07:22:12.043242(+0200)[11140:17856] INF Keep Alive success
	Line 33279: 2021-10-11 07:22:15.944130(+0200)[11140:17856] INF Keep Alive success
	Line 35791: 2021-10-11 07:22:20.579317(+0200)[11140:17856] ERR Keep Alive failed: {"message":"Error: please try again in few minutes"}
	Line 35828: 2021-10-11 07:22:24.337144(+0200)[11140:17856] INF Keep Alive success
	Line 38630: 2021-10-11 07:22:27.445852(+0200)[11140:17856] INF Keep Alive success
	Line 40968: 2021-10-11 07:22:30.496433(+0200)[11140:17856] INF Keep Alive success
	Line 43538: 2021-10-11 07:22:33.583889(+0200)[11140:17856] INF Keep Alive success
	Line 45905: 2021-10-11 07:22:36.443031(+0200)[11140:17856] INF Keep Alive success
	Line 48446: 2021-10-11 07:22:39.276382(+0200)[11140:17856] INF Keep Alive success
	Line 50784: 2021-10-11 07:22:41.982466(+0200)[11140:17856] INF Keep Alive success
	Line 53151: 2021-10-11 07:22:44.927333(+0200)[11140:17856] ERR Keep Alive failed: {"message":"Error: please try again in few minutes"}

We’ll see how it plays out during business hours here.

BR
Manuel

Update: got word from support it has been fixed. And indeed, looks good for now.

1 Like