Zscaler App Connector - Performance and Troubleshooting

Zscaler App Connectors are deployed in customer environments to provide connectivity to client applications. The Zscaler App Connector is provided as an OVA for installation in VMWare environments, and as an AMI for deployment in AWS – in both cases it is a CentOS 7 image which has been hardened by removing unnecessary services and listeners. The App Connector is also available as an RPM for installation on Enterprise Linux platforms.

In all cases, once deployed, the customer is responsible for operating system patching and maintenance. Performing regular maintenance, log rotation, operating system monitoring is important. Similarly, the App Connector OVA/AMI is updated periodically with a patched Operating System and latest release of software – the App Connectors could be re-deployed using standard DevSecOps processes to ensure they’re always at a base level.

Whilst the base App Connector may be deployed, or the RPM installed, there are several steps which should be considered to ensure the Zscaler processes run appropriately and can make best use of the resources. This document describes these changes which may be necessary.

Changes to Operating System

Network Interface
DHCP may be used for connectors, especially in IaaS environments. However, best practice would be to use a static IP address. If IPv6 is not available on the network, then disable it on the network interface also.
Configuring appropriate DNS servers which are in the same location, or geographically close to the App Connector will ensure DNS response time is optimized. Whilst ZPA will cache responses, DNS performance is critical to operation

#Configure Network Interface    
cat > /etc/sysconfig/network-scripts/ifcfg-eth0 <<-EOT
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=yes
IPADDR=<IPADDR>
PREFIX=</24>
GATEWAY=<IPGATEWAY>
DNS1=<DNS SERVER 1>
DNS2=<DNS SERVER 2>
DEFROUTE=yes
IPV6INIT=no
EOT

Once the network settings have been applied, it is important to restart the network processes to re-read the configuration. Zscaler Private Access processes should be stopped before network process restart to ensure it reads the changes. Since this will interrupt network it it preferable to perform this from the VM Console.

#Restart Network Interfaces
sudo systemctl stop zpa-connector
sudo systemctl restart network
(make sure the network settings work)
sudo systemctl start zpa-connector

IPv6
IPv6 is enabled by default, however this may not be needed in the network the App Connector is deployed in. If the App Connector does not have an IPv6 address, it will not be able to connect to IPv6 Applications.

When a client application is requested, the App Connector would make a DNS A record lookup AND a DNS AAAA record lookup for resolution. If the DNS server is not capable of responding to the AAAA record, this could add delay in resolution. Best practice is to disable IPv6 if it is not available on the DNS server or in the network

#Disable IPv6 Entirely 
sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1
#Persist Change Across Reboot
cat >> /etc/sysctl.d/99-sysctl.conf <<-EOT
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
EOT

DNS

DNS is important to all functions across Zscaler Private Access. The configuration changes on the network adaptor ensure an IPv6 address not initialized (if appropriate) and IPv6 settings ensure AAAA records are not requested (if appropriate).

Linux handles /etc/resolv.conf as an ordered preference list, using secondary entries only as backups. The App Connector processes handles /etc/resolv.conf as a round-robin pool, load-balancing between all entries.

It’s important to ensure the DNS servers are used correctly. DNS Servers can get overloaded, especially since Zscaler Private Access will generate a large number of queries based on the users the connector is serving. Similarly, it’s important to ensure the DNS servers are functioning – since ZPA rotates between the DNS servers in /etc/resolv.conf , no DNS server should be offline. ZPA re-reads from /etc/resolv.conf the DNS servers every 5 minutes to ensure consistency.

The timeout values and retries could be tailored to the specific environment. Rotating through DNS servers will ensure a single DNS server doesn’t take all the requests, and having appropriate timeout (how long to wait for a response) and retries (how many times to ask the same DNS server for a request) will ensure a single DNS server isn’t saturated.

Secondarily, ZPA uses DNS to resolve the Zscaler Private Access Service Edge (public or private) based on the DNS response. For Public DNS resolution, it is important that DNS servers return the same response for Zscaler entries – co2br.prod.zpath.net for example.

#Rotate through DNS server failures
echo "options rotate timeout:1 retries:1" >> /etc/resolv.conf

Increase Ephemeral Ports

The Zscaler App Connector’s IP address is used as the source for initiated connections to applications. Any connection made will use a TCP/UDP source port, and after use the port will be placed into a TIME_WAIT state before it can be re-used. The default port range is 32768-60999 (28231 available ports each for UDP and TCP). Increasing this port range will ensure the App Connector does not run out of source ports – this is especially important for Active Directory which will generate a lot of traffic for CLDAP and LDAP during AD Site discovery.

#Increase Port Range 
sysctl -w net.ipv4.ip_local_port_range="1024 65000"
#Persist Change Across Reboot
cat >> /etc/sysctl.d/99-sysctl.conf <<-EOT
net.ipv4.ip_local_port_range = 1024 65000
EOT

Zscaler Debug Parameters

Occasionally Zscaler Engineering may ask for enhanced debug statistics from a connector during a troubleshooting session. It’s important to ensure after a debug session, that these statistics are disabled. Whilst leaving them enabled should not cause issue, it’s best practice to revert to the base configuration. These can only be changed when the Zscaler processes are running.

#Update debug flags
curl http://localhost:9000/debug/fohh?value=0
curl http://localhost:9000/debug/wally?value=0
curl http://localhost:9000/debug/zpath_lib?value=0
curl http://localhost:9000/debug/zpn?value=0
curl http://localhost:9000/debug/assistant?value=0
curl http://localhost:9000/debug/zhealth?value=0

Log Rotation

Zscaler Private Access App Connector will write logs to /var/log/messages via journalctl. It is important on any system to ensure the log files do not consume the disk partition. Having appropriate log rotation in place will take care of this, and compress older log files. This update should be customized based on customers standard log rotation, however for the base AMI/OVA this configuration will ensure the logs are rotated daily, store 7 days worth of logs, and compress old log files as they are rotated.

#Update log rotation
cat > /etc/logrotate.conf <<-EOT
# rotate log files daily
daily
# keep 7 days worth of backlogs
rotate 7
# create new (empty) log files after rotating old ones
create
# use date as a suffix of the rotated file
dateext
# uncomment this if you want your log files compressed
compress
# RPM packages drop log rotation information into this directory
include /etc/logrotate.d
# no packages own wtmp and btmp -- we'll rotate them here
/var/log/wtmp {
    monthly
    create 0664 root utmp
	minsize 1M
    rotate 1
}
/var/log/btmp {
    missingok
    monthly
    create 0600 root utmp
    rotate 1
}
# system-specific logs may be also be configured here.
EOT

Troubleshooting

When troubleshooting Zscaler Private Access App Connectors, it’s worth considering all the above changes - as well as the CPU/Memory/Disk allocated to the VM/OS. Also look at the TLS connections/second to ensure the CPU’s are capable of processing the volume of mTunnels created during application access.
The script can be run on a connector to pull all the necessary parameters to investigate offline. Mass-executing this across connectors would enable diagnostics to check for congruent system configuration and identification of any connector “hotspots” or “notspots” which could be addressed through policy/configuration.

#!/usr/bin/bash
#scp zpa-diag.sh connector.domain.com:/tmp
#ssh -t connector.domain.com '/tmp/zpa-diag.sh'
#scp 'connector.domain.com:/tmp/zpa-diag*.tar.gz' ./
#Some commands need ROOT - run this script as root, or sudo ./zpa-diag.sh
#or run visudo and add the following line to the end of the sudoers file
#admin ALL=(ALL) NOPASSWD: /usr/sbin/lsof, /usr/sbin/ss, /usr/bin/openssl

exec 3>&2
exec 2> /dev/null
echo Creating Diagnostics Directory
mkdir /tmp/zpa-diag
#Following commands require ROOT.  Either run script as root, or edit SUDOERS as above
#If you run script as root, remove sudo commands below
CID=$(sudo openssl x509 -subject -noout -in /opt/zscaler/var/cert.pem | cut -d '=' -f 4 | cut -d '-' -f 2 | cut -d '.' -f 1)
sudo lsof -n -P > /tmp/zpa-diag/lsof-output.txt
sudo lsof -n | wc -l >/tmp/zpa-diag/lsof-opencount.txt
sudo ss -s > /tmp/zpa-diag/ss.txt

echo Connector ID = $CID 
echo $CID >> /tmp/zpa-diag/$CID
echo Collecting AWS Instance Type
curl --connect-timeout 2 http://169.254.169.254/latest/meta-data/instance-type -o /tmp/zpa-diag/instance-type
curl --connect-timeout 2 http://169.254.169.254/latest/meta-data/placement/availability-zone -o /tmp/zpa-diag/availability-zone
echo Collecting Azure Instance Type
curl --connect-timeout 2  -H metadata:true "http://169.254.169.254/metadata/instance/compute/vmSize?api-version=2017-08-01&format=text" -o /tmp/zpa-diag/azure-instance-type
curl --connect-timeout 2  -H metadata:true "http://169.254.169.254/metadata/instance/compute/location?api-version=2017-08-01&format=text" -o /tmp/zpa-diag/azure-availabilty-zone

echo Running Openssl Checks
echo openssl speed -evp aes-256-cbc > /tmp/zpa-diag/openssl.txt
openssl speed -evp aes-256-cbc >> /tmp/zpa-diag/openssl.txt
echo >> /tmp/zpa-diag/openssl.txt
echo openssl speed aes-256-cbc >> /tmp/zpa-diag/openssl.txt
openssl speed aes-256-cbc >> /tmp/zpa-diag/openssl.txt

echo Collecting Journal
journalctl > /tmp/zpa-diag/journal.log
journalctl -u zpa-connector -S -1m | grep Mtunnels >/tmp/zpa-diag/mtunnels.txt

echo Collecting CPU/Memory Info
echo Memory Report
date >> memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo UNAME >> /tmp/zpa-diag/memory_report.txt
uname -a >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo HOSTNAME >> /tmp/zpa-diag/memory_report.txt
hostname >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo LSCPU >> /tmp/zpa-diag/memory_report.txt
lscpu >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo /PROC/CPUINFO >> /tmp/zpa-diag/memory_report.txt
cat /proc/cpuinfo >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo /PROC/MEMINFO >> /tmp/zpa-diag/memory_report.txt
cat /proc/meminfo >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo Processes >> /tmp/zpa-diag/memory_report.txt
echo "ps aux --sort=-pmem | head -5" >> /tmp/zpa-diag/memory_report.txt
ps aux --sort=-pmem | head -5 >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo "curl -s 127.0.0.1:9000/memory/status" >> /tmp/zpa-diag/memory_report.txt
curl -s 127.0.0.1:9000/memory/status >> /tmp/zpa-diag/memory_report.txt
echo >> /tmp/zpa-diag/memory_report.txt
echo "curl -s 127.0.0.1:9000/memory/argo" >> /tmp/zpa-diag/memory_report.txt
curl -s 127.0.0.1:9000/memory/argo >> /tmp/zpa-diag/memory_report.txt

echo Collecting File Descriptors
echo sysctl fs.file-max > /tmp/zpa-diag/file_descriptors.txt
sysctl fs.file-max >> /tmp/zpa-diag/file_descriptors.txt
echo >> /tmp/zpa-diag/file_descriptors.txt
echo ulimit -Hn >> /tmp/zpa-diag/file_descriptors.txt
ulimit -Hn >> /tmp/zpa-diag/file_descriptors.txt
echo >> /tmp/zpa-diag/file_descriptors.txt
echo ulimit -Sn >> /tmp/zpa-diag/file_descriptors.txt
ulimit -Sn

echo Collecting Disk Utilisation
echo DISK Utilisation >> /tmp/zpa-diag/disk_report.txt
df -h >> /tmp/zpa-diag/disk_report.txt

echo Collecting Port Range
mkdir /tmp/zpa-diag/portrange
cp /proc/sys/net/ipv4/ip_local_port_range >> /tmp/zpa-diag/portrange
cp /etc/sysctl.conf >> /tmp/zpa-diag/portrange
cp /etc/sysctl.d/* /tmp/zpa-diag/portrange
sysctl net.ipv4.ip_local_port_range >> /tmp/zpa-diag/portrange/current


echo Resolving co2br.prod.zpath.net - performing MTR
echo resolved IPs
dig co2br.prod.zpath.net | grep "IN A" | cut -f 3
for x in $(dig co2br.prod.zpath.net | grep "IN A" | cut -f 3)
do
	echo MTR to $x
	mtr -rnc5 $x > /tmp/zpa-diag/mtr-$x.txt
done
cp /etc/resolv.conf /tmp/zpa-diag
cp /etc/hosts /tmp/zpa-diag

echo Collecting ZPA Statistics
curl '127.0.0.1:9000/debug'  >> /tmp/zpa-diag/connector_debug_state.txt
curl '127.0.0.1:9000/assistant/dns/state/dump' >> /tmp/zpa-diag/connector_dns_state_dump.txt
curl '127.0.0.1:9000/assistant/app/dump/state_summary' >> /tmp/zpa-diag/connector_app_state_summary.txt
curl '127.0.0.1:9000/assistant/data/mtunnel/dump/stats' >> /tmp/zpa-diag/connector_mtunnel_stats.txt
ls -lR /opt/zscaler/ >> /tmp/zpa-diag/dir.txt
cp /opt/zscaler/var/version /tmp/zpa-diag
cp /opt/zscaler/var/updater.version /tmp/zpa-diag
uptime >> /tmp/zpa-diag/uptime.txt



tar -zcvf /tmp/zpa-diag-$CID.tar.gz /tmp/zpa-diag/*
rm -rf /tmp/zpa-diag
10 Likes