How to Backup You ZPA Configuration via ZPA Terraform Provider to AWS S3 Backend (Part 1)

One of the questions asked by some ZPA administrators, is how they can backup and eventually restore their ZPA configuration. As of the writing of this post, ZPA currently do not provide such capability; however, with that said, it is possible to combine both the (Unofficial) Zscaler Private Access Terraform Provider and AWS Terraform Provider to achieve just that. The below diagram provides a high level understanding of the flow, which I will walk through in this article/demo series.

In this post, I am going to discuss how you can use the ZPA Terraform provider along with AWS Terraform provider, to backup and store your ZPA configuration into an AWS S3 Bucket, while at the same time leveraging AWS DynamoDB to implement state locking to prevent concurrent or conflicting updates that may lead to data loss or even corruption of the ZPA Terraform state file.

Here are the topics I’ll cover in this two part series:

What is Terraform state?

First things first, it is important to understand what is the Terraform State File. Imagine that you used the ZPA Terraform Provider to push configurations to the ZPA platform. The question that comes up is then, how does Terraform knows which resources it is supposed to manage after the configuration is pushed? You could have all sorts of resources in your ZPA tenant deployed either manually via the UI, or some via Terraform), so the question remains, how does Terraform knows which resources it’s responsible for?

The answer is that Terraform records information about what ZPA resources it created in a Terraform state file. By default, when you run Terraform in the folder i.e /foo/bar , Terraform creates the file /foo/bar/terraform.tfstate .

This file contains a custom JSON format that records a mapping from the Terraform resources in your templates to the representation of those resources in the real world. For example, let’s say your Terraform template contained the following:

  resource "zpa_segment_group" "example" {
   name                   = "Example"
   description            = "Example"
   enabled                = true
   tcp_keep_alive_enabled = "1"
 }

After running terraform apply , the terraform.tfstate file will look something like this:

{
  "version": 4,
  "terraform_version": "1.0.7",
  "serial": 1,
  "lineage": "3fa5123d-ecce-1378-f258-eaea0e6e615f",
  "outputs": {},
  "resources": [
    {
      "mode": "managed",
      "type": "zpa_segment_group",
      "name": "Example",
      "provider": "provider[\"zscaler.com/zpa/zpa\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "applications": [],
            "config_space": "DEFAULT",
            "description": "Example",
            "enabled": true,
            "id": "216196257331290913",
            "name": "Example",
            "tcp_keep_alive_enabled": "1"
          },
          "sensitive_attributes": [],
          "private": "bnVsbA=="
        }
      ]
    }
  ]
}

Using this simple JSON format, Terraform knows that a resource with type zpa_segment_group and name Example corresponds to a Segment Group in your ZPA tenant with ID i-216196257331290913. Every time you run Terraform, it can fetch the latest status of this Segment Group from ZPA and compare that to what’s in your Terraform configurations to determine what changes need to be applied. In other words, the output of the terraform plan command is a diff between the code on your computer and the infrastructure deployed in the real world, as discovered via IDs in the state file.

If you’re using Terraform for a personal project, storing state in a local terraform.tfstate file works just fine. But if you want to use Terraform as a team on a real production environment, you run into a few problems:

  1. Shared storage for state files: To be able to use Terraform to update your ZPA infrastructure, each of your team members needs access to the same Terraform state files. That means you need to store those files in a shared location.

  2. Locking state files: As soon as data is shared, you run into a new problem: locking. Without locking, if two team members are running Terraform at the same time, you may run into race conditions as multiple Terraform processes make concurrent updates to the state files, leading to conflicts, data loss, and state file corruption.

In the following sections, I’ll dive into each of these problems and show you how to solve them.

Shared storage for state files

The most common technique for allowing multiple team members to access a common set of files is to put them in version control (e.g. Git). With Terraform state, this is a :-1:t4: Bad Idea for the following reasons:

  1. Manual error: It’s too easy to forget to pull down the latest changes from version control before running Terraform or to push your latest changes to version control after running Terraform. It’s just a matter of time before someone on your team runs Terraform with out-of-date state files and as a result, accidentally rolls back or duplicates previous deployments.

  2. Locking : Most version control systems do not provide any form of locking that would prevent two team members from running terraform apply on the same state file at the same time.

  3. Secrets : :warning: All data in Terraform state files is stored in plain text. This is a problem because certain Terraform resources need to store sensitive data. For example, if you use the zpa_provisioning_key resource to create a Provisioning Key, Terraform will store the provisioning_key for i.e an App Connector Group or Service Edge Group in a state file in plain text. Storing plain-text provisioning_keys or any type of secrets anywhere is a bad idea, including version control.
    :bulb: Note: Speaking of securely storing provisioning keys, you may want to look at my previous article ZPA App Connector Deployment in AWS Using Terraform

Back to the topic at hand, instead of using version control, the best way to manage shared storage for ZPA state files is to use Terraform’s built-in support for remote backends. A Terraform backend determines how Terraform loads and stores state. The default backend, which you’ve been using this whole time, is the local backend, which stores the state file on your local disk. Remote backends allow you to store the state file in a remote, shared store. A number of remote backends are supported, including Amazon S3, Azure Storage, Google Cloud Storage, and HashiCorp’s Terraform Pro and Terraform Enterprise.

Remote backends solve all three of the issues listed above:

  1. Manual error: Once you configure a remote backend, Terraform will automatically load the state file from that backend every time you run plan or apply and it’ll automatically store the ZPA state file in that backend after each apply, so there’s no chance of manual error.

  2. Locking: Most of the remote backends natively support locking. To run terraform apply , Terraform will automatically acquire a lock; if someone else is already running apply, they will already have the lock, and you will have to wait. You can run apply with the -lock-timeout=<TIME> parameter to tell Terraform to wait up to TIME for a lock to be released (e.g., -lock-timeout=10m will wait for 10 minutes).

  3. Secrets : Most of the remote backends natively support encryption in transit and encryption on disk of the state file. Moreover, those backends usually expose ways to configure access permissions (e.g., using IAM policies with an S3 bucket), so you can control who has access to your ZPA state files and the secrets that it may contain. It would still be better if Terraform natively supported encrypting secrets within the state file, but these remote backends reduce most of the security concerns, as at least the state file isn’t stored in plaintext on disk anywhere.

If you’re using Terraform with AWS, Amazon S3 (Simple Storage Service), which is Amazon’s managed file store, is typically your best bet as a remote backend for the following reasons:

  • It’s a managed service, so you don’t have to deploy and manage extra infrastructure to use it.
  • It’s designed for 99.999999999% durability and 99.99% availability, which means you don’t have to worry too much about data loss or outages.
  • It supports encryption, which reduces worries about storing sensitive data in state files. Anyone on your team who has access to that S3 bucket will be able to see the state files in an unencrypted form, so this is still a partial solution, but at least the data will be encrypted at rest (S3 supports server-side encryption using AES-256) and in transit (Terraform uses SSL to read and write data in S3).
  • It supports locking via DynamoDB. More on this below.
  • It supports versioning , so every revision of your state file is stored, and you can roll back to an older version if something goes wrong.
  • It’s inexpensive, with most Terraform usage easily fitting into the free tier.

In part 2 of this article/demo series, I will describe the step by step on how to enable remote state storage with AWS S3 Bucket, so you can store your ZPA provider statefile configuration in a secure and straightforward fashion.

2 Likes