Multiple GRE Tunnel for load Sharing on Cisco ASR1000

Hi All,

I want to prepare 2 active GRE tunnels to use more than 1Gbps traffic, but I cannot find any documentations for Cisco ASR1000. (WAN link bandwidth is 10Gbps, might scale to more GRE Tunnel in long run)

My research on zscaler kb only suggested PBR on cisco router but somehow PBR requires large amount of reconfiguration if I need to add/remove specific route-map/ACL. In addition, PBR will end up fixing specific subnet to use one particular active GRE tunnel so that still restricted to 1Gbps bandwidth limitation to that subnet which is not highly desirable.

The ideal outcome i can think of is to have 2 active GRE tunnel, all subnet can forward traffic to zscaler via 2 GRE tunnel in round-robin mode or tunnel with less traffic load at the time so we can be more effective to consume the WAN link bandwidth.

Is there a good way to load share traffic over 2 GRE tunnels, what load sharing protocol can make use of?

Also would like to know if there are any limitations or cautions.

e.g.

  • Must not be static/dynamic route
  • Must not be per-packet load sharing
  • Must be source IP based load sharing

Thanks,
Sean

Have you been able to find any additional answers on this? @Mithun would you be able to point us in the right direction?

Hello Sean,

You have to be mindful of session persistence that is being used by some servers your users may be consuming. So whilst doing a RR load-balancing, which I assume is using per packet load-balancing, will achieve the ~ 50/50 load across the two ACTIVE GRE tunnels, however this will definitely break connectivity with server with IP/HTTP persistence check.

Let me give you an example of what I mean. Let’s assume there is a server “server.domain.com” that has IP persistence check in place. This means that if a client sends a HTTP request IP packet via the GRE tunnel 1, it lands on proxy 1 in our backend and hits the server to GET the index.html page from IP 1.1.1.1. The server response with a response COOKIE of JSESSIONID and then the server maps the IP 1.1.1.1 with the JSESSIONID.

After the index.html is served to 1.1.1.1 IP, it has redirects and JS, CSS, font, etc resources that also have to be GET-ed. These are of course going to be done over another HTTP and TCP session and definitely over more than a single IP packet.

These subsequent IP packets have a 50% chance that they will be RR load-balanced over the other GRE tunnel 2. This means that they will land on another proxy in our backend and thus will no longer get the egress IP of 1.1.1.1. Let’s assume that these will get IP 2.2.2.2. However, the client is same. The client inject the JSESSIONID cookie in each call. Now when the request hits the server, since it traversed over proxy 2 and GRE tunnel 2, it got another IP (2.2.2.2) and now this violates the IP/HTTP persistence on the server which in it’s table has an entry for the JSESSIONID belonging to 1.1.1.1 and not 2.2.2.2. This HTTP request will be thus denied by the server. Depending on the application this can have various impacts from the app crashing/freezing/loading (retry).

There aren’t many servers doing this but some do and nobody has the list of all of them as the scope of knowing what is on the Internet is beyond any of us. The rule of thumb to follow, when you need multiple ACTIVE GRE tunnels, is to make sure that you maintain persistence as once you break it, there is no way in our backend to unbreak it.

Going back to your last section posted, the per packet load-balancing must not be used. The load-balancing should have at least session stickiness. With session stickiness some of the problems will be fixed but some will not. To fix all of these issues you should stick with source IP persistence. It is not feasible to reach the 50/50 split using PBR I agree as you could have more heavy clients in one subnet vs the other.

The how, is a difficult question to answer as there are many variables to consider here from hardware, software versions, licensing, utilisation on the peer etc, routing (do you use VRFs or can you split your internal ranges down the middle to achieve the source IP persistence).

I do know that some customer use BIG-IP appliance downstream from the peer router to achieve the persistence, some do PBR and split the IP ranges, some do ECMP. So whilst I cannot give you the exact config on your router for your network I hope I gave you a few pointers to keep in mind.

2 Likes

Hi Jan,
Thanks for the brief.

Regards
Ramesh M

Hi,

Since you are using Cisco edge routers, probably makes sense to leverage ECMP:

Equal Cost Multiple Path (ECMP) processing is a networking feature that enables the edge appliance to use up to four equal-cost routes to the same destination. Without this feature, if there are multiple equal-cost routes to the same destination, the virtual router chooses one of those routes from the routing table and adds it to its forwarding table; it will not use any of the other routes unless there is an outage in the chosen route. Enabling ECMP functionality on a virtual router allows the firewall have up to four equal-cost paths to a destination in its forwarding table, allowing the firewall to:

  1. Load balance flows (sessions) to the same destination over multiple equal-cost links.
  2. Make use of the available bandwidth on all links to the same destination rather than leave some links unused.
  3. Dynamically shift traffic to another ECMP member to the same destination if a link fails, rather than waiting for the routing protocol or RIB table to elect an alternative path, which can help reduce down time when links fail.

ECMP load balancing is done at the session level, not at the packet level. This means the firewall chooses an equal-cost path at the start of a new session, not each time the firewall receives a packet.

Hi Sean,

Last option. It must be Source IP based Load Balancing.

We have several deployments balancing GRE and IPsec tunnels. In the case of IPsec, starting each tunnel from a unique Public source IP is also mandatory.

Adrian Larsen
Maidenhead Bridge
Cloud Security Connectors for Zscaler.