SNAT Port Exhaustion in AKS

Kubernetes networking is complex, and understanding Source Network Address Translation (SNAT) is crucial for seamless communication within clusters. In this post, I will explain SNAT Port Exhaustion in AKS.

SNAT Port Exhaustion occurs when the available ports for Source Network Address Translation are depleted. If not addressed properly, it can block the outbound connectivity of your entire cluster and disrupt the applications.

Understanding SNAT (Source Network Address Translation)

SNAT is a networking technique that enables multiple devices in a network to share a single public IP address. It plays a pivotal role in Kubernetes networking, facilitating efficient communication between pods and external resources.

One of the key techniques under SNAT’s umbrella is IP Masquerading, a simplified form of network address translation that hides the internal IP addresses of pods within an AKS cluster behind the public IP address of the cluster’s nodes or a load balancer. This process allows outgoing traffic from the pods to appear as if it originates from the node’s public IP, enhancing security and simplifying network configurations. Essentially, IP Masquerading ensures that external entities see communication from the AKS cluster as coming from a single, unified source.

An AKS cluster with an outbound type load balancer with one public IP will have 64,000 ports eligible for SNAT. These ports will be allocated to the backend VMs. The number of ports allocated to each VM depends on the number of nodes in the cluster. SNAT ports are allocated for each outbound connection to the same destination and destination port. When applications run on the node and make too many requests, the ports allocated to the node become unavailable, and port exhaustion occurs.

Identifying SNAT Port Exhaustion Issues

Azure Standard Load Balancer has metrics that show the status of SNAT ports usage. To find the SNAT connection statistics:

Select SNAT Connections as the metric type and Sum as the aggregation.
Group by Connection State to represent successful and failed SNAT connection counts by different lines.

A volume of failed connections greater than zero indicates SNAT port exhaustion.

Preventive Measures

The recommended way for outbound connectivity is to use a NAT gateway. Azure NAT gateway allocates ports dynamically from the entire pool of ports. It also employs a port reuse technique that makes it less likely to experience connection issues. Sometimes, migrating traffic from a load balancer to NAT gateway may not be an option due to different constraints and known limitations of NAT gateway. However, there are some steps you can easily configure that will help reduce the occurrence of port exhaustion:

Use private link or service endpoints wherever possible.
Services that send traffic over private link or service endpoints do not use SNAT functionality and hence do not consume SNAT ports for outbound connectivity.
Configure Idle timeout on load balancer.
Choose a value with a small number. In most cases, 4 minutes should be enough. The idle connections get released after this timeout, making the port available for newer connections.
Configure the pre-allocated number of ports in the Load balancer.
A load balancer with a single public IP will have 64,000 allocatable ports. By default, the port number allocated will be a conservative number. By doing the calculations, we can manually set the port numbers each node can have.
Add additional public IP’s to the external load balancer.
Each public IP has 64,000 available ports. Attaching an additional public IP will add another 64,000 ports that can be allocated to the nodes and used for outbound connectivity.
Use a greater number of nodes with smaller sizes.
Using more smaller nodes reduces the blast radius during outages. If a faulty application has three replicas, only the nodes with the faulty app are affected, minimizing impact. Consider the billing implications, analyze, and choose a balanced VM size for performance and cost.
Use sensible maxpods values.
Establishing a sensible maximum pods per node helps restrict the blast radius, thereby minimizing the impact.

No matter whatever technique you use, a faulty application can consume all the ports on a node and block other applications from establishing outbound connectivity. The best solution is to fine-tune your application to make it connection-efficient. You may adapt techniques like connection pooling, reuse connections, use less aggressive retry logic, and use keep-alives to reset the outbound idle timeout.

What steps to take in the event of SNAT port exhaustion.

Sometimes, there can be a faulty pod that consumes all the ports on a node, affecting all other pods running on the same node. A swift response is to identify and isolate the faulty application. Other metrics, such as container_sockets, can be utilized for this purpose. Compare the pod for which an increase in container_sockets occurred against the time at which port exhaustion happened, providing insight into the pod causing the SNAT port exhaustion. Once the pod is identified, move the other pods running on the node to other nodes in the cluster. This way, we can mitigate the impact on other pods until the faulty application is fixed.

Conclusion: Mitigating SNAT Port Exhaustion Challenges in Kubernetes

In conclusion, understanding, monitoring, and implementing preventive measures are crucial steps in mitigating SNAT Port Exhaustion challenges in Kubernetes. By following best practices and staying vigilant, you can ensure the smooth operation of your clusters.

SNAT Port Exhaustion in AKS