Work Around Max Count of Security Group Rules on EKS
Table of Contents
AWS EKS on VPC networks need AWS Security Group Rules (SG) to receipt ingress traffic. But what if you reach the max rules count in your SG?
Background
LoadBalancer-type Service and Security Group Rules
Kubernetes users can expose a Service in two ways:
- Register with the Istio ingress gateways—the golden path for most tenants
- Create a dedicated LoadBalancer-type Service object, which tells the cloud provider to create a load balancer and set up health checks.
EKS recommends aws-load-balancer-controller to react to updates to LoadBalancer
-type Service objects and set up NLB accordingly. For example, suppose a Service object exposes ports 80 and 443, the controller will create five Security Group (SG) Rules on EKS worker Nodes:
- allow ingress source
0.0.0.0/0
to the corresponding NodePort for port80
- allow ingress source
0.0.0.0/0
to the corresponding NodePort for port443
- allow EKS zonal subnet in
us-west-2a
to ingress to the health-check NodePort. - allow EKS zonal subnet in
us-west-2b
to ingress to the health-check NodePort - allow EKS zonal subnet in
us-west-2c
to ingress to the health-check NodePort
Note: health-check will fail if a) the Node does not host any target Pods or b) none of the target Pods on this Node is ready, determined by the Pod’s readiness probe
The SG Rules are added to an SG attached to all worker Nodes in the given EKS.
Security Group Limits
For each AWS account, there are two quota limits on Security Groups:
- Max number of inbound rules per SG
- Max number of SGs per network interface
These limits can be adjusted subject to the constraint that the product of the two quotas cannot exceed 1000 (AWS doc). It means a network interface can not have more than 1000 SG rules.
Problem
Once your EKS cluster approaches the limit of SG rules, it restricts your ability to create new load balancers. It means you won’t be able to perform blue-green upgrade of the load balancer, because you need to provision two sets of load balancers simultaneously. The lack of headroom also means you can no longer onboard more applications that requires a dedicated load balancer.
Solutions
The following solutions are not mutually exclusive. They can be used together.
Second dedicated SG for each node pools
Suppose your current setup is that all worker Nodes, regardless node pool, has a shared SG attached named “worker”. The aws-load-balancer-controller
adds new rules to the “worker” SG.
You can keep the shared “worker” SG to store common rules but create a new SG for each node pool, and use the new SG for NLBs ingress. You need to change the node pool launch template to attach the new SG.
If you decide to continue letting the AWS LB controller manage SG rules for us, you should tag the new SG with kubernetes.io/cluster/{{ .ClusterName }}: shared
. This is necessary when there are multiple security groups attached to an ENI, so that the controller knows which SG to add new rules to. Because the existing “worker” SG has this tag already, we need to create a duplicate SG, say “worker2”, which does NOT have the SG tag for NLB. Then, we will attach to the node pool the “worker2” SG and the per-pool SG.
Optimize SG rules outside of aws LB controller
Recall the aws-load-balancer-controller
implementation creates 5 inbound SG rules per envoy-ingress Service. We can optimize this by managing the SG rules ourselves and asking the controller to skip SG rules creation. We can reduce the need to 2 inbound SG rules per envoy-ingress Service.
Add the service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: false
` annotation to the LoadBalancer-type Service object. Documentation about this annotation is here.
Reserve 3 static NodePorts for each Service. One for NLB to health check the EKS nodes. One for frontend port 80. One for frontend port 443. You can choose a static healthCheckNodePort
if you set externalTrafficPolicy: Local
(which comes with the benefits to preserve source IP address). The two regular NodePorts can be static regardless.
The two regular NodePorts should be consecutive, so one SG rule can cover both. The healthCheckNodePort
does not need to be consecutive, because the source IP range in the SG rule is different (i.e. only allow NLB to healthcheck the nodes).
Consider the following example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: v1
kind: Service
metadata:
annotations:
external-dns.alpha.kubernetes.io/hostname: acmecorp.com
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-type: external
+ service.beta.kubernetes.io/aws-load-balancer-manage-backend-security-group-rules: false
name: myapp
namespace: myapp
spec:
externalTrafficPolicy: Local
+ healthCheckNodePort: 30218
ports:
- name: https
+ nodePort: 30212
port: 443
protocol: TCP
targetPort: 8095
- name: http
+ nodePort: 30213
port: 80
protocol: TCP
targetPort: 8089
selector:
app: myapp
type: LoadBalancer
The optimized SG rules would be:
allow ingress source0.0.0.0/0
to the corresponding NodePort for port80
allow ingress source0.0.0.0/0
to the corresponding NodePort for port443
-
allow source
0.0.0.0/
0 to ingress to NodePort range from30212
to30213
allow EKS zonal subnet inus-west-2a
to ingress to the health-check NodePortallow EKS zonal subnet inus-west-2b
to ingress to the health-check NodePortallow EKS zonal subnet inus-west-2c
to ingress to the health-check NodePort- allow EKS VPC network in region
us-west-2
to ingress to the health-check NodePort
Raise max inbound rules per SG by reducing SG count per ENI
The solution picks a different point on the trade-off spectrum between #Inbound rules per SG and #SG per ENI.
SG quota is set for each and whole AWS account, so any adjustment will affect other workloads in the same account. Thus, we need to verify whether the existing AWS account has ENI with max number of SGs attached already.
Build EKS clusters in a separate AWS account
Building new clusters and shifting tenants over are expensive. Try other solutions first.