Enable Horizontal Pod Autoscaler on EKS Kubernetes Cluster

4 min readOct 7, 2022

The Horizontal Pod Autoscaler is a Kubernetes resource controller that allows for automatic scaling of the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization or with custom metrics support. Horizontal Pod Autoscaling only apply to objects that can be scaled. For objects that cannot be scaled like DaemonSets it cannot be used.

The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The controller periodically adjusts the number of replicas in a replication controller or deployment to match the observed average CPU utilization to the target specified by user.

Using Horizontal Pod Autoscaler on Kubernetes EKS Cluster

Before you can use Horizontal Pod Autoscaler on EKS Cluster you need to have installed Metrics Server. Follow the guide below for complete installation steps.

Install Kubernetes Metrics Server on Amazon EKS Cluster

Verify the metrics server is functional by using the command below.

$ kubectl get apiservice v1beta1.metrics.k8s.io -o yaml



apiVersion: apiregistration.k8s.io/v1

kind: APIService

metadata:

  annotations:

    kubectl.kubernetes.io/last-applied-configuration: |

      "apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":"annotations":,"name":"v1beta1.metrics.k8s.io","spec":"group":"metrics.k8s.io","groupPriorityMinimum":100,"insecureSkipTLSVerify":true,"service":"name":"metrics-server","namespace":"kube-system","version":"v1beta1","versionPriority":100

  creationTimestamp: "2020-08-12T11:27:13Z"

  name: v1beta1.metrics.k8s.io

  resourceVersion: "130943"

  selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io

  uid: 83c44e41-6346-4dff-8ce2-aff665199209

spec:

  group: metrics.k8s.io

  groupPriorityMinimum: 100

  insecureSkipTLSVerify: true

  service:

    name: metrics-server

    namespace: kube-system

    port: 443

  version: v1beta1

  versionPriority: 100

status:

  conditions:

  - lastTransitionTime: "2020-08-12T11:27:18Z"

    message: all checks passed

    reason: Passed

    status: "True"

    type: Available

Deploy sample app for testing HPA

Let’s deploy a test application that we’ll use to demonstrate the working of Horizontal Pod Autoscaler.

Create demo demo namespace:

$ kubectl create ns demo

namespace/demo created



$ kubectl get ns

NAME              STATUS   AGE

default           Active   2d20h

demo              Active   22s

kube-node-lease   Active   2d20h

kube-public       Active   2d20h

kube-system       Active   2d20h

Deploy a sample Apache web server application by running the following command in your terminal.

$ kubectl apply -f https://k8s.io/examples/application/php-apache.yaml -n demo

deployment.apps/php-apache created

service/php-apache created

You can also use kubectl run command to deploy the application and create a service.

$ kubectl run php-apache \

  --generator=run-pod/v1 \

  --image=k8s.gcr.io/hpa-example \

  --requests=cpu=200m \

  --limits=cpu=500m \

  --expose \

  --port=80

Check the status of your application.

$ kubectl get pods -n demo

NAME                          READY   STATUS    RESTARTS   AGE

php-apache-79544c9bd9-wccnj   1/1     Running   0          40s

Create Kubernetes HPA resource

When the application is running we can create HPA resource.

$ kubectl autoscale deployment php-apache --cpu-percent=70 --min=1 --max=5 -n demo

horizontalpodautoscaler.autoscaling/php-apache autoscaled

The command above creates an autoscaler which scales up Pods when CPU utilization exceeds 70%. The minimum number of pods is set to 1 and Maximum is 5.

Get details of autoscaler with the following command:

$ kubectl get hpa -n demo

NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE

php-apache   Deployment/php-apache   0%/70%    1         5         1          80s



$ kubectl describe hpa -n demo

Name:                                                  php-apache

Namespace:                                             demo

Labels:                                                

Annotations:                                           

CreationTimestamp:                                     Fri, 14 Aug 2020 21:38:12 +0300

Reference:                                             Deployment/php-apache

Metrics:                                               ( current / target )

  resource cpu on pods  (as a percentage of request):  0% (1m) / 70%

Min replicas:                                          1

Max replicas:                                          5

Deployment pods:                                       1 current / 1 desired

Conditions:

  Type            Status  Reason               Message

  ----            ------  ------               -------

  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation

  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)

  ScalingLimited  False   DesiredWithinRange   the desired count is within the acceptable range

Events:

Increasing Load

Let us now increase the load by hitting the Service we deployed on Kubernetes from several locations. For this purpose we’re using busybox container to generate load.

kubectl run -it --rm load-generator --image=busybox /bin/sh --generator=run-pod/v1 -n demo

You’re be logged into the container terminal. Run the following commands to execute a while loop which hits service endpoint on http:///php-apache

/ # while true; do wget -q -O - http://php-apache; done

Open a separate terminal and see how the autoscaler creates more Pods in the deployment as the load increases.

$ kubectl get hpa -n demo

NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE

php-apache   Deployment/php-apache   83%/70%   1         5         5          9m

As long as actual CPU percentage is higher than the target percentage, then the replica count increases, up to 5. In this case, it’s 83%, so the number of REPLICAS continues to increase.

Stop the load using CTRL+C

Watch as autoscaler scales down deployment:

$ kubectl get hpa -n demo -w

It may take some minutes before running Pods drop back to 1. Clean the setup once done.

$ kubectl delete -f https://k8s.io/examples/application/php-apache.yaml -n demo

deployment.apps "php-apache" deleted

service "php-apache" deleted

Delete Autoscaler.

$ kubectl delete hpa php-apache -n demo

horizontalpodautoscaler.autoscaling "php-apache" deleted

Lastly delete the demo namespace.

$ kubectl delete ns demo

namespace "demo" deleted

You’ll use the same approach to autoscale your Applications with HPA using Metrics Server.

https://www.computingpost.com/enable-horizontal-pod-autoscaler-on-eks-kubernetes-cluster/?feed_id=9058&_unique_id=634059d3737fe

Enable Horizontal Pod Autoscaler on EKS Kubernetes Cluster

Using Horizontal Pod Autoscaler on Kubernetes EKS Cluster

Deploy sample app for testing HPA

Create Kubernetes HPA resource

Increasing Load

Written by ComputingPost

No responses yet