Pod Auto-Scaling

To enable horizontal auto-scaling for a component, you just need to set autoScaling.horizontal.maxReplicas greater than the value for replicas. Additionally, you should configure one or multiple of the target value parameters, averageCPU and averageMemory. These target values define how the autoscaler will set the number of replicas to achieve an average CPU utilization and/or an average memory usage by the pods that will be scaled within this component.

replicas: 4
autoScaling:
  horizontal:
    maxReplicas: 10
    averageCPU: 800m
    averageRelativeMemory: 50

The above example would create an horizontal pod autoscaler in Kubernetes which is configured to:

create at least 4 pods for the component
scale the component up to a maximum of 10 pods
observe the CPU usage of all replicas and try to scale between 4 and 10 replicas to achieve an average CPU utilization of 800m
observe the memory usage of all replicas and try to scale between 4 and 10 replicas to achieve an average memory utilization of 50% (of the requested memory)

`horizontal`

`maxReplicas`

The maxReplicas option expects an integer with the maximum number of replicas that the autoscaler is allowed to create.

Min Replicas

The minReplicas for the autoscaler will be defined by the replicas option for the component.

`averageCPU`

The averageCPU option expects a fixed amount of CPU. The autoscaler will try to achieve that on average, all replicas use this much CPU.

`averageRelativeCPU`

The averageRelativeCPU option expects a percentage number without % suffix. The autoscaler will try to achieve that, on average, all replicas use this much CPU relative to the amount of CPU each replica has requested.

`averageMemory`

The averageMemory option expects a fixed amount of memory. The autoscaler will try to achieve that, on average, all replicas use this much memory.

`averageRelativeMemory`

The averageRelativeMemory option expects a percentage number without % suffix. The autoscaler will try to achieve that, on average, all replicas use this much memory relative to the amount of memory each replica has requested.

#horizontal

#maxReplicas

Min Replicas

#averageCPU

#averageRelativeCPU

#averageMemory

#averageRelativeMemory

`horizontal`

`maxReplicas`

`averageCPU`

`averageRelativeCPU`

`averageMemory`

`averageRelativeMemory`