Explain this openshift resource manifest
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-resource-metrics-memory
namespace: default
spec:
...
minReplicas: 20
...
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Pods
value: 4
periodSeconds: 30
- type: Percent
value: 10
periodSeconds: 60
selectPolicy: Max
scaleUp:
selectPolicy: Disabled
The provided resource manifest is for a HorizontalPodAutoscaler (HPA) which automatically scales the number of pods in a deployment or replica set based on observed resource usage or other metrics.
This HorizontalPodAutoscaler is set to:
This configuration helps manage the number of replicas in response to changing workload demands while preventing rapid scaling oscillations.
Minimum 20 replicas running, scalling down is gradually with 10% cut down every minute or 4 pods every 30 seconds, which ever is bigger.
Scaling up targeting utilization without specific policy.
HorizontalPodAutoscaler (HPA) in OpenShift manages pod scaling based on % resource usage.
ensures at least 20 pods are always running (minReplicas: 20).
Scaling down : It can remove up to 4 pods every 30 seconds or 10% of pods every 60 seconds.
Scaling up is disabled, meaning new pods won’t be added automatically.
Is that an old manifest? The latest stable version of the Horizontal Pod Autoscaler (HPA) API is: apiVersion: autoscaling/v2 was introduced in Kubernetes 1.23, that is OpenShift 4.10 and became stable(GA) in Kubernetes 1.26, OpenShift 4.13, latest Openshift release as I am writing as of March 2025 is 4.17 with kubernetes 1.30 to give you perspective how outdated the API version is.
Other than what others already said, I present my concerns, what is not there instead of what is there!!
Some Concerns:
1- With scaling up disabled, the system can't automatically handle increased load
scaleUp: selectPolicy: Disabled
2- The minimum of 20 replicas might be resource-intensive if not needed, haven't seen it in my 6+ years of K8s operations?!
3- The 5-minute stabilization window might be too long for most workloads
4- Having two scale-down policies introduces confusion as well might lead to aggressive scaling in some scenarios
Thoughts:
Update to the right/latest API version for HPA.
Doesn't want automatic scaling up (possibly handled by other mechanisms?!)
Needs careful, controlled scaling down?!
Requires high availability (minimum 20 pods)?!
Its horizontal pod autosacalling configuration file which has details given below:
setting the repicas minimum to 20
setting the scalling behaviour for scaling down by 4 pod (10%) every 30 seconds with stability of 5 minuts.
its not gona scale up as per this configuration
- This HPA creates the scaledown policy,
- This HPA maintains 20 minimum replicas. Even if load decreases, openshift will not scale below 20.
- The first policy (Pods) allows at most 4 replicas to be scaled down in 30 seconds. The second policy (Percent) allows atmost 10 percent of
the current replicas to be scaled down in 60 seconds. This will repart in each iterations.
- In this example there are two policies defined and it specifies Policy to `selectPolicy: Max` The Max policy allows the highest amount of change. For lowest amount of the chnage Min can be used.
- The default value for Policy is Max.
Minimum of 20 pods are always running.
Scaling down can happen gradually (4 pods per 30 sec OR 10% per 60 sec).
Scaling up is disabled, meaning new pods won't be added automatically.
Yo, we ain't using that legacy v1 stuff. We're on the cutting edge with v2beta2, the real deal for resource-based autoscaling. This ain't your grandma's HPA.
We're summoning the mighty HPA! This ain't no Deployment, it's the auto-scaling overlord. We're telling Kubernetes to flex those pod muscles.
We're dubbing this beast 'hpa-resource-metrics-memory' and dumping it in the 'default' namespace. Keep your namespaces tidy, folks. We ain't running a wild cluster.
We're starting with a solid 20 pods. No wimpy deployments here. We're building a fortress of containers. This application means business.
When the load drops, we're not gonna panic and shrink immediately. We're gonna chill for 300 seconds (5 minutes) before we even think about scaling down. No jittery scaling, we're keeping it smooth and stable.
For the scale-down, we're taking it easy. We're only dropping 4 pods every 30 seconds. No sudden pod deletions, we're being gentle. If we're feeling a bit more aggressive, we'll scale down by 10% every 60 seconds. We're giving the system some flexibility.
When it comes to scaling down, we're going all in. We're picking the most aggressive policy. Max it out! We want to shrink fast when we need to.
Hold your horses! Scaling up? Nah, we're disabling that. We're only focused on scaling down in this scenario. We're keeping a tight rein on resource usage. Scale up? Not today, buddy
This HPA is all about scaling down. It's designed to be conservative with resources, ensuring that the application doesn't hog more than it needs. The minimum replica count is 20, and the scale-down behavior is carefully controlled to prevent sudden changes. Scaling up is completely disabled. This is a very specific HPA configuration for a workload that needs to be scaled down, but never scaled up.
This is an OpenShift HorizontalPodAutoscaler (HPA) resource manifest that automatically scales the number of pods in a deployment based on resource utilization.
Breakdown of the Manifest
API Version and Kind
apiVersion: autoscaling/v2beta2 – Specifies the Kubernetes API version for HPA. This version supports advanced scaling policies.
kind: HorizontalPodAutoscaler – Defines the resource type as an HPA.
Metadata
metadata:
name: hpa-resource-metrics-memory – The name of the HPA object.
namespace: default – The namespace where this HPA is applied.
Specification (spec)
This section defines the scaling behavior.
minReplicas: 20 – The minimum number of pod replicas must always be 20. Even if resource usage is low, the deployment will not scale below this number.
Scaling Behavior (behavior)
This section defines how the HPA scales pods up or down.
Scale Down (scaleDown)
stabilizationWindowSeconds: 300 – The system waits 300 seconds (5 minutes) before scaling down to prevent rapid fluctuations.
policies: – Defines scaling policies for decreasing the number of pods:
- type: Pods, value: 4, periodSeconds: 30 – Reduces up to 4 pods every 30 seconds.
- type: Percent, value: 10, periodSeconds: 60 – Reduces up to 10% of the total pod count every 60 seconds.
selectPolicy: Max – Uses the most aggressive scaling policy, meaning the largest reduction allowed by the policies is chosen.
Scale Up (scaleUp)
selectPolicy: Disabled – Scaling up is disabled, meaning the number of pods cannot increase beyond the current count.
Summary
This HPA ensures that at least 20 pods are always running, allows controlled scale-down by removing a maximum of 4 pods every 30s or 10% every 60s, and completely disables scaling up.
Red Hat
Learning Community
A collaborative learning environment, enabling open source skill development.