I have a working 1.23.9 kubernetes cluster hosted on Google Kubernetes Engine with multi-cluster services enabled, one cluster hosted in us and another in eu. I have multiple deployment apps and hpa configured for each through YAML. Out of 7 deployment apps, HPA is only working for one app. service-1 can only be accessed from service-2 internally and service-2 is exposed through HttpGateway by GKE. Please find more info below. Any help would be extremely appreciated.
Deployment file, I have posted only 2 apps, service-2's HPA is working fine, whereas service-1's is not.
$ cat deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-1
namespace: backend
labels:
app: service-1
spec:
replicas: 1
selector:
matchLabels:
lbtype: internal
template:
metadata:
labels:
lbtype: internal
app: service-1
spec:
containers:
- name: service-1
image: [REDACTED]
ports:
- containerPort: [REDACTED]
name: "[REDACTED]"
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
imagePullSecrets:
- name: docker-gcr
restartPolicy: Always
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-2
namespace: backend
labels:
app: service-2
spec:
replicas: 2
selector:
matchLabels:
lbtype: external
template:
metadata:
labels:
lbtype: external
app: service-2
spec:
containers:
- name: service-2
image: [REDACTED]
ports:
- containerPort: [REDACTED]
name: "[REDACTED]"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
imagePullSecrets:
- name: docker-gcr
restartPolicy: Always
HorizontalPodScaler file:
$ cat horizontal-pod-scaling.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: service-1
namespace: backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: service-1
minReplicas: 1
maxReplicas: 2
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: service-2
namespace: backend
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: service-2
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Service file:
$ cat service.yaml
apiVersion: v1
kind: Service
metadata:
name: backend-internal
namespace: backend
spec:
type: ClusterIP
ports:
- name: service-1
port: [REDACTED]
targetPort: "[REDACTED]"
selector:
lbtype: internal
---
apiVersion: v1
kind: Service
metadata:
name: backend-middleware
namespace: backend
spec:
ports:
- name: service-2
port: [REDACTED]
targetPort: "[REDACTED]"
selector:
lbtype: external
$ kctl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
service-1 Deployment/service-1 <unknown>/70% 1 2 1 18h
service-2 Deployment/service-2 4%/70% 2 4 2 18h
$ kctl top pod
NAME CPU(cores) MEMORY(bytes)
service-1-8f7dc66cc-xtz76 3m 66Mi
service-2-5fd767cbc-vm7f5 4m 76Mi
$ kubectl describe deployment metrics-server-v0.5.2 -nkube-system
Name: metrics-server-v0.5.2
Namespace: kube-system
CreationTimestamp: Fri, 02 Dec 2022 11:01:18 +0530
Labels: addonmanager.kubernetes.io/mode=Reconcile
k8s-app=metrics-server
version=v0.5.2
Annotations: components.gke.io/layer: addon
deployment.kubernetes.io/revision: 4
Selector: k8s-app=metrics-server,version=v0.5.2
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
...
Containers:
metrics-server:
Image: gke.gcr.io/metrics-server:v0.5.2-gke.1
Port: 10250/TCP
Host Port: 10250/TCP
Command:
/metrics-server
--metric-resolution=30s
--kubelet-port=10255
--deprecated-kubelet-completely-insecure=true
--kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
--cert-dir=/tmp
--secure-port=10250
$ kctl describe hpa service-1
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: no recommendation
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 2m (x4470 over 18h) horizontal-pod-autoscaler no recommendation
$ kctl describe hpa service-2
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited True TooFewReplicas the desired replica count is less than the minimum replica count
Events: <none>
As per my understanding ScalingActive=False It should not affect the auto scaling in a major way.
Check below possible solutions :
1)Check The Resource Metric : You can remove the LIMITS from your deployments and try it. Try only Pod's containers must be set relevant REQUESTS for RESOURCES at the deployment level and it may work. If you see the HPA is working then later you can play with LIMITS as well. This discussion tells you that only using REQUESTS is sufficient to do the HPA.
2)FailedGetResourceMetric : Check if metric is registered and available (also look at "Custom and external metrics"). Try executing the commands kubectl top node and kubectl top pod -A to verify that metrics-server is working properly.
The HPA controller runs regularly to check if any adjustments to the system are required. During each run, the controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition. The controller manager obtains the metrics from either the resource metrics API (for per-pod resource metrics).
Basically HPA targets deployment by name, uses deployment selector labels to get pod's metrics. One may have two deployments that use the same selector and then HPA would get metrics for pods of both deployments. Try the same deployment with a kind cluster and it may work fine.
3)Kubernetes Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. Metrics Server for CPU/Memory based horizontal autoscaling.
Check Requirements : Kubernetes Metrics Server has specific requirements for cluster and network configuration. These requirements aren't the default for all cluster distributions. Please ensure that your cluster distribution supports these requirements before using Metrics Server.
4)HPA process scaleup event every 15-30 seconds and It may take around 3-4 min because of latency of metrics data.
5)Check this relevant SO for more information.
Related
I'm using a HPA based on a custom metric on GKE.
The HPA is not working and it's showing me this error log:
unable to fetch metrics from custom metrics API: the server is currently unable to handle the request
When I run kubectl get apiservices | grep custom I get
v1beta1.custom.metrics.k8s.io services/prometheus-adapter False (FailedDiscoveryCheck) 135d
this is the HPA spec config :
spec:
scaleTargetRef:
kind: Deployment
name: api-name
apiVersion: apps/v1
minReplicas: 3
maxReplicas: 50
metrics:
- type: Object
object:
target:
kind: Service
name: api-name
apiVersion: v1
metricName: messages_ready_per_consumer
targetValue: '1'
and this is the service's spec config :
spec:
ports:
- name: worker-metrics
protocol: TCP
port: 8080
targetPort: worker-metrics
selector:
app.kubernetes.io/instance: api
app.kubernetes.io/name: api-name
clusterIP: 10.8.7.9
clusterIPs:
- 10.8.7.9
type: ClusterIP
sessionAffinity: None
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
What should I do to make it work ?
First of all, confirm that the Metrics Server POD is running in your kube-system namespace. Also, you can use the following manifest:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
imagePullPolicy: Always
volumeMounts:
- name: tmp-dir
mountPath: /tmp
If so, take a look into the logs and look for any stackdriver adapter’s line. This issue is commonly caused due to a problem with the custom-metrics-stackdriver-adapter. It usually crashes in the metrics-server namespace. To solve that, use the resource from this URL, and for the deployment, use this image:
gcr.io/google-containers/custom-metrics-stackdriver-adapter:v0.10.1
Another common root cause of this is an OOM issue. In this case, adding more memory solves the problem. To assign more memory, you can specify the new memory amount in the configuration file, as the following example shows:
apiVersion: v1
kind: Pod
metadata:
name: memory-demo
namespace: mem-example
spec:
containers:
- name: memory-demo-ctr
image: polinux/stress
resources:
limits:
memory: "200Mi"
requests:
memory: "100Mi"
command: ["stress"]
args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]
In the above example, the Container has a memory request of 100 MiB and a memory limit of 200 MiB. In the manifest, the "--vm-bytes", "150M" argument tells the Container to attempt to allocate 150 MiB of memory. You can visit this Kubernetes Official Documentation to have more references about the Memory settings.
You can use the following threads for more reference GKE - HPA using custom metrics - unable to fetch metrics, Stackdriver-metadata-agent-cluster-level gets OOMKilled, and Custom-metrics-stackdriver-adapter pod keeps crashing.
What do you get for kubectl get pod -l "app.kubernetes.io/instance=api,app.kubernetes.io/name=api-name"?
There should be a pod, to which the service reffers.
If there is a pod, check its logs with kubectl logs <pod-name>. you can add -f to kubectl logs command, to follow the logs.
Adding this block in my EKS nodes security group rules solved the issue for me:
node_security_group_additional_rules = {
...
ingress_cluster_metricserver = {
description = "Cluster to node 4443 (Metrics Server)"
protocol = "tcp"
from_port = 4443
to_port = 4443
type = "ingress"
source_cluster_security_group = true
}
...
}
I have a cluster in the Google Kubernetes Engine and want to make one of the deployments auto scalable by memory.
After doing a deployment, I check the horizontal scalation with the following command
kubectl describe hpa -n my-namespace
With this result:
Name: myapi-api-deployment
Namespace: my-namespace
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 15 Feb 2022 12:21:44 +0100
Reference: Deployment/myapi-api-deployment
Metrics: ( current / target )
resource memory on pods (as a percentage of request): <unknown> / 50%
Min replicas: 1
Max replicas: 5
Deployment pods: 1 current / 1 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: failed to get memory utilization: missing request for memory
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 2m22s (x314 over 88m) horizontal-pod-autoscaler failed to get memory utilization: missing request for memory
When I use the kubectl top command I can see the memory and cpu usage. Here is my deployment including the autoscale:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api-deployment
namespace: my-namespace
annotations:
reloader.stakater.com/auto: "true"
spec:
replicas: 1
selector:
matchLabels:
app: my-api
version: v1
template:
metadata:
labels:
app: my-api
version: v1
annotations:
sidecar.istio.io/rewriteAppHTTPProbers: "true"
spec:
serviceAccountName: my-api-sa
containers:
- name: esp
image: gcr.io/endpoints-release/endpoints-runtime:2
imagePullPolicy: Always
args: [
"--listener_port=9000",
"--backend=127.0.0.1:8080",
"--service=myproject.company.ai"
]
ports:
- containerPort: 9000
- name: my-api
image: gcr.io/myproject/my-api:24
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: "/healthcheck"
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: "/healthcheck"
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
resources:
limits:
cpu: 500m
memory: 2048Mi
requests:
cpu: 300m
memory: 1024Mi
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: my-api-deployment
namespace: my-namespace
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-api-deployment
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: memory
target:
type: "Utilization"
averageUtilization: 50
---
Using the autoscaling/v2beta2 recommended by the GKE documentation
When using the HPA with memory or CPU, you need to set resource requests for whichever metric(s) your HPA is using. See How does a HorizontalPodAutoscaler work, specifically
For per-pod resource metrics (like CPU), the controller fetches the
metrics from the resource metrics API for each Pod targeted by the
HorizontalPodAutoscaler. Then, if a target utilization value is set,
the controller calculates the utilization value as a percentage of the
equivalent resource request on the containers in each Pod. If a target
raw value is set, the raw metric values are used directly.
Your HPA is set to match the my-api-deployment which has two containers. You have resource requests set for my-api but not for esp. So you just need to add a memory resource request to esp.
My pod scaler fails to deploy, and keeps giving an error of FailedGetResourceMetric:
Warning FailedComputeMetricsReplicas 6s horizontal-pod-autoscaler failed to compute desired number of replicas based on listed metrics for Deployment/default/bot-deployment: invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
I have ensured to install metrics-server as you can see when I run the the following command to show the metrics-server resource on the cluster:
kubectl get deployment metrics-server -n kube-system
It shows this:
metrics-server
I also set the --kubelet-insecure-tls and --kubelet-preferred-address-types=InternalIP options in the args section of the metrics-server manifest file.
This is what my deployment manifest looks like:
apiVersion: apps/v1
kind: Deployment
metadata:
name: bot-deployment
labels:
app: bot
spec:
replicas: 1
selector:
matchLabels:
app: bot
template:
metadata:
labels:
app: bot
spec:
containers:
- name: bot-api
image: gcr.io/<repo>
ports:
- containerPort: 5600
volumeMounts:
- name: bot-volume
mountPath: /core
- name: wallet
image: gcr.io/<repo>
ports:
- containerPort: 5000
resources:
requests:
cpu: 800m
limits:
cpu: 1500m
volumeMounts:
- name: bot-volume
mountPath: /wallet_
volumes:
- name: bot-volume
emptyDir: {}
The specifications for my pod scaler is shown below too:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: bot-scaler
spec:
metrics:
- resource:
name: cpu
target:
averageUtilization: 85
type: Utilization
type: Resource
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: bot-deployment
minReplicas: 1
maxReplicas: 10
Because of this the TARGET options always remains as /80%. Upon introspection, the HPA makes that same complaint over and over again, I have tried all options, that I have seen on some other questions, but none of them seem to work. I have also tried uninstalling and reinstalling the metrics-server many times, but it doesn't work.
One thing I notice though, is that the metrics-server seems to shut down after I deploy the HPA manifest, and it fails to start. When i check the state of the metrics-server the READY option shows 0/1 even though it was initially 1/1. What could be wrong?
I will gladly provide as much info as needed. Thank you!
Looks like your bot-api is missing it's resource request and limit. your wallet has them though. the hpa uses all the resources in the pod to calculate the utilization
I am trying to run a test pod with OpenShift CLI:
$oc run nginx --image=nginx --limits=cpu=2,memory=4Gi
deploymentconfig.apps.openshift.io/nginx created
$oc describe deploymentconfig.apps.openshift.io/nginx
Name: nginx
Namespace: myproject
Created: 12 seconds ago
Labels: run=nginx
Annotations: <none>
Latest Version: 1
Selector: run=nginx
Replicas: 1
Triggers: Config
Strategy: Rolling
Template:
Pod Template:
Labels: run=nginx
Containers:
nginx:
Image: nginx
Port: <none>
Host Port: <none>
Limits:
cpu: 2
memory: 4Gi
Environment: <none>
Mounts: <none>
Volumes: <none>
Deployment #1 (latest):
Name: nginx-1
Created: 12 seconds ago
Status: New
Replicas: 0 current / 0 desired
Selector: deployment=nginx-1,deploymentconfig=nginx,run=nginx
Labels: openshift.io/deployment-config.name=nginx,run=nginx
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal DeploymentCreated 12s deploymentconfig-controller Created new replication controller "nginx-1" for version 1
Warning FailedCreate 1s (x12 over 12s) deployer-controller Error creating deployer pod: pods "nginx-1-deploy" is forbidden: failed quota: quota-svc-myproject: must specify limits.cpu,limits.memory
I get "must specify limits.cpu,limits.memory" error, despite both limits being present in the same describe output.
What might be the problem and how do I fix it?
I found a solution!
Part of the error message was "Error creating deployer pod". It means that the problem is not with my pod, but with the deployer pod which performs my pod deployment.
It seems the quota in my project affects deployer pods as well.
I couldn't find a way to set deployer pod limits with CLI, so I've made a DeploymentConfig.
kind: "DeploymentConfig"
apiVersion: "v1"
metadata:
name: "test-app"
spec:
template:
metadata:
labels:
name: "test-app"
spec:
containers:
- name: "test-app"
image: "nginxinc/nginx-unprivileged"
resources:
limits:
cpu: "2000m"
memory: "20Gi"
ports:
- containerPort: 8080
protocol: "TCP"
replicas: 1
selector:
name: "test-app"
triggers:
- type: "ConfigChange"
- type: "ImageChange"
imageChangeParams:
automatic: true
containerNames:
- "test-app"
from:
kind: "ImageStreamTag"
name: "nginx-unprivileged:latest"
strategy:
type: "Rolling"
resources:
limits:
cpu: "2000m"
memory: "20Gi"
A you can see, two sets of limitations are specified here: for container and for deployment strategy.
With this configuration it worked fine!
Looks like you have specified resource quota and the values you specified for limits seems to be larger than that. Can you describe the resource quota oc describe quota quota-svc-myproject and adjust your configs accordingly.
A good reference could be https://docs.openshift.com/container-platform/3.11/dev_guide/compute_resources.html
I am trying to setup HPA for my AKS cluster. Following is the Kubernetes manifest file:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kompose.cmd: XXXXXX\tools\kompose.exe
convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: loginservicedapr
name: loginservicedapr
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: loginservicedapr
strategy: {}
template:
metadata:
annotations:
kompose.cmd: XXXXXX\kompose.exe
convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: loginservicedapr
spec:
containers:
image: XXXXXXX.azurecr.io/loginservicedapr:latest
imagePullPolicy: ""
name: loginservicedapr
resources:
requests:
cpu: 250m
limits:
cpu: 500m
ports:
- containerPort: 80
resources: {}
restartPolicy: Always
serviceAccountName: ""
volumes: null
status: {}
---
apiVersion: v1
kind: Service
metadata:
annotations:
kompose.cmd: XXXXXXXXXX\kompose.exe
convert
kompose.version: 1.21.0 (992df58d8)
creationTimestamp: null
labels:
io.kompose.service: loginservicedapr
name: loginservicedapr
spec:
type: LoadBalancer
ports:
- name: "5016"
port: 5016
targetPort: 80
selector:
io.kompose.service: loginservicedapr
status:
loadBalancer: {}
Following is my HPA yaml file:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: loginservicedapr-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: loginservicedapr
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Pods
pods:
name: cpu
target:
type: Utilization
averageUtilization: 50
But when HPA is failing with the error 'FailedGetResourceMetric' - 'missing request for CPU'.
I have also installed metrics-server (though not sure whether that was required or not) using the following statement:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
But still I am getting the following output when I do 'kubectl describe hpa':
Name: loginservicedapr-hpa
Namespace: default
Labels: fluxcd.io/sync-gc-mark=sha256.Y6dHhIOs-hNYbDmJ25Ijw1YsJ_8f0PH3Vlruj5rfbFk
Annotations: fluxcd.io/sync-checksum: d5c0d9eda6db0c40f1e5e23e1356d0268dbccc8f
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{"fluxcd.io/sync-checksum":"d5c0d9eda6db0c40f1e5...
CreationTimestamp: Wed, 08 Jul 2020 17:19:47 +0530
Reference: Deployment/loginservicedapr
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 50%
Min replicas: 3
Max replicas: 10
Deployment pods: 3 current / 3 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: missing request for cpu
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 33m (x1234 over 6h3m) horizontal-pod-autoscaler Invalid metrics (1 invalid out of 1), last error was: failed to get cpu utilization: missing request for cpu
Warning FailedGetResourceMetric 3m11s (x1340 over 6h3m) horizontal-pod-autoscaler missing request for cpu
I have 2 more services that I have deployed along with 'loginservicedapr'. But I have not written HPA for those services. But I have included resource limits for those services as well in their YAML files. How to make this HPA work?
resources appears twice in your pod spec.
resources: # once here
requests:
cpu: 250m
limits:
cpu: 500m
ports:
- containerPort: 80
resources: {} # another here, clearing it
I was able to resolve the issue by changing the following in my kubernetes manifest file from this:
resources:
requests:
cpu: 250m
limits:
cpu: 500m
to the following:
resources:
requests:
cpu: "250m"
limits:
cpu: "500m"
HPA worked after that. Following is the GitHub link which gave the solution:
https://github.com/kubernetes-sigs/metrics-server/issues/237
But I did not add any Internal IP address command or anything else.
This is typically related to the metrics server.
Make sure you are not seeing anything unusual about the metrics server installation:
# This should show you metrics (they come from the metrics server)
$ kubectl top pods
$ kubectl top nodes
or check the logs:
$ kubectl logs <metrics-server-pod>
Also, check your kube-controller-manager logs for HPA events related entries.
Furthermore, if you'd like to explore more on whether your pods have missing requests/limits you can simply see the full output of your running pod managed by the HPA:
$ kubectl get pod <pod-name> -o=yaml
Some other people have had luck deleting and renaming the HPA too.