Cluster autoscaler v1.0.4 kubernetes error - kubernetes

im getting below error
W0316 22:04:26.025272 1 clusterstate.go:514] Failed to get nodegroup for <nodename>: Wrong id: expected format aws:///<zone>/<name>, got
W0316 22:04:26.025296 1 clusterstate.go:514] Failed to get nodegroup for <nodename>: Wrong id: expected format aws:///<zone>/<name>, got
W0316 22:04:26.025303 1 clusterstate.go:514] Failed to get nodegroup for <nodename>: Wrong id: expected format aws:///<zone>/<name>, got
W0316 22:04:26.025309 1 clusterstate.go:514] Failed to get nodegroup for <nodename>: Wrong id: expected format aws:///<zone>/<name>, got
W0316 22:04:26.025316 1 clusterstate.go:514] Failed to get nodegroup for <nodename>: Wrong id: expected format aws:///<zone>/<name>, got
W0316 22:04:26.025324 1 clusterstate.go:514] Failed to get nodegroup for <nodename>: Wrong id: expected format aws:///<zone>/<name>, got
W0316 22:04:26.025340 1 clusterstate.go:560] Readiness for node group *** not found
E0316 22:04:02.705833 1 static_autoscaler.go:257] Failed to scale up: failed to build node infos for node groups: Wrong id: expected format aws:///<zone>/<name>, got
using cluster-autoscasler
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler

That happened because some of your nodes do not have a tag which identifies your node group.
As #Matthew L Daniel mentioned in his comment, it needs a tag on AWS instance for working properly.
Here is from official documentation about how identification works and why:
It is assumed that the underlying cluster is run on top of some kind of node groups. Inside a node group, all machines have identical capacity and have the same set of assigned labels. Thus, increasing a size of a node group will create a new machine that will be similar to those already in the cluster - they will just not have any user-created pods running (but will have all pods run from the node manifest and daemon sets.)
As you can find in installation documentation:
To run a cluster-autoscaler which auto-discovers ASGs with nodes use the --node-group-auto-discovery flag and tag the ASGs with key k8s.io/cluster-autoscaler/enabled and key kubernetes.io/cluster/< YOUR CLUSTER NAME >.
So, just add that tags to your nodes.
Also, you can use as many AWS tags and Kubernetes labels for a node as you want, it will not affect autoscaler.
UPD:
The reason why Autoscaler was not working and crashed on getting ProviderID was in a missed --cloud-provider option value in Kubelet. Addin aws value should fix that kind of issues.

Related

AKS Application Gateway Ingress Controller Issue

Below is the exception facing while implementing AGIC in AKS
Readiness Prob is failing for the ingress-azure
Events:
Type Reason Age From Message
Normal Scheduled 5m22s default-scheduler Successfully assigned default/ingress-azure-fc5dcbcd8-bsgt8 to aks-agentpool-22890870-vmss000002
Normal Pulling 5m22s kubelet Pulling image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0"
Normal Pulled 5m22s kubelet Successfully pulled image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0" in 121.018102ms
Normal Created 5m22s kubelet Created container ingress-azure
Normal Started 5m22s kubelet Started container ingress-azure
Warning Unhealthy 21s (x30 over 5m11s) kubelet Readiness probe failed: Get "http://10.240.xx.xxx:8123/health/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
kubectl logs -f mic_xxxx:
failed to update user-assigned identities on node aks-agentpool-2xxxxx-vmss (add [1], del [0], update[0]), error: failed to get identity resource, error: failed to get vmss aks-agentpool-2xxxx-vmss in resource group MC_Axx-xx_axxx-ak8_koreacentral, error: failed to get vmss aks-agentpool-2xxxxx-vmss in resource group MC_Axx-axxx_agw-ak8_koreacentral, error: compute.VirtualMachineScaleSetsClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '4xxxxxx-xxxxxxx-7xxx-xxxxxxx' with object id '4xxxxxx-xxxxxxx-7xxx-xxxxxxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/read' over scope '/subscriptions/{subscription_id}/resourceGroups/{MC_rg_name}/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-2xxxxx-vmss' or the scope is invalid. If access was recently granted, please refresh your credentials."
Steps Implemented:
AKS cluster with RABAC enabled & Azure CNI
2 subnets in the same vnet with same resource group (Not the RG which starts with MC_)
Provided the contributor & reader access to the AGW after implementing it.
Applied
kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/v1.8.8/deploy/infra/deployment-rbac.yaml
Made changes according in the helm-config.yaml and authenticated using identityResourceID.
Suggested us on this exception. Thanks.

Enable custom kubernetes scheduler for a namespace

I have a k8 job that brings up multiple pods. This job is used for load testing so all the pods need to come up at the same time. Job shouldn't be started until nodes are available for all pods to be scheduled.
I came across kube-batch https://github.com/kubernetes-sigs/kube-batch to do this scheduling. I have couple of questions:
1. How to enable kube-batch for only one namespace in a cluster?
2. Installed kube-batch by following the tutorial. But pods are failing on startup with below error. How to resolve this error?
I1204 20:07:55.911393 1 allocate.go:96] Queue <default> is overused, ignore it.
I1204 20:07:55.911399 1 allocate.go:194] Leaving Allocate ...
I1204 20:07:55.911407 1 backfill.go:41] Enter Backfill ...
I1204 20:07:55.911413 1 backfill.go:71] Leaving Backfill ...
E1204 20:07:55.911521 1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:522
/usr/local/go/src/runtime/panic.go:513
/usr/local/go/src/runtime/panic.go:82
/usr/local/go/src/runtime/signal_unix.go:390
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/pkg/scheduler/framework/session.go:368
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/pkg/scheduler/plugins/gang/gang.go:154
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/pkg/scheduler/framework/framework.go:58
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/pkg/scheduler/scheduler.go:102
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/pkg/scheduler/scheduler.go:85
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/home/root1/servicecomb/go/src/github.com/kubernetes-sigs/kube-batch/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/usr/local/go/src/runtime/asm_amd64.s:1333
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x148 pc=0x10ab979]
Not sure what you are trying to achive is doable. In my opinion what you can do is to modify the pods dockerfile to include Supervisord . Then in supervisord specify the commands you want to run when the pods come in running state using priority for supervisord.
Example
[program:api]
directory=/usr/local
command=go main.go
priority=100
autostart=true
autorestart=true
stderr_logfile=/var/log
stdout_logfile=/var/log

Mounting a Kubernetes Volume with Quarkus

I am trying to mount a volume to a Pod so that one deployment can write to it, and another deployment can read from it. I am using MiniKube with Docker on Ubuntu. I am running ./mvnw clean package -Dquarkus.kubernetes.deploy=true.
From the Quarkus documentation, it seems pretty straightforward, but I'm running into trouble.
When I add this line quarkus.kubernetes.mounts.my-volume.path=/volumePath to my application.properties, I get the following error:
[ERROR] Failed to execute goal io.quarkus:quarkus-maven-plugin:1.6.0.Final:build (default) on project getting-started: Failed to build quarkus application: io.quarkus.builder.BuildException: Build failure: Build failed due to errors
[ERROR] [error]: Build step io.quarkus.kubernetes.deployment.KubernetesDeployer#deploy threw an exception: io.dekorate.deps.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://IP:8443/apis/apps/v1/namespaces/default/deployments. Message: Deployment.apps "getting-started" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "my-volume". Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].volumeMounts[0].name, message=Not found: "my-volume", reason=FieldValueNotFound, additionalProperties={})], group=apps, kind=Deployment, name=getting-started, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Deployment.apps "getting-started" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "my-volume", metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
When I add quarkus.kubernetes.config-map-volumes.my-volume.config-map-name=my-volume (along with the previous statement), the error goes away, but the pod does not start. Running "kubectl describe pods" returns:
Normal Scheduled <unknown> default-scheduler Successfully assigned default/getting-started-859d89fc8-tbg6w to minikube
Warning FailedMount 14s (x6 over 30s) kubelet, minikube MountVolume.SetUp failed for volume "my-volume" : configmap "my-volume" not found
Does it look like the volume is not being set in the YAML file?
So my question is, how can I set the name of the volume in application.properties, so I can have a volume mounted in the Pod?
I recommend you look at your kubernetes.yml and kubernetes.json files under target/kubernetes
For the first error. It looks like my-volume needs to exist in your cluster either as a Persistent Volume.
For the second error quarkus.kubernetes.config-map-volumes.my-volume.config-map-name=my-volume is meant to be used as a ConfigMap so the actual ConfigMap needs to be defined/exist in your cluster.

Kubernetes cluster-autoscaler not working

I have deployed cluster-autoscaler for my aws kube cluster, now it is failing with below error.
W0411 03:07:37.393124 1 clusterstate.go:514] Failed to get nodegroup for dev-k8s-node-asg-230-i-089e4d2f163533989: Wrong id: expected format aws:////, got
W0411 03:07:37.393145 1 clusterstate.go:514] Failed to get nodegroup for stg-k8s-w2-npe-master-3: Wrong id: expected format aws:////, got
W0411 03:07:37.393152 1 clusterstate.go:514] Failed to get nodegroup for dev-k8s-node-prm-1: Wrong id: expected format aws:////, got
W0411 03:07:37.393158 1 clusterstate.go:514] Failed to get nodegroup for dev-k8s-node-asg-230-i-0eb3341fce85be39c: Wrong id: expected format aws:////, got
W0411 03:07:37.393164 1 clusterstate.go:514] Failed to get nodegroup for dev-k8s-node-asg-230-i-091d1a037311d5daf: Wrong id: expected format aws:////, got
W0411 03:07:37.393169 1 clusterstate.go:514] Failed to get nodegroup for dev-k8s-node-asg-230-i-041dd54f2baaa4553: Wrong id: expected format aws:////, got
W0411 03:07:37.393188 1 clusterstate.go:560] Readiness for node group dev-k8s-node-asg-230 not found
W0411 03:07:37.393203 1 clusterstate.go:560] Readiness for node group stg-k8s-agent-w2-asg not found
autoscaler-configuration
Command:
./cluster-autoscaler \
--v=6 \
--stderrthreshold=info \
--cloud-provider=aws \
--skip-nodes-with-local-storage=false \
--expander=least-waste \
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/dev
I have added below tags to my autoscaling group, can someone help me to understand this error.
You may try to create cloud-config.conf and insert KubernetesClusterTag, KubernetesClusterID and Zone properties there. Please verify if
--cloud-provider=aws
is present at every nodes and in the kubelet command line, and if
--cloud-config=
points to correct path of cloud_config.conf.

Kubernetes regression (1.6 - 1.7) - openstack cinder provider

I do not manage to make cinder plugin work with kubernetes 1.7.
It worked well with 1.6. With the same configuration, I got the following error with 1.7:
E1011 16:13:44.309318 5 openstack_volumes.go:320] Failed to create a 3 GB volume: Invalid request due to incorrect syntax or missing required parameters.
I1011 16:13:44.309411 5 cinder_util.go:207] Error creating cinder volume: Invalid request due to incorrect syntax or missing required parameters.
I1011 16:13:44.309458 5 pv_controller.go:1331] failed to provision volume for claim "default/my-persistent-volume-claim" with StorageClass "standard": Invalid request due to incorrect syntax or missing required parameters.
Thanks for your help