Spinnaker pipeline failing when deployment strategy is Recreate - kubernetes

When the deployment strategy is changed from "Rolling update" to "Recreate", I am facing the below error
Failure executing: PATCH at: https://3x.xxx.2x1.xxx/apis/extensions/v1beta1/namespaces/default/deployments/xxxxxx. Message: Deployment.apps "xxxxxx" is invalid: spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy type is 'Recreate'. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.strategy.rollingUpdate, message=Forbidden: may not be specified when strategy type is 'Recreate', reason=FieldValueForbidden, additionalProperties={})], group=apps, kind=Deployment, name=xxxxxx, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Deployment.apps "xxxxxx" is invalid: spec.strategy.rollingUpdate: Forbidden: may not be specified when strategy type is 'Recreate', metadata=ListMeta(resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
Any help on this? I am using Spinnaker 1.6.0

There are many tickets on GitHub related to that problem: Kubernetes, Cert-manager, Spinnaker. And in each one you can find the same answer - it is not possible to switch the update strategy of already created resources.
So, the only way is to create a new deployment with a new strategy due to the implementation of the updating process in Kubernetes.

Related

Cert-Manager: renewing certificate not working

Folks, am trying to renew certificates for a wildcard domain, and am seeing the following errors when looking at the logs on the certmanager pod, and at the error in the certificaterequest
Message: Waiting on certificate issuance from order
production/certmanager-xxxxxxxxx-pp9n2-3392968554: "pending"
production/cert-manager-877fd747c-4nf2f[cert-manager]: E0817 21:32:34.447585 1
controller.go:166] cert-manager/challenges "msg"="re-queuing item due to error
processing" "error"="failed to change Route 53 record set: InvalidChangeBatch: [RRSet
with DNS name _acme-challenge.xxxxxx.com., type TXT, SetIdentifier
\"xxxxxxx\" cannot be created because a non
multivalue answer rrset exists with the same name and type.]"
"key"="production/certmanager-xxxx-pp9n2-3392968554-1376642102"
Do I need to update the TXT record in DNS? Currently it is set to a different value than the SetIdentifier value from the output above.
Also noticing a strange error in the log. The pod name mention is incorrect, there is a different pod by another name running:
production/cert-manager-877fd747c-4nf2f[cert-manager]: E0817 21:45:46.379332 1
controller.go:208] cert-manager/challenges "msg"="challenge in work queue no longer
exists" "error"="challenge.acme.cert-manager.io \"certmanager-idrive-ssl-srvw4-
3392968554-1376642102\" not found"
Thanks!

InvalidIdentityToken: Couldn't retrieve verification key from your identity provider

I am new to aws and kubectl, I need to deploy one of the app to aws. After deploying to eks cluster, I edited the ingress in the kubectl but unfortunately it returned 404 not found. (i am pretty sure the new service container works fine)
after checking from kubectl describe ingress, here are some events reports:
Warning FailedBuildModel 40m ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
status code: 400, request id: xxxxxxxx-4a93-4e27-9d6b-xxxxxxxx
Warning FailedBuildModel 22m ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
status code: 400, request id: xxxxxxxx-5368-41e1-8a4d-xxxxxxxx
Warning FailedBuildModel 5m8s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
status code: 400, request id: xxxxxxxx-20ea-4bd0-b1cb-xxxxxxxx
Anyone has ideas about this issue?

Kubernetes control plane how to increase Endpoint Slices for Service

I am gettingthe below error and unable to resolve this.
"Error updating Endpoint Slices for Service default/centosdei2-service: failed to update centosdei2-service-dc4tm EndpointSlice for Service default/centosdei2-service: EndpointSlice.discovery.k8s.io "centosdei2-service-dc4tm" is invalid: ports: Too many: 109: must have at most 100 items"
Tried to add " - --max-endpoints-per-slice=150" in the /etc/kubernetes/manifests/kube-controller-manager.yaml and restarted the master but didnt work. Any pointers?

Kubernetes cluster working but getting this error from the NGINX controller

Although the cluster is working as expected this error is somewhat troublesome.
Kubernetes Version: v1.17.3
E0407 17:57:54.426952 1 reflector.go:123]
github.com/nginxinc/kubernetes-ingress/nginx-ingress/internal/k8s/controller.go:341:
Failed to list *v1.VirtualServerRoute:
virtualserverroutes.k8s.nginx.org is forbidden: User
"system:serviceaccount:kube-system:default" cannot list resource
"virtualserverroutes" in API group "k8s.nginx.org" at the cluster
scope
To fix the problem you have to disable list/watch operations on virtualserver and virtualserverroutes - set the --enable-custom-resources flag to false in your deployment/daemonset manifest.
--enable-custom-resources
Enables custom resources (default true)
Take a look also at: nginx-ingress-controller-configuration, disabling-list-watch-virtualserver.

Concourse 3.3.0 spitting hard to debug error: "json: unsupported type: map[interface {}]interface {}"

We are using some community custom resource types (https://github.com/ljfranklin/terraform-resource and https://github.com/cloudfoundry/bosh-deployment-resource). After upgrading to concourse 3.3.0, we've begun consistently seeing the following error on a few of our jobs at the same step: json: unsupported type: map[interface {}]interface {}.
This is fairly hard to debug as there is no other log output other than that. We are unsure what is incompatible between those resources and Concourse.
Notes about our pipeline:
We originally had substituted all of our usages of {{}} to (()), but reverting that did not lead to the error going away.
We upgraded concourse from v3.0.1.
The failing step can be found here: https://github.com/cloudfoundry/capi-ci/blob/6a73764d09f544820ce39f16dca166d6d6861996/ci/pipeline.yml#L731-L739
We are using a resource called elsa-aws-storage-terraform, found here: https://github.com/cloudfoundry/capi-ci/blob/6a73764d09f544820ce39f16dca166d6d6861996/ci/pipeline.yml#L731-L739
That resource is of a custom resource-type terraform found here: https://github.com/cloudfoundry/capi-ci/blob/6a73764d09f544820ce39f16dca166d6d6861996/ci/pipeline.yml#L45-L48
A similar failing step can be found here: https://github.com/cloudfoundry/capi-ci/blob/6a73764d09f544820ce39f16dca166d6d6861996/ci/pipeline.yml#L871-L886
This is related to issue of not being able to define nested maps in resource configuration https://github.com/concourse/concourse/issues/1345