AKS Application Gateway Ingress Controller Issue - azure-devops

Below is the exception facing while implementing AGIC in AKS
Readiness Prob is failing for the ingress-azure
Events:
Type Reason Age From Message
Normal Scheduled 5m22s default-scheduler Successfully assigned default/ingress-azure-fc5dcbcd8-bsgt8 to aks-agentpool-22890870-vmss000002
Normal Pulling 5m22s kubelet Pulling image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0"
Normal Pulled 5m22s kubelet Successfully pulled image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0" in 121.018102ms
Normal Created 5m22s kubelet Created container ingress-azure
Normal Started 5m22s kubelet Started container ingress-azure
Warning Unhealthy 21s (x30 over 5m11s) kubelet Readiness probe failed: Get "http://10.240.xx.xxx:8123/health/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
kubectl logs -f mic_xxxx:
failed to update user-assigned identities on node aks-agentpool-2xxxxx-vmss (add [1], del [0], update[0]), error: failed to get identity resource, error: failed to get vmss aks-agentpool-2xxxx-vmss in resource group MC_Axx-xx_axxx-ak8_koreacentral, error: failed to get vmss aks-agentpool-2xxxxx-vmss in resource group MC_Axx-axxx_agw-ak8_koreacentral, error: compute.VirtualMachineScaleSetsClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '4xxxxxx-xxxxxxx-7xxx-xxxxxxx' with object id '4xxxxxx-xxxxxxx-7xxx-xxxxxxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/read' over scope '/subscriptions/{subscription_id}/resourceGroups/{MC_rg_name}/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-2xxxxx-vmss' or the scope is invalid. If access was recently granted, please refresh your credentials."
Steps Implemented:
AKS cluster with RABAC enabled & Azure CNI
2 subnets in the same vnet with same resource group (Not the RG which starts with MC_)
Provided the contributor & reader access to the AGW after implementing it.
Applied
kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/v1.8.8/deploy/infra/deployment-rbac.yaml
Made changes according in the helm-config.yaml and authenticated using identityResourceID.
Suggested us on this exception. Thanks.

Related

gitlab kubernets runner failing to authenticate in private docker registry during prepare stage

Im setting up a gitlab runner in my kubernets cluster.
The runner is properly deployed and running. When I trigger any pipeline, during the prepare stage it fails with an authentication error to pull from my private docker registry:
Preparing the "kubernetes" executor 00:00
Using Kubernetes namespace: gitlab-runner
Using Kubernetes executor with image myprivaterepo.com/terraform:light ...
Using attach strategy to execute scripts...
Preparing environment 00:04
Waiting for pod gitlab-runner/runner-d8cjrcgf-project-2156-concurrent-0nhsjb to be running, status is Pending
ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
ContainersNotReady: "containers with unready status: [build helper]"
ContainersNotReady: "containers with unready status: [build helper]"
WARNING: Failed to pull image with policy "": image pull failed: rpc error: code = Unknown desc = failed to pull and unpack image "myprivaterepo.com/terraform:light": failed to resolve reference "myprivaterepo.com/terraform:light": pulling from host myprivaterepo.com failed with status code [manifests light]: 401 Unauthorized
ERROR: Job failed: prepare environment: waiting for pod running: pulling image "myprivaterepo.com/terraform:light": image pull failed: rpc error: code = Unknown desc = failed to pull and unpack image "myprivaterepo.com/terraform:light": failed to resolve reference "myprivaterepo.com/terraform:light": pulling from host myprivaterepo.com failed with status code [manifests light]: 401 Unauthorized. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
I already tried by adding in the runner deployment imagePullSecrets (kubernetes.io/dockerconfigjson) and also in the gitlab -> Settings -> CI/CD -> environment variable -> DOCKER_AUTH_CONFIG but no success for any of those.
Where is the correct place to add it? Im using helm chart.
my .gitlab-ci.yaml:
.base-terraform:
image:
name: myprivaterepo.com/terraform:light
In my DOCKER_AUTH_CONFIG I had a domain with different port than the actual one. gilab-ci used this project env variable automatically as it should be.

InvalidIdentityToken: Couldn't retrieve verification key from your identity provider

I am new to aws and kubectl, I need to deploy one of the app to aws. After deploying to eks cluster, I edited the ingress in the kubectl but unfortunately it returned 404 not found. (i am pretty sure the new service container works fine)
after checking from kubectl describe ingress, here are some events reports:
Warning FailedBuildModel 40m ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
status code: 400, request id: xxxxxxxx-4a93-4e27-9d6b-xxxxxxxx
Warning FailedBuildModel 22m ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
status code: 400, request id: xxxxxxxx-5368-41e1-8a4d-xxxxxxxx
Warning FailedBuildModel 5m8s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: Couldn't retrieve verification key from your identity provider, please reference AssumeRoleWithWebIdentity documentation for requirements
status code: 400, request id: xxxxxxxx-20ea-4bd0-b1cb-xxxxxxxx
Anyone has ideas about this issue?

How to fix Unsupported Config Type "" error in Hyperledger Fabric on Kubernetes?

I am trying to follow this tutorial on deploying Hyperledger Fabric on Kubernetes. But instead of IBM Cloud, I'm doing it with Google Cloud. I encountered this same issue (see my logs below) and tried:
changing docker image to docker:18.09-dind in docker.yaml.
setting FABRIC_CFG_PATH=$PWD/configFiles instead of FABRIC_CFG_PATH=$PWD in create_channel.yaml according to another StackOverflow answer.
However, these workaround did not work for me and I still encounter the error.
How do I fix this to be able to successfully deploy the network?
> ./setup_blockchainNetwork.sh
peersDeployment.yaml file was configured to use Docker in a container.
Creating Docker deployment
persistentvolume/docker-pv created
persistentvolumeclaim/docker-pvc created
service/docker created
deployment.apps/docker-dind created
Creating volume
The Persistant Volume does not seem to exist or is not bound
Creating Persistant Volume
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/createVolume.yaml
persistentvolume/shared-pv created
persistentvolumeclaim/shared-pvc created
Success creating Persistant Volume
Creating Copy artifacts job.
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/copyArtifactsJob.yaml
job.batch/copyartifacts created
Wating for container of copy artifact pod to run. Current status of copyartifacts-dcg4m is Pending
copyartifacts-dcg4m is now Running
Starting to copy artifacts in persistent volume.
Waiting for 10 more seconds for copying artifacts to avoid any network delay
Waiting for copyartifacts job to complete
Copy artifacts job completed
Generating the required artifacts for Blockchain network
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/generateArtifactsJob.yaml
job.batch/utils created
Waiting for generateArtifacts job to complete
Waiting for generateArtifacts job to complete
Creating Services for blockchain network
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/blockchain-services.yaml
service/blockchain-ca created
service/blockchain-orderer created
service/blockchain-org1peer1 created
service/blockchain-org2peer1 created
service/blockchain-org3peer1 created
service/blockchain-org4peer1 created
Creating new Deployment to create four peers in network
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/peersDeployment.yaml
deployment.apps/blockchain-orderer created
deployment.apps/blockchain-ca created
deployment.apps/blockchain-org1peer1 created
deployment.apps/blockchain-org2peer1 created
deployment.apps/blockchain-org3peer1 created
deployment.apps/blockchain-org4peer1 created
Checking if all deployments are ready
Waiting for 15 seconds for peers and orderer to settle
Creating channel transaction artifact and a channel
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/create_channel.yaml
job.batch/createchannel created
Waiting for createchannel job to be completed
Waiting for createchannel job to be completed
Create Channel Failed
> kubectl get pods
NAME READY STATUS RESTARTS AGE
blockchain-ca-58b4bbbcc7-dqmnw 1/1 Running 0 30s
blockchain-orderer-ddc9466d-2sqt8 1/1 Running 0 30s
blockchain-org1peer1-ffbf698bb-fd6nf 1/1 Running 0 29s
blockchain-org2peer1-98f7fb5f9-mb5m7 1/1 Running 0 29s
blockchain-org3peer1-75d6b8bf5c-bxd24 1/1 Running 0 29s
blockchain-org4peer1-675669ffff-b4dxj 1/1 Running 0 29s
copyartifacts-dcg4m 0/1 Completed 0 60s
createchannel-9wt54 1/2 Error 0 12s
docker-dind-54767c54c5-crk7b 0/1 CrashLoopBackOff 3 73s
utils-wbpcz 0/2 Completed 0 37s
> kubectl logs createchannel-9wt54 -c createchanneltx
/shared
systemd-private-3cbb0a492497473087eda0bb66fbd738-systemd-networkd.service-QHqKfL
systemd-private-3cbb0a492497473087eda0bb66fbd738-systemd-resolved.service-NuNfWF
systemd-private-3cbb0a492497473087eda0bb66fbd738-systemd-timesyncd.service-SzE37R
2021-02-03 08:49:16.970 UTC [common.tools.configtxgen] main -> INFO 001 Loading configuration
2021-02-03 08:49:16.970 UTC [common.tools.configtxgen.localconfig] Load -> PANI 002 Error reading configuration: Unsupported Config Type ""
2021-02-03 08:49:16.970 UTC [common.tools.configtxgen] func1 -> PANI 003 Error reading configuration: Unsupported Config Type ""
panic: Error reading configuration: Unsupported Config Type "" [recovered]
panic: Error reading configuration: Unsupported Config Type ""
...
FABRIC_CFG_PATH setting is wrong.
Currently, your error is a phrase that occurs when there is a problem with the syntax in the configtx.yaml file or when the file path is wrong and cannot be found.
For configtxgen, refer to the configtx.yaml file under FABRIC_CFG_PATH.
In the tutorial you provided, configtx.yaml is not found under configFiles directory and it exists under artifacts directory.
I'll suggest two of the easiest solutions out of many.
move artifacts/configtx.yaml to configFiles/configtx.yaml
mv ./artifacts/configtx.yaml configFiles/configtx.yaml
Or, set FABRIC_CFG_PATH to configFiles
export FABRIC_CFG_PATH=${PWD}/artifacts

Mounting a Kubernetes Volume with Quarkus

I am trying to mount a volume to a Pod so that one deployment can write to it, and another deployment can read from it. I am using MiniKube with Docker on Ubuntu. I am running ./mvnw clean package -Dquarkus.kubernetes.deploy=true.
From the Quarkus documentation, it seems pretty straightforward, but I'm running into trouble.
When I add this line quarkus.kubernetes.mounts.my-volume.path=/volumePath to my application.properties, I get the following error:
[ERROR] Failed to execute goal io.quarkus:quarkus-maven-plugin:1.6.0.Final:build (default) on project getting-started: Failed to build quarkus application: io.quarkus.builder.BuildException: Build failure: Build failed due to errors
[ERROR] [error]: Build step io.quarkus.kubernetes.deployment.KubernetesDeployer#deploy threw an exception: io.dekorate.deps.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://IP:8443/apis/apps/v1/namespaces/default/deployments. Message: Deployment.apps "getting-started" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "my-volume". Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].volumeMounts[0].name, message=Not found: "my-volume", reason=FieldValueNotFound, additionalProperties={})], group=apps, kind=Deployment, name=getting-started, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Deployment.apps "getting-started" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "my-volume", metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
When I add quarkus.kubernetes.config-map-volumes.my-volume.config-map-name=my-volume (along with the previous statement), the error goes away, but the pod does not start. Running "kubectl describe pods" returns:
Normal Scheduled <unknown> default-scheduler Successfully assigned default/getting-started-859d89fc8-tbg6w to minikube
Warning FailedMount 14s (x6 over 30s) kubelet, minikube MountVolume.SetUp failed for volume "my-volume" : configmap "my-volume" not found
Does it look like the volume is not being set in the YAML file?
So my question is, how can I set the name of the volume in application.properties, so I can have a volume mounted in the Pod?
I recommend you look at your kubernetes.yml and kubernetes.json files under target/kubernetes
For the first error. It looks like my-volume needs to exist in your cluster either as a Persistent Volume.
For the second error quarkus.kubernetes.config-map-volumes.my-volume.config-map-name=my-volume is meant to be used as a ConfigMap so the actual ConfigMap needs to be defined/exist in your cluster.

Kubernetes kubelet error updating node status

Running a kubernetes cluster in AWS via EKS. Everything appears to be working as expected, but just checking through all logs to verify. I hopped on to one of the worker nodes and I noticed a bunch of errors when looking at the kubelet service
Oct 09 09:42:52 ip-172-26-0-213.ec2.internal kubelet[4226]: E1009 09:42:52.335445 4226 kubelet_node_status.go:377] Error updating node status, will retry: error getting node "ip-172-26-0-213.ec2.internal": Unauthorized
Oct 09 10:03:54 ip-172-26-0-213.ec2.internal kubelet[4226]: E1009 10:03:54.831820 4226 kubelet_node_status.go:377] Error updating node status, will retry: error getting node "ip-172-26-0-213.ec2.internal": Unauthorized
Nodes are all showing as ready, but I'm not sure why those errors are appearing. Have 3 worker nodes and all 3 have the same kubelet errors (hostnames are different obviously)
Additional information. It would appear that the error is coming from this line in kubelet_node_status.go
node, err := kl.heartbeatClient.CoreV1().Nodes().Get(string(kl.nodeName), opts)
if err != nil {
return fmt.Errorf("error getting node %q: %v", kl.nodeName, err)
}
From the workers I can execute get nodes using kubectl just fine:
kubectl get --kubeconfig=/var/lib/kubelet/kubeconfig nodes
NAME STATUS ROLES AGE VERSION
ip-172-26-0-58.ec2.internal Ready <none> 1h v1.10.3
ip-172-26-1-193.ec2.internal Ready <none> 1h v1.10.3
Turns out this is not an issue. Official reply from AWS regarding these errors:
The kubelet will regularly report node status to the Kubernetes API. When it does so it needs an authentication token generated by the aws-iam-authenticator. The kubelet will invoke the aws-iam-authenticator and store the token in it's global cache. In EKS this authentication token expires after 21 minutes.
The kubelet doesn't understand token expiry times so it will attempt to reach the API using the token in it's cache. When the API returns the Unauthorized response, there is a retry mechanism to fetch a new token from aws-iam-authenticator and retry the request.