I have a Kubernetes cluster deployed locally to a node prepped by kubeadm.
I am experimenting with one of the pods. This pod fails to deploy, however I can't locate the cause of it. I have guesses as to what the problem is but I'd like to see something related in the Kubernetes logs
Here's what i have tried:
$kubectl logs nmnode-0-0 -c hadoop -n test
Error from server (NotFound): pods "nmnode-0-0" not found
$ kubectl get event -n test | grep nmnode
(empty results here)
$ journalctl -m |grep nmnode
and I get a bunch of repeated entries like the following. It talks about killing the pod but it gives no reason whatsoever for it
Aug 08 23:10:15 jeff-u16-3 kubelet[146562]: E0808 23:10:15.901051 146562 event.go:240] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"nmnode-0-0.15b92c3ff860aed6", GenerateName:"", Namespace:"test", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"test", Name:"nmnode-0-0", UID:"743d2876-69cf-43bc-9227-aca603590147", APIVersion:"v1", ResourceVersion:"38152", FieldPath:"spec.containers{hadoop}"}, Reason:"Killing", Message:"Stopping container hadoop", Source:v1.EventSource{Component:"kubelet", Host:"jeff-u16-3"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbf4b616dacae12d6, ext:2812562895486, loc:(*time.Location)(0x781e740)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf4b616dacae12d6, ext:2812562895486, loc:(*time.Location)(0x781e740)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "nmnode-0-0.15b92c3ff860aed6" is forbidden: unable to create new content in namespace test because it is being terminated' (will not retry!)
The shorted version of the above message is this:
Reason:"Killing", Message:"Stopping container hadoop",
The cluster is still running. Do you know how I can get to the bottom of this?
Try to execute command below:
$ kubectl get pods --all-namespaces
Take a look if your pod was not created in a different namespace.
Most common reason of pod failures:
1. The container was never created because it failed to pull image.
2. The container never existed in the runtime, and the error reason is not in the "special error list", so the containerStatus was never set and kept as "no state".
3. Then the container was treated as "Unkown" and the pod was reported as Pending without any reason.
The containerStatus was always "no state" after each syncPod(), the status manager could never delete the pod even though the Deletiontimestamp was set.
Useful article: pod-failure.
try this command to get some hints
kubectl describe pod nmnode-0-0 -n test
share the output from
kubectl get po -n test
Related
I'm trying to install the Eclipse-Che by following this blog : https://che.eclipseprojects.io/2022/07/25/#karatkep-installing-eclipse-che-on-aks.html,
yet following all the steps i'm not able to successfully install the Eclipse che.
1)
After running this command:
kubectl logs -l app.kubernetes.io/component=che-operator -n eclipse-che -f
these are the errors i'm facing:
logs: Waited for 1.034843163s due to client-side throttling, not priority and fairness, request: GET:https://10.1.0.1:443/apis/discovery.k8s.io/v1?timeout=32s
time="2022-09-12T14:08:29Z" level=info msg="Successfully reconciled."
2) the Che-gateway pod is failing:
che-gateway-7d54ccdd59-bblw6 3/4 CrashLoopBackOff 18 (2m51s ago) 70m
Description: Oauth-proxy container is getting failed (Crash loop back error)
Logs of the oauth- Proxy container:
#invalid configuration:
missing setting: login-url
missing setting: redeem-url
Below is the exception facing while implementing AGIC in AKS
Readiness Prob is failing for the ingress-azure
Events:
Type Reason Age From Message
Normal Scheduled 5m22s default-scheduler Successfully assigned default/ingress-azure-fc5dcbcd8-bsgt8 to aks-agentpool-22890870-vmss000002
Normal Pulling 5m22s kubelet Pulling image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0"
Normal Pulled 5m22s kubelet Successfully pulled image "mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:1.4.0" in 121.018102ms
Normal Created 5m22s kubelet Created container ingress-azure
Normal Started 5m22s kubelet Started container ingress-azure
Warning Unhealthy 21s (x30 over 5m11s) kubelet Readiness probe failed: Get "http://10.240.xx.xxx:8123/health/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
kubectl logs -f mic_xxxx:
failed to update user-assigned identities on node aks-agentpool-2xxxxx-vmss (add [1], del [0], update[0]), error: failed to get identity resource, error: failed to get vmss aks-agentpool-2xxxx-vmss in resource group MC_Axx-xx_axxx-ak8_koreacentral, error: failed to get vmss aks-agentpool-2xxxxx-vmss in resource group MC_Axx-axxx_agw-ak8_koreacentral, error: compute.VirtualMachineScaleSetsClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '4xxxxxx-xxxxxxx-7xxx-xxxxxxx' with object id '4xxxxxx-xxxxxxx-7xxx-xxxxxxx' does not have authorization to perform action 'Microsoft.Compute/virtualMachineScaleSets/read' over scope '/subscriptions/{subscription_id}/resourceGroups/{MC_rg_name}/providers/Microsoft.Compute/virtualMachineScaleSets/aks-agentpool-2xxxxx-vmss' or the scope is invalid. If access was recently granted, please refresh your credentials."
Steps Implemented:
AKS cluster with RABAC enabled & Azure CNI
2 subnets in the same vnet with same resource group (Not the RG which starts with MC_)
Provided the contributor & reader access to the AGW after implementing it.
Applied
kubectl apply -f https://raw.githubusercontent.com/Azure/aad-pod-identity/v1.8.8/deploy/infra/deployment-rbac.yaml
Made changes according in the helm-config.yaml and authenticated using identityResourceID.
Suggested us on this exception. Thanks.
I am trying to follow this tutorial on deploying Hyperledger Fabric on Kubernetes. But instead of IBM Cloud, I'm doing it with Google Cloud. I encountered this same issue (see my logs below) and tried:
changing docker image to docker:18.09-dind in docker.yaml.
setting FABRIC_CFG_PATH=$PWD/configFiles instead of FABRIC_CFG_PATH=$PWD in create_channel.yaml according to another StackOverflow answer.
However, these workaround did not work for me and I still encounter the error.
How do I fix this to be able to successfully deploy the network?
> ./setup_blockchainNetwork.sh
peersDeployment.yaml file was configured to use Docker in a container.
Creating Docker deployment
persistentvolume/docker-pv created
persistentvolumeclaim/docker-pvc created
service/docker created
deployment.apps/docker-dind created
Creating volume
The Persistant Volume does not seem to exist or is not bound
Creating Persistant Volume
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/createVolume.yaml
persistentvolume/shared-pv created
persistentvolumeclaim/shared-pvc created
Success creating Persistant Volume
Creating Copy artifacts job.
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/copyArtifactsJob.yaml
job.batch/copyartifacts created
Wating for container of copy artifact pod to run. Current status of copyartifacts-dcg4m is Pending
copyartifacts-dcg4m is now Running
Starting to copy artifacts in persistent volume.
Waiting for 10 more seconds for copying artifacts to avoid any network delay
Waiting for copyartifacts job to complete
Copy artifacts job completed
Generating the required artifacts for Blockchain network
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/generateArtifactsJob.yaml
job.batch/utils created
Waiting for generateArtifacts job to complete
Waiting for generateArtifacts job to complete
Creating Services for blockchain network
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/blockchain-services.yaml
service/blockchain-ca created
service/blockchain-orderer created
service/blockchain-org1peer1 created
service/blockchain-org2peer1 created
service/blockchain-org3peer1 created
service/blockchain-org4peer1 created
Creating new Deployment to create four peers in network
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/peersDeployment.yaml
deployment.apps/blockchain-orderer created
deployment.apps/blockchain-ca created
deployment.apps/blockchain-org1peer1 created
deployment.apps/blockchain-org2peer1 created
deployment.apps/blockchain-org3peer1 created
deployment.apps/blockchain-org4peer1 created
Checking if all deployments are ready
Waiting for 15 seconds for peers and orderer to settle
Creating channel transaction artifact and a channel
Running: kubectl create -f /home/me/blockchain-network-on-kubernetes/configFiles/create_channel.yaml
job.batch/createchannel created
Waiting for createchannel job to be completed
Waiting for createchannel job to be completed
Create Channel Failed
> kubectl get pods
NAME READY STATUS RESTARTS AGE
blockchain-ca-58b4bbbcc7-dqmnw 1/1 Running 0 30s
blockchain-orderer-ddc9466d-2sqt8 1/1 Running 0 30s
blockchain-org1peer1-ffbf698bb-fd6nf 1/1 Running 0 29s
blockchain-org2peer1-98f7fb5f9-mb5m7 1/1 Running 0 29s
blockchain-org3peer1-75d6b8bf5c-bxd24 1/1 Running 0 29s
blockchain-org4peer1-675669ffff-b4dxj 1/1 Running 0 29s
copyartifacts-dcg4m 0/1 Completed 0 60s
createchannel-9wt54 1/2 Error 0 12s
docker-dind-54767c54c5-crk7b 0/1 CrashLoopBackOff 3 73s
utils-wbpcz 0/2 Completed 0 37s
> kubectl logs createchannel-9wt54 -c createchanneltx
/shared
systemd-private-3cbb0a492497473087eda0bb66fbd738-systemd-networkd.service-QHqKfL
systemd-private-3cbb0a492497473087eda0bb66fbd738-systemd-resolved.service-NuNfWF
systemd-private-3cbb0a492497473087eda0bb66fbd738-systemd-timesyncd.service-SzE37R
2021-02-03 08:49:16.970 UTC [common.tools.configtxgen] main -> INFO 001 Loading configuration
2021-02-03 08:49:16.970 UTC [common.tools.configtxgen.localconfig] Load -> PANI 002 Error reading configuration: Unsupported Config Type ""
2021-02-03 08:49:16.970 UTC [common.tools.configtxgen] func1 -> PANI 003 Error reading configuration: Unsupported Config Type ""
panic: Error reading configuration: Unsupported Config Type "" [recovered]
panic: Error reading configuration: Unsupported Config Type ""
...
FABRIC_CFG_PATH setting is wrong.
Currently, your error is a phrase that occurs when there is a problem with the syntax in the configtx.yaml file or when the file path is wrong and cannot be found.
For configtxgen, refer to the configtx.yaml file under FABRIC_CFG_PATH.
In the tutorial you provided, configtx.yaml is not found under configFiles directory and it exists under artifacts directory.
I'll suggest two of the easiest solutions out of many.
move artifacts/configtx.yaml to configFiles/configtx.yaml
mv ./artifacts/configtx.yaml configFiles/configtx.yaml
Or, set FABRIC_CFG_PATH to configFiles
export FABRIC_CFG_PATH=${PWD}/artifacts
I am trying to mount a volume to a Pod so that one deployment can write to it, and another deployment can read from it. I am using MiniKube with Docker on Ubuntu. I am running ./mvnw clean package -Dquarkus.kubernetes.deploy=true.
From the Quarkus documentation, it seems pretty straightforward, but I'm running into trouble.
When I add this line quarkus.kubernetes.mounts.my-volume.path=/volumePath to my application.properties, I get the following error:
[ERROR] Failed to execute goal io.quarkus:quarkus-maven-plugin:1.6.0.Final:build (default) on project getting-started: Failed to build quarkus application: io.quarkus.builder.BuildException: Build failure: Build failed due to errors
[ERROR] [error]: Build step io.quarkus.kubernetes.deployment.KubernetesDeployer#deploy threw an exception: io.dekorate.deps.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://IP:8443/apis/apps/v1/namespaces/default/deployments. Message: Deployment.apps "getting-started" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "my-volume". Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].volumeMounts[0].name, message=Not found: "my-volume", reason=FieldValueNotFound, additionalProperties={})], group=apps, kind=Deployment, name=getting-started, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Deployment.apps "getting-started" is invalid: spec.template.spec.containers[0].volumeMounts[0].name: Not found: "my-volume", metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
When I add quarkus.kubernetes.config-map-volumes.my-volume.config-map-name=my-volume (along with the previous statement), the error goes away, but the pod does not start. Running "kubectl describe pods" returns:
Normal Scheduled <unknown> default-scheduler Successfully assigned default/getting-started-859d89fc8-tbg6w to minikube
Warning FailedMount 14s (x6 over 30s) kubelet, minikube MountVolume.SetUp failed for volume "my-volume" : configmap "my-volume" not found
Does it look like the volume is not being set in the YAML file?
So my question is, how can I set the name of the volume in application.properties, so I can have a volume mounted in the Pod?
I recommend you look at your kubernetes.yml and kubernetes.json files under target/kubernetes
For the first error. It looks like my-volume needs to exist in your cluster either as a Persistent Volume.
For the second error quarkus.kubernetes.config-map-volumes.my-volume.config-map-name=my-volume is meant to be used as a ConfigMap so the actual ConfigMap needs to be defined/exist in your cluster.
I'm facing the issue as following when i append logging addon to the kubernete cluster ,the kibana doesn't work,any clue to troubleshoot it? thank in advance.
kubectl get pod/kibana-logging-v1-mertn --namespace=kube-system
NAME READY STATUS RESTARTS AGE
kibana-logging-v1-mertn 0/1 CrashLoopBackOff 8 21m
kubectl logs pod/kibana-logging-v1-mertn --namespace=kube-system
ELASTICSEARCH_URL=http://elasticsearch-logging.kube-system:9200
{"#timestamp":"2016-04-19T02:39:08.559Z","level":"error","message":"Service Unavailable","node_env":"production","error":{"message":"Service Unavailable","name":"Error","stack":"Error: Service Unavailable\n at respond (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:235:15)\n at checkRespForFailure (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:203:7)\n at HttpConnector.<anonymous> (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/connectors/http.js:156:7)\n at IncomingMessage.bound (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/node_modules/lodash-node/modern/internals/baseBind.js:56:17)\n at IncomingMessage.emit (events.js:117:20)\n at _stream_readable.js:944:16\n at process._tickCallback (node.js:442:13)\n"}}
{"#timestamp":"2016-04-19T02:39:08.652Z","level":"fatal","message":"Service Unavailable","node_env":"production","error":{"message":"Service Unavailable","name":"Error","stack":"Error: Service Unavailable\n at respond (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:235:15)\n at checkRespForFailure (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:203:7)\n at HttpConnector.<anonymous> (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/connectors/http.js:156:7)\n at IncomingMessage.bound (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/node_modules/lodash-node/modern/internals/baseBind.js:56:17)\n at IncomingMessage.emit (events.js:117:20)\n at _stream_readable.js:944:16\n at process._tickCallback (node.js:442:13)\n"}}
How did you create kibana? Did you create an elasticsearch service for it to talk to?
Did you follow the guide for setting it up during cluster creation?