I am receiving an error message when trying to access details from VS Code of my Azure Kubernetes Cluster. This problem prevents me from attaching a debugger to the pod.
I receive the following error message:
Error loading document: Error: cannot open k8smsx://loadkubernetescore/pod-kube-system%20%20%20coredns-748cdb7bf4-q9f9x.yaml?ns%3Dall%26value%3Dpod%2Fkube-system%20%20%20coredns-748cdb7bf4-q9f9x%26_%3D1611398456559. Detail: Unable to read file 'k8smsx://loadkubernetescore/pod-kube-system coredns-748cdb7bf4-q9f9x.yaml?ns=all&value=pod/kube-system coredns-748cdb7bf4-q9f9x&_=1611398456559' (Get command failed: error: there is no need to specify a resource type as a separate argument when passing arguments in resource/name form (e.g. 'kubectl get resource/<resource_name>' instead of 'kubectl get resource resource/<resource_name>'
)
My Setup
I have VS Code installed, with "Kubernetes", "Bridge to Kubernetes" and "Azure Kubernetes Service" installed
I have connected my Cluster through az login and can already access different information (e.g. my nodes, etc.)
When trying to access the workloads / pods on my cluster, I receive the above error message - and in the Kubernetes View in VS Code I get an error for the details of the pod.
Error in Kubernetes-View in VS Code
What I tried
I tried to reinstall the AKS Cluster and completely logging in freshly to it
I tried to reinstall all extensions mentioned above in VS Code
Browsing the internet, I do not find any comparable error message
The strange thing is that it used to work two weeks ago - and I did not change or update anything (as far as I remember)
Any ideas / hints that I can try further?
Thank you
As #mdaniel wrote: the Node view is just for human consumption, and that the tree item you actually want to click on is underneath Namespaces / kube-system / coredns-748cdb7bf4-q9f9x. Give that a try, and consider reporting your bad experience to their issue tracker since it looks like release 1.2.2 just came out 2 days ago and might not have been tested well.
final solution is to attach debugger in the other way - through Workloads / Deployments.
Related
When trying to sync the user directories of Jira to other atlassian products (confluence and bitbucket server running on aks) a 403 error is returned.
Upon looking into this error the following steps have been attempted:
https://confluence.atlassian.com/stashkb/unable-to-connect-to-jira-for-authentication-forbidden-403-323391874.html
The IP adresses have been added to the whitelist of Jira. The next step in solutions online is to restart the Jira
service.
This however causes issues as upon running the stop/start-jira.sh files inside the pod the service returns
with none of the previous settings and all configurations including backups are gone. Taking us back to square one.
cluster size:
current set-up
3 x Standard D8 v3 (8 vcpus, 32 GiB memory) cluster on aks
Used the following images installed through UI:
atlassian/jira-software
cptactionhank/docker-atlassian-jira
Exec into pod and go to /opt/atlassian/jira/bin
run ./(start/stop)-jira.sh
What should happen is that when going back to the url the Jira instance is reset and all configuration files in the pod for the service are lost.
The logs of the pod give error no 137 as a common error when restarting.
update:
https://github.com/int128/devops-kompose/tree/master/atlassian-jira-software
The following helm chart has also been used and achieved the same result.
The cluster was running fine for 255 days. I brought down the cluster and after that, I was unable to run the cluster up. It gives the following error while running the cluster up.
Creating minions.
Attempt 1 to create kubernetes-minion-template
ERROR: (gcloud.compute.instance-templates.create) Could not fetch image resource:
- The resource 'projects/google-containers/global/images/container-vm-v20170627' was not found
Attempt 1 failed to create instance template kubernetes-minion-template. Retrying.
This Attempt goes on and it always fails. Am I missing something?
The kubernetes version is v1.7.2.
It looks like the image you are trying to use to create the machines has been deprecated and/or is no longer available.
You should try specifying an alternative image to create these machines from Google's current public images.
I'm trying to run a container in a custom VM on Google Compute Engine. This is to perform a heavy ETL process so I need a large machine but only for a couple of hours a month. I have two versions of my container with small startup changes. Both versions were built and pushed to the same google container registry by the same computer using the same Google login. The older one works fine but the newer one fails by getting stuck in an endless list of the following error:
E0927 09:10:13 7f5be3fff700 api_server.cc:184 Metadata request unsuccessful: Server responded with 'Forbidden' (403): Transport endpoint is not connected
Can anyone tell me exactly what's going on here? Can anyone please explain why one of my images doesn't have this problem (well it gives a few of these messages but gets past them) and the other does have this problem (thousands of this message and taking over 24 hours before I killed it).
If I ssh in to a GCE instance then both versions of the container pull and run just fine. I'm suspecting the INTEGRITY_RULE checking from the logs but I know nothing about how that works.
MORE INFO: this is down to "restart policy: never". Even a simple Centos:7 container that says "hello world" deployed from the console triggers this if the restart policy is never. At least in the short term I can fix this in the entrypoint script as the instance will be destroyed when the monitor realises that the process has finished
I suggest you try creating a 3rd container that's focused on the metadata service functionality to isolate the issue. It may be that there's a timing difference between the 2 containers that's not being overcome.
Make sure you can ‘curl’ the metadata service from the VM and that the request to the metadata service is using the VM's service account.
I tried to set up a new Kubernetes cluster on IBM Bluemix and after a while I received the message that the deploy failed. To start over again I have tried to delete the cluster from the Bluemix interface, with no success. The error messages are not consistent, and go from elaborate messages error messages to the most common: 500: internal server error.
The command line does not help either. I expected this to work
bx cs cluster-rm k8s_demo
But the most of the time it leads to and EOF error. Somehow internal connections are an issue because
bx cs clusters
leads to the error
FAILED
unable to connect to https://us-south.containers.bluemix.net/v1/clusters, please check your Internet Connection
most of the time. Every so often a list including the k8s_demo cluster appears, but being as persistent with the cluster-rm command has not brought such luck that the cluster is deleted.
Is there any other way I can try to start over again? Apart from setting up another Bluemix account of course, something I would prefer to avoid.
If this problem is continuing, I would suggest contacting IBM Support. They'll have the tools to troubleshoot what has happened with the cluster provisioning and/or deleting.
Expert
I have a question related monitoring system for container.
Below picture has my thinking for monitoring.
I'd like to run monitoring combination(Grafana,Heapster,InfluxDB,cAdvisor) on baremetal as a daemon process instead of running in containers.
When i configure those 4 units...i got the error below.
It might be comes out from linking influxdb to Heapster (ex:--sink)
Below is my commands to run heapster
"./heapster --source=cadvisor:external?cadvisorPort=8081 --sink=influxdb:http://192.168.56.20:8086/db=k8s&user=root&pw=root"
Then I get following message
driver.go:326 Database creation failed : Server returned (404): 404
page not found
What symptom it is?
Someone if have a solution? let me get an advice
Thanks in advance