I'm running a simple example with Helm. Take a look below at values.yaml file:
cat << EOF | helm install helm/vitess -n vitess -f -
topology:
cells:
- name: 'zone1'
keyspaces:
- name: 'vitess'
shards:
- name: '0'
tablets:
- type: 'replica'
vttablet:
replicas: 1
mysqlProtocol:
enabled: true
authType: secret
username: vitess
passwordSecret: vitess-db-password
etcd:
replicas: 3
vtctld:
replicas: 1
vtgate:
replicas: 3
vttablet:
dataVolumeClaimSpec:
storageClassName: nfs-slow
EOF
Take a look at the output of current pods running below:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-fb8b8dccf-8f5kt 1/1 Running 0 32m
kube-system coredns-fb8b8dccf-qbd6c 1/1 Running 0 32m
kube-system etcd-master1 1/1 Running 0 32m
kube-system kube-apiserver-master1 1/1 Running 0 31m
kube-system kube-controller-manager-master1 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-bkg9z 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-q8vh4 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-vqmnz 1/1 Running 0 32m
kube-system kube-proxy-bd8mf 1/1 Running 0 32m
kube-system kube-proxy-nlc2b 1/1 Running 0 32m
kube-system kube-proxy-x7cd5 1/1 Running 0 32m
kube-system kube-scheduler-master1 1/1 Running 0 32m
kube-system tiller-deploy-8458f6c667-cx2mv 1/1 Running 0 27m
vitess etcd-global-6pwvnv29th 0/1 Init:0/1 0 16m
vitess etcd-operator-84db9bc774-j4wml 1/1 Running 0 30m
vitess etcd-zone1-zwgvd7spzc 0/1 Init:0/1 0 16m
vitess vtctld-86cd78b6f5-zgfqg 0/1 CrashLoopBackOff 7 16m
vitess vtgate-zone1-58744956c4-x8ms2 0/1 CrashLoopBackOff 7 16m
vitess zone1-vitess-0-init-shard-master-mbbph 1/1 Running 0 16m
vitess zone1-vitess-0-replica-0 0/6 Init:CrashLoopBackOff 7 16m
Running logs I see this error:
$ kubectl logs -n vitess vtctld-86cd78b6f5-zgfqg
++ cat
+ eval exec /vt/bin/vtctld '-cell="zone1"' '-web_dir="/vt/web/vtctld"' '-web_dir2="/vt/web/vtctld2/app"' -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 '-service_map="grpc-vtctl"' '-topo_implementation="etcd2"' '-topo_global_server_address="etcd-global-client.vitess:2379"' -topo_global_root=/vitess/global
++ exec /vt/bin/vtctld -cell=zone1 -web_dir=/vt/web/vtctld -web_dir2=/vt/web/vtctld2/app -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 -service_map=grpc-vtctl -topo_implementation=etcd2 -topo_global_server_address=etcd-global-client.vitess:2379 -topo_global_root=/vitess/global
ERROR: logging before flag.Parse: E0422 02:35:34.020928 1 syslogger.go:122] can't connect to syslog
F0422 02:35:39.025400 1 server.go:221] Failed to open topo server (etcd2,etcd-global-client.vitess:2379,/vitess/global): grpc: timed out when dialing
I'm running behind vagrant with 1 master and 2 nodes. I suspect that is a issue with eth1.
The storage are configured to use NFS.
$ kubectl logs etcd-operator-84db9bc774-j4wml
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-zone1 cluster-namespace=vitess pkg=cluster
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-global cluster-namespace=vitess pkg=cluster
It appears that etcd is not fully initializing. Note that neither the pod for the global lockserver (etcd-global-6pwvnv29th) nor the local one for cell zone1 (pod etcd-zone1-zwgvd7spzc) are ready.
Related
I facing this error
status:
conditions:
- lastTransitionTime: "2022-07-04T14:09:43.687887Z"
message: 'GET /connectors/debezium-connector-mssql/topics returned 404 (Not Found):
Unexpected status code'
reason: ConnectRestException
status: "True"
type: NotReady
observedGeneration: 3
tasksMax: 1
topics: []
this is my pods status
:~/.ssh$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kafka my-cluster-entity-operator-77bcc4b67f-8qwbp 3/3 Running 0 3d19h
kafka my-cluster-kafka-0 1/1 Running 0 3d19h
kafka my-cluster-zookeeper-0 1/1 Running 0 3d19h
kafka my-connect-cluster-connect-f4b6ccc55-s5cxz 1/1 Running 0 3d19h
kafka strimzi-cluster-operator-86864b86d5-rq9pp 1/1 Running 0 3d19h
kube-system coredns-6d4b75cb6d-9gjxx 1/1 Running 0 3d19h
kube-system etcd-minikube 1/1 Running 0 3d19h
kube-system kube-apiserver-minikube 1/1 Running 0 3d19h
kube-system kube-controller-manager-minikube 1/1 Running 0 3d19h
kube-system kube-proxy-d6gl6 1/1 Running 0 3d19h
kube-system kube-scheduler-minikube 1/1 Running 0 3d19h
kube-system storage-provisioner 1/1 Running 1 (3d19h ago) 3d19h
And I logged in as a interactive terminal from the pods
$ kubectl exec -it my-cluster-kafka-0 bash -n kafka
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[kafka#my-cluster-kafka-0 kafka]$ curl my-connect-cluster-connect-api:8083/connectors/debezium-connector-mssql/status
{"name":"debezium-connector-mssql","connector":{"state":"RUNNING","worker_id":"172.17.0.7:8083"},
"tasks":[{"id":0,"state":"FAILED","worker_id":"172.17.0.7:8083","trace":"org.apache.kafka.connect.errors.ConnectException: Error configuring an instance of SqlServerConnectorTask; check the logs for details\n\tat io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:91)\n\tat org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:208)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"}],"type":"source"}
Facing error with configuring an instance of SqlServerConnectorTask
I print all of my Pods with:
$ kubectl get pods --all-namespaces
and the output is:
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-7487d7f956-hx4fp 1/1 Running 0 88m
calico-system calico-node-vn52p 1/1 Running 0 88m
calico-system calico-typha-7588984c44-m6tsz 1/1 Running 0 88m
gitlab-managed-apps install-ingress 0/1 Error 0 14m********
gitlab-managed-apps install-prometheus 0/1 Error 0 12m
kube-system coredns-f9fd979d6-2n2pg 1/1 Running 0 91m
kube-system coredns-f9fd979d6-sq9bl 1/1 Running 0 91m
kube-system etcd-tuoputuo-iamnotstone-server 1/1 Running 0 91m
kube-system kube-apiserver-tuoputuo-iamnotstone-server 1/1 Running 0 91m
kube-system kube-controller-manager-tuoputuo-iamnotstone-server 1/1 Running 0 91m
kube-system kube-proxy-87jkr 1/1 Running 0 91m
kube-system kube-scheduler-tuoputuo-iamnotstone-server 1/1 Running 0 91m
tigera-operator tigera-operator-58f56c4958-4x9tp 1/1 Running 0 89m
But when I execute the logs command:
$ kubectl logs -f install-ingress
I see this error
Error from server (NotFound): pods "install-ingress" not found
The install-ingress pod is in gitlab-managed-apps namespace. If you do not specify namespace in the kubectl command then it will search for the pod in default namespace where the install-ingress pod is not present.
Could you try below command (specifying the namespace of the pod).
kubectl logs -f install-ingress -n gitlab-managed-apps
I don't know what to do to debug it. I have 1 Kubernetes master node and three slave nodes. I have deployed on the three nodes a Gluster cluster just fine with this guide https://github.com/gluster/gluster-kubernetes/blob/master/docs/setup-guide.md.
I created volumes and everything is working. But when I reboot a slave node, and the node reconnects to the master node, the glusterd.service inside the slave node shows up dead and nothing works after this.
[root#kubernetes-node-1 /]# systemctl status glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: inactive (dead)
I don't know what to do from here, for example /var/log/glusterfs/glusterd.log has been updated last time 3 days ago (it's not being updated with errors after a reboot or a pod deletion+recreation).
I just want to know where glusterd crashes so I can find out why.
How can I debug this crash?
All the nodes (master + slaves) run on Ubuntu Desktop 18 64 bit LTS Virtualbox VMs.
requested logs (kubectl get all --all-namespaces):
NAMESPACE NAME READY STATUS RESTARTS AGE
glusterfs pod/glusterfs-7nl8l 0/1 Running 62 22h
glusterfs pod/glusterfs-wjnzx 1/1 Running 62 2d21h
glusterfs pod/glusterfs-wl4lx 1/1 Running 112 41h
glusterfs pod/heketi-7495cdc5fd-hc42h 1/1 Running 0 22h
kube-system pod/coredns-86c58d9df4-n2hpk 1/1 Running 0 6d12h
kube-system pod/coredns-86c58d9df4-rbwjq 1/1 Running 0 6d12h
kube-system pod/etcd-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-apiserver-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-controller-manager-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-flannel-ds-amd64-785q8 1/1 Running 5 3d19h
kube-system pod/kube-flannel-ds-amd64-8sj2z 1/1 Running 8 3d19h
kube-system pod/kube-flannel-ds-amd64-v62xb 1/1 Running 0 3d21h
kube-system pod/kube-flannel-ds-amd64-wx4jl 1/1 Running 7 3d21h
kube-system pod/kube-proxy-7f6d9 1/1 Running 5 3d19h
kube-system pod/kube-proxy-7sf9d 1/1 Running 0 6d12h
kube-system pod/kube-proxy-n9qxq 1/1 Running 8 3d19h
kube-system pod/kube-proxy-rwghw 1/1 Running 7 3d21h
kube-system pod/kube-scheduler-kubernetes-master-work 1/1 Running 0 6d12h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d12h
elastic service/glusterfs-dynamic-9ad03769-2bb5-11e9-8710-0800276a5a8e ClusterIP 10.98.38.157 <none> 1/TCP 2d19h
elastic service/glusterfs-dynamic-a77e02ca-2bb4-11e9-8710-0800276a5a8e ClusterIP 10.97.203.225 <none> 1/TCP 2d19h
elastic service/glusterfs-dynamic-ad16ed0b-2bb6-11e9-8710-0800276a5a8e ClusterIP 10.105.149.142 <none> 1/TCP 2d19h
glusterfs service/heketi ClusterIP 10.101.79.224 <none> 8080/TCP 2d20h
glusterfs service/heketi-storage-endpoints ClusterIP 10.99.199.190 <none> 1/TCP 2d20h
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 6d12h
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
glusterfs daemonset.apps/glusterfs 3 3 0 3 0 storagenode=glusterfs 2d21h
kube-system daemonset.apps/kube-flannel-ds-amd64 4 4 4 4 4 beta.kubernetes.io/arch=amd64 3d21h
kube-system daemonset.apps/kube-flannel-ds-arm 0 0 0 0 0 beta.kubernetes.io/arch=arm 3d21h
kube-system daemonset.apps/kube-flannel-ds-arm64 0 0 0 0 0 beta.kubernetes.io/arch=arm64 3d21h
kube-system daemonset.apps/kube-flannel-ds-ppc64le 0 0 0 0 0 beta.kubernetes.io/arch=ppc64le 3d21h
kube-system daemonset.apps/kube-flannel-ds-s390x 0 0 0 0 0 beta.kubernetes.io/arch=s390x 3d21h
kube-system daemonset.apps/kube-proxy 4 4 4 4 4 <none> 6d12h
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
glusterfs deployment.apps/heketi 1/1 1 0 2d20h
kube-system deployment.apps/coredns 2/2 2 2 6d12h
NAMESPACE NAME DESIRED CURRENT READY AGE
glusterfs replicaset.apps/heketi-7495cdc5fd 1 1 0 2d20h
kube-system replicaset.apps/coredns-86c58d9df4 2 2 2 6d12h
requested:
tasos#kubernetes-master-work:~$ kubectl logs -n glusterfs glusterfs-7nl8l
env variable is set. Update in gluster-blockd.service
Please check these similar topics:
GlusterFS deployment on k8s cluster-- Readiness probe failed: /usr/local/bin/status-probe.sh
and
https://github.com/gluster/gluster-kubernetes/issues/539
Check tcmu-runner.log log to debug it.
UPDATE:
I think it will be your issue:
https://github.com/gluster/gluster-kubernetes/pull/557
PR is prepared, but not merged.
UPDATE 2:
https://github.com/gluster/glusterfs/issues/417
Be sure that rpcbind is installed.
I have just started a new Kubernetes 1.8.0 environment using minikube (0.27) on Windows 10.
I followed this steps but it didn't work:
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/
When I list pods this is the result:
C:\WINDOWS\system32>kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-minikube 1/1 Running 0 23m
kube-system heapster-69b5d4974d-s9vrf 1/1 Running 0 5m
kube-system kube-addon-manager-minikube 1/1 Running 0 23m
kube-system kube-apiserver-minikube 1/1 Running 0 23m
kube-system kube-controller-manager-minikube 1/1 Running 0 23m
kube-system kube-dns-545bc4bfd4-xkt7l 3/3 Running 3 1h
kube-system kube-proxy-7jnk6 1/1 Running 0 23m
kube-system kube-scheduler-minikube 1/1 Running 0 23m
kube-system kubernetes-dashboard-5569448c6d-8zqnc 1/1 Running 2 52m
kube-system kubernetes-dashboard-869db7f6b4-ddlmq 0/1 CrashLoopBackOff 19 51m
kube-system monitoring-influxdb-78d4c6f5b6-b66m9 1/1 Running 0 4m
kube-system storage-provisioner 1/1 Running 2 1h
As you can see, I have 2 kubernets-dashboard pods now, one of then is running and the other one is CrashLookBackOff.
When I try to run minikube dashboard this is the result:
"Waiting, endpoint for service is not ready yet..."
I have tried to remove kubernetes-dashboard-869db7f6b4-ddlmq pod:
kubectl delete pod kubernetes-dashboard-869db7f6b4-ddlmq
This is the result:
"Error from server (NotFound): pods "kubernetes-dashboard-869db7f6b4-ddlmq" not found"
"Error from server (NotFound): pods "kubernetes-dashboard-869db7f6b4-ddlmq" not found"
You failed to delete the pod due to the lack of namespace (add -n kube-system). And it should be 1 dashboard pod if no modification's applied. If it still fails to run minikube dashboard after you delete the abnormal pod, more logs should be provided.
v1.8.2,installed by kubeadm
2 node:
NAME STATUS ROLES AGE VERSION
192-168-99-102.node Ready <none> 8h v1.8.2
192-168-99-108.master Ready master 8h v1.8.2
run nginx to test:
NAME READY STATUS RESTARTS AGE IP NODE
curl-6896d87888-smvjm 1/1 Running 0 7h 10.244.1.99 192-168-99-102.node
nginx-fbb985966-5jbxd 1/1 Running 0 7h 10.244.1.94 192-168-99-102.node
nginx-fbb985966-8vp9g 1/1 Running 0 8h 10.244.1.93 192-168-99-102.node
nginx-fbb985966-9bqzh 1/1 Running 1 7h 10.244.0.85 192-168-99-108.master
nginx-fbb985966-fd22h 1/1 Running 1 7h 10.244.0.83 192-168-99-108.master
nginx-fbb985966-lmgmf 1/1 Running 0 7h 10.244.1.98 192-168-99-102.node
nginx-fbb985966-lr2rh 1/1 Running 0 7h 10.244.1.96 192-168-99-102.node
nginx-fbb985966-pm2p7 1/1 Running 0 7h 10.244.1.97 192-168-99-102.node
nginx-fbb985966-t6d8b 1/1 Running 0 7h 10.244.1.95 192-168-99-102.node
kubectl exec pod on master is OK!
but when i exec pod on other node,return a error:
kubectl exec -it nginx-fbb985966-8vp9g bash
error: unable to upgrade connection: pod does not exist