Failed to open topo server on vitess with etcd - kubernetes-helm

I'm running a simple example with Helm. Take a look below at values.yaml file:
cat << EOF | helm install helm/vitess -n vitess -f -
topology:
cells:
- name: 'zone1'
keyspaces:
- name: 'vitess'
shards:
- name: '0'
tablets:
- type: 'replica'
vttablet:
replicas: 1
mysqlProtocol:
enabled: true
authType: secret
username: vitess
passwordSecret: vitess-db-password
etcd:
replicas: 3
vtctld:
replicas: 1
vtgate:
replicas: 3
vttablet:
dataVolumeClaimSpec:
storageClassName: nfs-slow
EOF
Take a look at the output of current pods running below:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-fb8b8dccf-8f5kt 1/1 Running 0 32m
kube-system coredns-fb8b8dccf-qbd6c 1/1 Running 0 32m
kube-system etcd-master1 1/1 Running 0 32m
kube-system kube-apiserver-master1 1/1 Running 0 31m
kube-system kube-controller-manager-master1 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-bkg9z 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-q8vh4 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-vqmnz 1/1 Running 0 32m
kube-system kube-proxy-bd8mf 1/1 Running 0 32m
kube-system kube-proxy-nlc2b 1/1 Running 0 32m
kube-system kube-proxy-x7cd5 1/1 Running 0 32m
kube-system kube-scheduler-master1 1/1 Running 0 32m
kube-system tiller-deploy-8458f6c667-cx2mv 1/1 Running 0 27m
vitess etcd-global-6pwvnv29th 0/1 Init:0/1 0 16m
vitess etcd-operator-84db9bc774-j4wml 1/1 Running 0 30m
vitess etcd-zone1-zwgvd7spzc 0/1 Init:0/1 0 16m
vitess vtctld-86cd78b6f5-zgfqg 0/1 CrashLoopBackOff 7 16m
vitess vtgate-zone1-58744956c4-x8ms2 0/1 CrashLoopBackOff 7 16m
vitess zone1-vitess-0-init-shard-master-mbbph 1/1 Running 0 16m
vitess zone1-vitess-0-replica-0 0/6 Init:CrashLoopBackOff 7 16m
Running logs I see this error:
$ kubectl logs -n vitess vtctld-86cd78b6f5-zgfqg
++ cat
+ eval exec /vt/bin/vtctld '-cell="zone1"' '-web_dir="/vt/web/vtctld"' '-web_dir2="/vt/web/vtctld2/app"' -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 '-service_map="grpc-vtctl"' '-topo_implementation="etcd2"' '-topo_global_server_address="etcd-global-client.vitess:2379"' -topo_global_root=/vitess/global
++ exec /vt/bin/vtctld -cell=zone1 -web_dir=/vt/web/vtctld -web_dir2=/vt/web/vtctld2/app -workflow_manager_init -workflow_manager_use_election -logtostderr=true -stderrthreshold=0 -port=15000 -grpc_port=15999 -service_map=grpc-vtctl -topo_implementation=etcd2 -topo_global_server_address=etcd-global-client.vitess:2379 -topo_global_root=/vitess/global
ERROR: logging before flag.Parse: E0422 02:35:34.020928 1 syslogger.go:122] can't connect to syslog
F0422 02:35:39.025400 1 server.go:221] Failed to open topo server (etcd2,etcd-global-client.vitess:2379,/vitess/global): grpc: timed out when dialing
I'm running behind vagrant with 1 master and 2 nodes. I suspect that is a issue with eth1.
The storage are configured to use NFS.
$ kubectl logs etcd-operator-84db9bc774-j4wml
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-zone1 cluster-namespace=vitess pkg=cluster
time="2019-04-22T17:26:51Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-zone1-zwgvd7spzc])" cluster-name=etcd-global cluster-namespace=vitess pkg=cluster

It appears that etcd is not fully initializing. Note that neither the pod for the global lockserver (etcd-global-6pwvnv29th) nor the local one for cell zone1 (pod etcd-zone1-zwgvd7spzc) are ready.

Related

Debezium with kafka connetor for the mssql connector class in kubernetes

I facing this error
status:
conditions:
- lastTransitionTime: "2022-07-04T14:09:43.687887Z"
message: 'GET /connectors/debezium-connector-mssql/topics returned 404 (Not Found):
Unexpected status code'
reason: ConnectRestException
status: "True"
type: NotReady
observedGeneration: 3
tasksMax: 1
topics: []
this is my pods status
:~/.ssh$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kafka my-cluster-entity-operator-77bcc4b67f-8qwbp 3/3 Running 0 3d19h
kafka my-cluster-kafka-0 1/1 Running 0 3d19h
kafka my-cluster-zookeeper-0 1/1 Running 0 3d19h
kafka my-connect-cluster-connect-f4b6ccc55-s5cxz 1/1 Running 0 3d19h
kafka strimzi-cluster-operator-86864b86d5-rq9pp 1/1 Running 0 3d19h
kube-system coredns-6d4b75cb6d-9gjxx 1/1 Running 0 3d19h
kube-system etcd-minikube 1/1 Running 0 3d19h
kube-system kube-apiserver-minikube 1/1 Running 0 3d19h
kube-system kube-controller-manager-minikube 1/1 Running 0 3d19h
kube-system kube-proxy-d6gl6 1/1 Running 0 3d19h
kube-system kube-scheduler-minikube 1/1 Running 0 3d19h
kube-system storage-provisioner 1/1 Running 1 (3d19h ago) 3d19h
And I logged in as a interactive terminal from the pods
$ kubectl exec -it my-cluster-kafka-0 bash -n kafka
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[kafka#my-cluster-kafka-0 kafka]$ curl my-connect-cluster-connect-api:8083/connectors/debezium-connector-mssql/status
{"name":"debezium-connector-mssql","connector":{"state":"RUNNING","worker_id":"172.17.0.7:8083"},
"tasks":[{"id":0,"state":"FAILED","worker_id":"172.17.0.7:8083","trace":"org.apache.kafka.connect.errors.ConnectException: Error configuring an instance of SqlServerConnectorTask; check the logs for details\n\tat io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:91)\n\tat org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:208)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"}],"type":"source"}
Facing error with configuring an instance of SqlServerConnectorTask

kubectl does not output the logs

I print all of my Pods with:
$ kubectl get pods --all-namespaces
and the output is:
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-7487d7f956-hx4fp 1/1 Running 0 88m
calico-system calico-node-vn52p 1/1 Running 0 88m
calico-system calico-typha-7588984c44-m6tsz 1/1 Running 0 88m
gitlab-managed-apps install-ingress 0/1 Error 0 14m********
gitlab-managed-apps install-prometheus 0/1 Error 0 12m
kube-system coredns-f9fd979d6-2n2pg 1/1 Running 0 91m
kube-system coredns-f9fd979d6-sq9bl 1/1 Running 0 91m
kube-system etcd-tuoputuo-iamnotstone-server 1/1 Running 0 91m
kube-system kube-apiserver-tuoputuo-iamnotstone-server 1/1 Running 0 91m
kube-system kube-controller-manager-tuoputuo-iamnotstone-server 1/1 Running 0 91m
kube-system kube-proxy-87jkr 1/1 Running 0 91m
kube-system kube-scheduler-tuoputuo-iamnotstone-server 1/1 Running 0 91m
tigera-operator tigera-operator-58f56c4958-4x9tp 1/1 Running 0 89m
But when I execute the logs command:
$ kubectl logs -f install-ingress
I see this error
Error from server (NotFound): pods "install-ingress" not found
The install-ingress pod is in gitlab-managed-apps namespace. If you do not specify namespace in the kubectl command then it will search for the pod in default namespace where the install-ingress pod is not present.
Could you try below command (specifying the namespace of the pod).
kubectl logs -f install-ingress -n gitlab-managed-apps

Gluster cluster in Kubernetes: Glusterd inactive (dead) after node reboot. How to debug?

I don't know what to do to debug it. I have 1 Kubernetes master node and three slave nodes. I have deployed on the three nodes a Gluster cluster just fine with this guide https://github.com/gluster/gluster-kubernetes/blob/master/docs/setup-guide.md.
I created volumes and everything is working. But when I reboot a slave node, and the node reconnects to the master node, the glusterd.service inside the slave node shows up dead and nothing works after this.
[root#kubernetes-node-1 /]# systemctl status glusterd.service
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/usr/lib/systemd/system/glusterd.service; enabled; vendor preset: disabled)
Active: inactive (dead)
I don't know what to do from here, for example /var/log/glusterfs/glusterd.log has been updated last time 3 days ago (it's not being updated with errors after a reboot or a pod deletion+recreation).
I just want to know where glusterd crashes so I can find out why.
How can I debug this crash?
All the nodes (master + slaves) run on Ubuntu Desktop 18 64 bit LTS Virtualbox VMs.
requested logs (kubectl get all --all-namespaces):
NAMESPACE NAME READY STATUS RESTARTS AGE
glusterfs pod/glusterfs-7nl8l 0/1 Running 62 22h
glusterfs pod/glusterfs-wjnzx 1/1 Running 62 2d21h
glusterfs pod/glusterfs-wl4lx 1/1 Running 112 41h
glusterfs pod/heketi-7495cdc5fd-hc42h 1/1 Running 0 22h
kube-system pod/coredns-86c58d9df4-n2hpk 1/1 Running 0 6d12h
kube-system pod/coredns-86c58d9df4-rbwjq 1/1 Running 0 6d12h
kube-system pod/etcd-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-apiserver-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-controller-manager-kubernetes-master-work 1/1 Running 0 6d12h
kube-system pod/kube-flannel-ds-amd64-785q8 1/1 Running 5 3d19h
kube-system pod/kube-flannel-ds-amd64-8sj2z 1/1 Running 8 3d19h
kube-system pod/kube-flannel-ds-amd64-v62xb 1/1 Running 0 3d21h
kube-system pod/kube-flannel-ds-amd64-wx4jl 1/1 Running 7 3d21h
kube-system pod/kube-proxy-7f6d9 1/1 Running 5 3d19h
kube-system pod/kube-proxy-7sf9d 1/1 Running 0 6d12h
kube-system pod/kube-proxy-n9qxq 1/1 Running 8 3d19h
kube-system pod/kube-proxy-rwghw 1/1 Running 7 3d21h
kube-system pod/kube-scheduler-kubernetes-master-work 1/1 Running 0 6d12h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6d12h
elastic service/glusterfs-dynamic-9ad03769-2bb5-11e9-8710-0800276a5a8e ClusterIP 10.98.38.157 <none> 1/TCP 2d19h
elastic service/glusterfs-dynamic-a77e02ca-2bb4-11e9-8710-0800276a5a8e ClusterIP 10.97.203.225 <none> 1/TCP 2d19h
elastic service/glusterfs-dynamic-ad16ed0b-2bb6-11e9-8710-0800276a5a8e ClusterIP 10.105.149.142 <none> 1/TCP 2d19h
glusterfs service/heketi ClusterIP 10.101.79.224 <none> 8080/TCP 2d20h
glusterfs service/heketi-storage-endpoints ClusterIP 10.99.199.190 <none> 1/TCP 2d20h
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 6d12h
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
glusterfs daemonset.apps/glusterfs 3 3 0 3 0 storagenode=glusterfs 2d21h
kube-system daemonset.apps/kube-flannel-ds-amd64 4 4 4 4 4 beta.kubernetes.io/arch=amd64 3d21h
kube-system daemonset.apps/kube-flannel-ds-arm 0 0 0 0 0 beta.kubernetes.io/arch=arm 3d21h
kube-system daemonset.apps/kube-flannel-ds-arm64 0 0 0 0 0 beta.kubernetes.io/arch=arm64 3d21h
kube-system daemonset.apps/kube-flannel-ds-ppc64le 0 0 0 0 0 beta.kubernetes.io/arch=ppc64le 3d21h
kube-system daemonset.apps/kube-flannel-ds-s390x 0 0 0 0 0 beta.kubernetes.io/arch=s390x 3d21h
kube-system daemonset.apps/kube-proxy 4 4 4 4 4 <none> 6d12h
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
glusterfs deployment.apps/heketi 1/1 1 0 2d20h
kube-system deployment.apps/coredns 2/2 2 2 6d12h
NAMESPACE NAME DESIRED CURRENT READY AGE
glusterfs replicaset.apps/heketi-7495cdc5fd 1 1 0 2d20h
kube-system replicaset.apps/coredns-86c58d9df4 2 2 2 6d12h
requested:
tasos#kubernetes-master-work:~$ kubectl logs -n glusterfs glusterfs-7nl8l
env variable is set. Update in gluster-blockd.service
Please check these similar topics:
GlusterFS deployment on k8s cluster-- Readiness probe failed: /usr/local/bin/status-probe.sh
and
https://github.com/gluster/gluster-kubernetes/issues/539
Check tcmu-runner.log log to debug it.
UPDATE:
I think it will be your issue:
https://github.com/gluster/gluster-kubernetes/pull/557
PR is prepared, but not merged.
UPDATE 2:
https://github.com/gluster/glusterfs/issues/417
Be sure that rpcbind is installed.

Kubernetes dashboard (web UI) not working

I have just started a new Kubernetes 1.8.0 environment using minikube (0.27) on Windows 10.
I followed this steps but it didn't work:
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/
When I list pods this is the result:
C:\WINDOWS\system32>kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-minikube 1/1 Running 0 23m
kube-system heapster-69b5d4974d-s9vrf 1/1 Running 0 5m
kube-system kube-addon-manager-minikube 1/1 Running 0 23m
kube-system kube-apiserver-minikube 1/1 Running 0 23m
kube-system kube-controller-manager-minikube 1/1 Running 0 23m
kube-system kube-dns-545bc4bfd4-xkt7l 3/3 Running 3 1h
kube-system kube-proxy-7jnk6 1/1 Running 0 23m
kube-system kube-scheduler-minikube 1/1 Running 0 23m
kube-system kubernetes-dashboard-5569448c6d-8zqnc 1/1 Running 2 52m
kube-system kubernetes-dashboard-869db7f6b4-ddlmq 0/1 CrashLoopBackOff 19 51m
kube-system monitoring-influxdb-78d4c6f5b6-b66m9 1/1 Running 0 4m
kube-system storage-provisioner 1/1 Running 2 1h
As you can see, I have 2 kubernets-dashboard pods now, one of then is running and the other one is CrashLookBackOff.
When I try to run minikube dashboard this is the result:
"Waiting, endpoint for service is not ready yet..."
I have tried to remove kubernetes-dashboard-869db7f6b4-ddlmq pod:
kubectl delete pod kubernetes-dashboard-869db7f6b4-ddlmq
This is the result:
"Error from server (NotFound): pods "kubernetes-dashboard-869db7f6b4-ddlmq" not found"
"Error from server (NotFound): pods "kubernetes-dashboard-869db7f6b4-ddlmq" not found"
You failed to delete the pod due to the lack of namespace (add -n kube-system). And it should be 1 dashboard pod if no modification's applied. If it still fails to run minikube dashboard after you delete the abnormal pod, more logs should be provided.

kubectl cannot exec or logs pod on other node

v1.8.2,installed by kubeadm
2 node:
NAME STATUS ROLES AGE VERSION
192-168-99-102.node Ready <none> 8h v1.8.2
192-168-99-108.master Ready master 8h v1.8.2
run nginx to test:
NAME READY STATUS RESTARTS AGE IP NODE
curl-6896d87888-smvjm 1/1 Running 0 7h 10.244.1.99 192-168-99-102.node
nginx-fbb985966-5jbxd 1/1 Running 0 7h 10.244.1.94 192-168-99-102.node
nginx-fbb985966-8vp9g 1/1 Running 0 8h 10.244.1.93 192-168-99-102.node
nginx-fbb985966-9bqzh 1/1 Running 1 7h 10.244.0.85 192-168-99-108.master
nginx-fbb985966-fd22h 1/1 Running 1 7h 10.244.0.83 192-168-99-108.master
nginx-fbb985966-lmgmf 1/1 Running 0 7h 10.244.1.98 192-168-99-102.node
nginx-fbb985966-lr2rh 1/1 Running 0 7h 10.244.1.96 192-168-99-102.node
nginx-fbb985966-pm2p7 1/1 Running 0 7h 10.244.1.97 192-168-99-102.node
nginx-fbb985966-t6d8b 1/1 Running 0 7h 10.244.1.95 192-168-99-102.node
kubectl exec pod on master is OK!
but when i exec pod on other node,return a error:
kubectl exec -it nginx-fbb985966-8vp9g bash
error: unable to upgrade connection: pod does not exist