CockroachDB distributed workload on all nodes - kubernetes

I've deployed a CockroachDB cluster on Kubernetes using this guide:
https://github.com/cockroachlabs-field/kubernetes-examples/blob/master/SECURE.md
I deployed it with
$ helm install k8crdb --set Secure.Enabled=true cockroachdb/cockroachdb --namespace=thesis-crdb
Here is how it looks when I list it with $ helm list --namespace=thesis-crdb
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
k8crdb thesis-crdb 1 2021-01-29 20:18:25.5710691 +0100 CET deployed cockroachdb-5.0.4 20.2.4
Here is how it looks when I list it with $ kubectl get all --namespace=thesis-crdb
NAME READY STATUS RESTARTS AGE
pod/k8crdb-cockroachdb-0 1/1 Running 0 3h1m
pod/k8crdb-cockroachdb-1 1/1 Running 0 3h1m
pod/k8crdb-cockroachdb-2 1/1 Running 0 3h1m
pod/k8crdb-cockroachdb-init-j2h7t 0/1 Completed 0 3h1m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/k8crdb-cockroachdb ClusterIP None <none> 26257/TCP,8080/TCP 3h1m
service/k8crdb-cockroachdb-public ClusterIP 10.99.163.201 <none> 26257/TCP,8080/TCP 3h1m
NAME READY AGE
statefulset.apps/k8crdb-cockroachdb 3/3 3h1m
NAME COMPLETIONS DURATION AGE
job.batch/k8crdb-cockroachdb-init 1/1 33s 3h1m
Now I wanna simulate traffic to this cluster. First I access the pod with: $ kubectl exec -i -t -n thesis-crdb k8crdb-cockroachdb-0 -c db "--" sh -c "clear; (bash || ash || sh)"
Which gets me inside the first pod/node.
From here I initiate the workload
[root#k8crdb-cockroachdb-0 cockroach]# cockroach workload init movr 'postgresql://root#localhost:26257?sslmode=disable'
And then I run the workload for 5 minutes
[root#k8crdb-cockroachdb-0 cockroach]# cockroach workload run movr --duration=5m 'postgresql://root#localhost:26257?sslmode=disable'
I am aware that I'm running the workload on one node, but I was under the expression that the workload would be distributed among all nodes? Because when I monitor the performance with the cockroachDB console I see that it's only the first node that is doing all the work, and the other nodes are idle.
As you can see the second (and third node) haven't had any workload at all. Is this just a visual glitch in the console? Or how can I run the workload so it get distributed evenly among all nodes in the cluster?
-UPDATE-
Yes, glad you brought up the cockroachdb-client-secure pod, because that's where I no longer could follow the guide. I tried as they did in the guide by doing: $ curl https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/client-secure.yaml | sed -e 's/serviceAccountName\: cockroachdb/serviceAccountName\: k8crdb-cockroachdb/g' | kubectl create -f -
But it throws this error:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1638 100 1638 0 0 4136 0 --:--:-- --:--:-- --:--:-- 4146
Error from server (Forbidden): error when creating "STDIN": pods "cockroachdb-client-secure" is forbidden: error looking up service account default/k8crdb-cockroachdb: serviceaccount "k8crdb-cockroachdb" not found
I also don't know if my certificates have been approved, because when I try this:
$ kubectl get csr k8crdb-cockroachdb-0 --namespace=thesis-crdb
I throws this:
Error from server (NotFound): certificatesigningrequests.certificates.k8s.io "k8crdb-cockroachdb-0" not found
And when I try to approve certificate: $ kubectl certificate approve k8crdb-cockroachdb-0 --namespace=thesis-crdb
It throws:
Error from server (NotFound): certificatesigningrequests.certificates.k8s.io "k8crdb-cockroachdb-0" not found
Any idea how to proceed from here?

This is not a glitch. Nodes will only receive SQL traffic if clients connect to them and issue SQL statements. It seems like you're running the workload by logging in to one of the cockroach pods and directing it to connect to that pod on its local port. That means only that pod is going to receive queries. The cockroach workload subcommand takes an arbitrary number of pgurl strings and will balance load over all of them. Note also that k8crdb-cockroachdb-public represents a load-balancer over all o
If you look at the guide you posted, it continues to describe how to deploy the cockroachdb-client-secure pod. Th If you were to run the workload there pointed at the load balancer, with something like:
'postgres://root#k8crdb-cockroachdb-public?sslcert=cockroach-certs%2Fclient.root.crt&sslkey=cockroach-certs%2Fclient.root.key&sslrootcert=cockroach-certs%2Fca.crt&sslmode=verify-full'
UPDATE
I'm not an expert in the k8s here but I think your issue creating the client pod relates to the namespace. It's currently assuming that everything is in the default namespace but it appears that you're working in the --namespace=thesis-crdb. Consider adding a namespace flag to the kubectl create -f - command. Or, potentially consider setting the namespace for the session:
kubectl config set-context --current --namespace=thesis-crdb

Related

Does `kubectl log deployment/name` get all pods or just one pod?

I need to see the logs of all the pods in a deployment with N worker pods
When I do kubectl logs deployment/name --tail=0 --follow the command syntax makes me assume that it will tail all pods in the deployment
However when I go to process I don't see any output as expected until I manually view the logs for all N pods in the deployment
Does kubectl log deployment/name get all pods or just one pod?
Yes, if you run kubectl logs with a deployment, it will return the logs of only one pod from the deployment.
However, you can accomplish what you are trying to achieve by using the -l flag to return the logs of all pods matching a label.
For example, let's say you create a deployment using:
kubectl create deployment my-dep --image=nginx --replicas=3
Each of the pods gets a label app=my-dep, as seen here:
$ kubectl get pods -l app=my-dep
NAME READY STATUS RESTARTS AGE
my-dep-6d4ddbf4f7-8jnsw 1/1 Running 0 6m36s
my-dep-6d4ddbf4f7-9jd7g 1/1 Running 0 6m36s
my-dep-6d4ddbf4f7-pqx2w 1/1 Running 0 6m36s
So, if you want to get the combined logs of all pods in this deployment you can use this command:
kubectl logs -l app=my-dep
only one pod seems to be the answer.
i went here How do I get logs from all pods of a Kubernetes replication controller? and it seems that the command kubectl logs deployment/name only shows one pod of N
also when you do execute the kubectl logs on a deployment it does say it only print to console that it is for one pod (not all the pods)

gcloud - BrokerCell cloud-run-events/default is not ready

I am trying to use google cloud for my pubsub event driven application.
Currently, I am setting up Cloud Run for Anthos following the below tutorials
https://codelabs.developers.google.com/codelabs/cloud-run-events-anthos#7
https://cloud.google.com/anthos/run/archive/docs/events/cluster-configuration
I have created the GKE clusters. It is successful and is up and running.
However, I am getting the below error when I try to create event broker.
$ gcloud beta events brokers create default --namespace default
X Creating Broker... BrokerCell cloud-run-events/default is not ready
- Creating Broker...
Failed.
ERROR: gcloud crashed (TransportError): HTTPSConnectionPool(host='oauth2.googleapis.com', port=443): Max retries exceeded with url: /token (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))
gcloud's default CA certificates failed to verify your connection, which can happen if you are behind a proxy or firewall.
To use a custom CA certificates file, please run the following command:
gcloud config set core/custom_ca_certs_file /path/to/ca_certs
However, When I rerun the command, it shows broker already exists
$ gcloud beta events brokers create default --namespace default
ERROR: (gcloud.beta.events.brokers.create) Broker [default] already exists.
Checking the status of broker, it shows BrokerCellNotReady
$ kubectl get broker -n default
NAME URL AGE READY REASON
default http://default-brokercell-ingress.cloud-run-events.svc.cluster.local/default/default 39m Unknown BrokerCellNotReady
And I am getting status pending for default-brokercell-fanout pod.
$ kubectl get pods -n cloud-run-events
NAME READY STATUS RESTARTS AGE
controller-648c495796-b5ccb 1/1 Running 0 105m
default-brokercell-fanout-855494bb9b-2c7zv 0/1 Pending 0 100m
default-brokercell-ingress-5f8cdc6467-wwq42 1/1 Running 0 100m
default-brokercell-retry-6f4f9696d6-tg898 1/1 Running 0 100m
webhook-85f7bc69b4-qrpck 1/1 Running 0 109m
I couldn't find any discussion related to this error.
Please give me some ideas to resolve this issue.
I encountered the same issue. The reason might be the given cluster setup does not have enough CPU resources.
You can check it by
kubectl describe pod/default-brokercell-retry-6f4f9696d6-tg898 -n cloud-run-events
If the output is
then that's the reason.
After knowing the root cause, you can fix it in various ways, e.g., enable auto-scaling in your node pool.

fail to run istio-ingressgateway, got Readiness probe failed: connection refused

I fail to deploy istio and met this problem. When I tried to deploy istio using istioctl install --set profile=default -y. The output is like:
➜ istio-1.11.4 istioctl install --set profile=default -y
✔ Istio core installed
✔ Istiod installed
✘ Ingress gateways encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
Deployment/istio-system/istio-ingressgateway (containers with unready status: [istio-proxy])
- Pruning removed resources Error: failed to install manifests: errors occurred during operation
After running kubectl get pods -n=istio-system, I found the pod of istio-ingressgateway was created, and the result of describe:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m36s default-scheduler Successfully assigned istio-system/istio-ingressgateway-8dbb57f65-vc85p to k8s-slave
Normal Pulled 4m35s kubelet Container image "docker.io/istio/proxyv2:1.11.4" already present on machine
Normal Created 4m35s kubelet Created container istio-proxy
Normal Started 4m35s kubelet Started container istio-proxy
Warning Unhealthy 3m56s (x22 over 4m34s) kubelet Readiness probe failed: Get "http://10.244.1.4:15021/healthz/ready": dial tcp 10.244.1.4:15021: connect: connection refused
And I can't get the log of this pod:
➜ ~ kubectl logs pods/istio-ingressgateway-8dbb57f65-vc85p -n=istio-system
Error from server: Get "https://192.168.0.154:10250/containerLogs/istio-system/istio-ingressgateway-8dbb57f65-vc85p/istio-proxy": dial tcp 192.168.0.154:10250: i/o timeout
I run all this command on two VM in Huawei Cloud, with a 2C8G master and a 2C4G slave in ubuntu18.04. I have reinstall the environment and the kubernetes cluster, but that doesn't help.
Without ingressgateway
I also tried istioctl install --set profile=minimal -y that only run istiod. But when I try to run httpbin(kubectl apply -f samples/httpbin/httpbin.yaml) with auto injection on, the deployment can't create pod.
➜ istio-1.11.4 kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
httpbin 0/1 0 0 5m24s
➜ istio-1.11.4 kubectl describe deployment/httpbin
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 6m6s deployment-controller Scaled up replica set httpbin-74fb669cc6 to 1
When I unlabel the default namespace(kubectl label namespace default istio-injection-), everything works fine.
I hope to deploy istio ingressgateway and run demo like istio-ingressgateway, but I have no idea to solve this situation. Thanks for any help.
I made a silly mistake Orz.
After communiation with my cloud provider, I was informed that there was a network security policy of my cloud server. It's strange that one server has full access and the other has partial access (which only allow for port like 80, 443 and so on). After I change the policy, everything works fine.
For someone who may meet the similar question, I found all these questions seem to come with network problems like dns configuration, k8s configuration or server network problem after hours of searching in google. Like what howardjohn said in this issue, this is not a istio problem.

Minikube got stuck when creating container

I recently got started to learn Kubernetes by using Minikube locally in my Mac. Previously, I was able to start a local Kubernetes cluster with Minikube 0.10.0, created a deployment and viewed Kubernetes dashboard.
Yesterday I tried to delete the cluster and re-did everything from scratch. However, I found I cannot get the assets deployed and cannot view the dashboard. From what I saw, everything seemed to get stuck during container creation.
After I ran minikube start, it reported
Starting local Kubernetes cluster...
Kubectl is now configured to use the cluster.
When I ran kubectl get pods --all-namespaces, it reported (pay attention to the STATUS column):
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system kube-addon-manager-minikube 0/1 ContainerCreating 0 51s
docker ps showed nothing:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
minikube status tells me the VM and cluster are running:
minikubeVM: Running
localkube: Running
If I tried to create a deployment and an autoscaler, I was told they were created successfully:
kubectl create -f configs
deployment "hello-minikube" created
horizontalpodautoscaler "hello-minikube-autoscaler" created
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default hello-minikube-661011369-1pgey 0/1 ContainerCreating 0 1m
default hello-minikube-661011369-91iyw 0/1 ContainerCreating 0 1m
kube-system kube-addon-manager-minikube 0/1 ContainerCreating 0 21m
When exposing the service, it said:
$ kubectl expose deployment hello-minikube --type=NodePort
service "hello-minikube" exposed
$ kubectl get service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-minikube 10.0.0.32 <nodes> 8080/TCP 6s
kubernetes 10.0.0.1 <none> 443/TCP 22m
When I tried to access the service, I was told:
curl $(minikube service hello-minikube --url)
Waiting, endpoint for service is not ready yet...
docker ps still showed nothing. It looked to me everything got stuck when creating a container. I tried some other ways to work around this issue:
Upgraded to minikube 0.11.0
Use the xhyve driver instead of the Virtualbox driver
Delete everything cached, like ~/.minikube, ~/.kube, and the cluster, and re-try
None of them worked for me.
Kubernetes is still new to me and I would like to know:
How can I troubleshoot this kind of issue?
What could be the cause of this issue?
Any help is appreciated. Thanks.
It turned out to be a network problem in my case.
The pod status is "ContainerCreating", and I found during container creation, docker image will be pulled from gcr.io, which is inaccessible in China (blocked by GFW). Previous time it worked for me because I happened to connect to it via a VPN.
I didn't try minikube but I use kubernetes. With the information provided it is difficult to say the cause of the issue. Your minikube has no problem in creating resources but ContainerCreating is a problem related to docker daemon or improper communication between kube-api and docker daemon or some problem with kubelet.
You can try the following command:
kubectl describe po POD_NAME
This will give you the POD's events. Maybe this will provide a path to the root cause of issue.
You may also check the logs of kubelet to get the events.
I had this problem on Windows, but it was related to an NTLM proxy. I deleted the minikube VM then recreated it with the correct proxy settings for my CNTLM installation:
minikube start \
--docker-env http_proxy=http://10.0.2.2:3128 \
--docker-env https_proxy=http://10.0.2.2:3128 \
--docker-env no_proxy=localhost,127.0.0.1,::1,192.168.99.100
See https://blog.alexellis.io/minikube-behind-proxy/
The horizontalpodautoscaler (hpa) requires heapster to use. You'll need to run heapster in minikube for that to work. You can always debug these kinds of issues with minikube logs or interactively through the dashboard found at minikube dashboard.
You can find the steps to run heapster and grafana at https://github.com/kubernetes/heapster
For me, it takes several minutes before I see the ContainerCreating problem. After executing the following command:
systemctl status kube-controller-manager.service
I get this error:
Sync "default/redis-master-2229813293" failed with unable to create pods: No API token found for service account "default", retry after the token is automatically created and added to the service account.
There are two ways to solve this:
Set the service account with token
Remove the ServiceAccount setting of KUBE_ADMISSION_CONTROL in api-server

A pending status after command "kubectl create -f busybox.yaml"

I have a picture below of my mac.
K8S Cluster(on VirtualBox, 1*master, 2*workers)
OS Ubuntu 15.04
K8S version 1.1.1
When I try to create a pod "busybox.yaml" it goes to pending status.
How can I resolve it?
I pasted the online status below for understanding with a picture (kubectl describe node).
Status
kubectl get nodes
192.168.56.11 kubernetes.io/hostname=192.168.56.11 Ready 7d
192.168.56.12 kubernetes.io/hostname=192.168.56.12 Ready 7d
kubectl get ev
1h 39s 217 busybox Pod FailedScheduling {scheduler } no nodes available to schedule pods
kubectl get pods
NAME READY STATUS RESTARTS AGE
busybox 0/1 Pending 0 1h
And I also added one more status.
"kubectl describe pod busybox" or "kubectl get pod busybox -o yaml" output could be useful.
Since you didn't specify, I assume that the busybox pod was created in the default namespace, and that no resource requirements nor nodeSelectors were specified.
In many cluster setups, including vagrant, we create a LimitRange for the default namespace to request a nominal amount of CPU for each pod (.1 cores). You should be able to confirm that this is the case using "kubectl get pod busybox -o yaml".
We also create a number of system pods automatically. You should be able to see them using "kubectl get pods --all-namespaces -o wide".
It is possible for nodes with sufficiently small capacity to fill up with just system pods, though I wouldn't expect this to happen with 2-core nodes.
If the busybox pod were created before the nodes were registered, that could be another reason for that event, though I would expect to see a subsequent event for the reason that the pod remained pending even after nodes were created.
Please take a look at the troubleshooting guide for more troubleshooting tips, and follow up here on on slack (slack.k8s.io) with more information.
http://kubernetes.io/v1.1/docs/troubleshooting.html