HTTPError 400 while deploying production-ready GitLab on Google Kubernetes Engine - kubernetes

I'm following the official Tutorial for Deploying production-ready GitLab on Google Kubernetes Engine.
The step Create the PostgreSQL instance and database: 1. Create the Cloud SQL database that GitLab will use to store most of its metadata gave me the Error:
gcloud beta sql instances create gitlab-db --network default \
--database-version=POSTGRES_9_6 --cpu 4 --memory 15 --no-assign-ip \
--storage-auto-increase --zone us-central1-a
ERROR: (gcloud.beta.sql.instances.create) HTTPError 400: Invalid request: Project {here_stands_my_correct_Project_ID} has invalid private network
name https://compute.googleapis.com/compute/v1/projects/{here_stands_my_correct_Project_ID}/global/networks/default.
Any ideas, thank you?
EDIT: I used the following command and edited manually the gilab-db to Private IP with attached Network (default) in the Console getting a 503 Error at the end of the the tutorial.
gcloud beta sql instances create gitlab-db --database-version=POSTGRES_9_6 --cpu 4 --memory 15 --storage-auto-increase --zone us-central1-a
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
gitlab-certmanager-788c6859c6-szqqm 1/1 Running 0 28m
gitlab-gitaly-0 0/1 Pending 0 28m
gitlab-gitlab-runner-6cfb858756-l8gxr 0/1 CrashLoopBackOff 6 28m
gitlab-gitlab-shell-6cc87fcd4c-2mqph 1/1 Running 0 28m
gitlab-gitlab-shell-6cc87fcd4c-jvp8n 1/1 Running 0 27m
gitlab-issuer.1-cx8tm 0/1 Completed 0 28m
gitlab-nginx-ingress-controller-5f486c5f7b-md8rj 1/1 Running 0 28m
gitlab-nginx-ingress-controller-5f486c5f7b-rps6m 1/1 Running 0 28m
gitlab-nginx-ingress-controller-5f486c5f7b-xc8fv 1/1 Running 0 28m
gitlab-nginx-ingress-default-backend-7f87d67c8-6xhhz 1/1 Running 0 28m
gitlab-nginx-ingress-default-backend-7f87d67c8-7w2s2 1/1 Running 0 28m
gitlab-registry-8dfc8f979-9hdbr 0/1 Init:0/2 0 28m
gitlab-registry-8dfc8f979-qr5nd 0/1 Init:0/2 0 27m
gitlab-sidekiq-all-in-1-88f47878-26nh8 0/1 Init:CrashLoopBackOff 7 28m
gitlab-task-runner-74fc4ccdb9-pm592 1/1 Running 0 28m
gitlab-unicorn-5b74ffdff8-4kkj4 0/2 Init:CrashLoopBackOff 7 28m
gitlab-unicorn-5b74ffdff8-nz662 0/2 Init:CrashLoopBackOff 7 27m
kube-state-metrics-57b88466db-h7xkj 1/1 Running 0 27m
node-exporter-q4bpv 1/1 Running 0 27m
node-exporter-x8mtj 1/1 Running 0 27m
node-exporter-xrdlv 1/1 Running 0 27m
prometheus-k8s-5cf4c4cf6c-hsntr 2/2 Running 1 27m

Possibly this is because it's still in beta and not all features and/or options are working correctly.
I can advice that you check if you have just one network available.
You can do that by using gcloud compute networks list.
$ gcloud compute networks list
NAME SUBNET_MODE BGP_ROUTING_MODE IPV4_RANGE GATEWAY_IPV4
default AUTO REGIONAL
If you see only default network then there is no need to worry about providing the --network flag at all.
Also form what it looks like the instance will need to use an IP either Public or Private so you can leave out the flag --no-assign-ip.
Working command might look like this:
gcloud beta sql instances create gitlab-db --database-version=POSTGRES_9_6 --cpu 4 --memory 15 --storage-auto-increase --zone us-central1-a
You can read the docs about the flags and usage on gcloud beta sql instances create

Related

How to debug GKE internal network issue?

UPDATE 1:
Some more logs from api-servers:
https://gist.github.com/nvcnvn/47df8798e798637386f6e0777d869d4f
This question is more about debugging method for current GKE but welcome for solution.
We're using GKE version 1.22.3-gke.1500 with following configuration:
We recently facing issue that commands like kubectl logs and exec doesn't work, deleting a namespace taking forever.
Checking some service inside the cluster, it seem for some reason some network operation just randomly failed. For example metric-server keep crashing with these error logs:
message: "pkg/mod/k8s.io/client-go#v0.19.10/tools/cache/reflector.go:156: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://10.97.0.1:443/api/v1/nodes?resourceVersion=387681528": net/http: TLS handshake timeout"
HTTP request timeout also:
unable to fully scrape metrics: unable to fully scrape metrics from node gke-staging-n2d-standard-8-78c35b3a-6h16: unable to fetch metrics from node gke-staging-n2d-standard-8-78c35b3a-6h16: Get "http://10.148.15.217:10255/stats/summary?only_cpu_and_memory=true": context deadline exceeded
and I also try to restart (by kubectl delete) most of the pod in this list:
kubectl get pod
NAME READY STATUS RESTARTS AGE
event-exporter-gke-5479fd58c8-snq26 2/2 Running 0 4d7h
fluentbit-gke-gbs2g 2/2 Running 0 4d7h
fluentbit-gke-knz2p 2/2 Running 0 85m
fluentbit-gke-ljw8h 2/2 Running 0 30h
gke-metadata-server-dtnvh 1/1 Running 0 4d7h
gke-metadata-server-f2bqw 1/1 Running 0 30h
gke-metadata-server-kzcv6 1/1 Running 0 85m
gke-metrics-agent-4g56c 1/1 Running 12 (3h6m ago) 4d7h
gke-metrics-agent-hnrll 1/1 Running 13 (13h ago) 30h
gke-metrics-agent-xdbrw 1/1 Running 0 85m
konnectivity-agent-87bc84bb7-g9nd6 1/1 Running 0 2m59s
konnectivity-agent-87bc84bb7-rkhhh 1/1 Running 0 3m51s
konnectivity-agent-87bc84bb7-x7pk4 1/1 Running 0 3m50s
konnectivity-agent-autoscaler-698b6d8768-297mh 1/1 Running 0 83m
kube-dns-77d9986bd5-2m8g4 4/4 Running 0 3h24m
kube-dns-77d9986bd5-z4j62 4/4 Running 0 3h24m
kube-dns-autoscaler-f4d55555-dmvpq 1/1 Running 0 83m
kube-proxy-gke-staging-n2d-standard-8-78c35b3a-8299 1/1 Running 0 11s
kube-proxy-gke-staging-n2d-standard-8-78c35b3a-fp5u 1/1 Running 0 11s
kube-proxy-gke-staging-n2d-standard-8-78c35b3a-rkdp 1/1 Running 0 11s
l7-default-backend-7db896cb4-mvptg 1/1 Running 0 83m
metrics-server-v0.4.4-fd9886cc5-tcscj 2/2 Running 82 33h
netd-5vpmc 1/1 Running 0 30h
netd-bhq64 1/1 Running 0 85m
netd-n6jmc 1/1 Running 0 4d7h
Some logs from metrics server
https://gist.github.com/nvcnvn/b77eb02705385889961aca33f0f841c7
if you cannot use kubectl to get info from your cluster, can you try to access them by using their restfull api
http://blog.madhukaraphatak.com/understanding-k8s-api-part-2/
try to delete "metric-server" pods or get logs from it using podman or curl command.

Debug a pod stuck in pending state [duplicate]

This question already has an answer here:
Error "pod has unbound immediate PersistentVolumeClaim" during statefulset deployment
(1 answer)
Closed 2 years ago.
How could I debug a pod stuck in pending state? I am using k8ssandra https://k8ssandra.io/docs/ to create a Cassandra cluster. It uses helm files. I created a 3 nodes cluster and changed size value to 3 in local values.yaml file to create a 3 node cluster - https://github.com/k8ssandra/k8ssandra/blob/main/charts/k8ssandra-cluster/values.yaml
no_reply#cloudshell:~ (k8ssandra-299315)$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cass-operator-86d4dc45cd-588c8 1/1 Running 0 29h
grafana-deployment-66557855cc-j7476 1/1 Running 0 29h
k8ssandra-cluster-a-grafana-operator-k8ssandra-5b89b64f4f-8pbxk 1/1 Running 0 29h
k8ssandra-cluster-a-reaper-k8ssandra-847c99ccd8-dsnj4 1/1 Running 0 28h
k8ssandra-cluster-a-reaper-k8ssandra-schema-5fzpn 0/1 Completed 0 28h
k8ssandra-cluster-a-reaper-operator-k8ssandra-87d56d56f-wn8hw 1/1 Running 0 29h
k8ssandra-dc1-default-sts-0 2/2 Running 0 29h
**k8ssandra-dc1-default-sts-1 0/2 Pending 0 14m**
k8ssandra-dc1-default-sts-2 2/2 Running 0 14m
k8ssandra-tools-kube-prome-operator-6bcdf668d4-ndhw9 1/1 Running 0 29h
prometheus-k8ssandra-cluster-a-prometheus-k8ssandra-0 2/2 Running 1 29h
The best way as described by Arghya is checking the events of the pod.
kubectl describe pod k8ssandra-dc1-default-sts-1
You could also check for the logs of the pod:
kubectl logs k8ssandra-dc1-default-sts-1

Delete all the pods created by applying Helm2.13.1

I'm new to Helm. I'm trying to deploy a simple server on the master node. When I do helm install and see the details using the command kubectl get po,svc I see lot of pods created other than the pods I intend to deploy.So, My precise questions are:
Why so many pods got created?
How do I delete all those pods?
Below is the output of the command kubectl get po,svc:
NAME READY STATUS RESTARTS AGE
pod/altered-quoll-stx-sdo-chart-6446644994-57n7k 1/1 Running 0 25m
pod/austere-garfish-stx-sdo-chart-5b65d8ccb7-jjxfh 1/1 Running 0 25m
pod/bald-hyena-stx-sdo-chart-9b666c998-zcfwr 1/1 Running 0 25m
pod/cantankerous-pronghorn-stx-sdo-chart-65f5699cdc-5fkf9 1/1 Running 0 25m
pod/crusty-unicorn-stx-sdo-chart-7bdcc67546-6d295 1/1 Running 0 25m
pod/exiled-puffin-stx-sdo-chart-679b78ccc5-n68fg 1/1 Running 0 25m
pod/fantastic-waterbuffalo-stx-sdo-chart-7ddd7b54df-p78h7 1/1 Running 0 25m
pod/gangly-quail-stx-sdo-chart-75b9dd49b-rbsgq 1/1 Running 0 25m
pod/giddy-pig-stx-sdo-chart-5d86844569-5v8nn 1/1 Running 0 25m
pod/hazy-indri-stx-sdo-chart-65d4c96f46-zmvm2 1/1 Running 0 25m
pod/interested-macaw-stx-sdo-chart-6bb7874bbd-k9nnf 1/1 Running 0 25m
pod/jaundiced-orangutan-stx-sdo-chart-5699d9b44b-6fpk9 1/1 Running 0 25m
pod/kindred-nightingale-stx-sdo-chart-5cf95c4d97-zpqln 1/1 Running 0 25m
pod/kissing-snail-stx-sdo-chart-854d848649-54m9w 1/1 Running 0 25m
pod/lazy-tiger-stx-sdo-chart-568fbb8d65-gr6w7 1/1 Running 0 25m
pod/nonexistent-octopus-stx-sdo-chart-5f8f6c7ff8-9l7sm 1/1 Running 0 25m
pod/odd-boxer-stx-sdo-chart-6f5b9679cc-5stk7 1/1 Running 1 15h
pod/orderly-chicken-stx-sdo-chart-7889b64856-rmq7j 1/1 Running 0 25m
pod/redis-697fb49877-x5hr6 1/1 Running 0 25m
pod/rv.deploy-6bbffc7975-tf5z4 1/2 CrashLoopBackOff 93 30h
pod/sartorial-eagle-stx-sdo-chart-767d786685-ct7mf 1/1 Running 0 25m
pod/sullen-gnat-stx-sdo-chart-579fdb7df7-4z67w 1/1 Running 0 25m
pod/undercooked-cow-stx-sdo-chart-67875cc5c6-mwvb7 1/1 Running 0 25m
pod/wise-quoll-stx-sdo-chart-5db8c766c9-mhq8v 1/1 Running 0 21m
You can run the command helm ls to see all the deployed helm releases in your cluster.
To remove the release (and every resource it created, including the pods), run: helm delete RELEASE_NAME --purge.
If you want to delete all the pods in your namespace without your Helm release (I DON'T think this is what you're looking for), you can run: kubectl delete pods --all.
On a side note, if you're new to Helm, consider starting with Helm v3 since it has many improvements, and specially because the migration from v2 to v3 can become cumbersome, and if you can avoid it - you should.

Kubernetes can't access pod in multi worker nodes

I was following a tutorial on youtube and the guy said that if you deploy your application in a multi-cluster setup and if your service is of type NodePort, you don't have to worry from where your pod gets scheduled. You can access it with different node IP address like
worker1IP:servicePort or worker2IP:servicePort or workerNIP:servicePort
But I tried just now and this is not the case, I can only access the pod on the node from where it is scheduled and deployed. Is it correct behavior?
kubectl version --short
> Client Version: v1.18.5
> Server Version: v1.18.5
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-6pt8s 0/1 Running 288 7d22h
coredns-66bff467f8-t26x4 0/1 Running 288 7d22h
etcd-redhat-master 1/1 Running 16 7d22h
kube-apiserver-redhat-master 1/1 Running 17 7d22h
kube-controller-manager-redhat-master 1/1 Running 19 7d22h
kube-flannel-ds-amd64-9mh6k 1/1 Running 16 5d22h
kube-flannel-ds-amd64-g2k5c 1/1 Running 16 5d22h
kube-flannel-ds-amd64-rnvgb 1/1 Running 14 5d22h
kube-proxy-gf8zk 1/1 Running 16 7d22h
kube-proxy-wt7cp 1/1 Running 9 7d22h
kube-proxy-zbw4b 1/1 Running 9 7d22h
kube-scheduler-redhat-master 1/1 Running 18 7d22h
weave-net-6jjd8 2/2 Running 34 7d22h
weave-net-ssqbz 1/2 CrashLoopBackOff 296 7d22h
weave-net-ts2tj 2/2 Running 34 7d22h
[root#redhat-master deployments]# kubectl logs weave-net-ssqbz -c weave -n kube-system
DEBU: 2020/07/05 07:28:04.661866 [kube-peers] Checking peer "b6:01:79:66:7d:d3" against list &{[{e6:c9:b2:5f:82:d1 redhat-master} {b2:29:9a:5b:89:e9 redhat-console-1} {e2:95:07:c8:a0:90 redhat-console-2}]}
Peer not in list; removing persisted data
INFO: 2020/07/05 07:28:04.924399 Command line options: map[conn-limit:200 datapath:datapath db-prefix:/weavedb/weave-net docker-api: expect-npc:true host-root:/host http-addr:127.0.0.1:6784 ipalloc-init:consensus=2 ipalloc-range:10.32.0.0/12 metrics-addr:0.0.0.0:6782 name:b6:01:79:66:7d:d3 nickname:redhat-master no-dns:true port:6783]
INFO: 2020/07/05 07:28:04.924448 weave 2.6.5
FATA: 2020/07/05 07:28:04.938587 Existing bridge type "bridge" is different than requested "bridged_fastdp". Please do 'weave reset' and try again
Update:
So basically the issue is because iptables is deprecated in rhel8. But After downgrading my OS to rhel7. I can access the nodeport only on the node it is deployed.

How to resolve Kubernetes DNS issues when trying to install Weave Cloud Agents for Minikube

I was trying to install the Weave Cloud Agents for my minikube. I used the provided command
curl -Ls https://get.weave.works |sh -s -- --token=xxx
but keep getting the following error:
There was an error while performing a DNS check: checking DNS failed, the DNS in the Kubernetes cluster is not working correctly. Please check that your cluster can download images and run pods.
I have following dns:
kube-system coredns-6955765f44-7zt4x 1/1 Running 0 38m
kube-system coredns-6955765f44-xdnd9 1/1 Running 0 38m
I tried different suggestions such as https://www.jeffgeerling.com/blog/2019/debugging-networking-issues-multi-node-kubernetes-on-virtualbox or https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/. However none of them resolved my issue.
It seems to an issue which happened before https://github.com/weaveworks/launcher/issues/285.
My Kubernetes is on v1.17.3
Reproduced you issue, have the same error.
minikube v1.7.2 on Centos 7.7.1908
Docker 19.03.5
vm-driver=virtualbox
Connecting cluster to "Old Tree 34" (id: old-tree-34) on Weave Cloud
Installing Weave Cloud agents on minikube at https://192.168.99.100:8443
Performing a check of the Kubernetes installation setup.
There was an error while performing a DNS check: checking DNS failed, the DNS in the Kubernetes cluster is not working correctly. Please check that your cluster can download images and run pods.
I wasnt able to fix this problem, instead of that found a workaround - use Helm. You have second tab 'Helm 'in 'Install the Weave Cloud Agents' with provided command, like
helm repo update && helm upgrade --install --wait weave-cloud \
--set token=xxx \
--namespace weave \
stable/weave-cloud
Lets install Helm and use it.
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get | bash
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller
.....
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
helm repo update
helm upgrade --install --wait weave-cloud \
> --set token=xxx \
> --namespace weave \
> stable/weave-cloud
Release "weave-cloud" does not exist. Installing it now.
NAME: weave-cloud
LAST DEPLOYED: Thu Feb 13 14:52:45 2020
NAMESPACE: weave
STATUS: DEPLOYED
RESOURCES:
==> v1/Deployment
NAME AGE
weave-agent 35s
==> v1/Pod(related)
NAME AGE
weave-agent-69fbf74889-dw77c 35s
==> v1/Secret
NAME AGE
weave-cloud 35s
==> v1/ServiceAccount
NAME AGE
weave-cloud 35s
==> v1beta1/ClusterRole
NAME AGE
weave-cloud 35s
==> v1beta1/ClusterRoleBinding
NAME AGE
weave-cloud 35s
NOTES:
Weave Cloud agents had been installed!
First, verify all Pods are running:
kubectl get pods -n weave
Next, login to Weave Cloud (https://cloud.weave.works) and verify the agents are connect to your instance.
If you need help or have any question, join our Slack to chat to us – https://slack.weave.works.
Happy hacking!
Check(wait around 10 min to deploy everything):
kubectl get pods -n weave
NAME READY STATUS RESTARTS AGE
kube-state-metrics-64599b7996-d8pnw 1/1 Running 0 29m
prom-node-exporter-2lwbn 1/1 Running 0 29m
prometheus-5586cdd667-dtdqq 2/2 Running 0 29m
weave-agent-6c77dbc569-xc9qx 1/1 Running 0 29m
weave-flux-agent-65cb4694d8-sllks 1/1 Running 0 29m
weave-flux-memcached-676f88fcf7-ktwnp 1/1 Running 0 29m
weave-scope-agent-7lgll 1/1 Running 0 29m
weave-scope-cluster-agent-8fb596b6b-mddv8 1/1 Running 0 29m
[vkryvoruchko#nested-vm-image1 bin]$ kubectl get all -n weave
NAME READY STATUS RESTARTS AGE
pod/kube-state-metrics-64599b7996-d8pnw 1/1 Running 0 30m
pod/prom-node-exporter-2lwbn 1/1 Running 0 30m
pod/prometheus-5586cdd667-dtdqq 2/2 Running 0 30m
pod/weave-agent-6c77dbc569-xc9qx 1/1 Running 0 30m
pod/weave-flux-agent-65cb4694d8-sllks 1/1 Running 0 30m
pod/weave-flux-memcached-676f88fcf7-ktwnp 1/1 Running 0 30m
pod/weave-scope-agent-7lgll 1/1 Running 0 30m
pod/weave-scope-cluster-agent-8fb596b6b-mddv8 1/1 Running 0 30m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus ClusterIP 10.108.197.29 <none> 80/TCP 30m
service/weave-flux-memcached ClusterIP None <none> 11211/TCP 30m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prom-node-exporter 1 1 1 1 1 <none> 30m
daemonset.apps/weave-scope-agent 1 1 1 1 1 <none> 30m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-state-metrics 1/1 1 1 30m
deployment.apps/prometheus 1/1 1 1 30m
deployment.apps/weave-agent 1/1 1 1 31m
deployment.apps/weave-flux-agent 1/1 1 1 30m
deployment.apps/weave-flux-memcached 1/1 1 1 30m
deployment.apps/weave-scope-cluster-agent 1/1 1 1 30m
NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-state-metrics-64599b7996 1 1 1 30m
replicaset.apps/prometheus-5586cdd667 1 1 1 30m
replicaset.apps/weave-agent-69fbf74889 0 0 0 31m
replicaset.apps/weave-agent-6c77dbc569 1 1 1 30m
replicaset.apps/weave-flux-agent-65cb4694d8 1 1 1 30m
replicaset.apps/weave-flux-memcached-676f88fcf7 1 1 1 30m
replicaset.apps/weave-scope-cluster-agent-8fb596b6b 1 1 1 30m
Login to https://cloud.weave.works/ and check the same:
Started installing agents on Kubernetes cluster v1.17.2
All Weave Cloud agents are connected!