Rancher installation on AKS - kubernetes

I am trying to install rancher on AKS using helm3, following below documentation :
https://rancher.com/docs/rancher/v2.5/en/installation/install-rancher-on-k8s/
helm upgrade --install rancher rancher-stable/rancher -f rancher.yaml -n cattle-system --set ingress.tls.source=rancher --set proxy="export https_proxy=http://x.x.x.x/" --set proxy="export http_proxy=http://x.x.x.x/"
[rancher]$ kubectl get pod -n cattle-system
NAME READY STATUS RESTARTS AGE
hrancher-854b498848-gw4cb 1/1 Running 0 15m
rancher-854b498848-nnbqs 1/1 Running 0 15m
rancher-854b498848-wbcvs 1/1 Running 0 15m
helm-operation-jkjzb 0/2 Completed 0 29m
rancher-webhook-6979fbd4bf-qzkgl 1/1 Running 0 5d19h
Helm created configmap, certificate, Service account, Service ,CRD
---
# Source: rancher/templates/clusterRoleBinding.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rancher
namespace: cattle-system
labels:
app: rancher
app.kubernetes.io/managed-by: Helm
chart: rancher-2.6.2
heritage: Helm
release: rancher
subjects:
- kind: ServiceAccount
name: rancher
namespace: cattle-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
Please find the pod logs:
POD 1
=====
[rancher]$ kubectl logs -f rancher-854b498848-nnbqs -n cattle-system | grep ERROR
2021/12/14 09:35:46 [ERROR] error syncing 'cattle-system/serving-cert': handler tls-storage: Secret "serving-cert" is invalid: [data[tls.crt]: Required value, data[tls.key]: Required value], requeuing
2021/12/14 09:35:48 [ERROR] Failed to connect to peer wss://10.241.1.180/v3/connect [local ID=10.241.0.234]: websocket: bad handshake
2021/12/14 09:35:50 [ERROR] Failed to handling tunnel request from 10.241.0.235:53604: response 400: cluster not found
POD 2
=====
[rancher]$ kubectl logs -f rancher-854b498848-wbcvs -n cattle-system | grep ERROR
2021/12/14 09:35:48 [ERROR] Failed to connect to peer wss://10.241.1.180/v3/connect [local ID=10.241.0.235]: websocket: bad handshake
POD 3
==========
[rancher]$ kubectl logs -f rancher-854b498848-gw4cb -n cattle-system | grep ERROR | head -10
2021/12/14 09:35:36 [ERROR] error syncing 'cattle-system/serving-cert': handler tls-storage: Secret "serving-cert" is invalid: [data[tls.crt]: Required value, data[tls.key]: Required value], requeuing
2021/12/14 09:35:48 [ERROR] Failed to handling tunnel request from 10.241.0.235:55192: response 400: cluster not found
2021/12/14 09:36:17 [ERROR] error parsing azure-group-cache-size, skipping update strconv.Atoi: parsing "": invalid syntax
2021/12/14 09:36:17 [ERROR] error syncing 'cluster-admin': handler auth-prov-v2-roletemplate: clusterroles.rbac.authorization.k8s.io "cluster-admin" not found, requeuing

Related

flannel" cannot get resource "pods" in API group "" in the namespace "kube-flannel"

I'm trying to install Kubernetes with dashboard but I get the following issue:
test#ubuntukubernetes1:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-ksc9n 0/1 CrashLoopBackOff 14 (2m15s ago) 49m
kube-system coredns-6d4b75cb6d-27m6b 0/1 ContainerCreating 0 4h
kube-system coredns-6d4b75cb6d-vrgtk 0/1 ContainerCreating 0 4h
kube-system etcd-ubuntukubernetes1 1/1 Running 1 (106m ago) 4h
kube-system kube-apiserver-ubuntukubernetes1 1/1 Running 1 (106m ago) 4h
kube-system kube-controller-manager-ubuntukubernetes1 1/1 Running 1 (106m ago) 4h
kube-system kube-proxy-6v8w6 1/1 Running 1 (106m ago) 4h
kube-system kube-scheduler-ubuntukubernetes1 1/1 Running 1 (106m ago) 4h
kubernetes-dashboard dashboard-metrics-scraper-7bfdf779ff-dfn4q 0/1 Pending 0 48m
kubernetes-dashboard dashboard-metrics-scraper-8c47d4b5d-9kh7h 0/1 Pending 0 73m
kubernetes-dashboard kubernetes-dashboard-5676d8b865-q459s 0/1 Pending 0 73m
kubernetes-dashboard kubernetes-dashboard-6cdd697d84-kqnxl 0/1 Pending 0 48m
test#ubuntukubernetes1:~$
Log files:
test#ubuntukubernetes1:~$ kubectl logs --namespace kube-flannel kube-flannel-ds-ksc9n
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
I0808 23:40:17.324664 1 main.go:207] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W0808 23:40:17.324753 1 client_config.go:614] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
E0808 23:40:17.547453 1 main.go:224] Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-ksc9n': pods "kube-flannel-ds-ksc9n" is forbidden: User "system:serviceaccount:kube-flannel:flannel" cannot get resource "pods" in API group "" in the namespace "kube-flannel"
test#ubuntukubernetes1:~$
Do you know how this issue can be solved? I tried the following installation:
Swapoff -a
Remove following line from /etc/fstab
/swap.img none swap sw 0 0
sudo apt update
sudo apt install docker.io
sudo systemctl start docker
sudo systemctl enable docker
sudo apt install apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" >> ~/kubernetes.list
sudo mv ~/kubernetes.list /etc/apt/sources.list.d
sudo apt update
sudo apt install kubeadm kubelet kubectl kubernetes-cni
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
kubectl proxy --address 192.168.1.133 --accept-hosts '.*'
Can you advise?
I had the same situation on a new deployment today. Turns out, the kube-flannel-rbac.yml file had the wrong namespace. It's now 'kube-flannel', not 'kube-system', so I modified it and re-applied.
I also added a 'namespace' entry under each 'name' entry in kube-flannel.yml, except for under the roleRef heading. (it threw an error when I added it there) All pods came up as 'Running' after the new yml was applied.
Seems like the problem is with kube-flannel-rbac.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
it expecting a service account in the kube-system
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
so just delete this
kubectl delete -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
as the kube-flannel.yml already creating this in the right namespace.
https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml#L43
I tried to deploy a 3 node cluster with 1 master and 2 workers. I followed similar method as described above.
Then tried to depploy Nginx but it failed. When I checked my pods, flannel on master was running but on the worker nodes it is failing.
I deleted flannel and started from beginning.
First I just used only, since there was some mention that kube-flannel-rbac.yaml was causing issues.
ubuntu#master:~$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
namespace/kube-flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
ubuntu#master:~$ kubectl describe ClusterRoleBinding flannel
Name: flannel
Labels:
Annotations:
Role:
Kind: ClusterRole
Name: flannel
Subjects:
Kind Name Namespace
ServiceAccount flannel kube-flannel
Then I was able to create nginx image.
However, I then delete image and applied the second yaml. This changed the namespace
ubuntu#master:~$ kubectl describe ClusterRoleBinding flannel
Name: flannel
Labels:
Annotations:
Role:
Kind: ClusterRole
Name: flannel
Subjects:
Kind Name Namespace
ServiceAccount flannel kube-system
and again the nginx was successful.
What is the purpose of this config? Is it needed since the image is being deployed with and without it?
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml

Retrieve secrets from Vault using csi driver returning “permission denied”

I can get a token using curl:
curl \
--request POST \
--data '{"jwt": "'$TOKEN_REVIEW_SJWT'", "role": "teste-role"}'\
http://<ip>:8200/v1/auth/kubernetes/login
I’m able to vault login <token> and read the secret vault read secret/data/k8s-secret. But when I deploy a pod to test it, is returning “permission denied”.
Warning FailedMount 103s (x23 over 32m) kubelet, <ip> MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod csi/nginx-secrets-store-inline, err: rpc error: code = Unknown desc = error making mount request: couldn't read secret "k8s-secret": Error making API request.
URL: GET http://<vault-ip>:8200/v1/%!!(MISSING)E(MISSING)2%!C(MISSING)secret/data/k8s-secret%!!(MISSING)E(MISSING)2%!D(MISSING)
Code: 403. Errors:
* 1 error occurred:
* permission denied
Pods status:
kubectl get pods -n csi
NAME READY STATUS RESTARTS AGE
csi-secrets-store-csi-driver-4n789 3/3 Running 0 24h
csi-secrets-store-csi-driver-8zfbp 3/3 Running 0 10d
csi-secrets-store-csi-driver-b6hqv 3/3 Running 0 10d
vault-csi-provider-f488v 1/1 Running 0 11d
vault-csi-provider-l2982 1/1 Running 0 24h
vault-csi-provider-zztxb 1/1 Running 0 10d
To install the vault provider and csi driver:
helm install vault hashicorp/vault -n csi\
--set "server.enabled=false" \
--set "injector.enabled=false" \
--set "csi.enabled=true"
helm install csi secrets-store-csi-driver/secrets-store-csi-driver -n csi
Pod yaml to consume the secret:
kind: Pod
apiVersion: v1
metadata:
name: nginx-secrets-store-inline
namespace: app
spec:
containers:
- image: nginx
name: nginx
volumeMounts:
- name: secrets-store-inline
mountPath: “/mnt/secrets-store”
readOnly: true
serviceAccountName: app-sa
volumes:
- name: secrets-store-inline
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: vault-secret
I was able to create the pod when I removed the double quotes from the SecretProviderClass.
objects: |
- objectName: password
secretPath: secret/data/k8s-secret/
secretKey: password

FailedScheduling: 0/3 nodes are available: 3 Insufficient pods

I'm trying to deploy my NodeJS application to EKS and run 3 pods with exactly the same container.
Here's the error message:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
cm-deployment-7c86bb474c-5txqq 0/1 Pending 0 18s
cm-deployment-7c86bb474c-cd7qs 0/1 ImagePullBackOff 0 18s
cm-deployment-7c86bb474c-qxglx 0/1 ImagePullBackOff 0 18s
public-api-server-79b7f46bf9-wgpk6 0/1 ImagePullBackOff 0 2m30s
$ kubectl describe pod cm-deployment-7c86bb474c-5txqq
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 23s (x4 over 2m55s) default-scheduler 0/3 nodes are available: 3 Insufficient pods.
So it says that 0/3 nodes are available However, if I run
kubectl get nodes --watch
$ kubectl get nodes --watch
NAME STATUS ROLES AGE VERSION
ip-192-168-163-73.ap-northeast-2.compute.internal Ready <none> 6d7h v1.14.6-eks-5047ed
ip-192-168-172-235.ap-northeast-2.compute.internal Ready <none> 6d7h v1.14.6-eks-5047ed
ip-192-168-184-236.ap-northeast-2.compute.internal Ready <none> 6d7h v1.14.6-eks-5047ed
3 pods are running.
here are my configurations:
aws-auth-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: [MY custom role ARN]
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: cm-deployment
spec:
replicas: 3
selector:
matchLabels:
app: cm-literal
template:
metadata:
name: cm-literal-pod
labels:
app: cm-literal
spec:
containers:
- name: cm
image: docker.io/cjsjyh/public_test:1
imagePullPolicy: Always
ports:
- containerPort: 80
#imagePullSecrets:
# - name: regcred
env:
[my environment variables]
I applied both .yaml files
How can I solve this?
Thank you
My guess, without running the manifests you've got is that the image tag 1 on your image doesn't exist, so you're getting ImagePullBackOff which usually means that the container runtime can't find the image to pull .
Looking at the Docker Hub page there's no 1 tag there, just latest.
So, either removing the tag or replace 1 with latest may resolve your issue.
I experienced this issue with aws instance types with low resources

Install kube-dns using github file"kube-dns.yaml.sed" get error "CrashLoopBackOff"

$ kubectl logs -n kube-system po/kube-dns-5d6f4dbccf-gt5j2
Error from server (BadRequest): a container name must be specified for pod kube-dns-5d6f4dbccf-gt5j2, choose one of: [kubedns dnsmasq sidecar]
root#k8s-server01:~/source/dns# kubectl logs -n kube-system po/kube-dns-5d6f4dbccf-gt5j2 -c kubedns
I1101 01:49:49.294664 1 dns.go:48] version: 1.14.10
F1101 01:49:49.364594 1 server.go:56] Failed to create a kubernetes client: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory

Kubernetes Deployment Hanging

Following the Deployment example in the docs. I'm trying to deploy the example nginx. With the following config:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
So far, the deployment always hangs. I tried to see if for any reason I needed a pod named nginx to be deployed already. That didn't solve the problem.
$ sudo kubectl get deployments
NAME UPDATEDREPLICAS AGE
nginx-deployment 0/3 34m
$ sudo kubectl describe deployments
Name: nginx-deployment
Namespace: default
CreationTimestamp: Sat, 30 Jan 2016 06:03:47 +0000
Labels: app=nginx
Selector: app=nginx
Replicas: 0 updated / 3 total
StrategyType: RollingUpdate
RollingUpdateStrategy: 1 max unavailable, 1 max surge, 0 min ready seconds
OldReplicationControllers: nginx (2/2 replicas created)
NewReplicationController: <none>
No events.
When I check the events from kubernetes I see no events which belong to this deployment. Has anyone experienced this before?
The versions are as followed:
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
If the deployment is not creating any pods you could have a look at the events an error might be reported there for example:
kubectl get events --all-namespaces
NAMESPACE LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
default 8m 2d 415 wordpress Ingress Normal Service loadbalancer-controller no user specified default backend, using system default
kube-lego 2m 8h 49 kube-lego-7c66c7fddf ReplicaSet Warning FailedCreate replicaset-controller Error creating: pods "kube-lego-7c66c7fddf-" is forbidden: service account kube-lego/kube-lego2-kube-lego was not found, retry after the service account is created
Also have a look at kubectl get rs --all-namespaces.
I found an answer from the issues page
In order to get the deployments to work after you enable it and restart the kube-apiserver, you must also restart the kube-controller-manager.
You can check what is wrong with command kubectl describe pod name_of_your_pod