deploying Portainer on Kubernetes Cluster failed - kubernetes

after deploying Portainer on Kubernetes Cluster (1 master, 2 workers), following https://documentation.portainer.io/v2.0/deploy/ceinstallk8s/, by
helm install --create-namespace -n portainer portainer portainer/portainer --set persistence.storageClass=slow
I got the status:
kubectl get all -n portainer
NAME READY STATUS RESTARTS AGE
pod/portainer-6cb48f955f-qmtdq 0/1 Pending 0 2d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/portainer NodePort 10.97.158.200 <none> 9000:30777/TCP,30776:30776/TCP 2d3h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/portainer 0/1 1 0 2d
NAME DESIRED CURRENT READY AGE
replicaset.apps/portainer-6cb48f955f 1 1 0 2d
So,
The pod is not READY, with STATUS Pending.
The service is up but has no EXTERNAL-IP.
The deployment is not READY or AVAILABLE.
The ReplicaSet is not READY.
And I can't access the instance on port 30777.
i.e. http://20.199.64.113:30777/
More 'kubectl describe' info:
root#kubemaster:/home/kubemaster# kubectl describe pod portainer -n portainer
Name: portainer-7b94d88f67-plz9d
Namespace: portainer
Priority: 0
Node: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 129m default-scheduler 0/3 nodes are available: 3 pod has unbound immediate Persiste
root#kubemaster:/home/kubemaster# kubectl describe pvc portainer -n portainer
Name: portainer
Namespace: portainer
StorageClass: slow
Status: Pending
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 2m22s (x259 over 9h) persistentvolume-controller Failed to provision volume with S
root#kubemaster:/home/kubemaster# kubectl describe pv portainer -n portainer
Error from server (NotFound): persistentvolumes "portainer" not found
I did researched the below error/warning:
Warning FailedScheduling 129m default-scheduler 0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims.
Warning ProvisioningFailed 2m22s (x259 over 9h) persistentvolume-controller Failed to provision volume with StorageClass "slow": AzureDisk - failed to get Azure Cloud Provider. GetCloudProvider returned <nil> instead
But still wasn't able to enable Portainer instance.
Is there anything i missed out or anyway to debug
thanks ahead

If you are using PersistentVolumeClaim you need a volume provisioner for Dynamic Volume Provisioning. The bigger cloud providers typically has this.
If you don't have a volume provisioner in your cluster, you have to create a PersistentVolume resource and possibly also a StorageClass and declare how to use your storage system.
Take a look: portainer-on-kubernetes.
So in your case as you have mentioned you can install external volume provisioner - NFS subdir external provisioner.

Related

Error when getting IngressClass nginx: "nginx" not found

I'm using Kubernetes version: 1.19.16 on bare metal Ubuntu-18.04lts server. When i tried to deploy the nginx-ingress yaml file it always fails with below errors.
Following steps followed to deploy nginx-ingress,
$ git clone https://github.com/nginxinc/kubernetes-ingress.git
cd kubernetes-ingress/deployments
kubernetes-ingress/deployments$ git branch
* main
$ kubectl apply -f common/ns-and-sa.yaml
$ kubectl apply -f rbac/rbac.yaml
$ kubectl apply -f rbac/ap-rbac.yaml
$ kubectl apply -f common/default-server-secret.yaml
$ kubectl apply -f common/nginx-config.yaml
$ kubectl apply -f deployment/nginx-ingress.yaml
deployment.apps/nginx-ingress created
$ kubectl get pods -n nginx-ingress -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-ingress-75c4bd64bd-mm52x 0/1 Error 2 21s 10.244.1.5 k8s-master <none> <none>
$ kubectl -n nginx-ingress get all
NAME READY STATUS RESTARTS AGE
pod/nginx-ingress-75c4bd64bd-mm52x 0/1 CrashLoopBackOff 12 38m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-ingress 0/1 1 0 38m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-ingress-75c4bd64bd 1 1 0 38m
$ kubectl logs nginx-ingress-75c4bd64bd-mm52x -n nginx-ingress
W1003 04:53:02.833073 1 flags.go:273] Ignoring unhandled arguments: []
I1003 04:53:02.833154 1 flags.go:190] Starting NGINX Ingress Controller Version=2.3.1 PlusFlag=false
I1003 04:53:02.833158 1 flags.go:191] Commit=a8742472b9ddf27433b6b1de49d250aa9a7cb47e Date=2022-09-16T08:09:31Z DirtyState=false Arch=linux/amd64 Go=go1.18.5
I1003 04:53:02.844374 1 main.go:210] Kubernetes version: 1.19.16
F1003 04:53:02.846604 1 main.go:225] Error when getting IngressClass nginx: ingressclasses.networking.k8s.io "nginx" not found
$ kubectl describe pods nginx-ingress-75c4bd64bd-mm52x -n nginx-ingress
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m6s default-scheduler Successfully assigned nginx-ingress/nginx-ingress-75c4bd64bd-mm52x to k8s-worker-1
Normal Pulled 87s (x5 over 3m5s) kubelet Container image "nginx/nginx-ingress:2.3.1" already present on machine
Normal Created 87s (x5 over 3m5s) kubelet Created container nginx-ingress
Normal Started 87s (x5 over 3m5s) kubelet Started container nginx-ingress
Warning BackOff 75s (x10 over 3m3s) kubelet Back-off restarting failed container
Nginx Ingress controller Deployment file Link for the reference.
As I'm using kubernetes-ingress.git repository main branch, not sure whether main branch is compatible with my Kubernetes version or not.
Can anyone share some pointer to solve this?
I think you missed to install ingress-controller "NGINX" that is why it is not able to identify the same https://github.com/nginxinc/kubernetes-ingress/blob/main/deployments/common/ingress-class.yaml#L4
kubectl apply -f common/ingress-class.yaml
You can follow thie steps from this document: https://docs.nginx.com/nginx-ingress-controller/installation/installation-with-manifests/

Kubernetes OpenSearch Deployment | "no persistent volumes available for this claim and no storage class is set" error

We deployed OpenSearch using Kubernetes according documentation instructions on 3 nodes cluster (https://opensearch.org/docs/latest/opensearch/install/helm/) , after deployment pods are on Pending state and when checking it, we see following msg:
"
persistentvolume-controller no persistent volumes available for this claim and no storage class is set
"
Can you please advise what could be wrong in our OpenSearch/Kubernetes deployment or what can be missing from configuration perspective?
sharing some info:
Cluster nodes:
[root#I***-M1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ir***-m1 Ready control-plane,master 4h34m v1.23.4
ir***-w1 Ready 3h41m v1.23.4
ir***-w2 Ready 3h19m v1.23.4
Pods State:
[root#I****1 ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
opensearch-cluster-master-0 0/1 Pending 0 80m
opensearch-cluster-master-1 0/1 Pending 0 80m
opensearch-cluster-master-2 0/1 Pending 0 80m
[root#I****M1 ~]# kubectl describe pvc
Name: opensearch-cluster-master-opensearch-cluster-master-0
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: app.kubernetes.io/instance=my-deployment
app.kubernetes.io/name=opensearch
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: opensearch-cluster-master-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 2m24s (x18125 over 3d3h) persistentvolume-controller **no persistent
volumes available for this claim and no storage class is set**
.....
[root#IR****M1 ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM
POLICY STATUS CLAIM STORAGECLASS REASON AGE
opensearch-cluster-master-opensearch-cluster-master-0 30Gi RWO Retain Available manual 6h24m
opensearch-cluster-master-opensearch-cluster-master-1 30Gi RWO Retain Available manual 6h22m
opensearch-cluster-master-opensearch-cluster-master-2 30Gi RWO Retain Available manual 6h23m
task-pv-volume 60Gi RWO Retain Available manual 7h48m
[root#I****M1 ~]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
opensearch-cluster-master-opensearch-cluster-master-0 Pending 3d3h
opensearch-cluster-master-opensearch-cluster-master-1 Pending 3d3h
opensearch-cluster-master-opensearch-cluster-master-2 Pending 3d3h
...no storage class is set...
Try upgrade your deployment with storage class, presumed you run on AWS EKS: helm upgrade my-deployment opensearch/opensearch --set persistence.storageClass=gp2
If you are running on GKE, change gp2 to standard. On AKS change to default.

When does Ansible AWX install postgreSQL?

I tried installing Ansible AWX. However, AWX also installs PostgreSQL on the system (I am using kubernetes for AWX btw). I understand that PostgreSQL is one of the requirements for AWX.
Now, for another project, I have to install PostgreSQL (on Kubernetes itself). I looked up a method online and it is working. However, is there some way I can do it automatically, just like the installation of AWX?
Thanks,
Suhas
This can be achieved by using the awx-operator. Below is a Demo installation of Helm. By default awx and PG db are located on the same worker node, but this requires a default SC
Helm Deployment
Configuring Helm sources for awx-operator
┌──[root#vms81.liruilongs.github.io]-[~/AWK]
└─$helm repo add awx-operator https://ansible.github.io/awx-operator/
"awx-operator" has been added to your repository.
┌──[root#vms81.liruilongs.github.io]──[~/AWK]
└─$helm repo update
Grab the latest from your diagram repository as we grab it...
... Successfully get updates from the "liruilong_repo" chart repository
... Successfully get updates from the "elastic" chart library
... Successfully obtained updates from the "prometheus-community" chart repository
... Successfully obtained updates from the "azure" chart repository
... Unable to get updates from "ali" chart repository (https://apphub.aliyuncs.com).
Failed to fetch https://apphub.aliyuncs.com/index.yaml: 504 gateway timeout
... Successfully getting updates from the "awx-operator" chart library
... Successfully fetching updates from the "stable" chart library
Update completed. ⎈ Have fun! ⎈
Search awx-operator for Chart
┌──[root#vms81.liruilongs.github.io]-[~/AWK]
└─$helm search repo awx-operator
NAME CHART VERSION APP VERSION DESCRIPTION
awx-operator/awx-operator 0.30.0 0.30.0 A Helm chart for the AWX Operator
Custom parameter installation helm install my-awx-operator awx-operator/awx-operator -n awx --create-namespace -f myvalues.yaml.
If you use a custom installation, you need to enable the corresponding switches in myvalues.yaml, you can configure HTTPS, standalone PG database, LB, LDAP authentication, etc. The file template can be found in the chart package under pull, and use the value.yaml inside for the template.
We use the default configuration here to install, no need to specify a configuration file.
┌──[root#vms81.liruilongs.github.io]-[~/AWK]
└─$helm install -n awx --create-namespace my-awx-operator awx-operator/awx-operator
Name: my-awx-operator
Last deployed. mon oct 10 16:29:24 2022
namespace: awx
Status: Deployed
Revision: 1
Test suite: none
Notes.
AWX operator is installed in Helm Chart version 0.30.0.
┌──[root#vms81.liruilongs.github.io]──[~/AWK]
└─$
After looking at the POD status
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
awx-demo-postgres-13-0 0/1 Pending 0 105s
awx-operator-controller-manager-79ff9599d8-2v5fn 2/2 Running 0 128m
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
awx-demo-postgres-13 ClusterIP None <none> 5432/TCP 5m48s
awx-operator-controller-manager-metrics-service ClusterIP 10.107.17.167 <none> 8443/TCP 132m
pg corresponding pod: awx-demo-postgres-13-0 pending now, look at the events
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl describe pods awx-demo-postgres-13-0 | grep -i -A 10 event
Event.
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 23s (x8 over 7m31s) default-scheduler 0/3 nodes are available: 3 pods have unbound direct PersistentVolumeClaims.
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get pvc
name status volume capacity access mode storage class age
postgres-13-awx-demo-postgres-13-0 Pending 10m
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl describe pvc postgres-13-awx-demo-postgres-13-0 | grep -i -A 10 event
Event.
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 82s (x42 over 11m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set.
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get sc
No resources found
OK ,the reason for Pending is that there is no default SC
For stateful applications, we need to create a default SC (dynamic volume provisioning) before generating a statefulset, which will dynamically handle the creation of PVs and PVCs and generate data storage for PGs, so we need to create a SC here.
Here, for convenience, we use local storage as the back-end storage. In general, PV can only be network storage and does not belong to any Node, so it is a bit more by way of NFS, and the SC will specify the allocator through the provisioner field. After the storageClass is created, the user uses the default SC's allocation storage when defining the pvc.
To confirm successful creation
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get sc
name provisioner reclaimpolicy volumebindingmode allowvolumeexpansion age
local-path rancher.io/local-path delete WaitForFirstConsumer false 2m6s
Set to default SC:
https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/change-default-storage-class/
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl patch storageclass local-path -p '{"metadata": {"comments":{"storageclass.kubernetes.io/is-default-class": "true"}}'
storageclass.storage.k8s.io/local-path patched
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
awx-demo-postgres-13-0 0/1 Pending 0 46m
awx-operator-controller-manager-79ff9599d8-2v5fn 2/2 Running 0 173m
Export yaml file, delete and recreate
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get pvc postgres-13-awx-demo-postgres-13-0 -o yaml > postgres-13-awx-demo-postgres-13-0.yaml
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl delete -f postgres-13-awx-demo-postgres-13-0.yaml
persistentvolumeclaim "postgres-13-awx-demo-postgres-13-0" deleted
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl apply -f postgres-13-awx-demo-postgres-13-0.yaml
persistentvolumeclaim/postgres-13-awx-demo-postgres-13-0 created
Check the status of the pvc, here you need to wait a while, Bound means it has been bound.
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
postgres-13-awx-demo-postgres-13-0 Pending local-path 3s
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl describe pvc postgres-13-awx-demo-postgres-13-0 | grep -i -A 10 event
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForPodScheduled 42s persistentvolume-controller waiting for pod awx-demo-postgres-13-0 to be scheduled
Normal ExternalProvisioning 41s persistentvolume-controller waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
Normal Provisioning 41s rancher.io/local-path_local-path-provisioner-7c795b5576-gmrx4_d69ca393-bcbe-4abb-8b22-cd8db3b26bf8 External provisioner is provisioning volume for claim "awx/postgres-13-awx-demo-postgres-13-0"
Normal ProvisioningSucceeded 39s rancher.io/local-path_local-path-provisioner-7c795b5576-gmrx4_d69ca393-bcbe-4abb-8b22-cd8db3b26bf8 Successfully provisioned volume pvc-44b7687c-de18-45d2-bef6-8fb2d1c415d3
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
postgres-13-awx-demo-postgres-13-0 Bound pvc-44b7687c-de18-45d2-bef6-8fb2d1c415d3 8Gi RWO local-path 53s
┌──[root#vms81.liruilongs.github.io]-[~/awx/awx-operator]
└─$
┌──[root#vms81.liruilongs.github.io]-[~/awx-operator/crds]
└─$kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-44b7687c-de18-45d2-bef6-8fb2d1c415d3 8Gi RWO Delete Bound awx/postgres-13-awx-demo-postgres-13-0 local-path 54s
Look at the status of the POD, here the PG-DB related POD is created successfully
Here you need to wait a while, you will see the Pods are normal
┌──[root#vms81.liruilongs.github.io]-[~/ansible]
└─$kubectl get pods
NAME READY STATUS RESTARTS AGE
awx-demo-65d9bf775b-hc58x 4/4 Running 0 79m
awx-demo-postgres-13-0 1/1 Running 0 143m
awx-operator-controller-manager-79ff9599d8-m7t8k 2/2 Running 0 81m
View SVC Access Test
┌──[root#vms81.liruilongs.github.io]-[~/ansible]
└─$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
awx-demo-postgres-13 ClusterIP None <none> 5432/TCP 143m
awx-demo-service NodePort 10.104.176.210 <none> 80:30066/TCP 79m
awx-operator-controller-manager-metrics-service ClusterIP 10.108.71.67 <none> 8443/TCP 82m
┌──[root#vms81.liruilongs.github.io]-[~/ansible]
└─$curl 192.168.26.82:30066
<!doctype html><html lang="en"><head><script nonce="cw6jhvbF7S5bfKJPsimyabathhaX35F5hIyR7emZNT0=" type="text/javascript">window.....
┌──[root#vms81.liruilongs.github.io]-[~/ansible]
└─$
Get Password
┌──[root#vms81.liruilongs.github.io]-[~/ansible]
└─$kubectl get secrets
NAME TYPE DATA AGE
awx-demo-admin-password Opaque 1 146m
awx-demo-app-credentials Opaque 3 82m
awx-demo-broadcast-websocket Opaque 1 146m
awx-demo-postgres-configuration Opaque 6 146m
awx-demo-receptor-ca kubernetes.io/tls 2 82m
awx-demo-receptor-work-signing Opaque 2 82m
awx-demo-secret-key Opaque 1 146m
awx-demo-token-sc92t kubernetes.io/service-account-token 3 82m
awx-operator-controller-manager-token-tpv2m kubernetes.io/service-account-token 3 84m
default-token-864fk kubernetes.io/service-account-token 3 4h32m
redhat-operators-pull-secret Opaque 1 146m
sh.helm.release.v1.my-awx-operator.v1 helm.sh/release.v1 1 84m
┌──[root#vms81.liruilongs.github.io]-[~/awx-operator/crds]
└─$echo $(kubectl get secret awx-demo-admin-password -o jsonpath="{.data.password}" | base64 --decode)
tP59YoIWSS6NgCUJYQUG4cXXJIaIc7ci
┌──[root#vms81.liruilongs.github.io]-[~/awx-operator/crds]
└─$
Access test
The default service is published as NodePort, so we can access it from any subnet IP via node plus port:http://192.168.26.82:30066/#/login

Kubernetes add Toleration from CLI

I'm using the Oracle Cloud Infrastructure with Kubernetes and Docker. I've got the following pod:
$ kubectl describe pod $podname -n $namespace
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 19m default-scheduler 0/1 nodes are available: 1 node(s) had taint {nvidia.com/gpu: }, that the pod didn't tolerate.
Warning FailedScheduling 18m default-scheduler 0/1 nodes are available: 1 node(s) had taint {nvidia.com/gpu: }, that the pod didn't tolerate.
I want to add a toleration to this pod - is there a command to do so, without creating the pod config yaml file (as this pod is created by some other systems that I don't want to edit. I just want to add the toleration to resolve this issue.
Thanks.
====================
gpu-config.yaml
apiVersion: v1 # What version of the Kubernetes API to use
kind: Pod # What kind of object you want to create
metadata: # Data that helps uniquely identify the object, including a name, string, UID and optional namespace
name: nvidia-gpu-workload
spec: # What state you desire for the object, differs for every type of Kubernetes object.
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
image: k8s.gcr/io/cuda-vector-add:v0.1
resources:
limits:
nvidia.com/gpu: 1
tolerations:
- key: "nvidia.com/gpu"
operator: "Equal"
effect: "NoSchedule"
# Update command
$ kubectl create -f ./gpu-config.yaml
# All this seems to do is create a pod by the name of nvidia-gpu-workload-v2, and it doesn't add these configurations to the pod that I require.
Just to note that this issue is occurring on a pod called hook-image-awaiter-5tq5 and I don't think I should re-create that pod with a different config as it seems to be configured by part of the system.

Failed to provision volume with StorageClass "google-storage"

I'm testing K8s with my custom cluster on GCP. I'm trying to create a StorageClass with this yaml:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: google-storage
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
replication-type: none
But I'm having this error when I tried to create the resource:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 87s (x7 over 7m47s) persistentvolume-controller Failed to provision volume with StorageClass "google-storage": failed to get GCE GCECloudProvider with error <nil>
I'm not using GKE, but my own install and and installed everything by myself with kubeadm with the version 1.20 of kubernetes
NAME STATUS ROLES AGE VERSION
master-1 Ready control-plane,master 56m v1.20.1
worker-1 Ready <none> 55m v1.20.1
worker-2 Ready <none> 55m v1.20.1
Everything is working properly when I create pods, but I'm having an issue with StorageClass. Did I miss something during the creation?
I have found this same issue on stackoverflow, but the answer does not seem to applied to me, as I'm running a more recent version of the K8s Cluster without this problem describe :
Container-VM Image with GPD Volumes fails with "Failed to get GCE Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead"