Deploy GitLab with Helm on Kubernetes - kubernetes

I need to deploy GitLab with Helm on Kubernetes.
I have the problem: PVC is Pending.
I see volume.alpha.kubernetes.io/storage-class: default in PVC description, but I set value gitlabDataStorageClass: gluster-heketi in values.yaml.
And I fine deploy simple nginx from article https://github.com/gluster/gluster-kubernetes/blob/master/docs/examples/hello_world/README.md
Yes, I use distribute storage GlusterFS https://github.com/gluster/gluster-kubernetes
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
gitlab1-gitlab-data Pending 19s
gitlab1-gitlab-etc Pending 19s
gitlab1-postgresql Pending 19s
gitlab1-redis Pending 19s
gluster1 Bound pvc-922b5dc0-6372-11e8-8f10-4ccc6a60fcbe 5Gi RWO gluster-heketi 43m
Structure for single of pangings:
# kubectl get pvc gitlab1-gitlab-data -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
volume.alpha.kubernetes.io/storage-class: default
creationTimestamp: 2018-05-29T19:43:18Z
finalizers:
- kubernetes.io/pvc-protection
name: gitlab1-gitlab-data
namespace: default
resourceVersion: "263950"
selfLink: /api/v1/namespaces/default/persistentvolumeclaims/gitlab1-gitlab-data
uid: 8958d4f5-6378-11e8-8f10-4ccc6a60fcbe
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
status:
phase: Pending
In describe I see:
# kubectl describe pvc gitlab1-gitlab-data
Name: gitlab1-gitlab-data
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: <none>
Annotations: volume.alpha.kubernetes.io/storage-class=default
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 2m (x43 over 12m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
My values.yaml file:
# Default values for kubernetes-gitlab-demo.
# This is a YAML-formatted file.
# Required variables
# baseDomain is the top-most part of the domain. Subdomains will be generated
# for gitlab, mattermost, registry, and prometheus.
# Recommended to set up an A record on the DNS to *.your-domain.com to point to
# the baseIP
# e.g. *.your-domain.com. A 300 baseIP
baseDomain: my-domain.com
# legoEmail is a valid email address used by Let's Encrypt. It does not have to
# be at the baseDomain.
legoEmail: my#mail.com
# Optional variables
# baseIP is an externally provisioned static IP address to use instead of the provisioned one.
#baseIP: 95.165.135.109
nameOverride: gitlab
# `ce` or `ee`
gitlab: ce
gitlabCEImage: gitlab/gitlab-ce:10.6.2-ce.0
gitlabEEImage: gitlab/gitlab-ee:10.6.2-ee.0
postgresPassword: NDl1ZjNtenMxcWR6NXZnbw==
initialSharedRunnersRegistrationToken: "tQtCbx5UZy_ByS7FyzUH"
mattermostAppSecret: NDl1ZjNtenMxcWR6NXZnbw==
mattermostAppUID: aadas
redisImage: redis:3.2.10
redisDedicatedStorage: true
redisStorageSize: 5Gi
redisAccessMode: ReadWriteOnce
postgresImage: postgres:9.6.5
# If you disable postgresDedicatedStorage, you should consider bumping up gitlabRailsStorageSize
postgresDedicatedStorage: true
postgresAccessMode: ReadWriteOnce
postgresStorageSize: 30Gi
gitlabDataAccessMode: ReadWriteOnce
#gitlabDataStorageSize: 30Gi
gitlabRegistryAccessMode: ReadWriteOnce
#gitlabRegistryStorageSize: 30Gi
gitlabConfigAccessMode: ReadWriteOnce
#gitlabConfigStorageSize: 1Gi
gitlabRunnerImage: gitlab/gitlab-runner:alpine-v10.6.0
# Valid values for provider are `gke` for Google Container Engine. Leaving it blank (or any othervalue) will disable fast disk options.
#provider: gke
# Gitlab pages
# The following 3 lines are needed to enable gitlab pages.
# pagesExternalScheme: http
# pagesExternalDomain: your-pages-domain.com
# pagesTlsSecret: gitlab-pages-tls # An optional reference to a tls secret to use in pages
## Storage Class Options
## If defined, volume.beta.kubernetes.io/storage-class: <storageClass>
## If not defined, but provider is gke, will use SSDs
## Otherwise default: volume.alpha.kubernetes.io/storage-class: default
gitlabConfigStorageClass: gluster-heketi
gitlabDataStorageClass: gluster-heketi
gitlabRegistryStorageClass: gluster-heketi
postgresStorageClass: gluster-heketi
redisStorageClass: gluster-heketi
healthCheckToken: 'SXBAQichEJasbtDSygrD'
# Optional, for GitLab EE images only
#gitlabEELicense: base64-encoded-license
# Additional omnibus configuration,
# see https://docs.gitlab.com/omnibus/settings/configuration.html
# for possible configuration options
#omnibusConfigRuby: |
# gitlab_rails['smtp_enable'] = true
# gitlab_rails['smtp_address'] = "smtp.example.org"
gitlab-runner:
checkInterval: 1
# runnerRegistrationToken must equal initialSharedRunnersRegistrationToken
runnerRegistrationToken: "tQtCbx5UZy_ByS7FyzUH"
# resources:
# limits:
# memory: 500Mi
# cpu: 600m
# requests:
# memory: 500Mi
# cpu: 600m
runners:
privileged: true
## Build Container specific configuration
##
# builds:
# cpuLimit: 200m
# memoryLimit: 256Mi
# cpuRequests: 100m
# memoryRequests: 128Mi
## Service Container specific configuration
##
# services:
# cpuLimit: 200m
# memoryLimit: 256Mi
# cpuRequests: 100m
# memoryRequests: 128Mi
## Helper Container specific configuration
##
# helpers:
# cpuLimit: 200m
# memoryLimit: 256Mi
# cpuRequests: 100m
# memoryRequests: 128Mi
You can see I have the StorageClass:
# kubectl get sc
NAME PROVISIONER AGE
gluster-heketi kubernetes.io/glusterfs 48m

Without a link to the actual helm you used, it's impossible for anyone to troubleshoot why the go-template isn't correctly consuming your values.yaml.
I see volume.alpha.kubernetes.io/storage-class: default in PVC description, but I set value gitlabDataStorageClass: gluster-heketi in values.yaml
I can appreciate you set whatever you wanted in values.yaml, but as long as that StorageClass doesn't match any existing StorageClass, I'm not sure what positive thing will materialize from there. You can certainly try creating a StorageClass named default containing the same values as your gluster-heketi SC, or update the PVC to use the correct SC.
To be honest, this may be a bug in the helm chart, but until it is fixed (and/or we get the link to the chart to help you know how to adjust your yaml) if you want your GitLab to deploy, you will need to work around this bad situation manually.

Related

Unable to mount NFS on Kubernetes Pod

I am working on deploying Hyperledger Fabric test network on Kubernetes minikube cluster. I intend to use PersistentVolume to share cytpo-config and channel artifacts among various peers and orderers. Following is my PersistentVolume.yaml and PersistentVolumeClaim.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
name: persistent-volume
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
path: "/nfsroot"
server: "3.128.203.245"
readOnly: false
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: persistent-volume-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Following is the pod where the above claim is mount on /data
kind: Pod
apiVersion: v1
metadata:
name: test-shell
labels:
name: test-shell
spec:
containers:
- name: shell
image: ubuntu
command: ["/bin/bash", "-c", "while true ; do sleep 10 ; done"]
volumeMounts:
- mountPath: "/data"
name: pv
volumes:
- name: pv
persistentVolumeClaim:
claimName: persistent-volume-claim
NFS is setup on my EC2 instance. I have verified NFS server is working fine and I was able to mount it inside minikube. I am not understanding what wrong am I doing, but any file present inside 3.128.203.245:/nfsroot is not present in test-shell:/data
What point am I missing. I even tried hostPath mount but to no avail. Please help me out.
I think you should check the following things to verify that NFS is mounted successfully or not
run this command on the node where you want to mount.
$showmount -e nfs-server-ip
like in my case $showmount -e 172.16.10.161
Export list for 172.16.10.161:
/opt/share *
use $df -hT command see that Is NFS is mounted or not like in my case it will give output 172.16.10.161:/opt/share nfs4 91G 32G 55G 37% /opt/share
if not mounted then use the following command
$sudo mount -t nfs 172.16.10.161:/opt/share /opt/share
if the above commands show an error then check firewall is allowing nfs or not
$sudo ufw status
if not then allow using the command
$sudo ufw allow from nfs-server-ip to any port nfs
I made the same setup I don't face any issues. My k8s cluster of fabric is running successfully . The hf k8s yaml files can be found at my GitHub repo. There I have deployed the consortium of Banks on hyperledger fabric which is a dynamic multihost blockchain network that means you can add orgs, peers, join peers, create channels, install and instantiate chaincode on the go in an existing running blockchain network.
By default in minikube you should have default StorageClass:
Each StorageClass contains the fields provisioner, parameters, and reclaimPolicy, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned.
For example, NFS doesn't provide an internal provisioner, but an external provisioner can be used. There are also cases when 3rd party storage vendors provide their own external provisioner.
Change the default StorageClass
In your example this property can lead to problems.
In order to list enabled addons in minikube please use:
minikube addons list
To list all StorageClasses in your cluster use:
kubectl get sc
NAME PROVISIONER
standard (default) k8s.io/minikube-hostpath
Please note that at most one StorageClass can be marked as default. If two or more of them are marked as default, a PersistentVolumeClaim without storageClassName explicitly specified cannot be created.
In your example the most probable scenario is that you have already default StorageClass. Applying those resources caused: new PV creation (without StoraglClass), new PVC creation (with reference to existing default StorageClass). In this situation there is no reference between your custom pv/pvc binding) as an example please take a look:
kubectl get pv,pvc,sc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/nfs 3Gi RWX Retain Available 50m
persistentvolume/pvc-8aeb802f-cd95-4933-9224-eb467aaa9871 1Gi RWX Delete Bound default/pvc-nfs standard 50m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pvc-nfs Bound pvc-8aeb802f-cd95-4933-9224-eb467aaa9871 1Gi RWX standard 50m
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/standard (default) k8s.io/minikube-hostpath Delete Immediate false 103m
This example will not work due to:
new persistentvolume/nfs has been created (without reference to pvc)
new persistentvolume/pvc-8aeb802f-cd95-4933-9224-eb467aaa9871 has been created using default StorageClass. In the Claim section we can notice that this pv has been created due to dynamic pv provisioning using default StorageClass with reference to default/pvc-nfs claim (persistentvolumeclaim/pvc-nfs ).
Solution 1.
According to the information from the comments:
Also I am able to connect to it within my minikube and also my actual ubuntu system.
I you are able to mount from inside minikube host this nfs share
If you mounted nfs share into your minikube node, please try to use this example with hostpath volume directly from your pod:
apiVersion: v1
kind: Pod
metadata:
name: test-shell
namespace: default
spec:
volumes:
- name: pv
hostPath:
path: /path/shares # path to nfs mount point on minikube node
containers:
- name: shell
image: ubuntu
command: ["/bin/bash", "-c", "sleep 1000 "]
volumeMounts:
- name: pv
mountPath: /data
Solution 2.
If you are using PV/PVC approach:
kind: PersistentVolume
apiVersion: v1
metadata:
name: persistent-volume
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
storageClassName: "" # Empty string must be explicitly set otherwise default StorageClass will be set / or custom storageClassName name
nfs:
path: "/nfsroot"
server: "3.128.203.245"
readOnly: false
claimRef:
name: persistent-volume-claim
namespace: default
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: persistent-volume-claim
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: "" # Empty string must be explicitly set otherwise default StorageClass will be set / or custom storageClassName name
volumeName: persistent-volume
Note:
If you are not referencing any provisioner associated with your StorageClass
Helper programs relating to the volume type may be required for consumption of a PersistentVolume within a cluster. In this example, the PersistentVolume is of type NFS and the helper program /sbin/mount.nfs is required to support the mounting of NFS filesystems.
Please keep in mind that when you are creating pvc kubernetes persistent-controller is trying to bind pvc with proper pv. During this process different factors are take into account like: storageClassName (default/custom), accessModes, claimRef, volumeName.
In this case you can use:
PersistentVolume.spec.claimRef.name: persistent-volume-claim PersistentVolumeClaim.spec.volumeName: persistent-volume
Note:
The control plane can bind PersistentVolumeClaims to matching PersistentVolumes in the cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a binding between that specific PV and PVC. If the PersistentVolume exists and has not reserved PersistentVolumeClaims through its claimRef field, then the PersistentVolume and PersistentVolumeClaim will be bound.
The binding happens regardless of some volume matching criteria, including node affinity. The control plane still checks that storage class, access modes, and requested storage size are valid.
Once the PV/pvc were created or in case of any problem with pv/pvc binding please use the following commands to figure current state:
kubectl get pv,pvc,sc
kubectl describe pv
kubectl describe pvc
kubectl describe pod
kubectl get events

Why ReadWriteOnce is working on different nodes?

Our platform which runs on K8s has different components. We need to share the storage between two of these components (comp-A and comp-B) but by mistake, we defined the PV and PVC for that as ReadWriteOnce and even when those two components were running on different nodes everything was working and we were able to read and write to the storage from both components.
Based on the K8s docs the ReadWriteOnce can be mounted to one node and we have to use ReadWriteMany:
ReadWriteOnce -- the volume can be mounted as read-write by a single node
ReadOnlyMany -- the volume can be mounted read-only by many nodes
ReadWriteMany -- the volume can be mounted as read-write by many nodes"
So I am wondering why everything was working fine while it shouldn't?
More info:
We use NFS for storage and we are not using dynamic provisioning and below is how we defined our pv and pvc (we use helm):
- apiVersion: v1
kind: PersistentVolume
metadata:
name: gstreamer-{{ .Release.Namespace }}
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
mountOptions:
- hard
- nfsvers=4.1
nfs:
server: {{ .Values.global.nfsserver }}
path: /var/nfs/general/gstreamer-{{ .Release.Namespace }}
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gstreamer-claim
namespace: {{ .Release.Namespace }}
spec:
volumeName: gstreamer-{{ .Release.Namespace }}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Update
The output of some kubectl commands:
$ kubectl get -n 149 pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
gstreamer-claim Bound gstreamer-149 10Gi RWO 177d
$ kubectl get -n 149 pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
gstreamer-149 10Gi RWO Recycle Bound 149/gstreamer-claim 177d
I think somehow it takes care of it because the only thing the pods need to do is connecting to that IP.
It's quite misleading concept regarding accessMode, especially in NFS.
In Kubernetes Persistent Volume docs it's mentioned that NFS supports all types of Access. RWO, RXX and RWX.
However accessMode is something like matching criteria, same as storage size. It's described better in OpenShift Access Mode documentation
A PersistentVolume can be mounted on a host in any way supported by the resource provider. Providers have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read-write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.
Claims are matched to volumes with similar access modes. The only two matching criteria are access modes and size. A claim’s access modes represent a request. Therefore, you might be granted more, but never less. For example, if a claim requests RWO, but the only volume available is an NFS PV (RWO+ROX+RWX), the claim would then match NFS because it supports RWO.
Direct matches are always attempted first. The volume’s modes must match or contain more modes than you requested. The size must be greater than or equal to what is expected. If two types of volumes, such as NFS and iSCSI, have the same set of access modes, either of them can match a claim with those modes. There is no ordering between types of volumes and no way to choose one type over another.
All volumes with the same modes are grouped, and then sorted by size, smallest to largest. The binder gets the group with matching modes and iterates over each, in size order, until one size matches.
In the next paragraph:
A volume’s AccessModes are descriptors of the volume’s capabilities. They are not enforced constraints. The storage provider is responsible for runtime errors resulting from invalid use of the resource.
For example, NFS offers ReadWriteOnce access mode. You must mark the claims as read-only if you want to use the volume’s ROX capability. Errors in the provider show up at runtime as mount errors.
Another example is that you can choose a few AccessModes as it is not constraint but a matching criteria.
$ cat <<EOF | kubectl create -f -
> apiVersion: v1
> kind: PersistentVolumeClaim
> metadata:
> name: exmaple-pvc
> spec:
> accessModes:
> - ReadOnlyMany
> - ReadWriteMany
> - ReadWriteOnce
> resources:
> requests:
> storage: 1Gi
> EOF
or as per GKE example:
$ cat <<EOF | kubectl create -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: exmaple-pvc-rwo-rom
spec:
accessModes:
- ReadOnlyMany
- ReadWriteOnce
resources:
requests:
storage: 1Gi
EOF
persistentvolumeclaim/exmaple-pvc-rwo-rom created
PVC Output
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
exmaple-pvc Pending standard 2m18s
exmaple-pvc-rwo-rom Bound pvc-d704d346-42b3-4090-af96-aebeee3053f5 1Gi RWO,ROX standard 6s
persistentvolumeclaim/exmaple-pvc created
exmaple-pvc is in Pending state as default GKE GCEPersistentDisk its not supporting RreadWriteMany.
Warning ProvisioningFailed 10s (x5 over 69s) persistentvolume-controller Failed to provision volume with StorageClass "standard": invalid AccessModes [ReadOnlyMany ReadWriteMany ReadWr
iteOnce]: only AccessModes [ReadWriteOnce ReadOnlyMany] are supported
However second pvc exmaple-pvc-rwo-rom were created and you can see it have 2 access mode RWO, ROX.
In short accessMode is more like requirement for PVC/PV to Bind. If NFS which is providing all access modes binds with RWO it fulfill requirement, however it will work as RWM as NFS providing that capability.
Hope it answered cleared a bit.
In addition you can check other StackOverflow threads regarding accessMode

GKE `ResourceQuota` on a namespace - limit higher than specified

I have a Kubernetes cluster running on GKE, and I created a new namespace with a ResourceQuota:
yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: bot-quota
spec:
hard:
requests.cpu: '500m'
requests.memory: 1Gi
limits.cpu: '1000m'
limits.memory: 2Gi
which I apply to my namespace (called bots), which gives me kubectl describe resourcequota --namespace=bots:
Name: bot-quota
Namespace: bots
Resource Used Hard
-------- ---- ----
limits.cpu 0 1
limits.memory 0 2Gi
requests.cpu 0 500m
requests.memory 0 1Gi
Name: gke-resource-quotas
Namespace: bots
Resource Used Hard
-------- ---- ----
count/ingresses.extensions 0 5k
count/jobs.batch 0 10k
pods 0 5k
services 0 1500
This is what I expect - and my expectation is that the bots namespace is hard limited to above limits.
Now I would like to deploy a single pod onto that namespace, using this simple yaml:
apiVersion: v1
kind: Pod
metadata:
name: podname
namespace: bots
labels:
app: someLabel
spec:
nodeSelector:
cloud.google.com/gke-nodepool: default-pool
containers:
- name: containername
image: something-image-whatever:latest
resources:
requests:
memory: '96Mi'
cpu: '300m'
limits:
memory: '128Mi'
cpu: '0.5'
args: ['0']
Given the resources specified`, I'd expect to be well in range, deploying a single instance. When I apply the yaml though:
Error from server (Forbidden): error when creating "pod.yaml": pods "podname" is forbidden: exceeded quota: bot-quota, requested: limits.cpu=2500m, used: limits.cpu=0, limited: limits.cpu=1
If I change the pod's yaml to use a cpu limit of 0.3, then the same error appear with limits.cpu=2300m requested.
In other words: it seems to miraculously add 2000m (=2) cpu to my limit.
We do NOT have any LimitRange applied.
What am I missing?
As discussed in the comments above, it is indeed related to istio. How?
As it is (now) obvious, the requests and limits are specified on container level, and NOT on pod/deployment level. Why is that relevant?
Running istio (in our case, managed istio on GKE), the container is not alone in the "workload", much rather it also has istio-init (which is terminated soon after starting) plus istio-proxy.
And these additional containers apply their own limits & resources, in the current pod I am looking at for example:
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 100m
memory: 128Mi
on istio-proxy (using: kubectl describe pods <podid>)
This indeed explains why the WHOLE deployment has 2 cpu more in the limit as expected.

Unable to delete perconaxtradbclusters which are in errored state

I've installed Percona XtraDB on kubernetes using 1.3.0 operator.
After using it, I wanted to delete the namespace. So I deleted them in the order which I applied them.
Everything is deleted and nothing is visible in svc, pods but there are two resources which are in errored state and cannot be deleted.
~ kubectl get perconaxtradbclusters -n pxc
NAME ENDPOINT STATUS PXC PROXYSQL AGE
cluster1 Error 0 0 4h1m
cluster2 Error 0 0 3h34m
I am not able to delete both of them and due to this I'm not able to create a cluster with the same name.
When I run the delete command, it gets stuck forever
~ kubectl delete perconaxtradbclusters -n pxc cluster1
perconaxtradbcluster.pxc.percona.com "cluster1" deleted
The command execution is never completed.
The yaml of the object
apiVersion: pxc.percona.com/v1
kind: PerconaXtraDBCluster
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"pxc.percona.com/v1-3-0","kind":"PerconaXtraDBCluster"}
creationTimestamp: "2020-04-21T18:06:13Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2020-04-21T18:38:33Z"
finalizers:
- delete-pxc-pods-in-order
generation: 2
name: cluster2
namespace: pxc
resourceVersion: "5445879"
selfLink: /apis/pxc.percona.com/v1/namespaces/pxc/perconaxtradbclusters/cluster2
uid: 8c100840-b7a8-40d1-b976-1f80c469622b
spec:
allowUnsafeConfigurations: false
backup:
image: percona/percona-xtradb-cluster-operator:1.3.0-backup
schedule:
- keep: 3
name: sat-night-backup
schedule: 0 0 * * 6
storageName: s3-us-west
- keep: 5
name: daily-backup
schedule: 0 0 * * *
storageName: fs-pvc
serviceAccountName: percona-xtradb-cluster-operator
storages:
fs-pvc:
type: filesystem
volume:
persistentVolumeClaim:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 6Gi
s3-us-west:
s3:
bucket: S3-BACKUP-BUCKET-NAME-HERE
credentialsSecret: my-cluster-name-backup-s3
region: us-west-2
type: s3
pmm:
enabled: false
image: percona/percona-xtradb-cluster-operator:1.3.0-pmm
serverHost: monitoring-service
serverUser: pmm
proxysql:
affinity:
antiAffinityTopologyKey: kubernetes.io/hostname
enabled: true
gracePeriod: 30
image: percona/percona-xtradb-cluster-operator:1.3.0-proxysql
podDisruptionBudget:
maxUnavailable: 1
resources:
requests:
cpu: 600m
memory: 1G
size: 3
volumeSpec:
persistentVolumeClaim:
resources:
requests:
storage: 2Gi
pxc:
affinity:
antiAffinityTopologyKey: kubernetes.io/hostname
gracePeriod: 600
image: percona/percona-xtradb-cluster-operator:1.3.0-pxc
podDisruptionBudget:
maxUnavailable: 1
resources:
requests:
cpu: 600m
memory: 4G
size: 3
volumeSpec:
persistentVolumeClaim:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 60Gi
storageClassName: local-storage
secretsName: my-cluster-secrets
sslInternalSecretName: my-cluster-ssl-internal
sslSecretName: my-cluster-ssl
status:
conditions:
- lastTransitionTime: "2020-04-21T18:06:13Z"
message: 'wrong PXC options: set version: new version: Malformed version: '
reason: ErrorReconcile
status: "True"
type: Error
message:
- 'Error: wrong PXC options: set version: new version: Malformed version: '
proxysql:
ready: 0
pxc:
ready: 0
state: Error
How can I get rid of them
Your perconaxtradbclusters yaml example mentions pvc resources, so you'll probably have to delete the associated pvc first, if you haven't already done so.
Can you edit the resources to remove the finalizer blocks, and try delete them again?
kubectl edit perconaxtradbclusters cluster1 -n pxc
and delete
finalizers:
- delete-pxc-pods-in-order
If there's nothing left relying on those resources, that is.
Edit:
I would generally only use this method if I've exhausted all other possibilities and I can't find the hanging resources that are blocking the deletion. I did some digging around. This comment here describes other steps to take before resorting to removing the finalizers.
- Check that the API services are available
- Find any lingering resources that still exist and delete them.

How to avoid creating pod on node that doesn't have enough storage?

I'm setting up my application with Kubernetes. I have 2 Docker images (Oracle and Weblogic). I have 2 kubernetes nodes, Node1 (20 GB storage) and Node2 (60 GB) storage.
When I run kubectl apply -f oracle.yaml it tries to create oracle pod on Node1 and after few minutes it fails due to lack of storage. How can I force Kubernetes to check the free storage of that node before creating the pod there?
Thanks
First of all, you probably want to give Node1 more storage.
But if you don't want the pod to start at all you can probably run a check with an initContainer where you check how much space you are using with something like du or df. It could be a script that checks for a threshold that exits unsuccessfully if there is not enough space. Something like this:
#!/bin/bash
# Check if there are less than 10000 bytes in the <dir> directory
if [ `du <dir> | tail -1 | awk '{print $1}'` -gt "10000" ]; then exit 1; fi
Another alternative is to use a persistent volume (PV) with a persistent volume claim (PVC) that has enough space together with the default StorageClass Admission Controller, and you do allocate the appropriate space in your volume definition.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 40Gi
storageClassName: mytype
Then on your Pod:
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: mycontainer
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
The Pod will not start if your claim cannot be allocated (There isn't enough space)
You may try to specify ephemeral storage requirement for pod:
resources:
requests:
ephemeral-storage: "40Gi"
limits:
ephemeral-storage: "40Gi"
Then it would be scheduled only on nodes with sufficient ephemeral storage available.
You can verify the amount of ephemeral storage available on each node in the output of "kubectl describe node".
$ kubectl describe node somenode | grep -A 6 Allocatable
Allocatable:
attachable-volumes-gce-pd: 64
cpu: 3920m
ephemeral-storage: 26807024751
hugepages-2Mi: 0
memory: 12700032Ki
pods: 110