I have created a statefulset using the following manifest,
the issue is that my statefulset is created correctly
but its replicas never get to run.
I hope you can help me, thanks
Name: phponcio
Namespace: default
Priority: 0
Node: juanaraque-worknode/10.0.0.2
Start Time: Wed, 01 Jun 2022 13:49:08 +0200
Labels: run=phponcio
Annotations: cni.projectcalico.org/containerID: 768caa8f4e9b0683033e279aa259d6673d81fe38b62a2ea06c7d256805d522f0
cni.projectcalico.org/podIP: 192.168.240.69/32
cni.projectcalico.org/podIPs: 192.168.240.69/32
Status: Running
IP: 192.168.240.69
IPs:
IP: 192.168.240.69
Containers:
phponcio:
Container ID: docker://75f90344ada396be965f0ae5c33bb7ea8c110d9ea1e014b75f36263b9cd72504
Image: 84d3f7ba5876/php-apache-mysql
Image ID: docker-pullable://84d3f7ba5876/php-apache-mysql#sha256:bf0cd01ae4b77cca146dcd54d8a447ba8b6c7f5a9e11d6dab6d19f429fb111d1
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 01 Jun 2022 14:31:53 +0200
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Wed, 01 Jun 2022 13:54:26 +0200
Finished: Wed, 01 Jun 2022 14:30:44 +0200
Ready: True
Restart Count: 2
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bld5b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-bld5b:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
Name: phponcio
Namespace: default
CreationTimestamp: Thu, 02 Jun 2022 10:35:08 +0200
Selector: app=phponcio
Labels: <none>
Annotations: <none>
Replicas: 3 desired | 1 total
Update Strategy: RollingUpdate
Partition: 0
Pods Status: 0 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=phponcio
Containers:
phponcio:
Image: 84d3f7ba5876/php-apache-mysql
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/usr/share from slow (rw)
Volumes: <none>
Volume Claims:
Name: slow
StorageClass: slow
Labels: <none>
Annotations: <none>
Capacity: 1Gi
Access Modes: [ReadWriteOnce]
Events: <none>
apiVersion: v1
kind: Service
metadata:
name: phponcio
labels:
app: phponcio
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: phponcio
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: phponcio
spec:
selector:
matchLabels:
app: phponcio # tiene que coincidir con .spec.template.metadata.labels
serviceName: phponcio
replicas: 3 # por defecto es 1
template:
metadata:
labels:
app: phponcio # tiene que coincidir con .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: phponcio
image: 84d3f7ba5876/php-apache-mysql
ports:
- containerPort: 80
name: phponcio
volumeMounts:
- name: slow
mountPath: /usr/share
volumeClaimTemplates:
- metadata:
name: slow
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: slow
resources:
requests:
storage: 1Gi
Related
I'm trying to create a persistent storage to share with all of my application in the K8s cluster.
storageClass.yaml file:
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: my-local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
persistentVolume.yaml file:
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-local-pv
spec:
capacity:
storage: 50Mi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: my-local-storage
local:
path: /base-xapp/data
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- juniper-ric
persistentVolumeClaim.yaml file:
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: my-local-storage
resources:
requests:
storage: 50Mi
selector:
matchLabels:
name: my
and finally, this is the deployment yaml file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Values.appName }}-deployment
labels:
app: {{ .Values.appName }}
xappRelease: {{ .Release.Name }}
spec:
replicas: 1
selector:
matchLabels:
app: {{ .Values.appName }}
template:
metadata:
labels:
app: {{ .Values.appName }}
xappRelease: {{ .Release.Name }}
spec:
containers:
- name: {{ .Values.appName }}
image: "{{ .Values.image }}:{{ .Values.tag }}"
imagePullPolicy: IfNotPresent
ports:
- name: rmr
containerPort: {{ .Values.rmrPort }}
protocol: TCP
- name: rtg
containerPort: {{ .Values.rtgPort }}
protocol: TCP
volumeMounts:
- name: app-cfg
mountPath: {{ .Values.routingTablePath }}{{ .Values.routingTableFile }}
subPath: {{ .Values.routingTableFile }}
- name: app-cfg
mountPath: {{ .Values.routingTablePath }}{{ .Values.vlevelFile }}
subPath: {{ .Values.vlevelFile }}
- name: {{ .Values.appName }}-persistent-storage
mountPath: {{ .Values.appName }}/data
envFrom:
- configMapRef:
name: {{ .Values.appName }}-configmap
volumes:
- name: app-cfg
configMap:
name: {{ .Values.appName }}-configmap
items:
- key: {{ .Values.routingTableFile }}
path: {{ .Values.routingTableFile }}
- key: {{ .Values.vlevelFile }}
path: {{ .Values.vlevelFile }}
- name: {{ .Values.appName }}-persistent-storage
persistentVolumeClaim:
claimName: my-claim
---
apiVersion: v1
kind: Service
metadata:
name: {{ .Values.appName }}-rmr-service
labels:
xappRelease: {{ .Release.Name }}
spec:
selector:
app: {{ .Values.appName }}
type : NodePort
ports:
- name: rmr
protocol: TCP
port: {{ .Values.rmrPort }}
targetPort: {{ .Values.rmrPort }}
- name: rtg
protocol: TCP
port: {{ .Values.rtgPort }}
targetPort: {{ .Values.rtgPort }}
When i deploy the container the container status equals Pending
base-xapp-deployment-6799d6cbf6-lgjks 0/1 Pending 0 3m25s
this is the output of the describe:
Name: base-xapp-deployment-6799d6cbf6-lgjks
Namespace: near-rt-ric
Priority: 0
Node: <none>
Labels: app=base-xapp
pod-template-hash=6799d6cbf6
xappRelease=base-xapp
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/base-xapp-deployment-6799d6cbf6
Containers:
base-xapp:
Image: base-xapp:0.1.0
Ports: 4565/TCP, 4561/TCP
Host Ports: 0/TCP, 0/TCP
Environment Variables from:
base-xapp-configmap ConfigMap Optional: false
Environment: <none>
Mounts:
/rmr_route from app-cfg (rw,path="rmr_route")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rxmwm (ro)
/vlevel from app-cfg (rw,path="vlevel")
base-xapp/data from base-xapp-persistent-storage (rw)
Conditions:
Type Status
PodScheduled False
Volumes:
app-cfg:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: base-xapp-configmap
Optional: false
base-xapp-persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-claim
ReadOnly: false
kube-api-access-rxmwm:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 10s (x6 over 4m22s) default-scheduler 0/1 nodes are available: 1 persistentvolumeclaim "my-claim" not found.
this is the output of kubectl resources:
get pv:
dan#linux$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
my-local-pv 50Mi RWO Retain Available my-local-storage 6m2s
get pvc:
dan#linux$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-claim Pending my-local-storage 36m
You're missing spec.volumeName in your PVC manifest.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-claim
spec:
volumeName: my-local-pv # this line was missing
accessModes:
- ReadWriteOnce
storageClassName: my-local-storage
resources:
requests:
storage: 50Mi
selector:
matchLabels:
name: my
I can see your deployment have namespace near-rt-ric.
But your PVC doesn't have a namespace, it probable placed in default namespace
Use this command to check kubectl get pvc -A
I am deploy traefik v2.1.6 using this yaml:
apiVersion: v1
kind: Service
metadata:
name: traefik
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8080'
spec:
ports:
- name: web
port: 80
- name: websecure
port: 443
- name: metrics
port: 8080
selector:
app: traefik
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: traefik-ingress-controller
labels:
app: traefik
spec:
selector:
matchLabels:
app: traefik
template:
metadata:
name: traefik
labels:
app: traefik
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 1
containers:
- image: traefik:2.1.6
name: traefik-ingress-lb
ports:
- name: web
containerPort: 80
hostPort: 80 #hostPort方式,将端口暴露到集群节点
- name: websecure
containerPort: 443
hostPort: 443 #hostPort方式,将端口暴露到集群节点
- name: metrics
containerPort: 8080
resources:
limits:
cpu: 2000m
memory: 1024Mi
requests:
cpu: 1000m
memory: 1024Mi
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
envFrom:
- secretRef:
name: traefik-alidns-secret
args:
- --configfile=/config/traefik.yaml
- --logLevel=INFO
- --metrics=true
- --metrics.prometheus=true
- --entryPoints.metrics.address=:8080
- --metrics.prometheus.entryPoint=metrics
- --metrics.prometheus.addServicesLabels=true
- --metrics.prometheus.addEntryPointsLabels=true
- --metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
# HTTPS证书配置
- --entryPoints.web.address=:80
- --entryPoints.websecure.address=:443
# 邮箱配置
- --certificatesResolvers.default.acme.email=jiangtingqiang#gmail.com
# 保存 ACME 证书的位置
- --certificatesResolvers.default.acme.storage=/config/acme.json
- --certificatesResolvers.default.acme.httpChallenge.entryPoint=web
# 下面是用于测试的ca服务,如果https证书生成成功了,则移除下面参数
- --certificatesResolvers.default.acme.dnsChallenge.provider=alidns
- --certificatesResolvers.default.acme.dnsChallenge=true
- --certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
volumeMounts:
- mountPath: "/config"
name: "config"
volumes:
- name: config
configMap:
name: traefik-config
tolerations: #设置容忍所有污点,防止节点被设置污点
- operator: "Exists"
nodeSelector: #设置node筛选器,在特定label的节点上启动
app-type: "online-app"
the service start success:
$ k get daemonset -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
traefik-ingress-controller 1 1 1 1 1 app-type=online-app 61d
But when I access the treafik using this command, it shows 404 not found:
[root#fat001 ~]# curl -k --header 'Host:traefik.example.com' https://172.19.104.230
404 page not found
172.19.104.230 is the kubernetes cluster(v1.15.2) edge node runs traefik, what should I do to access traefik success? This is pod describe output:
$ k describe pod traefik-ingress-controller-t4rmx -n kube-system
Name: traefik-ingress-controller-t4rmx
Namespace: kube-system
Priority: 0
Node: azshara-k8s02/172.19.104.230
Start Time: Tue, 31 Mar 2020 00:14:38 +0800
Labels: app=traefik
controller-revision-hash=547587d6d5
pod-template-generation=44
Annotations: <none>
Status: Running
IP: 172.30.208.18
IPs: <none>
Controlled By: DaemonSet/traefik-ingress-controller
Containers:
traefik-ingress-lb:
Container ID: docker://88b74826c5e380e00a53d2d4741ab6b74d8628412275f062dda861ad26681971
Image: traefik:2.1.6
Image ID: docker-pullable://traefik#sha256:13c5e62a0757bd8bf57c8c36575f7686f06186994ad6d2bda773ed8f140415c2
Ports: 80/TCP, 443/TCP, 8080/TCP
Host Ports: 80/TCP, 443/TCP, 0/TCP
Args:
--configfile=/config/traefik.yaml
--logLevel=INFO
--metrics=true
--metrics.prometheus=true
--entryPoints.metrics.address=:8080
--metrics.prometheus.entryPoint=metrics
--metrics.prometheus.addServicesLabels=true
--metrics.prometheus.addEntryPointsLabels=true
--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
--entryPoints.web.address=:80
--entryPoints.websecure.address=:443
--certificatesResolvers.default.acme.email=jiangtingqiang#gmail.com
--certificatesResolvers.default.acme.storage=/config/acme.json
--certificatesResolvers.default.acme.httpChallenge.entryPoint=web
--certificatesResolvers.default.acme.dnsChallenge.provider=alidns
--certificatesResolvers.default.acme.dnsChallenge=true
--certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
State: Running
Started: Tue, 31 Mar 2020 00:14:39 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Environment Variables from:
traefik-alidns-secret Secret Optional: false
Environment: <none>
Mounts:
/config from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from traefik-ingress-controller-token-92vsc (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: traefik-config
Optional: false
traefik-ingress-controller-token-92vsc:
Type: Secret (a volume populated by a Secret)
SecretName: traefik-ingress-controller-token-92vsc
Optional: false
QoS Class: Burstable
Node-Selectors: app-type=online-app
Tolerations:
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 102m default-scheduler Successfully assigned kube-system/traefik-ingress-controller-t4rmx to azshara-k8s02
Normal Pulled 102m kubelet, azshara-k8s02 Container image "traefik:2.1.6" already present on machine
Normal Created 102m kubelet, azshara-k8s02 Created container traefik-ingress-lb
Normal Started 102m kubelet, azshara-k8s02 Started container traefik-ingress-lb
And this is my treafik route config:
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dashboard-route
namespace: kube-system
spec:
entryPoints:
- websecure
tls:
certResolver: default
routes:
- match: Host(`traefik.example.com`) && PathPrefix(`/default`)
kind: Rule
services:
- name: traefik
port: 8080
curl from kubernetes container works fine like this:
/ # curl -L traefik.kube-system.svc.cluster.local:8080
<!DOCTYPE html><html><head><title>Traefik</title><meta charset=utf-8><meta name=description content="Traefik UI"><meta name=format-detection content="telephone=no"><meta name=msapplication-tap-highlight content=no><meta name=viewport content="user-scalable=no,initial-scale=1,maximum-scale=1,minimum-scale=1,width=device-width"><link rel=icon type=image/png href=statics/app-logo-128x128.png><link rel=icon type=image/png sizes=16x16 href=statics/icons/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=statics/icons/favicon-32x32.png><link rel=icon type=image/png sizes=96x96 href=statics/icons/favicon-96x96.png><link rel=icon type=image/ico href=statics/icons/favicon.ico><link href=css/019be8e4.d05f1162.css rel=prefetch><link href=css/099399dd.9310dd1b.css rel=prefetch><link href=css/0af0fca4.e3d6530d.css rel=prefetch><link href=css/162d302c.9310dd1b.css rel=prefetch><link href=css/29ead7f5.9310dd1b.css rel=prefetch><link href=css/31ad66a3.9310dd1b.css rel=prefetch><link href=css/524389aa.619bfb84.css rel=prefetch><link href=css/61674343.9310dd1b.css rel=prefetch><link href=css/63c47f2b.294d1efb.css rel=prefetch><link href=css/691c1182.ed0ee510.css rel=prefetch><link href=css/7ba452e3.37efe53c.css rel=prefetch><link href=css/87fca1b4.8c8c2eec.css rel=prefetch><link href=js/019be8e4.d8726e8b.js rel=prefetch><link href=js/099399dd.a047d401.js rel=prefetch><link href=js/0af0fca4.271bd48d.js rel=prefetch><link href=js/162d302c.ce1f9159.js rel=prefetch><link href=js/29ead7f5.cd022784.js rel=prefetch><link href=js/2d21e8fd.f3d2bb6c.js rel=prefetch><link href=js/31ad66a3.12ab3f06.js rel=prefetch><link href=js/524389aa.21dfc9ee.js rel=prefetch><link href=js/61674343.adb358dd.js rel=prefetch><link href=js/63c47f2b.caf9b4a2.js rel=prefetch><link href=js/691c1182.5d4aa4c9.js rel=prefetch><link href=js/7ba452e3.71a69a60.js rel=prefetch><link href=js/87fca1b4.ac9c2dc6.js rel=prefetch><link href=css/app.e4fba3f1.css rel=preload as=style><link href=js/app.841031a8.js rel=preload as=script><link href=js/vendor.49a1849c.js rel=preload as=script><link href=css/app.e4fba3f1.css rel=stylesheet><link rel=manifest href=manifest.json><meta name=theme-color content=#027be3><meta name=apple-mobile-web-app-capable content=yes><meta name=apple-mobile-web-app-status-bar-style content=default><meta name=apple-mobile-web-app-title content=Traefik><link rel=apple-touch-icon href=statics/icons/apple-icon-120x120.png><link rel=apple-touch-icon sizes=180x180 href=statics/icons/apple-icon-180x180.png><link rel=apple-touch-icon sizes=152x152 href=statics/icons/apple-icon-152x152.png><link rel=apple-touch-icon sizes=167x167 href=statics/icons/apple-icon-167x167.png><link rel=mask-icon href=statics/icons/safari-pinned-tab.svg color=#027be3><meta name=msapplication-TileImage content=statics/icons/ms-icon-144x144.png><meta name=msapplication-TileColor content=#000000></head><body><div id=q-app></div><script type=text/javascript src=js/app.841031a8.js></script><script type=text/javascript src=js/vendor.49a1849c.js></script></body></html>/ #
curl from host failed:
[root#fat001 ~]# curl -k --header 'Host:traefik.example.com' https://172.19.104.230
404 page not found
env: vagrant + virtualbox
kubernetes: 1.14
docker 18.06.3~ce~3-0~debian
os: debian stretch
I have priority classes:
root#k8s-master:/# kubectl get priorityclass
NAME VALUE GLOBAL-DEFAULT AGE
cluster-health-priority 1000000000 false 33m < -- created by me
default-priority 100 true 33m < -- created by me
system-cluster-critical 2000000000 false 33m < -- system
system-node-critical 2000001000 false 33m < -- system
default-priority - has been set as globalDefault
root#k8s-master:/# kubectl get priorityclass default-priority -o yaml
apiVersion: scheduling.k8s.io/v1
description: Used for all Pods without priorityClassName
globalDefault: true <------------------
kind: PriorityClass
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"scheduling.k8s.io/v1","description":"Used for all Pods without priorityClassName","globalDefault":true,"kind":"PriorityClass","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile"},"name":"default-priority"},"value":100}
creationTimestamp: "2019-07-15T16:48:23Z"
generation: 1
labels:
addonmanager.kubernetes.io/mode: Reconcile
name: default-priority
resourceVersion: "304"
selfLink: /apis/scheduling.k8s.io/v1/priorityclasses/default-priority
uid: 5bea6f73-a720-11e9-8343-0800278dc04d
value: 100
I have some pods, which were created after policy classes creation
This
kube-state-metrics-874ccb958-b5spd 1/1 Running 0 9m18s 10.20.59.67 k8s-master <none> <none>
And this
tmp-shell-one-59fb949cb5-b8khc 1/1 Running 1 47s 10.20.59.73 k8s-master <none> <none>
kube-state-metrics pod is using priorityClass cluster-health-priority
root#k8s-master:/etc/kubernetes/addons# kubectl -n kube-system get pod kube-state-metrics-874ccb958-b5spd -o yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2019-07-15T16:48:24Z"
generateName: kube-state-metrics-874ccb958-
labels:
k8s-app: kube-state-metrics
pod-template-hash: 874ccb958
name: kube-state-metrics-874ccb958-b5spd
namespace: kube-system
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: kube-state-metrics-874ccb958
uid: 5c64bf85-a720-11e9-8343-0800278dc04d
resourceVersion: "548"
selfLink: /api/v1/namespaces/kube-system/pods/kube-state-metrics-874ccb958-b5spd
uid: 5c88143e-a720-11e9-8343-0800278dc04d
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kube-role
operator: In
values:
- master
containers:
- image: gcr.io/google_containers/kube-state-metrics:v1.6.0
imagePullPolicy: Always
name: kube-state-metrics
ports:
- containerPort: 8080
name: http-metrics
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-state-metrics-token-jvz5b
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: k8s-master
nodeSelector:
namespaces/default: "true"
priorityClassName: cluster-health-priority <------------------------
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: kube-state-metrics
serviceAccountName: kube-state-metrics
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: dedicated
operator: Equal
value: master
- key: CriticalAddonsOnly
operator: Exists
volumes:
- name: kube-state-metrics-token-jvz5b
secret:
defaultMode: 420
secretName: kube-state-metrics-token-jvz5b
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:48:24Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:48:58Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:48:58Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:48:24Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://a736dce98492b7d746079728b683a2c62f6adb1068075ccc521c5e57ba1e02d1
image: gcr.io/google_containers/kube-state-metrics:v1.6.0
imageID: docker-pullable://gcr.io/google_containers/kube-state-metrics#sha256:c98991f50115fe6188d7b4213690628f0149cf160ac47daf9f21366d7cc62740
lastState: {}
name: kube-state-metrics
ready: true
restartCount: 0
state:
running:
startedAt: "2019-07-15T16:48:46Z"
hostIP: 10.0.2.15
phase: Running
podIP: 10.20.59.67
qosClass: BestEffort
startTime: "2019-07-15T16:48:24Z"
tmp-shell pod has nothing about priority classes at all:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2019-07-15T16:56:49Z"
generateName: tmp-shell-one-59fb949cb5-
labels:
pod-template-hash: 59fb949cb5
run: tmp-shell-one
name: tmp-shell-one-59fb949cb5-b8khc
namespace: monitoring
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: tmp-shell-one-59fb949cb5
uid: 89c3caa3-a721-11e9-8343-0800278dc04d
resourceVersion: "1350"
selfLink: /api/v1/namespaces/monitoring/pods/tmp-shell-one-59fb949cb5-b8khc
uid: 89c71bad-a721-11e9-8343-0800278dc04d
spec:
containers:
- args:
- /bin/bash
image: nicolaka/netshoot
imagePullPolicy: Always
name: tmp-shell-one
resources: {}
stdin: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
tty: true
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-g9lnc
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: k8s-master
nodeSelector:
namespaces/default: "true"
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:
- name: default-token-g9lnc
secret:
defaultMode: 420
secretName: default-token-g9lnc
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:56:49Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:57:20Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:57:20Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2019-07-15T16:56:49Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://545d4d029b440ebb694386abb09e0377164c87d1170ac79704f39d3167748bf5
image: nicolaka/netshoot:latest
imageID: docker-pullable://nicolaka/netshoot#sha256:b3e662a8730ee51c6b877b6043c5b2fa61862e15d535e9f90cf667267407753f
lastState:
terminated:
containerID: docker://dfdfd0d991151e94411029f2d5a1a81d67b5b55d43dcda017aec28320bafc7d3
exitCode: 130
finishedAt: "2019-07-15T16:57:17Z"
reason: Error
startedAt: "2019-07-15T16:57:03Z"
name: tmp-shell-one
ready: true
restartCount: 1
state:
running:
startedAt: "2019-07-15T16:57:19Z"
hostIP: 10.0.2.15
phase: Running
podIP: 10.20.59.73
qosClass: BestEffort
startTime: "2019-07-15T16:56:49Z"
According to the docs:
The globalDefault field indicates that the value of this PriorityClass
should be used for Pods without a priorityClassName
and
Pod priority is specified by setting the priorityClassName field of
podSpec. The integer value of priority is then resolved and populated
to the priority field of podSpec
So, the questions are:
Why tmp-shell pod is not using priorityClass default-priority, even it created after priority class with globalDefault to true?
Why kube-state-metrics pod does not have field priority with parsed value from the priority class cluster-health-priority in podSpec?(look at .yaml above)
What am I doing wrong?
The only way I can reproduce it is by disabling the Priority Admission Controller by adding this argument --disable-admission-plugins=Priority to the kube-api-server definition which is under /etc/kubernetes/manifests/kube-apiserver.yaml of the Host running the API Server.
According to the documentation in v1.14 this is enabled by default. Please make sure that it is enabled in your cluster as well.
I'm new to RabbitMQ and trying to setup a Highly Available Queue using statefulsets. The tutorial I followed is here
After deploying the statefulset and service to kubernetes,
The nodes are not able to discover each other in the cluster and the pod goes to Status: CrashLoopBackOff. It seems the Peer Discovery is not working as expected and the node is not able to join the cluster.
My cluster nodes are
rabbit#rabbitmq-0, rabbit#rabbitmq-1 and rabbit#rabbitmq-2
$ kubectl exec -it rabbitmq-0 /bin/sh
/ # rabbitmqctl status
Status of node 'rabbit#rabbitmq-0'
Error: unable to connect to node 'rabbit#rabbitmq-0': nodedown
DIAGNOSTICS
===========
attempted to contact: ['rabbit#rabbitmq-0']
rabbit#rabbitmq-0:
* connected to epmd (port 4369) on rabbitmq-0
* epmd reports: node 'rabbit' not running at all
no other nodes on rabbitmq-0
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-22#rabbitmq-0'
- home dir: /var/lib/rabbitmq
- cookie hash: 5X3n5Gy+r4FL+M53FHwv3w==
rabbitmq.conf
{ rabbit, [
{ loopback_users, [ ] },
{ tcp_listeners, [ 5672 ] },
{ ssl_listeners, [ ] },
{ hipe_compile, false },
{ cluster_nodes, { [ rabbit#rabbitmq-0, rabbit#rabbitmq-1, rabbit#rabbitmq-2], disc } },
{ssl_listeners, [5671]},
{ssl_options, [{cacertfile,"/etc/rabbitmq/ca_certificate.pem"},
{certfile,"/etc/rabbitmq/server_certificate.pem"},
{keyfile,"/etc/rabbitmq/server_key.pem"},
{verify,verify_peer},
{versions, ['tlsv1.2', 'tlsv1.1']}
{fail_if_no_peer_cert,false}]}
] },
{ rabbitmq_management, [ { listener, [
{ port, 15672 },
{ ssl, false }
] } ] }
].
$ kubectl get statefulset rabbitmq
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: rabbitmq
name: rabbitmq
namespace: development
resourceVersion: "119265565"
selfLink: /apis/apps/v1/namespaces/development/statefulsets/rabbitmq
uid: 10c2fabc-cbb3-11e7-8821-00505695519e
spec:
podManagementPolicy: OrderedReady
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: rabbitmq
serviceName: rabbitmq
template:
metadata:
creationTimestamp: null
labels:
app: rabbitmq
spec:
containers:
- env:
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
key: rabbitmq-erlang-cookie
name: rabbitmq-erlang-cookie
image: rabbitmq:1.0
imagePullPolicy: IfNotPresent
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- |
if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
cat /etc/resolv.conf.new > /etc/resolv.conf;
rm /etc/resolv.conf.new;
fi; until rabbitmqctl node_health_check; do sleep 1; done; if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
rabbitmqctl stop_app;
rabbitmqctl join_cluster rabbit#rabbitmq-0;
rabbitmqctl start_app;
fi; rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
name: rabbitmq
ports:
- containerPort: 5672
protocol: TCP
- containerPort: 5671
protocol: TCP
- containerPort: 15672
protocol: TCP
- containerPort: 25672
protocol: TCP
- containerPort: 4369
protocol: TCP
resources:
limits:
cpu: 400m
memory: 2Gi
requests:
cpu: 200m
memory: 1Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/rabbitmq
name: rabbitmq-persistent-data-storage
- mountPath: /etc/rabbitmq
name: rabbitmq-config
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
volumes:
- name: rabbitmq-config
secret:
defaultMode: 420
secretName: rabbitmq-config
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: rabbitmq-persistent-data-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
status:
phase: Pending
status:
currentReplicas: 1
currentRevision: rabbitmq-4234207235
observedGeneration: 1
replicas: 1
updateRevision: rabbitmq-4234207235
$ kubectl get service rabbitmq
apiVersion: v1
kind: Service
metadata:
labels:
app: rabbitmq
name: rabbitmq
namespace: develop
resourceVersion: "59968950"
selfLink: /api/v1/namespaces/develop/services/rabbitmq
uid: ced85a60-cbae-11e7-8821-00505695519e
spec:
clusterIP: None
ports:
- name: tls-amqp
port: 5671
protocol: TCP
targetPort: 5671
- name: management
port: 15672
protocol: TCP
targetPort: 15672
selector:
app: rabbitmq
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
$ kubectl describe pod rabbitmq-0
Name: rabbitmq-0
Namespace: development
Node: node9/170.XX.X.Xx
Labels: app=rabbitmq
controller-revision-hash=rabbitmq-4234207235
Status: Running
IP: 10.25.128.XX
Controlled By: StatefulSet/rabbitmq
Containers:
rabbitmq:
Container ID: docker://f60b06283d3974382a068ded54782b24de4b6da3203c05772a77c65d76aa2e2f
Image: rabbitmq:1.0
Image ID: rabbitmq#sha256:6245a81a1fc0fb
Ports: 5672/TCP, 5671/TCP, 15672/TCP, 25672/TCP, 4369/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Ready: False
Restart Count: 104
Limits:
cpu: 400m
memory: 2Gi
Requests:
cpu: 200m
memory: 1Gi
Environment:
RABBITMQ_ERLANG_COOKIE: <set to the key 'rabbitmq-erlang-cookie' in secret 'rabbitmq-erlang-cookie'> Optional: false
Mounts:
/etc/rabbitmq from rabbitmq-config (rw)
/var/lib/rabbitmq from rabbitmq-persistent-data-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-lqbp6 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
rabbitmq-persistent-data-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: rabbitmq-persistent-data-storage-rabbitmq-0
ReadOnly: false
rabbitmq-config:
Type: Secret (a volume populated by a Secret)
SecretName: rabbitmq-config
Optional: false
default-token-lqbp6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-lqbp6
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <none>
Events: <none>
This problem is due to failed DNS resolution happening inside the Pod. The pods are not able to contact each other due to no valid DNS records.
In order to solve this, please try creating additional service, or edit an existing one to handle DNS resolution for this.
Creating an additional service for DNS probe, can be done as follows :
kind: Service
apiVersion: v1
metadata:
namespace: default
name: rabbitmq
labels:
app: rabbitmq
type: Service
spec:
ports:
- name: http
protocol: TCP
port: 15672
targetPort: 15672
- name: amqp
protocol: TCP
port: 5672
targetPort: 5672
selector:
app: rabbitmq
type: ClusterIP
clusterIP: None
Here you mention in the Service spec that it is of type ClusterIP with clusterIP as none. This should help pods resolve the DNS.
Cheers!!
Rishabh
I'm trying to mimic volumes_from using Openshift/Kubernetes. I have container A packaging the app and container B server the packaged app.
I cannot use init containers since I'm stuck on kubernetes 1.2.
I've tried the postStart lifecycle hook, detailed here:
How to mimic '--volumes-from' in Kubernetes
But, Openshift/Kubernetes is always complaining that container A is contantly crashing because once it's done packaging, it exits.
How do I get Openshift/Kubernetes to stop complaining about container A crashing and just accept that it finished it's job?
Or is there another way of having one container build a package for another container to run?
Thanks in advance for your time.
Update 1:
I don't have kubectl, but using oc describe pod myapp-2-prehook:
me:~/Projects/myapp (master) $ oc describe pod myapp-2-prehook
Name: myapp-2-prehook
Namespace: myproject
Node: my.host/my.ip
Start Time: Tue, 01 Nov 2016 15:30:55 -1000
Labels: openshift.io/deployer-pod-for.name=myapp-2
Status: Failed
IP:
Controllers: <none>
Containers:
lifecycle:
Container ID: docker://97a5272ebfa56f0c40fdc95094f13da06dba889049f2cc964fe3e89f61bd7792
Image: my.ip:5000/myproject/myapp#sha256:cde5739c5f2bdc8c25b1dd514f698c543cfb6c8b68c3f1afbc7760e11597fde9
Image ID: docker://3be476fec505e5b979bac69d327d4ffb53b3f568e85547c5b66c229948435f44
Port:
Command:
scripts/build.sh
QoS Tier:
cpu: BestEffort
memory: BestEffort
State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 01 Nov 2016 15:31:21 -1000
Finished: Tue, 01 Nov 2016 15:31:42 -1000
Ready: False
Restart Count: 0
Environment Variables:
CUSTOM_VAR1: custom_value1
OPENSHIFT_DEPLOYMENT_NAME: myapp
OPENSHIFT_DEPLOYMENT_NAMESPACE: myproject
Conditions:
Type Status
Ready False
Volumes:
default-token-goe98:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-goe98
No events.
Output of oc get pod assessor-2-prehook -o yaml:
apiVersion: v1
kind: Pod
metadata:
annotations:
openshift.io/deployment.name: myapp-2
openshift.io/scc: restricted
creationTimestamp: 2016-11-02T01:30:55Z
labels:
openshift.io/deployer-pod-for.name: myapp-2
name: myapp-2-prehook
namespace: myproject
resourceVersion: "21512896"
selfLink: /api/v1/namespaces/myproject/pods/myapp-2-prehook
uid: ffcb7766-a09b-11e6-9053-005056a65cf8
spec:
activeDeadlineSeconds: 21600
containers:
- command:
- scripts/build.sh
env:
- name: CUSTOM_VAR1
value: custom_value1
- name: OPENSHIFT_DEPLOYMENT_NAME
value: myapp-2
- name: OPENSHIFT_DEPLOYMENT_NAMESPACE
value: myproject
image: my.ip:5000/myproject/myapp#sha256:cde5739c5f2bdc8c25b1dd514f698c543cfb6c8b68c3f1afbc7760e11597fde9
imagePullPolicy: IfNotPresent
name: lifecycle
resources: {}
securityContext:
privileged: false
seLinuxOptions:
level: s0:c21,c0
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-goe98
readOnly: true
dnsPolicy: ClusterFirst
host: my.host
imagePullSecrets:
- name: default-dockercfg-srrog
nodeName: my.host
restartPolicy: Never
securityContext:
seLinuxOptions:
level: s0:c21,c0
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:
- name: default-token-goe98
secret:
secretName: default-token-goe98
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2016-11-02T01:31:49Z
message: 'containers with unready status: [lifecycle]'
reason: ContainersNotReady
status: "False"
type: Ready
containerStatuses:
- containerID: docker://97a5272ebfa56f0c40fdc95094f13da06dba889049f2cc964fe3e89f61bd7792
image: my.ip:5000/myproject/myapp#sha256:cde5739c5f2bdc8c25b1dd514f698c543cfb6c8b68c3f1afbc7760e11597fde9
imageID: docker://3be476fec505e5b979bac69d327d4ffb53b3f568e85547c5b66c229948435f44
lastState: {}
name: lifecycle
ready: false
restartCount: 0
state:
terminated:
containerID: docker://97a5272ebfa56f0c40fdc95094f13da06dba889049f2cc964fe3e89f61bd7792
exitCode: 1
finishedAt: 2016-11-02T01:31:42Z
reason: Error
startedAt: 2016-11-02T01:31:21Z
hostIP: 128.49.90.62
phase: Failed
startTime: 2016-11-02T01:30:55Z