I am deploying web agent via web-agent-deployment.yaml. So I ran the below command
root#ip-10-11.x.x.:~/ignite# kubectl create -f web-agent-deployment.yaml
deployment web-agent created
But still no web-agent pod spin at all. Please check below command output.
root#ip-10-10-11.x.x:~/ignite# kubectl get pods -n ignite
NAME READY STATUS RESTARTS AGE
ignite-cluster-6qhmf 1/1 Running 0 2h
ignite-cluster-lpgrt 1/1 Running 0 2h
as per official Doc, it should come Like blow.
$ kubectl get pods -n ignite
NAME READY STATUS RESTARTS AGE
web-agent-5596bd78c-h4272 1/1 Running 0 1h
This is my web-agent file:-
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: web-agent
namespace: ignite
spec:
selector:
matchLabels:
app: web-agent
replicas: 1
template:
metadata:
labels:
app: web-agent
spec:
serviceAccountName: ignite-cluster
containers:
- name: web-agent
image: apacheignite/web-agent
resources:
limits:
cpu: 500m
memory: 500Mi
env:
- name: DRIVER_FOLDER
value: "./jdbc-drivers"
- name: NODE_URI
value: ""https://10.11.X.Y:8080"" #my One of worker Node IP
- name: SERVER_URI
value: "http://frontend.web-console.svc.cluster.local"
- name: TOKENS
value: ""
- name: NODE_LOGIN
value: web-agent
- name: NODE_PASSWORD
value: password
Related
I'm currently learning Kubernetes and all its quircks.
I'm currently using a rabbitMQ Deployment, service and pod in my cluster to exchange messages between apps in the cluster. However, I saw an abnormal amount of the rabbitMQ pod restarts.
After installing prometheus and Grafana to see the problem, I saw that the rabbitMQ pod would consume more and more memory and cpu until it gets killed by the OOMkiller every two hours or so. The graph looks like this :
Graph of CPU consumption in my cluster (rabbitmq in red)
After that I looked into the rabbitMQ pod UI, and saw that an app in my cluster (ip 10.224.0.5) was constantly creating new connections, this IP corresponding to my kube-system and my prometheus instance, as shown by the following logs :
k get all -A -o wide | grep 10.224.0.5
E1223 12:13:48.231908 23198 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E1223 12:13:48.311831 23198 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
kube-system pod/azure-ip-masq-agent-xh9jk 1/1 Running 0 25d 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/cloud-node-manager-h5ff5 1/1 Running 0 25d 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/csi-azuredisk-node-sf8sn 3/3 Running 0 3d15h 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/csi-azurefile-node-97nbt 3/3 Running 0 19d 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/kube-proxy-2s5tn 1/1 Running 0 3d15h 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
monitoring pod/prometheus-prometheus-node-exporter-dztwx 1/1 Running 0 20h 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
Also, I noticed that these connections seem tpo be blocked by rabbitMQ, as the field connection.blocked in the client properties is set to true, as shown in the follwing image:
Print screen of a connection details from rabbitMQ pod's UI
I saw in the documentation that rabbitMQ starts to blocks connections when it hits low on resources, but I set the cpu and memory limits to 1 cpu and 1 Gib RAM, and the connections are blocked from the start anyway.
On the cluster, I'm also using Keda which uses the rabbitmq pod, and polls it every one second to see if there are any messages in a queue (I set pollingInterval to 1 in the yaml). But as I said earlier, it's not Keda that's creating all the connections, it's kube-system. Unless keda uses a component described earlier in the log to poll rabbitmq, and that the Keda's polling interval does not corresponds to seconds (which is highly unlikely as it's written in the docs that this polling intertval is given in seconds), I don't know at all what's going on with all these connections.
The following section contains the yamls of all the components that might be involved with this problem (keda and rabbitmq) :
rabbitMQ Replica Count.yaml
apiVersion: v1
kind: ReplicationController
metadata:
labels:
component: rabbitmq
name: rabbitmq-controller
spec:
replicas: 1
template:
metadata:
labels:
app: taskQueue
component: rabbitmq
spec:
containers:
- image: rabbitmq:3.11.5-management
name: rabbitmq
ports:
- containerPort: 5672
name: amqp
- containerPort: 15672
name: http
resources:
limits:
cpu: 1
memory: 1Gi
rabbitMQ Service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
component: rabbitmq
name: rabbitmq-service
spec:
type: LoadBalancer
ports:
- port: 5672
targetPort: 5672
name: amqp
- port: 15672
targetPort: 15672
name: http
selector:
app: taskQueue
component: rabbitmq
keda JobScaler, Secret and TriggerAuthentication (sample data is just a replacement for fields that I do not want to be revealed :) ):
apiVersion: v1
kind: Secret
metadata:
name: keda-rabbitmq-secret
data:
host: sample-host # base64 encoded value of format amqp://guest:password#localhost:5672/vhost
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-trigger-auth-rabbitmq-conn
namespace: default
spec:
secretTargetRef:
- parameter: host
name: keda-rabbitmq-secret
key: host
---
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: builder-job-scaler
namespace: default
spec:
jobTargetRef:
parallelism: 1
completions: 1
activeDeadlineSeconds: 600
backoffLimit: 5
template:
spec:
volumes:
- name: shared-storage
emptyDir: {}
initContainers:
- name: sourcesfetcher
image: sample image
volumeMounts:
- name: shared-storage
mountPath: /mnt/shared
env:
- name: SHARED_STORAGE_MOUNT_POINT
value: /mnt/shared
- name: RABBITMQ_ENDPOINT
value: sample host
- name: RABBITMQ_QUEUE_NAME
value: buildOrders
containers:
- name: builder
image: sample image
volumeMounts:
- name: shared-storage
mountPath: /mnt/shared
env:
- name: SHARED_STORAGE_MOUNT_POINT
value: /mnt/shared
- name: MINIO_ENDPOINT
value: sample endpoint
- name: MINIO_PORT
value: sample port
- name: MINIO_USESSL
value: "false"
- name: MINIO_ROOT_USER
value: sample user
- name: MINIO_ROOT_PASSWORD
value: sampel password
- name: BUCKET_NAME
value: "hex"
- name: SERVER_NAME
value: sample url
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 500m
memory: 512Mi
restartPolicy: OnFailure
pollingInterval: 1
maxReplicaCount: 2
minReplicaCount: 0
rollout:
strategy: gradual
triggers:
- type: rabbitmq
metadata:
protocol: amqp
queueName: buildOrders
mode: QueueLength
value: "1"
authenticationRef:
name: keda-trigger-auth-rabbitmq-conn
Any help would very much appreciated!
I am trying to run a test pod with OpenShift CLI:
$oc run nginx --image=nginx --limits=cpu=2,memory=4Gi
deploymentconfig.apps.openshift.io/nginx created
$oc describe deploymentconfig.apps.openshift.io/nginx
Name: nginx
Namespace: myproject
Created: 12 seconds ago
Labels: run=nginx
Annotations: <none>
Latest Version: 1
Selector: run=nginx
Replicas: 1
Triggers: Config
Strategy: Rolling
Template:
Pod Template:
Labels: run=nginx
Containers:
nginx:
Image: nginx
Port: <none>
Host Port: <none>
Limits:
cpu: 2
memory: 4Gi
Environment: <none>
Mounts: <none>
Volumes: <none>
Deployment #1 (latest):
Name: nginx-1
Created: 12 seconds ago
Status: New
Replicas: 0 current / 0 desired
Selector: deployment=nginx-1,deploymentconfig=nginx,run=nginx
Labels: openshift.io/deployment-config.name=nginx,run=nginx
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal DeploymentCreated 12s deploymentconfig-controller Created new replication controller "nginx-1" for version 1
Warning FailedCreate 1s (x12 over 12s) deployer-controller Error creating deployer pod: pods "nginx-1-deploy" is forbidden: failed quota: quota-svc-myproject: must specify limits.cpu,limits.memory
I get "must specify limits.cpu,limits.memory" error, despite both limits being present in the same describe output.
What might be the problem and how do I fix it?
I found a solution!
Part of the error message was "Error creating deployer pod". It means that the problem is not with my pod, but with the deployer pod which performs my pod deployment.
It seems the quota in my project affects deployer pods as well.
I couldn't find a way to set deployer pod limits with CLI, so I've made a DeploymentConfig.
kind: "DeploymentConfig"
apiVersion: "v1"
metadata:
name: "test-app"
spec:
template:
metadata:
labels:
name: "test-app"
spec:
containers:
- name: "test-app"
image: "nginxinc/nginx-unprivileged"
resources:
limits:
cpu: "2000m"
memory: "20Gi"
ports:
- containerPort: 8080
protocol: "TCP"
replicas: 1
selector:
name: "test-app"
triggers:
- type: "ConfigChange"
- type: "ImageChange"
imageChangeParams:
automatic: true
containerNames:
- "test-app"
from:
kind: "ImageStreamTag"
name: "nginx-unprivileged:latest"
strategy:
type: "Rolling"
resources:
limits:
cpu: "2000m"
memory: "20Gi"
A you can see, two sets of limitations are specified here: for container and for deployment strategy.
With this configuration it worked fine!
Looks like you have specified resource quota and the values you specified for limits seems to be larger than that. Can you describe the resource quota oc describe quota quota-svc-myproject and adjust your configs accordingly.
A good reference could be https://docs.openshift.com/container-platform/3.11/dev_guide/compute_resources.html
I have an existing kubernetes deployment which is running fine. Now I want to edit it with some new environment variables which I will use in the pod.
Editing a deployment will delete and create new pod or it will update the existing pod.
My requirement is I want to create a new pod whenever I edit/update the deployment.
Kubernetes is always going to recreate your pods in case you change/create env vars.
Lets check this together creating a deployment without any env var on it:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Let's check and note these pod names so we can compare later:
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-deployment-56db997f77-9mpjx 1/1 Running 0 8s
nginx-deployment-56db997f77-mgdv9 1/1 Running 0 8s
nginx-deployment-56db997f77-zg96f 1/1 Running 0 8s
Now let's edit this deployment and include one env var making the manifest look like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
env:
- name: STACK_GREETING
value: "Hello from the MARS"
ports:
- containerPort: 80
After we finish the edition, let's check our pod names and see if it changed:
$ kubectl get pod
nginx-deployment-5b4b68cb55-9ll7p 1/1 Running 0 25s
nginx-deployment-5b4b68cb55-ds9kb 1/1 Running 0 23s
nginx-deployment-5b4b68cb55-wlqgz 1/1 Running 0 21s
As we can see, all pod names changed. Let's check if our env var got applied:
$ kubectl exec -ti nginx-deployment-5b4b68cb55-9ll7p -- sh -c 'echo $STACK_GREETING'
Hello from the MARS
The same behavior will occur if you change the var or even remove it. All pods need to be removed and created again for the changes to take place.
If you would like to create a new pod, then you need to create a new deployment for that. By design deployments are managing the replicas of pods that belong to them.
was hoping to get a little help, my Google-Fu didnt get me much closer. I'm trying to install the metrics server for my fedora-coreos kubernetes 4 node cluster like so:
kubectl apply -f deploy/kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
the service seems to never start
kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"...
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2020-03-04T16:53:33Z
Resource Version: 1611816
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
UID: 65d9a56a-c548-4d7e-a647-8ce7a865a266
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: kube-system
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2020-03-04T16:53:33Z
Message: failing or missing response from https://10.3.230.59:443/apis/metrics.k8s.io/v1beta1: bad status from https://10.3.230.59:443/apis/metrics.k8s.io/v1beta1: 403
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
Diagnosing I have found googling around:
kubectl get deploy,svc -n kube-system |egrep metrics-server
deployment.apps/metrics-server 1/1 1 1 8m7s
service/metrics-server ClusterIP 10.3.230.59 <none> 443/TCP 8m7s
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get all --all-namespaces | grep -i metrics-server
kube-system pod/metrics-server-75b5d446cd-zj4jm 1/1 Running 0 9m11s
kube-system service/metrics-server ClusterIP 10.3.230.59 <none> 443/TCP 9m11s
kube-system deployment.apps/metrics-server 1/1 1 1 9m11s
kube-system replicaset.apps/metrics-server-75b5d446cd 1 1 1 9m11s
kubectl logs -f metrics-server-75b5d446cd-zj4jm -n kube-system
I0304 16:53:36.475657 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
W0304 16:53:38.229267 1 authentication.go:296] Cluster doesn't provide requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
I0304 16:53:38.267760 1 secure_serving.go:116] Serving securely on [::]:4443
kubectl get -n kube-system deployment metrics-server -o yaml | grep -i args -A 10
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"metrics-server","namespace":"kube-system"},"spec":{"selector":{"matchLabels":{"k8s-app":"metrics-server"}},"template":{"metadata":{"labels":{"k8s-app":"metrics-server"},"name":"metrics-server"},"spec":{"containers":[{"args":["--cert-dir=/tmp","--secure-port=4443","--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"],"image":"k8s.gcr.io/metrics-server-amd64:v0.3.6","imagePullPolicy":"IfNotPresent","name":"metrics-server","ports":[{"containerPort":4443,"name":"main-port","protocol":"TCP"}],"securityContext":{"readOnlyRootFilesystem":true,"runAsNonRoot":true,"runAsUser":1000},"volumeMounts":[{"mountPath":"/tmp","name":"tmp-dir"}]}],"nodeSelector":{"beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64"},"serviceAccountName":"metrics-server","volumes":[{"emptyDir":{},"name":"tmp-dir"}]}}}}
creationTimestamp: "2020-03-04T16:53:33Z"
generation: 1
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
resourceVersion: "1611810"
selfLink: /apis/apps/v1/namespaces/kube-system/deployments/metrics-server
uid: 006e758e-bd33-47d7-8378-d3a8081ee8a8
spec:
--
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
name: metrics-server
ports:
- containerPort: 4443
name: main-port
finally my deployment config:
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
imagePullPolicy: IfNotPresent
volumeMounts:
- name: tmp-dir
mountPath: /tmp
hostNetwork: true
nodeSelector:
beta.kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
I'm at a loss of what it could be getting the metrics service to start and just get the basic kubectl top node to display any info all I get is
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
I have searched the internet and tried adding the args: and command: lines but no luck
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
Can anyone shed light on how to fix this? Thanks
Pastebin log file
Log File
I've reproduced your issue. I have used Calico as CNI.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
fedora-master Ready master 6m27s v1.17.3
fedora-worker-1 Ready <none> 4m48s v1.17.3
fedora-worker-2 Ready <none> 4m46s v1.17.3
fedora-master:~/metrics-server$ kubectl describe apiservice v1beta1.metrics.k8s.io
Status:
Conditions:
Last Transition Time: 2020-03-12T16:04:59Z
Message: failing or missing response from https://10.99.122.196:443/apis/metrics.k8s.io/v
1beta1: Get https://10.99.122.196:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting
for connection (Client.Timeout exceeded while awaiting headers)
fedora-master:~/metrics-server$ kubectl top pod
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
When you have only one node in cluster, default settings in metrics-server repo works correctly. Issue occurs when you have more than 2 nodes. Ive used 1 master and 2 workers to reproduce. Below example deployment which works correct (have all required args). Before, please remove your current metrics-server YAMLs (kubectl delete -f deploy/kubernetes) and execute:
$ git clone https://github.com/kubernetes-sigs/metrics-server
$ cd metrics-server/deploy/kubernetes/
$ vi metrics-server-deployment.yaml
Paste below YAML:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
hostNetwork: true
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
args:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
- --cert-dir=/tmp
- --secure-port=4443
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- name: tmp-dir
mountPath: /tmp
nodeSelector:
kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
save and quit using :wq
$ cd ~/metrics-server
$ kubectl apply -f deploy/kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
Wait a while for metrics-server to gather a few metrics from nodes.
$ kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
Namespace:
...
Metadata:
Creation Timestamp: 2020-03-12T16:57:58Z
...
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: kube-system
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2020-03-12T16:58:01Z
Message: all checks passed
Reason: Passed
Status: True
Type: Available
Events: <none>
after a few minutes you can use top.
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
fedora-master 188m 9% 1315Mi 17%
fedora-worker-1 109m 5% 982Mi 13%
fedora-worker-2 84m 4% 969Mi 13%
If you will still encounter some issues, please add - --v=6 to deployment and provide logs from metrics-server pod.
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
args:
- /metrics-server
- --v=6
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
You need to carefully check logs for calico-node pods. In my case i have some other network interfaces and the autodetection mechanism in calico was detecting wrong interface (ip address). You need to consult this documentation https://projectcalico.docs.tigera.io/reference/node/configuration.
What i did in my case, was simply:
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=cidr=172.16.8.0/24
cidr is my "working network". After this, all calico-nodes restarted and suddenly everything was fine.
OpenShift (and probably k8s, too) updates a deployment's existing environment variables and creates new ones when they were changed in the respective DeploymentConfig in a template file before applying it.
Is there a way to remove already existing environment variables if they are no longer specified in a template when you run oc apply?
There is a way to achieve what you need and for that you need to patch your objects. You need to use the patch type merge-patch+json and as a patch you need to supply a complete/desired list of env vars.
As an example lets consider this deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mydeployment
labels:
app: sample
spec:
replicas: 2
selector:
matchLabels:
app: sample
template:
metadata:
labels:
app: sample
spec:
containers:
- name: envar-demo-container
image: gcr.io/google-samples/node-hello:1.0
env:
- name: VAR1
value: "Hello, I'm VAR1"
- name: VAR2
value: "Hey, VAR2 here. Don't kill me!"
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
mydeployment-db84d9bcc-jg8cb 1/1 Running 0 28s
mydeployment-db84d9bcc-mnf4s 1/1 Running 0 28s
$ kubectl exec -ti mydeployment-db84d9bcc-jg8cb -- env | grep VAR
VAR1=Hello, I'm VAR1
VAR2=Hey, VAR2 here. Don't kill me!
Now, to remove VAR2 we have to export our yaml deployment:
$ kubectl get deployments mydeployment -o yaml --export > patch-file.yaml
Edit this file and remove VAR2 entry:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 10
selector:
matchLabels:
app: sample
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: sample
spec:
containers:
- env:
- name: VAR1
value: Hello, I'm VAR1
image: gcr.io/google-samples/node-hello:1.0
imagePullPolicy: IfNotPresent
name: patch-demo-ctr
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status: {}
Now we need to patch it with the following command:
$ kubectl patch deployments mydeployment --type merge --patch "$(cat patch-file.yaml)"
deployment.extensions/mydeployment patched
Great, If we check our pods we can see that we have 2 new pods and the old ones are being terminated:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
mydeployment-8484d6887-dvdnc 1/1 Running 0 5s
mydeployment-8484d6887-xzkhb 1/1 Running 0 3s
mydeployment-db84d9bcc-jg8cb 1/1 Terminating 0 5m33s
mydeployment-db84d9bcc-mnf4s 1/1 Terminating 0 5m33s
Now, if we check the new pods, we can see they have only VAR1:
$ kubectl exec -ti mydeployment-8484d6887-dvdnc -- env | grep VAR
VAR1=Hello, I'm VAR1