nodeSelector not working 1.2.0 - kubernetes

apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- image: busybox
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Always
nodeSelector:
docker: twelve
On the kube master:
kubectl get nodes -l docker=twelve
NAME LABELS STATUS AGE
10.10.1.4 docker=twelve,kubernetes.io/hostname=10.10.1.4 Ready 115d
10.10.1.5 docker=twelve,kubernetes.io/hostname=10.10.1.5 Ready 115d
from the event log
4m 17s 20 busybox Pod FailedScheduling {scheduler } Failed for reason MatchNodeSelector and possibly others
If I remove the nodeSelector, it deploys w/o issue.
I am trying to handle docker 1.9.1 and docker 1.12.1 for various teams and this is preventing it.
This is a kube cluster on CentOS 7.2.-1511 servers
kubectl version
Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"86327329213fed4af2661c5ae1e92f9956b24f55", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.0", GitCommit:"86327329213fed4af2661c5ae1e92f9956b24f55", GitTreeState:"clean"}

Restarting the kube-scheduler fixed the issue.
I guess when you add a new label to a node, you need to restart the scheduler.

Related

Manually trigger kubernetes cronjob fails

My k8s version:
kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.7-eks-d88609", GitCommit:"d886092805d5cc3a47ed5cf0c43de38ce442dfcb", GitTreeState:"clean", BuildDate:"2021-07-31T00:29:12Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.22) and server (1.20) exceeds the supported minor version skew of +/-1
My cronjob file:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
app: git-updater
name: git-updater
namespace: my-ns
spec:
concurrencyPolicy: Replace
failedJobsHistoryLimit: 2
jobTemplate:
spec:
template:
metadata:
labels:
app: git-updater
name: git-updater
spec:
containers:
- args:
- sh
- -c
- apk add --no-cache git openssh-client && cd /repos && ls -d */*.git/
| while read dir; do cd /repos/$dir; git gc; git fetch --prune; done
image: alpine
name: updater
volumeMounts:
- mountPath: test
name: test
- mountPath: test
name: test
restartPolicy: Never
volumes:
- persistentVolumeClaim:
claimName: pvc
name: repos
schedule: '#daily'
successfulJobsHistoryLimit: 4
...
When I create the job from file, all goes well:
kubectl -n my-ns create -f git-updater.yaml
cronjob.batch/git-updater created
But I would like to trigger it manually just for testing purposes, so I do:
kubectl -n my-ns create job --from=cronjob/git-updater test-job
error: unknown object type *v1beta1.CronJob
Which is strange because I have just been able to create it from file.
In another similar post it was suggested to switch from ApiVersion v1beta1 to v1.... but when I do so then I get:
kubectl -n my-ns create -f git-updater.yaml
error: unable to recognize "git-updater.yaml": no matches for kind "CronJob" in version "batch/v1"
I am a little stuck here, how can I test my newly and successfully created CronJob?
That's your v1.20 cluster version, that doing that.
Short answer is you should upgrade cluster to 1.21, where cronjobs works more or less stable.
Check
no matches for kind "CronJob" in version "batch/v1" , especially comment
One thing is the api-version, another one is in which version the
resource you are creating is available. By version 1.19.x you do have
batch/v1 as an available api-resource, but you don't have the resource
CronJob under it.
Kubernetes Create Job from Cronjob not working
I have an 1.20.9 GKE cluster and face the same issue as you
$kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.9-gke.1001", GitCommit:"1fe18c314ed577f6047d2712a9d1c8e498e22381", GitTreeState:"clean", BuildDate:"2021-08-23T23:06:28Z", GoVersion:"go1.15.13b5", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.22) and server (1.20) exceeds the supported minor version skew of +/-1
cronjob.batch/git-updater created
$kubectl -n my-ns create job --from=cronjob/git-updater test-job
error: unknown object type *v1beta1.CronJob
At the same time, your cronjob yaml perfectly works with apiVersion: batch/v1 on 1.21 GKE one.
$kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.3-gke.2001", GitCommit:"77cdf52c2a4e6c1bbd8ae4cd75c864e9abf4c79e", GitTreeState:"clean", BuildDate:"2021-08-20T18:38:46Z", GoVersion:"go1.16.6b7", Compiler:"gc", Platform:"linux/amd64"}
$kubectl -n my-ns create -f cronjob.yaml
cronjob.batch/git-updater created
$ kubectl -n my-ns create job --from=cronjob/git-updater test-job
job.batch/test-job created
$kubectl describe job test-job -n my-ns
Name: test-job
Namespace: my-ns
Selector: controller-uid=ef8ab972-cff1-4889-ae11-60c67a374660
Labels: app=git-updater
controller-uid=ef8ab972-cff1-4889-ae11-60c67a374660
job-name=test-job
Annotations: cronjob.kubernetes.io/instantiate: manual
Parallelism: 1
Completions: 1
Start Time: Tue, 02 Nov 2021 17:54:38 +0000
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 16m job-controller Created pod: test-job-vcxx9

metrics-service in kubernetes not working

I'm running kubernetes using an ec2 machine on aws.
Node is in Ubuntu.
my metrics-server version.
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
components.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-type=InternalIP,ExternalIP,Hostname
- --kubelet-insecure-tls
Even after adding args, the error appears.
error :
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
or
error: metrics not available yet
No matter how long I wait, that error appears.
my kops version : Version 1.18.0 (git-698bf974d8)
i use networking calico.
please help...
++
I try to wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
view logs..
kubectl logs -n kube-system deploy/metrics-server
"Failed to scrape node" err="GET "https://172.20.51.226:10250/stats/summary?only_cpu_and_memory=true": bad status code "401 Unauthorized"" node="ip-172-20-51-226.ap-northeast-2.compute.internal"
"Failed probe" probe="metric-storage-ready" err="not metrics to serve"
Download the components.yaml file manually:
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Then edit the args section under Deployment:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
add there two more lines:
- --kubelet-insecure-tls=true
- --kubelet-preferred-address-types=InternalIP
kubelet Of 10250 The port uses https agreement , The connection needs to be verified by tls certificate. Adding ,--kubelet-insecure-tls tells it do not verify client certificate.
After this modification just apply the manifest:
kubectl apply -f components.yaml
wait a minute and you will see metrics server pod is up
Last comment is useful.You can edit the deploy directly as well and adding line "--kubelet-insecure-tls=true" its enought for me:
Edit deploy:
$ kubectl edit deployment.apps/metrics-server -n kube-system
Add the line:
- --kubelet-insecure-tls=true
Similar result:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls=true
And save with ":wq" and enjoy.
~$ kubectl top pods -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-6d4b75cb6d-k8dmc 3m 18Mi
coredns-6d4b75cb6d-wxxn6 3m 17Mi
kube-apiserver-k8s-master1 82m 306Mi
kube-apiserver-k8s-master2 65m 247Mi
kube-controller-manager-k8s-master1 32m 47Mi
kube-controller-manager-k8s-master2 4m 19Mi
kube-proxy-9dbgk 1m 9Mi
kube-proxy-bwhdm 1m 14Mi
kube-proxy-fz8v8 1m 15Mi
kube-proxy-vcnrc 1m 9Mi
kube-scheduler-k8s-master1 7m 18Mi
kube-scheduler-k8s-master2 4m 16Mi
metrics-server-79576f7ff-97tpc 6m 15Mi
metrics-server-79576f7ff-qzczp 4m 13Mi
~$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master1 318m 15% 1047Mi 55%
k8s-master2 208m 10% 1002Mi 52%
k8s-worker1 30m 3% 804Mi 42%
k8s-worker2 35m 3% 550Mi 29%

Kubernetes job and deployment

can I run a job and a deploy in a single config file/action
Where the deploy will wait for the job to finish and check if it's successful so it can continue with the deployment?
Based on the information you provided I believe you can achieve your goal using a Kubernetes feature called InitContainer:
Init containers are exactly like regular containers, except:
Init containers always run to completion.
Each init container must complete successfully before the next one starts.
If a Pod’s init container fails, Kubernetes repeatedly restarts the Pod until the init container succeeds. However, if the Pod has a restartPolicy of Never, Kubernetes does not restart the Pod.
I'll create a initContainer with a busybox to run a command linux to wait for the service mydb to be running before proceeding with the deployment.
Steps to Reproduce:
- Create a Deployment with an initContainer which will run the job that needs to be completed before doing the deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: my-app
name: my-app
spec:
replicas: 2
selector:
matchLabels:
run: my-app
template:
metadata:
labels:
run: my-app
spec:
restartPolicy: Always
containers:
- name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
initContainers:
- name: init-mydb
image: busybox:1.28
command: ['sh', '-c', "until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done"]
Many kinds of commands can be used in this field, you just have to select a docker image that contains the binary you need (including your sequelize job)
Now let's apply it see the status of the deployment:
$ kubectl apply -f my-app.yaml
deployment.apps/my-app created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
my-app-6b4fb4958f-44ds7 0/1 Init:0/1 0 4s
my-app-6b4fb4958f-s7wmr 0/1 Init:0/1 0 4s
The pods are hold on Init:0/1 status waiting for the completion of the init container.
- Now let's create the service which the initcontainer is waiting to be running before completing his task:
apiVersion: v1
kind: Service
metadata:
name: mydb
spec:
ports:
- protocol: TCP
port: 80
targetPort: 9377
We will apply it and monitor the changes in the pods:
$ kubectl apply -f mydb-svc.yaml
service/mydb created
$ kubectl get pods -w
NAME READY STATUS RESTARTS AGE
my-app-6b4fb4958f-44ds7 0/1 Init:0/1 0 91s
my-app-6b4fb4958f-s7wmr 0/1 Init:0/1 0 91s
my-app-6b4fb4958f-s7wmr 0/1 PodInitializing 0 93s
my-app-6b4fb4958f-44ds7 0/1 PodInitializing 0 94s
my-app-6b4fb4958f-s7wmr 1/1 Running 0 94s
my-app-6b4fb4958f-44ds7 1/1 Running 0 95s
^C
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/my-app-6b4fb4958f-44ds7 1/1 Running 0 99s
pod/my-app-6b4fb4958f-s7wmr 1/1 Running 0 99s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mydb ClusterIP 10.100.106.67 <none> 80/TCP 14s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/my-app 2/2 2 2 99s
NAME DESIRED CURRENT READY AGE
replicaset.apps/my-app-6b4fb4958f 2 2 2 99s
If you need help to apply this to your environment let me know.
Although initContainers are a viable option for this solution, there is another if you use helm to manage and deploy to your cluster.
Helm has chart hooks that allow you to run a Job before other installations in the helm chart occur. You mentioned that this is for a database migration before a service deployment. Some example helm config to get this done could be...
apiVersion: batch/v1
kind: Job
metadata:
name: api-migration-job
namespace: default
labels:
app: api-migration-job
annotations:
"helm.sh/hook": pre-install,pre-upgrade
"helm.sh/hook-weight": "-1"
"helm.sh/hook-delete-policy": before-hook-creation
spec:
template:
spec:
containers:
- name: platform-migration
...
This will run the job to completion before moving on to the installation / upgrade phases in the helm chart. You can see there is a 'hook-weight' variable that allows you to order these hooks if you desire.
This in my opinion is a more elegant solution than init containers, and allows for better control.

Kubernetes Deployment Hanging

Following the Deployment example in the docs. I'm trying to deploy the example nginx. With the following config:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
So far, the deployment always hangs. I tried to see if for any reason I needed a pod named nginx to be deployed already. That didn't solve the problem.
$ sudo kubectl get deployments
NAME UPDATEDREPLICAS AGE
nginx-deployment 0/3 34m
$ sudo kubectl describe deployments
Name: nginx-deployment
Namespace: default
CreationTimestamp: Sat, 30 Jan 2016 06:03:47 +0000
Labels: app=nginx
Selector: app=nginx
Replicas: 0 updated / 3 total
StrategyType: RollingUpdate
RollingUpdateStrategy: 1 max unavailable, 1 max surge, 0 min ready seconds
OldReplicationControllers: nginx (2/2 replicas created)
NewReplicationController: <none>
No events.
When I check the events from kubernetes I see no events which belong to this deployment. Has anyone experienced this before?
The versions are as followed:
Client Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"1", GitVersion:"v1.1.3", GitCommit:"6a81b50c7e97bbe0ade075de55ab4fa34f049dc2", GitTreeState:"clean"}
If the deployment is not creating any pods you could have a look at the events an error might be reported there for example:
kubectl get events --all-namespaces
NAMESPACE LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
default 8m 2d 415 wordpress Ingress Normal Service loadbalancer-controller no user specified default backend, using system default
kube-lego 2m 8h 49 kube-lego-7c66c7fddf ReplicaSet Warning FailedCreate replicaset-controller Error creating: pods "kube-lego-7c66c7fddf-" is forbidden: service account kube-lego/kube-lego2-kube-lego was not found, retry after the service account is created
Also have a look at kubectl get rs --all-namespaces.
I found an answer from the issues page
In order to get the deployments to work after you enable it and restart the kube-apiserver, you must also restart the kube-controller-manager.
You can check what is wrong with command kubectl describe pod name_of_your_pod

No nodes available to schedule pods, using google container engine

I'm having an issue where a container I'd like to run doesn't appear to be getting started on my cluster.
I've tried searching around for possible solutions, but there's a surprising lack of information out there to assist with this issue or anything of it's nature.
Here's the most I could gather:
$ kubectl describe pods/elasticsearch
Name: elasticsearch
Namespace: default
Image(s): my.image.host/my-project/elasticsearch
Node: /
Labels: <none>
Status: Pending
Reason:
Message:
IP:
Replication Controllers: <none>
Containers:
elasticsearch:
Image: my.image.host/my-project/elasticsearch
Limits:
cpu: 100m
State: Waiting
Ready: False
Restart Count: 0
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
Mon, 19 Oct 2015 10:28:44 -0500 Mon, 19 Oct 2015 10:34:09 -0500 12 {scheduler } failedScheduling no nodes available to schedule pods
I also see this:
$ kubectl get pod elasticsearch -o wide
NAME READY STATUS RESTARTS AGE NODE
elasticsearch 0/1 Pending 0 5s
I guess I'd like to know: What prerequisites exist so that I can be confident that my container is going to run in container engine? What do I need to do in this scenario to get it running?
Here's my yml file:
apiVersion: v1
kind: Pod
metadata:
name: elasticsearch
spec:
containers:
- name: elasticsearch
image: my.image.host/my-project/elasticsearch
ports:
- containerPort: 9200
resources:
volumeMounts:
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch
volumes:
- name: elasticsearch-data
gcePersistentDisk:
pdName: elasticsearch-staging
fsType: ext4
Here's some more output about my node:
$ kubectl get nodes
NAME LABELS STATUS
gke-elasticsearch-staging-00000000-node-yma3 kubernetes.io/hostname=gke-elasticsearch-staging-00000000-node-yma3 NotReady
You only have one node in your cluster and its status in NotReady. So you won't be able to schedule any pods. You can try to determine why your node isn't ready by looking in /var/log/kubelet.log. You can also add new nodes to your cluster (scale the cluster size up to 2) or delete the node (it will be automatically replaced by the instance group manager) to see if either of those options get you a working node.
It appears that scheduler couldn't see any nodes in your cluster. You can run kubectl get nodes and gcloud compute instances list to confirm whether you have any nodes in the cluster. Did you correctly specify number of nodes (--num-nodes) when creating the cluster?