How to configure airflow master node - celery

I'm trying to modify an existing airflow-celery cluster to make the scheduler a dedicated master node that doesn't run jobs. (Currently, all of the nodes are functioning as workers; I want to prevent the scheduler from being a worker.)
How would I accomplish this?

If you are deploying with docker swarm with this airflow distribution You can specify in which kind of node to deploy each service.
So you can force to deploy scheduler in manager node:
scheduler:
image: puckel/docker-airflow:1.10.1
deploy:
replicas: 1
placement:
constraints: [node.role==manager]
And workers service in workers node:
worker:
image: puckel/docker-airflow:1.10.1
deploy:
replicas: 3
placement:
constraints: [node.role==worker]

In our cluster, the UI and the scheduler are on the same node. This is our “master” node.
When we start up these two components we use the commands:
airflow scheduler
airflow webserver
On the worker nodes, you start them with
airflow worker
This keeps your processes separate.
If you need additional help, please edit your answer and post your config file.

Related

Is there a way to deploy statefulset first in a cluster?

Is there a way to make a kubernetes cluster to deploy first the statefulset and then all other deployments?
I'm working in GKE and I have a Redis pod which I want to get up and ready first because the other deployments depend on the connection to it.
You can use initcontainer in other deployments.Because init containers run to completion before any app containers start, init containers offer a mechanism to block or delay app container startup until a set of preconditions are met.
The init container can have a script which perform a readiness probe of the redis pods.

How to run two Kubernetes master without worker node , the first k8s and 2nd k8s replication mode

How to run two kubernetes master without worker node ,one k8s master another should work as slave ,?
You can find 2 solution for Creating Highly Available clusters with kubeadm here.
There are described steps in order to create 2 kinds kinds of cluster:
Stacked control plane and etcd nodes
External etcd nodes
Additional resources:
Install and configure a multi-master Kubernetes cluster with kubeadm - HAProxy as a load balancer
kubernetes-the-hard-way
Hope this help.

How to start with kubernetes?

I have two IP'S master node and worker node? I need to deploy some services using these. I don't know anything about kubernetes ,what is master node and worker node?
How do I start?
You should start from the very basic things..
Kubernetes concept page is your starting point.
The Kubernetes Master is a collection of three processes that run on a
single node in your cluster, which is designated as the master node.
Those processes are: kube-apiserver, kube-controller-manager and
kube-scheduler.
Each individual non-master node in your cluster runs
two processes: kubelet, which communicates with the Kubernetes Master.
kube-proxy, a network proxy which reflects Kubernetes networking
services on each node.
Regarding you question in comment: read Organizing Cluster Access Using kubeconfig Files. Make sure you have kubeconfig file in the right place..

GKE does not scale to/from 0 when autoscaling enabled

I want to run a CronJob on my GKE in order to perform a batch operation on a daily basis. The ideal scenario would be for my cluster to scale to 0 nodes when the job is not running and to dynamically scale to 1 node and run the job on it every time the schedule is met.
I am first trying to achieve this by using a simple CronJob found in the kubernetes doc that only prints the current time and terminates.
I first created a cluster with the following command:
gcloud container clusters create $CLUSTER_NAME \
--enable-autoscaling \
--min-nodes 0 --max-nodes 1 --num-nodes 1 \
--zone $CLUSTER_ZONE
Then, I created a CronJob with the following description:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: Never
The job is scheduled to run every hour and to print the current time before terminating.
First thing, I wanted to create the cluster with 0 nodes but setting --num-nodes 0 results in an error. Why is it so? Note that I can manually scale down the cluster to 0 nodes after it has been created.
Second, if my cluster has 0 nodes, the job won't be scheduled because the cluster does not scale to 1 node automatically but instead gives the following error:
Cannot schedule pods: no nodes available to schedule pods.
Third, if my cluster has 1 node, the job runs normally but after that, the cluster won't scale down to 0 nodes but stay with 1 node instead. I let my cluster run for two successive jobs and it did not scale down in between. I assume one hour should be long enough for the cluster to do so.
What am I missing?
EDIT: I've got it to work and detailed my solution here.
Update:
Note: Beginning with Kubernetes version 1.7, you can specify a minimum
size of zero for your node pool. This allows your node pool to scale
down completely if the instances within aren't required to run your
workloads.
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler
Old answer:
Scaling the entire cluster to 0 is not supported, because you always need at least one node for system pods:
See docs
You could create one node pool with a small machine for system pods, and an additional node pool with a big machine where you would run your workload. This way the second node pool can scale down to 0 and you still have space to run the system pods.
After attempting, #xEc mentions: Also note that there are scenarios in which my node pool wouldn't scale, like if I created the pool with an initial size of 0 instead of 1.
Initial suggestion:
Perhaps you could run a micro VM, with cron to scale the cluster up, submit a Job (instead of CronJob), wait for it to finish and then scale it back down to 0?
I do not think it's a good idea to tweak GKE for this kind of job. If you really need 0 instances I'd suggest you use either
App Engine Standard Environment, which allows you scale Instances to 0 (https://cloud.google.com/appengine/docs/standard/go/config/appref)
or
Cloud Functions, they are 'instanceless'/serverless anyway. You can use this unofficial guide to trigger your Cloud Functions (https://cloud.google.com/community/tutorials/using-stackdriver-uptime-checks-for-scheduling-cloud-functions)

Will (can) Kubernetes run Docker containers on the master node(s)?

Kubernetes has master and minion nodes.
Will (can) Kubernetes run specified Docker containers on the master node(s)?
I guess another way of saying it is: can a master also be a minion?
Thanks for any assistance.
Update 2015-08-06: As of PR #12349 (available in 1.0.3 and will be available in 1.1 when it ships), the master node is now one of the available nodes in the cluster and you can schedule pods onto it just like any other node in the cluster.
A docker container can only be scheduled onto a kubernetes node running a kubelet (what you refer to as a minion). There is nothing preventing you from creating a cluster where the same machine (physical or virtual) runs both the kubernetes master software and a kubelet, but the current cluster provisioning scripts separate the master onto a distinct machine.
This is going to change significantly when Issue #6087 is implemented.
You need to taint your master node to run containers on it, although not recommended.
Run this on your master node:
kubectl taint nodes --all node-role.kubernetes.io/master-
Courtesy of Alex Ellis' blog post here.
You can try this code:
kubectl label node [name_of_node] node-short-name=node-1
Create yaml file (first.yaml)
apiVersion: v1
kind: Pod
metadata:
name: nginxtest
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
node-short-name: node-1
Create a pod
kubectl create –f first.yaml