Unable to start celery worker from airflow using systemd - celery

I am trying to set up airflow using systemd. Normally when I do it using screens everything works perfectly. But when I start the service service airflow-worker start I get the following error.
Mar 02 08:52:57 IP systemd[1]: Started Airflow celery worker daemon.
Mar 02 08:52:57 IP airflow[16162]: [2019-03-02 08:52:57,489] {settings.py:174} INFO - settings.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800, pid=16162
Mar 02 08:52:57 IP airflow[16162]: [2019-03-02 08:52:57,969] {__init__.py:51} INFO - Using executor CeleryExecutor
Mar 02 08:52:58 IP airflow[16162]: Traceback (most recent call last):
Mar 02 08:52:58 IP airflow[16162]: File "/bin/airflow", line 32, in <module>
Mar 02 08:52:58 IP airflow[16162]: args.func(args)
Mar 02 08:52:58 IP airflow[16162]: File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/utils/cli.py", line 74, in wrapper
Mar 02 08:52:58 IP airflow[16162]: return f(*args, **kwargs)
Mar 02 08:52:58 IP airflow[16162]: File "/home/ubuntu/.local/lib/python2.7/site-packages/airflow/bin/cli.py", line 1066, in worker
Mar 02 08:52:58 IP airflow[16162]: sp = subprocess.Popen(['airflow', 'serve_logs'], env=env, close_fds=True)
Mar 02 08:52:58 IP airflow[16162]: File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
Mar 02 08:52:58 IP airflow[16162]: errread, errwrite)
Mar 02 08:52:58 IP airflow[16162]: File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
Mar 02 08:52:58 IP airflow[16162]: raise child_exception
Mar 02 08:52:58 IP airflow[16162]: OSError: [Errno 20] Not a directory
Mar 02 08:52:58 IP systemd[1]: airflow-worker.service: Main process exited, code=exited, status=1/FAILURE
Mar 02 08:52:58 IP systemd[1]: airflow-worker.service: Failed with result 'exit-code'.
Following is the sequence in which I start the service:
service airflow-webserver start
service airflow-worker start
service airflow-scheduler start
service airflow-flower start
I am referring to the following documentation.
http://site.clairvoyantsoft.com/installing-and-configuring-apache-airflow/
P.S: rabbitmq and Postgres are running fine in the background.

I was able to solve my issue by doing the following:
Created a file at vim /etc/sysconfig/airflow
Added environment variable
AIRFLOW_CONFIG=/home/ubuntu/airflow/airflow.cfg
AIRFLOW_HOME=/home/ubuntu/airflow
PATH=/home/ubuntu/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/ubuntu/.local/bin/
Restarted the worker

Related

Installing K3S offline faild with error: starting kubernetes: preparing server: building kine: dial tcp\\: unknown network tcp\\"

I now install a k3s on:
A centos7 with arm64;
A mysql8.0;
I have disabled:
firewall
selinux
swap
I have modified /etc/hosts;
I have installed docker-ce;
I have downloaded:
https://get.k3s.io to install.sh
k3s-airgap-images-arm64.tar to the right place "/var/lib/rancher/k3s/agent/images/"
chmod +x k3s-arm64 and move to "/usr/local/bin/k3s".
I'm quite sure the mysql connection is ok.Then I use: INSTALL_K3S_SKIP_DOWNLOAD=true INSTALL_K3S_EXEC='server --docker --datastore-endpoint="mysql://root:root#tcp(172.16.149.139:3306)/k3s"' ./install.sh
But I always got error shows in journalctl:
Nov 19 11:05:52 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:05:52 k3s01 k3s[16058]: time="2020-11-19T11:05:52.883415201+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:05:52 k3s01 k3s[16058]: time="2020-11-19T11:05:52.884004317+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:05:52 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:05:52 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:05:52 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:05:52 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:05:57 k3s01 systemd[1]: k3s.service holdoff time over, scheduling restart.
Nov 19 11:05:57 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:05:57 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:05:58 k3s01 k3s[16086]: time="2020-11-19T11:05:58.341115144+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:05:58 k3s01 k3s[16086]: time="2020-11-19T11:05:58.345448686+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:05:58 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:05:58 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:05:58 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:05:58 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:06:03 k3s01 systemd[1]: k3s.service holdoff time over, scheduling restart.
Nov 19 11:06:03 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:06:03 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:06:03 k3s01 k3s[16114]: time="2020-11-19T11:06:03.855567834+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:06:03 k3s01 k3s[16114]: time="2020-11-19T11:06:03.856344291+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:06:03 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:06:03 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:06:03 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:06:03 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:06:08 k3s01 systemd[1]: k3s.service holdoff time over, scheduling restart.
Nov 19 11:06:08 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:06:08 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:06:09 k3s01 k3s[16142]: time="2020-11-19T11:06:09.430387037+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:06:09 k3s01 k3s[16142]: time="2020-11-19T11:06:09.431185565+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:06:09 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:06:09 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:06:09 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:06:09 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:06:14 k3s01 systemd[1]: k3s.service holdoff time over, scheduling restart.
Nov 19 11:06:14 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:06:14 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:06:14 k3s01 k3s[16193]: time="2020-11-19T11:06:14.888534204+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:06:14 k3s01 k3s[16193]: time="2020-11-19T11:06:14.889537923+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:06:14 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:06:14 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:06:14 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:06:14 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:06:19 k3s01 systemd[1]: k3s.service holdoff time over, scheduling restart.
Nov 19 11:06:19 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:06:19 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:06:20 k3s01 k3s[16221]: time="2020-11-19T11:06:20.442535396+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:06:20 k3s01 k3s[16221]: time="2020-11-19T11:06:20.443421344+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:06:20 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:06:20 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:06:20 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:06:20 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:06:24 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:06:24 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:06:25 k3s01 k3s[16336]: time="2020-11-19T11:06:25.168513665+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:06:25 k3s01 k3s[16336]: time="2020-11-19T11:06:25.168946929+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:06:25 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:06:25 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:06:25 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:06:25 k3s01 systemd[1]: k3s.service failed.
Nov 19 11:06:30 k3s01 systemd[1]: k3s.service holdoff time over, scheduling restart.
Nov 19 11:06:30 k3s01 systemd[1]: Stopped Lightweight Kubernetes.
Nov 19 11:06:30 k3s01 systemd[1]: Starting Lightweight Kubernetes...
Nov 19 11:06:30 k3s01 k3s[16363]: time="2020-11-19T11:06:30.645875517+08:00" level=info msg="Starting k3s v1.19.3+k3s3 (0e4fbfef)"
Nov 19 11:06:30 k3s01 k3s[16363]: time="2020-11-19T11:06:30.649172179+08:00" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: building kine: dial tcp\\: unknown network tcp\\"
Nov 19 11:06:30 k3s01 systemd[1]: k3s.service: main process exited, code=exited, status=1/FAILURE
Nov 19 11:06:30 k3s01 systemd[1]: Failed to start Lightweight Kubernetes.
Nov 19 11:06:30 k3s01 systemd[1]: Unit k3s.service entered failed state.
Nov 19 11:06:30 k3s01 systemd[1]: k3s.service failed.
I really don't know what's going on, need help!!!!!!!!!
finally , I found that I must use K3S_DATASTORE_ENDPOINT='mysql://xxxxxxx' not INSTALL_K3S_EXEC='xxx --datastore-endpoint="mysql://xxxxxx"' to avoid this!But I don't know what's on earth of it

kubelet.service: Unit entered failed state in not ready state node error from kubernetes cluster

I am trying to deploy an springboot microservices in kubernetes cluster having 1 master and 2 worker node. When I am trying to get the node state using the command sudo kubectl get nodes, I am getting one of my worker node is not ready. It showing not ready in status.
When I am applying to troubleshoot the following command,
sudo journalctl -u kubelet
I am getting response like kubelet.service: Unit entered failed state and kubelet service stopped. The following is the response what I am getting when applying the command sudo journalctl -u kubelet.
-- Logs begin at Fri 2020-01-03 04:56:18 EST, end at Fri 2020-01-03 05:32:47 EST. --
Jan 03 04:56:25 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:31 MILDEVKUB050 kubelet[970]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --confi
Jan 03 04:56:31 MILDEVKUB050 kubelet[970]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --confi
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.053962 970 server.go:416] Version: v1.17.0
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.084061 970 plugins.go:100] No cloud provider specified.
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.235928 970 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:32 MILDEVKUB050 kubelet[970]: I0103 04:56:32.280173 970 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curre
Jan 03 04:56:38 MILDEVKUB050 kubelet[970]: I0103 04:56:38.107966 970 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
Jan 03 04:56:38 MILDEVKUB050 kubelet[970]: F0103 04:56:38.109401 970 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable swa
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:38 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:56:48 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.901632 1433 server.go:416] Version: v1.17.0
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.907654 1433 plugins.go:100] No cloud provider specified.
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.907806 1433 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:48 MILDEVKUB050 kubelet[1433]: I0103 04:56:48.947107 1433 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curr
Jan 03 04:56:49 MILDEVKUB050 kubelet[1433]: I0103 04:56:49.263777 1433 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to
Jan 03 04:56:49 MILDEVKUB050 kubelet[1433]: F0103 04:56:49.264219 1433 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable sw
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:49 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --conf
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.712729 1500 server.go:416] Version: v1.17.0
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.714927 1500 plugins.go:100] No cloud provider specified.
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.715248 1500 server.go:821] Client rotation is on, will bootstrap in background
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.763508 1500 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-curr
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: I0103 04:56:59.956706 1500 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to
Jan 03 04:56:59 MILDEVKUB050 kubelet[1500]: F0103 04:56:59.957078 1500 server.go:273] failed to run Kubelet: running with swap on is not supported, please disable sw
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Unit entered failed state.
Jan 03 04:56:59 MILDEVKUB050 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Jan 03 04:57:10 MILDEVKUB050 systemd[1]: Started kubelet: The Kubernetes Node Agent.
log file: service: Unit entered failed state
I tried by restarting the kubelet. But still there is no change in node state. Not ready state only.
Updates
When I am trying the command systemctl list-units --type=swap --state=active , then I am getting the following response,
docker#MILDEVKUB040:~$ systemctl list-units --type=swap --state=active
UNIT LOAD ACTIVE SUB DESCRIPTION
dev-mapper-MILDEVDCR01\x2d\x2dvg\x2dswap_1.swap loaded active active /dev/mapper/MILDEVDCR01--vg-swap_1
Important
When I am getting these kind of issue with node not ready, each time I need to disable the swap and need to reload the daemon and kubelet. After that node becomes ready state. And again I need to repeat the same.
How can I find a permanent solution for this?
failed to run Kubelet: running with swap on is not supported, please disable swap
You need to disable swap on the system for kubelet to work. You can disable swap with sudo swapoff -a
For systemd based systems, there is another way of enabling swap partitions using swap units which gets enabled whenever systemd reloads even if you have turned off swap using swapoff -a
https://www.freedesktop.org/software/systemd/man/systemd.swap.html
Check if you have any swap units using systemctl list-units --type=swap --state=active
You can permanently disable any active swap unit with systemctl mask <unit name>.
Note: Do not use systemctl disable <unit name> to disable the swap unit as swap unit will be activated again when systemd reloads. Use systemctl mask <unit name> only.
To make sure swap doesn't get re-enabled when your system reboots due to power cycle or any other reason, remove or comment out the swap entries in /etc/fstab
Summarizing:
Run sudo swapoff -a
Check if you have swap units with command systemctl list-units --type=swap --state=active. If there are any active swap units, mask them using systemctl mask <unit name>
Remove swap entries in /etc/fstab
The root cause is the swap space. To disable completely follow steps:
run swapoff -a: this will immediately disable swap but will activate on restart
remove any swap entry from /etc/fstab
reboot the system.
If the swap is gone, good. If, for some reason, it is still here, you
had to remove the swap partition. Repeat steps 1 and 2 and, after
that, use fdisk or parted to remove the (now unused) swap partition.
Use great care here: removing the wrong partition will have disastrous
effects!
reboot
This should resolve your issue.
Removing /etc/fstab will give the vm error, I think we should find another way to solve this issue. I tried to remove the fstab, all command (install, ping and other command) error.

Kubeam failed | service is down

I try to join worker node to k8s kluser.
sudo kubeadm join 10.2.67.201:6443 --token x --discovery-token-ca-cert-hash sha2566 x
But i get error on this stage:
curl -sSL http://localhost:10248/healthz'
failed with error: Get http://localhost:10248/healthz: dial tcp
Error:
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
I see that kubelet service is down:
journalctl -xeu kubelet
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished shutting down.
Nov 22 15:49:00 s001as-ceph-node-03 systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished starting up.
--
-- The start-up result is done.
Nov 22 15:49:00 s001as-ceph-node-03 kubelet[286703]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag.
Nov 22 15:49:00 s001as-ceph-node-03 kubelet[286703]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag.
Nov 22 15:49:00 s001as-ceph-node-03 kubelet[286703]: F1122 15:49:00.224350 286703 server.go:251] unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.cr
Nov 22 15:49:00 s001as-ceph-node-03 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
Nov 22 15:49:00 s001as-ceph-node-03 systemd[1]: Unit kubelet.service entered failed state.
Nov 22 15:49:00 s001as-ceph-node-03 systemd[1]: kubelet.service failed.
Nov 22 15:49:10 s001as-ceph-node-03 systemd[1]: kubelet.service holdoff time over, scheduling restart.
Nov 22 15:49:10 s001as-ceph-node-03 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished shutting down.
Nov 22 15:49:10 s001as-ceph-node-03 systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished starting up.
--
-- The start-up result is done.
Nov 22 15:49:10 s001as-ceph-node-03 kubelet[286717]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag.
Nov 22 15:49:10 s001as-ceph-node-03 kubelet[286717]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag.
Nov 22 15:49:10 s001as-ceph-node-03 kubelet[286717]: F1122 15:49:10.476478 286717 server.go:251] unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.cr
Nov 22 15:49:10 s001as-ceph-node-03 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
Nov 22 15:49:10 s001as-ceph-node-03 systemd[1]: Unit kubelet.service entered failed state.
Nov 22 15:49:10 s001as-ceph-node-03 systemd[1]: kubelet.service failed.
I fixed it.
Just copy /etc/kubernetes/pki/ca.crt into /etc/kubernetes/ssl/ca.crt

Installing kubernetes cluster on master node

I am new to container worrld and trying to setup a kubernetes cluster locally in two linux VMs. During the cluster initialization it got stuck at
[apiclient] Created API client, waiting for the control plane to
become ready
I have followed the pre-flight check steps,
[root#lm--kube-glusterfs--central ~]# kubeadm init --pod-network-cidr=10.244.0.0/16
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.6.0
[init] Using Authorization mode: RBAC
[preflight] Running pre-flight checks
[preflight] WARNING: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Starting the kubelet service
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [lm--kube-glusterfs--central kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.99.7.215]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready
OS : Red Hat Enterprise Linux Server release 7.4 (Maipo)
Kuberneter version :
kubeadm-1.6.0-0.x86_64.rpm
kubectl-1.6.0-0.x86_64.rpm
kubelet-1.6.0-0.x86_64.rpm
kubernetes-cni-0.6.0-0.x86_64.rpm
cri-tools-1.12.0-0.x86_64.rpm
How to debug the issue or is there any version mismatch of kube versions. Same was working before when i use google.cloud.repo to install yum -y install kubelet kubeadm kubectl .
I couldnot use repo due to some firewall issues. Hence using the rpms to install.
After executing the following command,journalctl -xeu kubelet
Jul 02 09:45:09 lm--son-config-cn--central kubelet[28749]: W0702 09:45:09.841246 28749 kubelet_network.go:70] Hairpin mode set to "promiscuous-bridge" but kubenet
Jul 02 09:45:09 lm--son-config-cn--central kubelet[28749]: I0702 09:45:09.841304 28749 kubelet.go:494] Hairpin mode set to "hairpin-veth"
Jul 02 09:45:09 lm--son-config-cn--central kubelet[28749]: W0702 09:45:09.845626 28749 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Jul 02 09:45:09 lm--son-config-cn--central kubelet[28749]: I0702 09:45:09.857969 28749 docker_service.go:187] Docker cri networking managed by kubernetes.io/no-op
Jul 02 09:45:09 lm--son-config-cn--central kubelet[28749]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs
Jul 02 09:45:09 lm--son-config-cn--central systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Jul 02 09:45:09 lm--son-config-cn--central systemd[1]: Unit kubelet.service entered failed state.
Jul 02 09:45:09 lm--son-config-cn--central systemd[1]: kubelet.service failed.
Jul 02 09:45:20 lm--son-config-cn--central systemd[1]: kubelet.service holdoff time over, scheduling restart.
Jul 02 09:45:20 lm--son-config-cn--central systemd[1]: Started Kubernetes Kubelet Server.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has finished starting up.
--
-- The start-up result is done.
Jul 02 09:45:20 lm--son-config-cn--central systemd[1]: Starting Kubernetes Kubelet Server...
-- Subject: Unit kubelet.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kubelet.service has begun starting up.
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.251465 28772 feature_gate.go:144] feature gates: map[]
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: W0702 09:45:20.251889 28772 server.go:469] No API client: no api servers specified
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.252009 28772 docker.go:364] Connecting to docker on unix:///var/run/docker.sock
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.252049 28772 docker.go:384] Start docker client with request timeout=2m0s
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: W0702 09:45:20.259436 28772 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.275674 28772 manager.go:143] cAdvisor running in container: "/system.slice"
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: W0702 09:45:20.317509 28772 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial r
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.328881 28772 fs.go:117] Filesystem partitions: map[/dev/vda2:{mountpoint:/ major:253 mino
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.335711 28772 manager.go:198] Machine: {NumCores:8 CpuFrequency:2095078 MemoryCapacity:337
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: [7] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:4194304 Type:Unifie
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.338001 28772 manager.go:204] Version: {KernelVersion:3.10.0-693.11.6.el7.x86_64 Container
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: W0702 09:45:20.338967 28772 server.go:350] No api server defined - no events will be sent to API server.
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.338980 28772 server.go:509] --cgroups-per-qos enabled, but --cgroup-root was not specifie
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.342041 28772 container_manager_linux.go:245] container manager verified user specified cg
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.342071 28772 container_manager_linux.go:250] Creating Container Manager object based on N
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: W0702 09:45:20.346505 28772 kubelet_network.go:70] Hairpin mode set to "promiscuous-bridge" but kubenet
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.346571 28772 kubelet.go:494] Hairpin mode set to "hairpin-veth"
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: W0702 09:45:20.351473 28772 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: I0702 09:45:20.363583 28772 docker_service.go:187] Docker cri networking managed by kubernetes.io/no-op
Jul 02 09:45:20 lm--son-config-cn--central kubelet[28772]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs
Jul 02 09:45:20 lm--son-config-cn--central systemd[1]: kubelet.service: main process exited, code=exited, status=1/FAILURE
Jul 02 09:45:20 lm--son-config-cn--central systemd[1]: Unit kubelet.service entered failed state.
Jul 02 09:45:20 lm--son-config-cn--central systemd[1]: kubelet.service failed.
Related to issue
There are a few fixes shown there , all you need is to change the cgroup driver to systemd

Kubespray: [Errno 111] Connection refused>", "redirected": false, "status": -1, "url": "http://127.0.0.1:8080/healthz

I am new to Kubernetes and I am trying to deploy a Kubernetes cluster using Kubespray (https://github.com/kubernetes-incubator/kubespray
)
When I run ansible-playbook -bi inventory/inventory cluster.yml it fails with:
fatal: [kube-k8s-master-1]: FAILED! => {"attempts": 20, "changed":
false, "content": "", "failed": true, "msg": "Status code was not
[200]: Request failed: ", "redirected": false, "status": -1, "url":
"http ://127.0.0.1:8080/healthz"}
etcd service log:
Oct 05 13:53:06 kube-k8s-master-1 systemd[1]: etcd.service: main
process exited, code=exited, status=1/FAILURE Oct 05 13:53:11
kube-k8s-master-1 docker[32146]: etcd1 Oct 05 13:53:11
kube-k8s-master-1 systemd[1]: Unit etcd.service entered failed state.
Oct 05 13:53:11 kube-k8s-master-1 systemd[1]: etcd.service failed.
kubelet service log:
Oct 05 13:56:02 kube-k8s-master-1 kubelet[19938]: E1005
13:56:02.500377 19938 reflector.go:190]
k8s.io/kubernetes/pkg/kubelet/kubelet.go:408: Failed to list *v1.Node:
Get https://127.0.0.1:6443/api/v1/nodes?fieldSelector=metadata.name%3
Oct 05 13:56:03 kube-k8s-master-1 kubelet[19938]: W1005
13:56:03.190059 19938 container.go:352] Failed to create summary
reader for
"/docker/ce8dbd1a8edfbe6b604aab4f38eff406846b1cfc8858ba23e7db5cac36d2247d":
none of the resources are be Oct 05 13:56:03 kube-k8s-master-1
kubelet[19938]: E1005 13:56:03.497076 19938 reflector.go:190]
k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list
*v1.Pod: Get https ://127.0.0.1:6443/api/v1/pods?fieldSelector=spec.node Oct 05
13:56:03 kube-k8s-master-1 kubelet[19938]: E1005 13:56:03.499795
19938 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:400:
Failed to list *v1.Service: Get
https ://127.0.0.1:6443/api/v1/services?resourceVersion=0: dial Oct 05
13:56:03 kube-k8s-master-1 kubelet[19938]: E1005 13:56:03.500800
19938 reflector.go:190] k8s.io/kubernetes/pkg/kubelet/kubelet.go:408:
Failed to list *v1.Node: Get
https ://127.0.0.1:6443/api/v1/nodes?fieldSelector=metadata.name%3 Oct
05 13:56:04 kube-k8s-master-1 kubelet[19938]: W1005 13:56:04.155665
19938 cni.go:189] Unable to update cni config: No networks found in
/etc/cni/net.d Oct 05 13:56:04 kube-k8s-master-1 kubelet[19938]: E1005
13:56:04.155825 19938 kubelet.go:2136] Container runtime network not
ready: NetworkReady=false reason:NetworkPluginNotReady message:docker:
network plugin is not ready: cni config Oct 05 13:56:04
kube-k8s-master-1 kubelet[19938]: E1005 13:56:04.190285 19938
eviction_manager.go:238] eviction manager: unexpected err: failed
GetNode: node 'kube-k8s-master-1' not found
How can I fix this?