Kubelet service not starting up - KUBELET_EXTRA_ARGS (code=exited, status=255) - kubernetes

I am trying to start up the kubelet service on a worker node (the 3rd worker node)... at the moment, I can't quite tell what the error is here.. I do however, see F0716 16:42:20.047413 556 server.go:155] unknown command: $KUBELET_EXTRA_ARGS in the output given by sudo systemctl status kubelet -l:
[svc.jenkins#node6 ~]$ sudo systemctl status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2018-07-16 16:42:20 CDT; 4s ago
Docs: http://kubernetes.io/docs/
Process: 556 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 556 (code=exited, status=255)
Jul 16 16:42:20 node6 kubelet[556]: --tls-cert-file string File containing x509 Certificate used for serving HTTPS (with intermediate certs, if any, concatenated after server cert). If --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory passed to --cert-dir. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
Jul 16 16:42:20 node6 kubelet[556]: --tls-cipher-suites strings Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be used. Possible values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_RC4_128_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_RC4_128_SHA (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
Jul 16 16:42:20 node6 kubelet[556]: --tls-min-version string Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12 (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
Jul 16 16:42:20 node6 kubelet[556]: --tls-private-key-file string File containing x509 private key matching --tls-cert-file. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
Jul 16 16:42:20 node6 kubelet[556]: -v, --v Level log level for V logs
Jul 16 16:42:20 node6 kubelet[556]: --version version[=true] Print version information and quit
Jul 16 16:42:20 node6 kubelet[556]: --vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
Jul 16 16:42:20 node6 kubelet[556]: --volume-plugin-dir string The full path of the directory in which to search for additional third party volume plugins (default "/usr/libexec/kubernetes/kubelet-plugins/volume/exec/")
Jul 16 16:42:20 node6 kubelet[556]: --volume-stats-agg-period duration Specifies interval for kubelet to calculate and cache the volume disk usage for all pods and volumes. To disable volume calculations, set to 0. (default 1m0s) (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
Jul 16 16:42:20 node6 kubelet[556]: F0716 16:42:20.047413 556 server.go:155] unknown command: $KUBELET_EXTRA_ARGS
Here is the configuration for my dropin loacated at /etc/systemd/system/kubelet.service.d/10-kubeadm.conf (it is the same on the other nodes that are in a working state):
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/data01/kubelet/pki"
Environment="KUBELET_EXTRA_ARGS=$KUBELET_EXTRA_ARGS --root-dir=/data01/kubelet"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
Just need help diagnosing what the issue preventing it from starting so that it can be resolved.. Thank in advanced :)
EDIT:
[svc.jenkins#node6 ~]$ kubelet --version
Kubernetes v1.10.4

Currently, in systemd a bit different approach is used. All options are put to separate file and systemd config script refers to that file.
In your case, it would be something like this:
/etc/sysconfig/kubelet
----------------------
KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf
KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true
KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin
KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local
KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt
KUBELET_CADVISOR_ARGS=--cadvisor-port=0
KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs
KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/data01/kubelet/pki
KUBELET_EXTRA_ARGS=$KUBELET_EXTRA_ARGS --root-dir=/data01/kubelet
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
-----------------------------------------------------
...
[Service]
EnvironmentFile=/etc/sysconfig/kubelet
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
The variables in systemd config file could look like ${VARIABLE} or $VARIABLE. Both cases should work fine.

Related

Service starting gunicorn failing with "Start request repeated too quickly"

Trying to start a service to run gunicorn as backend server for Flask, not working. Running nginx as frontend server for React, working.
Server:
Virtualization: vmware
Operating System: Red Hat Enterprise Linux 8.4 (Ootpa)
CPE OS Name: cpe:/o:redhat:enterprise_linux:8.4:GA
Kernel: Linux 4.18.0-305.3.1.el8_4.x86_64
Architecture: x86-64
Service file in /etc/systemd/system/myservice.service:
[Unit]
Description="Description"
After=network.target
[Service]
User=root
Group=root
WorkingDirectory=/home/project/app/api
ExecStart=/home/project/app/api/venv/bin/gunicorn -b 127.0.0.1:5000 api:app
Restart=always
[Install]
WantedBy=multi-user.target
/app/api:
-rwxr-xr-x. 1 root root 2018 Jun 9 20:06 api.py
drwxrwxr-x+ 5 root root 100 Jun 7 10:11 venv
Error message:
● myservice.service - "Description"
Loaded: loaded (/etc/systemd/system/myservice.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2021-06-10 19:01:01 CEST; 5s ago
Process: 18307 ExecStart=/home/project/app/api/venv/bin/gunicorn -b 127.0.0.1:5000 api:app (code=exited, status=203/EXEC)
Main PID: 18307 (code=exited, status=203/EXEC)
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Service RestartSec=100ms expired, scheduling restart.
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Scheduled restart job, restart counter is at 5.
Jun 10 19:01:01 xxxx systemd[1]: Stopped "Description".
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Start request repeated too quickly.
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Failed with result 'exit-code'.
Jun 10 19:01:01 xxxx systemd[1]: Failed to start "Description".
Tried, not working:
Adding Environment="PATH=/home/project/app/api/venv/bin" under [Service]
$ systemctl reset-failed myservice.service
$ systemctl daemon-reload
Reboot, ofc.
Tried, working:
Running (as root) /home/project/app/api/venv/bin/gunicorn -b 127.0.0.1:5000 api:app while in /app/api directory
Does anyone know how to fix this problem?
Typically enough, I figured it out shortly after posting this issue.
SELinux is messing with permissions for files and directories, so for anyone experiencing the same issue, make sure to test with the following alterings (as root):
$ setsebool -P httpd_can_network_connect on
$ chcon -Rt httpd_sys_content_t /path/to/your/Flask/dir
In my case: $ chcon -Rt httpd_sys_content_t /home/project/app/api
While this is NOT a permanent fix, it's worth a try. Check out the SELinux docs for more permanent solutions.

systemd service not starting on boot, starts when i restart it

I have made this service file to start a python script when my raspberry pi (4) boots up:
/etc/systemd/system/plants.service
[Unit]
Description=plant-sender
After=network.target
[Service]
Type=simple
User=root
Group=root
WorkingDirectory=/home/theo/Repos/plants-monitor/remote
ExecStart=/usr/bin/python main.py
Restart=on-failure
[Install]
WantedBy=multi-user.target
However, once the pi is on, I run sudo systemctl status plants, and get:
* plants.service - plant-sender
Loaded: loaded (/etc/systemd/system/plants.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2020-03-30 20:22:43 EDT; 1min 45s ago
Process: 323 ExecStart=/usr/bin/python main.py (code=exited, status=1/FAILURE)
Main PID: 323 (code=exited, status=1/FAILURE)
Mar 30 20:22:43 arpi systemd[1]: plants.service: Scheduled restart job, restart counter is at 5.
Mar 30 20:22:43 arpi systemd[1]: Stopped plant-sender.
Mar 30 20:22:43 arpi systemd[1]: plants.service: Start request repeated too quickly.
Mar 30 20:22:43 arpi systemd[1]: plants.service: Failed with result 'exit-code'.
Mar 30 20:22:43 arpi systemd[1]: Failed to start plant-sender.
But, after running sudo systemctl restart plants, the service starts up and everything is fine.
If it doesn't start on boot but does on systemctl restart, I'd be looking at whether /home/theo/Repos/plants-monitor/remote is mounted at that point.
There may be something automounting or home-mounting your home directory when you log in.
If so, you could change the working directory to something that exists always, even if only a test.
Additionally, using journalctl -n 9999 -u plants will get you more log messages, so you can see why it's failing, rather than just seeing the "tried too many times, giving up" messages.

Adding removed etcd member in Kubernetes master

I was following Kelsey Hightower's kubernetes-the-hard-way repo and successfully created a cluster with 3 master nodes and 3 worker nodes. Here are the problems encountered when removing one of the etcd members and then adding it back, also with all the steps used:
3 master nodes:
10.240.0.10 controller-0
10.240.0.11 controller-1
10.240.0.12 controller-2
Step 1:
isaac#controller-0:~$ sudo ETCDCTL_API=3 etcdctl member list --endpoints=https://127.0.0.1:2379 --cacert=/etc/etcd/ca.pem --cert=/etc/etcd/kubernetes.pem --key=/etc/etcd/kubernetes-key.pem
Result:
b28b52253c9d447e, started, controller-2, https://10.240.0.12:2380, https://10.240.0.12:2379
f98dc20bce6225a0, started, controller-0, https://10.240.0.10:2380, https://10.240.0.10:2379
ffed16798470cab5, started, controller-1, https://10.240.0.11:2380, https://10.240.0.11:2379
Step 2 (Remove etcd member of controller-2):
isaac#controller-0:~$ sudo ETCDCTL_API=3 etcdctl member remove b28b52253c9d447e --endpoints=https://127.0.0.1:2379 --cacert=/etc/etcd/ca.pem --cert=/etc/etcd/kubernetes.pem --key=/etc/etcd/kubernetes-key.pem
Step 3 (Add the member back):
isaac#controller-0:~$ sudo ETCDCTL_API=3 etcdctl member add controller-2 --peer-urls=https://10.240.0.12:2380 --endpoints=https://127.0.0.1:2379 --cacert=/etc/etcd/ca.pem --cert=/etc/etcd/kubernetes.pem --key=/etc/etcd/kubernetes-key.pem
Result:
Member 66d450d03498eb5c added to cluster 3e7cc799faffb625
ETCD_NAME="controller-2"
ETCD_INITIAL_CLUSTER="controller-2=https://10.240.0.12:2380,controller-0=https://10.240.0.10:2380,controller-1=https://10.240.0.11:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.240.0.12:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
Step 4 (run member list command):
isaac#controller-0:~$ sudo ETCDCTL_API=3 etcdctl member list --endpoints=https://127.0.0.1:2379 --cacert=/etc/etcd/ca.pem --cert=/etc/etcd/kubernetes.pem --key=/etc/etcd/kubernetes-key.pem
Result:
66d450d03498eb5c, unstarted, , https://10.240.0.12:2380,
f98dc20bce6225a0, started, controller-0, https://10.240.0.10:2380,
https://10.240.0.10:2379 ffed16798470cab5, started, controller-1,
https://10.240.0.11:2380, https://10.240.0.11:2379
Step 5 (Run the command to start etcd in controller-2):
isaac#controller-2:~$ sudo etcd --name controller-2 --listen-client-urls https://10.240.0.12:2379,http://127.0.0.1:2379 --advertise-client-urls https://10.240.0.12:2379 --listen-peer-urls https://10.240.0.12:
2380 --initial-advertise-peer-urls https://10.240.0.12:2380 --initial-cluster-state existing --initial-cluster controller-0=http://10.240.0.10:2380,controller-1=http://10.240.0.11:2380,controller-2=http://10.240.0.1
2:2380 --ca-file /etc/etcd/ca.pem --cert-file /etc/etcd/kubernetes.pem --key-file /etc/etcd/kubernetes-key.pem
Result:
2019-06-09 13:10:14.958799 I | etcdmain: etcd Version: 3.3.9
2019-06-09 13:10:14.959022 I | etcdmain: Git SHA: fca8add78
2019-06-09 13:10:14.959106 I | etcdmain: Go Version: go1.10.3
2019-06-09 13:10:14.959177 I | etcdmain: Go OS/Arch: linux/amd64
2019-06-09 13:10:14.959237 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2019-06-09 13:10:14.959312 W | etcdmain: no data-dir provided, using default data-dir ./controller-2.etcd
2019-06-09 13:10:14.959435 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2019-06-09 13:10:14.959575 C | etcdmain: cannot listen on TLS for 10.240.0.12:2380: KeyFile and CertFile are not presented
Clearly, the etcd service did not start as expected, so I do the troubleshooting as below:
isaac#controller-2:~$ sudo systemctl status etcd
Result:
● etcd.service - etcd Loaded: loaded
(/etc/systemd/system/etcd.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Sun 2019-06-09 13:06:55 UTC; 29min ago
Docs: https://github.com/coreos Process: 1876 ExecStart=/usr/local/bin/etcd --name controller-2
--cert-file=/etc/etcd/kubernetes.pem --key-file=/etc/etcd/kubernetes-key.pem --peer-cert-file=/etc/etcd/kubernetes.pem --peer-key-file=/etc/etcd/kube Main PID: 1876 (code=exited, status=0/SUCCESS) Jun 09 13:06:55 controller-2 etcd[1876]: stopped
peer f98dc20bce6225a0 Jun 09 13:06:55 controller-2 etcd[1876]:
stopping peer ffed16798470cab5... Jun 09 13:06:55 controller-2
etcd[1876]: stopped streaming with peer ffed16798470cab5 (writer) Jun
09 13:06:55 controller-2 etcd[1876]: stopped streaming with peer
ffed16798470cab5 (writer) Jun 09 13:06:55 controller-2 etcd[1876]:
stopped HTTP pipelining with peer ffed16798470cab5 Jun 09 13:06:55
controller-2 etcd[1876]: stopped streaming with peer ffed16798470cab5
(stream MsgApp v2 reader) Jun 09 13:06:55 controller-2 etcd[1876]:
stopped streaming with peer ffed16798470cab5 (stream Message reader)
Jun 09 13:06:55 controller-2 etcd[1876]: stopped peer ffed16798470cab5
Jun 09 13:06:55 controller-2 etcd[1876]: failed to find member
f98dc20bce6225a0 in cluster 3e7cc799faffb625 Jun 09 13:06:55
controller-2 etcd[1876]: forgot to set Type=notify in systemd service
file?
I indeed tried to start the etcd member using different commands but seems the etcd of controller-2 still stuck at unstarted state. May I know the reason of that? Any pointers would be highly appreciated! Thanks.
Turned out I solved the problem as follows (credit to Matthew):
Delete the etcd data directory with the following command:
rm -rf /var/lib/etcd/*
To fix the message cannot listen on TLS for 10.240.0.12:2380: KeyFile and CertFile are not presented, I revised the command to start the etcd as follows:
sudo etcd --name controller-2 --listen-client-urls https://10.240.0.12:2379,http://127.0.0.1:2379 --advertise-client-urls https://10.240.0.12:2379 --listen-peer-urls https://10.240.0.12:2380 --initial-advertise-peer-urls https://10.240.0.12:2380 --initial-cluster-state existing --initial-cluster controller-0=https://10.240.0.10:2380,controller-1=https://10.240.0.11:2380,controller-2=https://10.240.0.12:2380 --peer-trusted-ca-file /etc/etcd/ca.pem --cert-file /etc/etcd/kubernetes.pem --key-file /etc/etcd/kubernetes-key.pem --peer-cert-file /etc/etcd/kubernetes.pem --peer-key-file /etc/etcd/kubernetes-key.pem --data-dir /var/lib/etcd
A few points to note here:
The newly added arguments --cert-file and --key-file presented the required key and certificate of controller2.
Argument --peer-trusted-ca-file is also presented so as to check if the x509 certificate presented by controller0 and controller1 are signed by a known CA. If this is not presented, error etcdserver: could not get cluster response from https://10.240.0.11:2380: Get https://10.240.0.11:2380/members: x509: certificate signed by unknown authority may be encountered.
The value presented for the argument --initial-cluster needs to be in-line with that shown in the systemd unit file.
if you are re-adding the more easy solution is following
rm -rf /var/lib/etcd/*
kubeadm join phase control-plane-join etcd --control-plane

Redis fails to start with error: redis-server.service: Failed at step NAMESPACE spawning /usr/bin/redis-server: Stale file handle

After upgrading Debian, it has an issue starting redis-server.service.
In the output of journalctl -xe I see the following:
redis-server.service: Failed at step NAMESPACE spawning /usr/bin/redis-server: Stale file handle.
I can't start the redis-server.service and in the output of the systemctl start redis-server I have:
Job for redis-server.service failed because the control process exited with error code.
See "systemctl status redis-server.service" and "journalctl -xe" for details.
In the output of systemctl status redis-server I have:
● redis-server.service - Advanced key-value store
Loaded: loaded (/lib/systemd/system/redis-server.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2018-01-29 10:29:08 MSK; 58s ago
Docs: http://redis.io/documentation,
man:redis-server(1)
Process: 11701 ExecStop=/bin/kill -s TERM $MAINPID (code=exited, status=226/NAMESPACE)
Process: 11720 ExecStart=/usr/bin/redis-server /etc/redis/redis.conf (code=exited, status=226/NAMESPACE)
Main PID: 10193 (code=exited, status=0/SUCCESS)
Jan 29 10:29:08 xxx systemd[1]: redis-server.service: Service hold-off time over, scheduling restart.
Jan 29 10:29:08 xxx systemd[1]: redis-server.service: Scheduled restart job, restart counter is at 5.
Jan 29 10:29:08 xxx systemd[1]: Stopped Advanced key-value store.
My question how to fix this issue and start redis-server.service?
Found a workaround solution:
I've played around with the /lib/systemd/system/redis-server.service editing service file as root, commenting out different fields trying to find where the failure happens and restarting systemd (via systemctl daemon-reload, systemctl stop redis-server, systemctl start redis-server)
For me the issue was the following line in the redis-server.service file:
ReadOnlyDirectories=/
which I have commented out and that allowed redis-server to start succesfully.
So my current /lib/systemd/system/redis-server.service is:
[Unit]
Description=Advanced key-value store
After=network.target
Documentation=http://redis.io/documentation, man:redis-server(1)
[Service]
Type=forking
ExecStart=/usr/bin/redis-server /etc/redis/redis.conf
ExecStop=/bin/kill -s TERM $MAINPID
PIDFile=/var/run/redis/redis-server.pid
TimeoutStopSec=0
Restart=always
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=2755
UMask=007
PrivateTmp=yes
LimitNOFILE=65535
PrivateDevices=yes
ProtectHome=yes
#Modified 20180129 to avoid issue to start redis
#redis-server.service: Failed at step NAMESPACE spawning /usr/bin/redis-server: Stale file handle
#ReadOnlyDirectories=/
ReadWriteDirectories=-/var/lib/redis
ReadWriteDirectories=-/var/log/redis
ReadWriteDirectories=-/var/run/redis
NoNewPrivileges=true
CapabilityBoundingSet=CAP_SETGID CAP_SETUID CAP_SYS_RESOURCE
MemoryDenyWriteExecute=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectControlGroups=true
RestrictRealtime=true
RestrictNamespaces=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
# redis-server can write to its own config file when in cluster mode so we
# permit writing there by default. If you are not using this feature, it is
# recommended that you replace the following lines with "ProtectSystem=full".
ProtectSystem=true
ReadWriteDirectories=-/etc/redis
[Install]
WantedBy=multi-user.target
Alias=redis.service
I've just encountered this issue after running sudo apt -y dist-upgrade and got this slightly different error in /var/log/syslog
redis-server.service: Failed at step NAMESPACE spawning /usr/bin/redis-server: Invalid argument
The solution, Launchpad Bug 1638410, is:
sudo systemctl edit redis-server
[Service]
ProtectHome=no
Save and exit the editor and complete the upgrade:
sudo apt install -f
cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.5 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.5 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
$ apt policy redis-server
redis-server:
Installed: 5:5.0.0-3chl1~xenial1
Candidate: 5:5.0.0-3chl1~xenial1
Version table:
*** 5:5.0.0-3chl1~xenial1 500
500 http://ppa.launchpad.net/chris-lea/redis-server/ubuntu xenial/main amd64 Packages
100 /var/lib/dpkg/status
2:3.0.6-1 500
500 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages
If you're using Ubuntu, in
/etc/redis/redis.conf
you should have :
supervised systemd

Cannot create systemd script for Puma

I created a service script named "puma.service" in /etc/systemd/system/ with the following contents:
[Unit]
Description=Puma HTTP Server
After=network.target
[Service]
Type=simple
User=ubuntu
WorkingDirectory=/home/username/appdir/current
ExecStart=/bin/bash -lc "/home/username/appdir/current/sbin/puma -C /home/username/appdir/current/config/puma.rb /home/username/appdir/current/config.ru"
Restart=always
[Install]
WantedBy=multi-user.target
I enabled the service and when started, I'm getting the following log from systemctl:
● puma.service - Puma HTTP Server
Loaded: loaded (/etc/systemd/system/puma.service; enabled; vendor preset: enabled)
Active: inactive (dead) (Result: exit-code) since Wed 2016-12-14 10:09:46 UTC; 12min ago
Process: 16889 ExecStart=/bin/bash -lc cd /home/username/appdir/current && bundle exec puma -C /home/username/appdir..
Main PID: 16889 (code=exited, status=127)
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Main process exited, code=exited, status=127/n/a
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Unit entered failed state.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Failed with result 'exit-code'.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Service hold-off time over, scheduling restart.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: Stopped Puma HTTP Server.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Start request repeated too quickly.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: Failed to start Puma HTTP Server.
Although, when I give the command in SSH terminal the server started and was running perfect. Is there any changes I have to make in the service file?
Note:
I have changed the dirnames for your convenience.
I did some research and the cause of status 127 is due to the executable not in the path. But, I guess that won't be a problem.
Can you shed some light?
I found the problem and changed the ExecStart as mentioned below and it worked like a charm:
ExecStart=/home/username/.rbenv/shims/bundle exec puma -e production -C ./config/puma.rb config.ru
PIDFile=/home/username/appdir/shared/tmp/pids/puma.pid
bundle should be taken from the rbenv shims and also the puma's config file (config/puma.rb) and application's config file (config.ru) can be given in relative path.
One way to solve it is to specify a PID file and systemd will take a look at that file to check on service status.
Here's how we use this in our scripts (adapted to your given sample)
ExecStart=/bin/bash -lc '/home/username/appdir/current/sbin/puma -C /home/username/appdir/current/config/puma.rb /home/username/appdir/current/config.ru --pidfile /home/username/appdir/current/tmp/pids/puma.pid'
PIDFile=/home/username/appdir/current/tmp/pids/puma.pid
Take note that you might have to configure --pidfile via your -C puma.rb file instead of passing it an as parameter. I'm just showing it here to illustrate that --pidfile (in puma config) should be the same as PIDFile in the service file.
As for why the error message is that way, I'm not sure myself and is interested in the answer too.
For rvm users try with
[Unit]
Description=Puma HTTP Server
After=network.target
[Service]
Type=simple
User=user_name
Group=user_name
WorkingDirectory=/home/user_name/apps/app_name/current
Environment=RAILS_ENV=production
ExecStart=/home/user_name/.rvm/bin/rvm ruby-3.1.2#app_name do bundle exec --keep-file-descriptors puma -C /home/user_name/apps/app_name/current/config/puma/production.rb
ExecReload=/bin/kill -USR1 $MAINPID
StandardOutput=append:/home/user_name/apps/app_name/current/log/puma_access.log
StandardError=append:/home/user_name/apps/app_name/current/log/puma_error.log
SyslogIdentifier=app_name-puma
Restart=always
KillMode=process
[Install]
WantedBy=multi-user.target
if you are not using a gemset change
ExecStart=/home/user_name/.rvm/bin/rvm ruby-3.1.2#app_name ....
to
ExecStart=/home/user_name/.rvm/bin/rvm default