nfs-kernel-server failed to start on debian - kubernetes

A bit new to this topic and would be great if one can provide some pointers in this case:
i have been trying to use NFS on an in-house built cluster. everything worked from the link i followed except common file sharing topic.
installed NFS-kernel-server on master node
installed NFS-common on slave / worker nodes
configured /etc/exports on master node as required
configured /etc/fstab properly on worker nodes with the mount location
But unable to start nfs-server on master node and the errors show issue with the dependencies
if needed i can provide output from journalctl -xe but main error shows "Failed to mount NFSD configuration filesystem."
Any pointer / solution would be greatly helpful.
Output from journalctl -xe:
*The unit proc-fs-nfsd.mount has entered the 'failed' state with result 'exit-code'. Dec 13 10:06:39 ccc-001 systemd1: Failed to
mount NFSD configuration filesystem.
-- Subject: A start job for unit proc-fs-nfsd.mount has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- A start job for unit proc-fs-nfsd.mount has finished with a failure.
-- The job identifier is 1032 and the job result is failed. Dec 13 10:06:39 ccc-001 systemd1: Dependency failed for NFS Mount Daemon.
-- Subject: A start job for unit nfs-mountd.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- A start job for unit nfs-mountd.service has finished with a failure.
-- The job identifier is 1034 and the job result is dependency. Dec 13 10:06:39 ccc-001 systemd1: Dependency failed for NFS server and
services.
-- Subject: A start job for unit nfs-server.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- A start job for unit nfs-server.service has finished with a failure.
-- The job identifier is 1027 and the job result is dependency. Dec 13 10:06:39 ccc-001 systemd1: Dependency failed for NFSv4 ID-name
mapping service.
-- Subject: A start job for unit nfs-idmapd.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- A start job for unit nfs-idmapd.service has finished with a failure.
-- The job identifier is 1037 and the job result is dependency. Dec 13 10:06:39 ccc-001 systemd1: nfs-idmapd.service: Job
nfs-idmapd.service/start failed with result 'dependency'. Dec 13
10:06:39 ccc-001 systemd1: nfs-server.service: Job
nfs-server.service/start failed with result 'dependency'. Dec 13
10:06:39 ccc-001 systemd1: nfs-mountd.service: Job
nfs-mountd.service/start failed with result 'dependency'. Dec 13
10:06:39 ccc-001 systemd1: Condition check resulted in RPC security
service for NFS server being skipped.
-- Subject: A start job for unit rpc-svcgssd.service has finished successfully
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- A start job for unit rpc-svcgssd.service has finished successfully.
-- The job identifier is 1042. Dec 13 10:06:39 ccc-001 systemd1: Condition check resulted in RPC security service for NFS client and
server being skipped.
-- Subject: A start job for unit rpc-gssd.service has finished successfully
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-- A start job for unit rpc-gssd.service has finished successfully.
-- The job identifier is 1041. Dec 13 10:06:39 ccc-001 sudo[4469]: pam_unix(sudo:session): session closed for user root*

Related

k3s.service won't start - Air Gapped install of k3s in Rocky 9 VM

Failed: k3s.service won't start - Air Gapped install of k3s in Rocky 9 VM
I'm trying to install k3s in a disconnected environment on a VM running Rocky 9.
The k3s.service fails to start. It mentions permission denied.
As part of troubleshooting I did the following:
Disabled SELinux
Disabled Swap memory
Install media:
Tar file
https://github.com/k3s-io/k3s/releases/download/v1.24.3%2Bk3s1/k3s-airgap-images-amd64.tar ->
/var/lib/rancher/k3s/agent/images/k3s-airgap-images-amd64.tar
K3S Binary
https://github.com/k3s-io/k3s/releases/download/v1.24.3%2Bk3s1/k3s ->
/usr/local/bin/k3s
Install Script
https://get.k3s.io/ ->
/usr/local/install/k3s/install.sh
Install CMD using install script:
sudo INSTALL_K3S_SKIP_DOWNLOAD=true ./install.sh
I noticed the following in: /etc/systemd/system/
-rw-r--r-- 1 root root 836 Aug 4 15:14 k3s.service
-rw------- 1 root root 0 Aug 4 15:14 k3s.service.env
The install script is meant to set permissions to 755 on the service. That doesn't happen. Doing chmod 755 and rebooting the VM makes no difference to k3s.service starting
Errors:
Job for k3s.service failed because the control process exited with error code.
See "systemctl status k3s.service" and "journalctl -xeu k3s.service" for details.
[admin#demolab01 k3s]$ systemctl status k3s.service
k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Thu 2022-08-04 15:30:22 UTC; 3s ago
Docs: https://k3s.io
Process: 4247 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
Process: 4249 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 4250 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Process: 4251 ExecStart=/usr/local/bin/k3s server (code=exited, status=1/FAILURE)
Main PID: 4251 (code=exited, status=1/FAILURE)
CPU: 18ms
[admin#demolab01 k3s]$ journalctl -xeu k3s.service
Subject: A start job for unit k3s.service has failed
Defined-By: systemd
Support: https://access.redhat.com/support
A start job for unit k3s.service has finished with a failure.
The job identifier is 38821 and the job result is failed.
Aug 04 15:31:24 demolab01.****<fqdn> systemd[1]: k3s.service: Scheduled restart job, restart counter is at 267.
Subject: Automatic restarting of a unit has been scheduled
Defined-By: systemd
Support: https://access.redhat.com/support
Automatic restarting of the unit k3s.service has been scheduled, as the result for
the configured Restart= setting for the unit.
Aug 04 15:31:24 demolab01.****<fqdn> systemd[1]: Stopped Lightweight Kubernetes.
Subject: A stop job for unit k3s.service has finished
Defined-By: systemd
Support: https://access.redhat.com/support
A stop job for unit k3s.service has finished.
The job identifier is 38959 and the job result is done.
Aug 04 15:31:24 demolab01.****<fqdn> systemd[1]: Starting Lightweight Kubernetes...
Subject: A start job for unit k3s.service has begun execution
Defined-By: systemd
Support: https://access.redhat.com/support
A start job for unit k3s.service has begun execution.
The job identifier is 38959.
Aug 04 15:31:24 demolab01.****<fqdn> sh[4359]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Aug 04 15:31:24 demolab01.****<fqdn> sh[4360]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Aug 04 15:31:25 demolab01.****<fqdn> k3s[4363]: time="2022-08-04T15:31:25Z" level=fatal msg="permission denied"
Aug 04 15:31:25 demolab01.****<fqdn> systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Subject: Unit process exited
Defined-By: systemd
Support: https://access.redhat.com/support
An ExecStart= process belonging to unit k3s.service has exited.
The process' exit code is 'exited' and its exit status is 1.
Aug 04 15:31:25 demolab01.****<fqdn> systemd[1]: k3s.service: Failed with result 'exit-code'.
Subject: Unit failed
Defined-By: systemd
Support: https://access.redhat.com/support
The unit k3s.service has entered the 'failed' state with result 'exit-code'.
Aug 04 15:31:25 demolab01.****<fqdn> systemd[1]: Failed to start Lightweight Kubernetes.
Subject: A start job for unit k3s.service has failed
Defined-By: systemd
Support: https://access.redhat.com/support
A start job for unit k3s.service has finished with a failure.
Any ideas welcome. I am no linux expert :-(

Ubuntu service stops randomly with "Main Process exited, status 143/n/a"

My apps deployed as debians and started using systemd service.The app is getting crashed randomly. I am unable to find the reason for the crash.
I have 4 applications running[built using java, scala], out of which two are getting killed(named as op and common). All are started using systemd services.
Error on syslog is
Jul 22 11:45:44 misqa mosquitto[2930]: Socket error on client
005056b76983-Common, disconnecting
Jul 22 11:45:44 misqa systemd[1]: commonmod.service: Main process
exited, code=exited, status=143/n/a
Jul 22 11:45:44 misqa systemd[1]: commonmod.service: Unit entered
failed state
Jul 22 11:45:44 misqa systemd[1]: commonmod.service: Failed with
result 'exit-code'
Jul 22 11:45:44 misqa systemd[1]: opmod.service: Main process exited,
code=exited, status=143/n/a
Jul 22 11:45:44 misqa systemd[1]: opmod.service: Unit entered failed
state
Jul 22 11:45:44 misqa systemd[1]: opmod.service: Failed with result
'exit-code'
But I am not getting any error on my application log file for both op and common
When I read more, I understood that the reason for crash is due to SIGTERM command, but unable to find out what is causing it. In any of these applications, I dont have exec commands for killall.
Is there anyway to identify which process is killing my applications.
My systemd service is like this:
[Unit]
Description=common Module
After=common-api
Requires=common-api
[Service]
TimeoutStartSec=0
ExecStart=/usr/bin/common-api
[Install]
WantedBy=multi-user.target
Basically Java programs sometimes don't send back the expected exit status when shutting down in response to SIGTERM.
You should be able to suppress this by adding the exit code into the systemd service file as a "success" exit status:
[Service]
SuccessExitStatus=143
This solution was sucessful applied here (serverfault) and here (stasckoverflow) both with java apps.

How to activate bcm2835_wdt watchdog kernel module for raspberry pi 3?

I have been trying to activate bcm2835_wdt watchdog module of raspberry pi 3 for 6 hours but I couldn't.
modprobe bcm2835_wdt returns no error but lsmod command doesn't return bcm2835_wdt module in the list.
I have loaded watchdog and chkconfig
then;
sudo chkconfig watchdog on
when I try to start service
sudo /etc/init.d/watchdog start
I got an error
[....] Starting watchdog (via systemctl): watchdog.service Job for watchdog.service failed because the control process exited with error code.
See "systemctl status watchdog.service" and "journalctl -xe" for details.
failed!
journalctl -xe returns;
-- Kernel start-up required 2093448 microseconds.
--
-- Initial RAM disk start-up required INITRD_USEC microseconds.
--
-- Userspace start-up required 5579375635 microseconds.
Jan 11 16:03:45 al sudo[935]: root : TTY=pts/1 ; PWD=/ ; USER=root ; COMMAND=/etc/init.d/watchdog start
Jan 11 16:03:45 al sudo[935]: pam_unix(sudo:session): session opened for user root by root(uid=0)
Jan 11 16:03:46 al systemd[1]: Starting watchdog daemon...
-- Subject: Unit watchdog.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit watchdog.service has begun starting up.
Jan 11 16:03:46 al sh[949]: modprobe: **FATAL: Module dcm2835_wdt not found in directory /lib/modules/4.9.59-v7+**
Jan 11 16:03:46 al systemd[1]: watchdog.service: Control process exited, code=exited status=1
Jan 11 16:03:46 al systemd[1]: Failed to start watchdog daemon.
-- Subject: Unit watchdog.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit watchdog.service has failed.
My question is how to enable watchdog kernel module bcm2835_wdt for raspberry pi3 ?
Thank you in advance...
May the bcm2835_wdt has been compiled into the kernel on your system, so you don't see it with lsmod. Just try:
# cat /lib/modules/$(uname -r)/modules.builtin | grep wdt
kernel/drivers/watchdog/bcm2835_wdt.ko
If you can see it in the list, it has been compiled within the kernel. You may also see if it has been enabled with this:
journalctl --no-pager | grep -i watchdog
Regarding you watchdog configuration, see this error:
modprobe: **FATAL: Module dcm2835_wdt not found in directory /lib/modules/4.9.59-v7+**
The module has been called dcm2835_wdt, not bcm2835_wdt.
Also, keep in mind that your watchdog may be used by SystemD, so you should refer to that for using it.
If you don't mind, you may also try a fork bomb to see if the watchdog is able to restart you system when a problem is detected:
python -c "import os, itertools; [os.fork() for i in itertools.count()]"

Job for kube-apiserver.service failed because the control process exited with error code

On the beginning i wanted to point out i am fairly new into Linux systems, and totally, totally new with kubernetes so my question may be trivial.
As stated in the title i have problem with setting up the Kubernetes cluster. I am working on the Atomic Host Version: 7.1707 (2017-07-31 16:12:06)
I am following this guide:
http://www.projectatomic.io/docs/gettingstarted/
in addition to that i followed this:
http://www.projectatomic.io/docs/kubernetes/
(to be precise, i ran this command:
rpm-ostree install kubernetes-master --reboot
everything was going fine until this point:
systemctl start etcd kube-apiserver kube-controller-manager kube-scheduler
the problem is with:
systemctl start etcd kube-apiserver
as it gives me back this response:
Job for kube-apiserver.service failed because the control process
exited with error code. See "systemctl status kube-apiserver.service"
and "journalctl -xe" for details.
systemctl status kube-apiserver.service
gives me back:
● kube-apiserver.service - Kubernetes API Server
Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Fri 2017-08-25 14:29:56 CEST; 2s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Process: 17876 ExecStart=/usr/bin/kube-apiserver $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBE_ETCD_SERVERS $KUBE_API_ADDRESS $KUBE_API_PORT $KUBELET_PORT $KUBE_ALLOW_PRIV $KUBE_SERVICE_ADDRESSES $KUBE_ADMISSION_CONTROL $KUBE_API_ARGS (code=exited, status=255)
Main PID: 17876 (code=exited, status=255)
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service: main process exited, code=exited, status=255/n/a
Aug 25 14:29:56 master systemd[1]: Failed to start Kubernetes API Server.
Aug 25 14:29:56 master systemd[1]: Unit kube-apiserver.service entered failed state.
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service failed.
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service holdoff time over, scheduling restart.
Aug 25 14:29:56 master systemd[1]: start request repeated too quickly for kube-apiserver.service
Aug 25 14:29:56 master systemd[1]: Failed to start Kubernetes API Server.
Aug 25 14:29:56 master systemd[1]: Unit kube-apiserver.service entered failed state.
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service failed.
I have no clue where to start and i will be more than thankful for any advices.
It turned out to be a typo in /etc/kubernetes/config. I misunderstood the "# Comma separated list of nodes in the etcd cluster".
Idk how to close the thread or anything.

Mongodb fails to start -> presents weird error logs

When I moved my environment from my local (mac) to my server (ubuntu) I unzipped my directory and the server installed with npm install with no errors or warnings, but my database was failing so I decided to reinstall it based on this tutorial (well, apt-remove mongo* first)
https://www.digitalocean.com/community/tutorials/how-to-install-mongodb-on-ubuntu-16-04
but then I get a
Job for mongodb.service failed because the control process exited with error code. See "systemctl status mongodb.service" and "journalctl -xe" for details.
Does anyone know what any of this means?
-- Unit mongodb.service has begun starting up.
Jun 20 03:54:18 ip-172-31-16-163 mongodb[25271]: * Starting database mongodb
Jun 20 03:54:19 ip-172-31-16-163 mongodb[25271]: ...fail!
Jun 20 03:54:19 ip-172-31-16-163 systemd[1]: mongodb.service: Control process exited, code=exited status=1
Jun 20 03:54:19 ip-172-31-16-163 sudo[25268]: pam_unix(sudo:session): session closed for user root
Jun 20 03:54:19 ip-172-31-16-163 systemd[1]: Failed to start LSB: An object/document-oriented database.
-- Subject: Unit mongodb.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongodb.service has failed.
--
-- The result is failed.
Jun 20 03:54:19 ip-172-31-16-163 systemd[1]: mongodb.service: Unit entered failed state.
Jun 20 03:54:19 ip-172-31-16-163 systemd[1]: mongodb.service: Failed with result 'exit-code'.
Looks familiar. Check ownership of files. Files in dbPath, mongod.run -lock file, keyfile...
Basically all those files what are listed at your /etc/mongod.conf
Run the following command and Its works for me
sudo apt-get install --reinstall mongodb