Program.service ExecStart fails but the program itself runs - service

I am testing how to run a script using a .service file on CentOS7.
The script is a very simple loop just to make sure it runs:
if [ "$1" == "start" ] || [ "$1" == "cycle" ]
then
/u/Test/Bincustom/haltrun_wrap.sh run &
echo $! /u/Test/Locks/start.pid
exit
elif [ "$1" == "stop" ] || [ "$1" == "halt" ]
then
killall -q -9 haltrun_wrap.sh
echo " " /u/Test/Locks/start.pid
elif [ "$1" == "run" ]
then
process_id=$(pidof haltrun_wrap.sh)
#echo $process_id /u/Test/Locks/start.pid
while [ 1 ]
do
CurTime=$(date)
echo $CurTime /u/Test/Logs/log
sleep 30s
done
else
cat /u/Test/Locks/start.pid
cat /u/Test/Logs/log
fi
That script runs fine as the root or test user if i launch manually.
The Program.service file looks like this:
[Unit]
Description=Program
[Service]
Type=forking
RemainAfterExit=yes
PIDFile=/u/Test/Locks/start.pid
EnvironmentFile=/u/Test/Config/environ
Environment="Base="sudo -u sirsi '/u/Test/Bincustom/Program " "Stop=halt force'" "Start=cycle force'""
ExecStart=/bin/sh $Base$Start
ExecStop=/bin/sh $Base$Stop
[Install]
WantedBy=multi-user.target
WantedBy=WebServices
WantedBy=BCA
The error is always:
● Program.service - Program
Loaded: loaded (/usr/lib/systemd/system/Program.service; enabled; vendor preset: disabled)
Active: failed (Result: resources) since Wed 2017-01-11 14:53:10 MST; 1s ago
Process: 12014 ExecStart=/bin/sh $Base$Start (code=exited, status=0/SUCCESS)
Jan 11 14:53:09 localhost.localdomain systemd[1]: Starting Program...
Jan 11 14:53:10 localhost.localdomain systemd[1]: PID file /u/Test/Locks/start.pid not readable (yet?) after start.
Jan 11 14:53:10 localhost.localdomain systemd[1]: Failed to start Program.
Jan 11 14:53:10 localhost.localdomain systemd[1]: Unit Program.service entered failed state.
Jan 11 14:53:10 localhost.localdomain systemd[1]: Program.service failed.
Obviously I'm doing something wrong in the .service but for the life of me I am still missing it.

The issue was in the line:
Environment="Base="sudo -u sirsi '/u/Test/Bincustom/Program " "Stop=halt force'" "Start=cycle force'""
ExecStart=/bin/sh $Base$Start
ExecStop=/bin/sh $Base$Stop
Apparently .service files do not recognize variables.
I also had an issue with sudo not being allowed to run my test script.
i had to add the sudo into the test script.

Related

How to run a process in daemon mode with systemd service?

I've googled and read quite a bit of blogs, posts, etc. on this. I've also been trying them out manually on my EC2 instance. However, I'm still not able to properly configure the systemd service unit to have it run the process in background as I expect. The process I'm running is nessus service. Here's my service unit definition:
$ cat /etc/systemd/system/nessusagent.service
[Unit]
Description=Nessus
[Service]
ExecStart=/opt/myorg/bin/init_nessus
Type=simple
[Install]
WantedBy=multi-user.target
and here is my script /opt/myorg/bin/init_nessus:
$ cat /opt/apiq/bin/init_nessus
#!/usr/bin/env bash
set -e
NESSUS_MANAGER_HOST=...
NESSUS_MANAGER_PORT=...
NESSUS_CLIENT_GROUP=...
NESSUS_LINKING_KEY=...
#-------------------------------------------------------------------------------
# link nessus agent with manager host
#-------------------------------------------------------------------------------
/opt/nessus_agent/sbin/nessuscli agent link --key=${NESSUS_LINKING_KEY} --host=${NESSUS_MANAGER_HOST} --port=${NESSUS_MANAGER_PORT} --groups=${NESSUS_CLIENT_GROUP}
if [ $? -ne 0 ]; then
echo "Cannot link the agent to the Nessus manager, quitting."
exit 1
fi
/opt/nessus_agent/sbin/nessus-service -q -D
When I run the service, I always get the following:
$ systemctl status nessusagent.service
● nessusagent.service - Nessus
Loaded: loaded (/etc/systemd/system/nessusagent.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Mon 2020-08-24 06:40:40 UTC; 9min ago
Process: 27787 ExecStart=/opt/myorg/bin/init_nessus (code=exited, status=0/SUCCESS)
Main PID: 27787 (code=exited, status=0/SUCCESS)
...
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: + /opt/nessus_agent/sbin/nessuscli agent link --key=... --host=... --port=8834 --groups=...
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: [info] [agent] HostTag::getUnix: setting TAG value to '8596420322084e3ab97d3c39e5c92e00'
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: [info] [agent] Successfully linked to <myorg.com>:8834
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: + '[' 0 -ne 0 ']'
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[28506]: + /opt/nessus_agent/sbin/nessus-service -q -D
However, I can't see the process that I expect to see:
$ ps faux | grep nessus
root 28565 0.0 0.0 12940 936 pts/0 S+ 06:54 0:00 \_ grep --color=auto nessus
If I run the last command manually, I can see it:
$ /opt/nessus_agent/sbin/nessus-service -q -D
$ ps faux | grep nessus
root 28959 0.0 0.0 12940 1016 pts/0 S+ 07:00 0:00 \_ grep --color=auto nessus
root 28952 0.0 0.0 6536 116 ? S 07:00 0:00 /opt/nessus_agent/sbin/nessus-service -q -D
root 28953 0.2 0.0 69440 9996 pts/0 Sl 07:00 0:00 \_ nessusd -q
What is it that I'm missing here?
Eventually figured out that this was because of the extra -D option in the last command. Removing the -D option fixed the issue. Running the process in daemon mode inside a system manager is not the way to go. We need to run it in the foreground and let the system manager handle it.

Can't start a systemd service

I am trying to create a service that runs the console to convert all my crontab commands to systemd in the future, but I always get this error, I have tried different tutorials and the same problem.
# systemctl status hello-world.service
● hello-world.service - Hello World Service
Loaded: loaded (/usr/lib/systemd/system/hello-world.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since mié 2019-10-09 10:06:59 CEST; 4s ago
Process: 26080 ExecStart=/usr/share/nginx/html/scripts-systemd/hello-world.sh (code=exited, status=203/EXEC)
Main PID: 26080 (code=exited, status=203/EXEC)
oct 09 10:06:59 ns37 systemd[1]: Started Hello World Service.
oct 09 10:06:59 ns37 systemd[1]: hello-world.service: main process exited, code=exited, status=203/EXEC
oct 09 10:06:59 ns37 systemd[1]: Unit hello-world.service entered failed state.
oct 09 10:06:59 ns37 systemd[1]: hello-world.service failed.
hello-world.sh file
#!/bin/bash
while $(sleep 30);
do
echo "hello world"
done
hello-world.service file
[Unit]
Description=Hello World Service
After=systend-user-sessions.service
[Service]
Type=simple
ExecStart=/usr/share/nginx/html/scripts-systemd/hello-world.sh
[Install]
WantedBy=multi-user.target
Im using Centos 7
Edit:
What I need to do is execute console commands at certain times every day due to problems with crontab.
I was using this example to check that everything works and once it works change the commands.
Here is an example of a crontab command:
*/10 * * * * cd /usr/share/nginx/html/mywebsite.com; php wp-cron.php >/dev/null 2>&1
0 0 */3 * * date=date -I; zip -r /root/copias/copia-archivos-html-webs$date.zip /usr/share/nginx/html`
15 15 * * * wget -q -O /dev/null https://mywebsite.com/?run_plugin=key_0_0
Edit2: Done! I've managed to do it and works for now, I leave the code here so it can be useful for other people
hello-world.sh file
#!/usr/bin/env bash
/usr/bin/mysqldump -user -pass db_name >/root/copias/backupname.sql
hello-world.service
[Unit]
Description=CopiaSql
[Service]
Type=oneshot
ExecStart=/bin/bash /usr/share/nginx/html/scripts-systemd/hello-world.sh
[Install]
WantedBy=multi-user.target
hello-world.timer
[Unit]
Description=Runs every 2 minutes test.sh
[Timer]
OnCalendar=*:0/2
Unit=hello-world.service
[Install]
WantedBy=timers.target
Thanks everyone for your help!
I have a first boot install service that displays on the console and it looks like:
[Unit]
After=multi-user.target
# tty getty service login promts for tty1 & tty6
# will not be seen until this install completes.
Before=getty#tty1.service getty#tty6.service
[Service]
Type=oneshot
ExecStart=/bin/bash -c "export TERM=vt100;/var/ssi/firstboot_install.sh"
StandardOutput=tty
StandardInput=tty
[Install]
WantedBy=multi-user.target
My script that runs also has this code to start
#---------------------------------------------------------------------
# Switch to tty6 so input is allowed from installation questions
# Change back to tty1 at end of this script to show normal booting
# messages from systemd.
#---------------------------------------------------------------------
exec < /dev/tty6 > /dev/tty6
chvt 6
at the end of this script I change it back
# Now that the system has been registered and has a few channels added,
# I can have the installation go back to the main Anaconda output screen
# on tty1
chvt 1
exec < /dev/tty1 > /dev/tty1
exit 0
This might not be exactly what you want but you can adapt it to your needs. The goal here is to display something on the console which starts during the boot sequence. My script which asks a number of installation questions where input is NOT allowed on tty1 (console) which is why I change to tty6 so input is allowed during the first boot installation.
Your script try:
#!/bin/bash
exec < /dev/tty6 > /dev/tty6
chvt 6
while $(sleep 30);
do
echo "hello world"
done
chvt 1
exec < /dev/tty1 > /dev/tty1
This might be overkill for what your trying to do but if you need input
from the console, you should do the same with tty6

Using Supervisord to manage mongos process

background
I am trying to automate the restarting in case of crash or reboot for mongos process used in mongodb sharded setup.
Case 1 : using direct command, with mongod user
supervisord config
[program:mongos_router]
command=/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid
user=mongod
autostart=true
autorestart=true
startretries=10
Result
supervisord log
INFO spawned: 'mongos_router' with pid 19535
INFO exited: mongos_router (exit status 0; not expected)
INFO gave up: mongos_router entered FATAL state, too many start retries too quickly
mongodb log
2018-05-01T21:08:23.745+0000 I SHARDING [Balancer] balancer id: ip-address:27017 started
2018-05-01T21:08:23.745+0000 E NETWORK [mongosMain] listen(): bind() failed errno:98 Address already in use for socket: 0.0.0.0:27017
2018-05-01T21:08:23.745+0000 E NETWORK [mongosMain] addr already in use
2018-05-01T21:08:23.745+0000 I - [mongosMain] Invariant failure inShutdown() src/mongo/db/auth/user_cache_invalidator_job.cpp 114
2018-05-01T21:08:23.745+0000 I - [mongosMain]
***aborting after invariant() failure
2018-05-01T21:08:23.748+0000 F - [mongosMain] Got signal: 6 (Aborted).
Process is seen running. But if killed does not restart automatically.
Case 2 : Using init script
here the slight change in the scenario is that some ulimit commands, creation of pid files is to be done as root and then the actual process should be started as mongod user.
mongos script
start()
{
# Make sure the default pidfile directory exists
if [ ! -d $PID_PATH ]; then
install -d -m 0755 -o $MONGO_USER -g $MONGO_GROUP $PIDDIR
fi
# Make sure the pidfile does not exist
if [ -f $PID_FILE ]; then
echo "Error starting mongos. $PID_FILE exists."
RETVAL=1
return
fi
ulimit -f unlimited
ulimit -t unlimited
ulimit -v unlimited
ulimit -n 64000
ulimit -m unlimited
ulimit -u 64000
ulimit -l unlimited
echo -n $"Starting mongos: "
#daemon --user "$MONGO_USER" --pidfile $PID_FILE $MONGO_BIN $OPTIONS --pidfilepath=$PID_FILE
#su $MONGO_USER -c "$MONGO_BIN -f $CONFIGFILE --pidfilepath=$PID_FILE >> /home/mav/startup_log"
su - mongod -c "/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid"
RETVAL=$?
echo -n "Return value : "$RETVAL
echo
[ $RETVAL -eq 0 ] && touch $MONGO_LOCK_FILE
}
daemon comman represents original script, but daemonizing under the supervisord is not logical, so using command to run the process in foreground(?)
supervisord config
[program:mongos_router_script]
command=/etc/init.d/mongos start
user=root
autostart=true
autorestart=true
startretries=10
Result
supervisord log
INFO spawned: 'mongos_router_script' with pid 20367
INFO exited: mongos_router_script (exit status 1; not expected)
INFO gave up: mongos_router_script entered FATAL state, too many start retries too quickly
mongodb log
Nothing indicating error, normal logs
Process is seen running. But if killed does not restart automatically.
Problem
How to correctly configure script / no script option for running mongos under supervisord ?
EDIT 1
Modified Command
sudo su -c "/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid" -s /bin/bash mongod`
This works if ran individually on command line as well as part of the script, but not with supervisord
EDIT 2
Added following option to config file for mongos to force it to run in the foreground
processManagement:
fork: false # fork and run in background
Now command line and script properly run it in the foreground but supervisord fails to launch it. At the same time there are 3 processes show up when ran from command line or script
root sudo su -c /usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid -s /bin/bash mongod
root su -c /usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid -s /bin/bash mongod
mongod /usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid
EDIT 3
With following supervisord config things are working fine. But I want to try and execute the script if possible to set ulimit
[program:mongos_router]
command=/usr/bin/mongos -f /etc/mongos.conf --pidfilepath=/var/run/mongodb/mongos.pid
user=mongod
autostart=true
autorestart=true
startretries=10
numprocs=1
For the mongos to run in the foreground set the following option
#how the process runs
processManagement:
fork: false # fork and run in background
with that and above supervisord.conf setting, mongos will be launched and under the supervisord control

gsutil rsync -C "continue" option not working

gsutil rsync -C "continue" option is not working from backup_script:
$GSUTIL rsync -c -C -e -r -x $EXCLUDES $SOURCE/Documents/ $DESTINATION/Documents/
From systemd log:
$ journalctl --since 12:00
Jul 25 12:00:14 localhost.localdomain CROND[9694]: (wolfv) CMDOUT (CommandException: Error opening file "file:///home/wolfv/Documents/PC_maintenance/backup_systems/gsutil/ssmtp.conf": .)
Jul 25 12:00:14 localhost.localdomain CROND[9694]: (wolfv) CMDOUT (Caught ^C - exiting)
Jul 25 12:00:14 localhost.localdomain CROND[9694]: (wolfv) CMDOUT (Caught ^C - exiting)
Jul 25 12:00:14 localhost.localdomain CROND[9694]: (wolfv) CMDOUT (Caught ^C - exiting)
Jul 25 12:00:14 localhost.localdomain CROND[9694]: (wolfv) CMDOUT (Caught ^C - exiting)
because owner is root rather than user:
$ ls -l ssmtp.conf
-rw-r-----. 1 root root 1483 Jul 24 21:30 ssmtp.conf
rsyc worked fine after deleting the root-owned file.
This happened on a Fedora22 machine, when cron called backup_script which called gsutil rsync.
Thanks for reporting that problem. We'll get a fix for this bug in gsutil release 4.14.
Mike

Mongodb over Lustre?

I need to install a mongodb instance with a lot of data storage.
We have a Lustre FS with hundreds of terabytes, but when monogdb start show me this error:
Mon Jul 15 12:06:50.898 [initandlisten] exception in initAndListen: 10310 Unable to lock file: /var/lib/mongodb/mongod.lock. Is a mongod instance already running?, terminating
Mon Jul 15 12:06:50.898 dbexit:
But the permissions should be fine:
# ls -lart /project/mongodb/
total 8
drwxr-xr-x 19 root root 4096 Jul 15 11:12 ..
-rwxr-xr-x 1 mongod mongod 0 Jul 15 11:54 mongod.lock
drwxr-xr-x 2 mongod mongod 4096 Jul 15 12:10 .
And no other running process:
# ps -fu mongod
UID PID PPID C STIME TTY TIME CMD
#
Has anyone done this (Lustre+mongodb)?
# rm mongod.lock
rm: remove regular empty file `mongod.lock'? y
# ls -lrt
total 0
# ls -lart
total 8
drwxr-xr-x 19 root root 4096 Jul 15 11:12 ..
drwxr-xr-x 2 mongod mongod 4096 Jul 15 12:10 .
# ps aux | grep mongod
root 25865 0.0 0.0 103296 884 pts/15 S+ 13:04 0:00 grep mongod
# service mongod start
Starting mongod: about to fork child process, waiting until server is ready for connections.
forked process: 25935
all output going to: /var/log/mongo/mongod.log
ERROR: child process failed, exited with error number 100
[FAILED]
I realize that this is an old question, but I feel I should set the record straight.
MongoDB, or any DB or any application can run against a lustre file system without issues. However, by default, lustre clients do not explicitly set user_xattr or flock (enable).
Having set -o flock or even -o localflock while mounting the file system would have resolved the issue.