Cap deploy creates duplicate unicorns - capistrano

I have the following tasks in my deploy.rb
namespace :unicorn do
desc "stop unicorn"
task :stop, :roles => :app, :except => { :no_release => true } do
run "#{try_sudo} kill `cat #{unicorn_pid}`"
end
desc "start unicorn"
task :start, :roles => :app, :except => { :no_release => true } do
run "cd #{current_path} && #{try_sudo} unicorn -c #{current_path}/config/unicorn.rb -E #{rails_env} -D"
end
task :reload, :roles => :app, :except => { :no_release => true } do
run "#{try_sudo} kill -s USR2 `cat #{unicorn_pid}`"
end
after "deploy:restart", "unicorn:reload"
end
When I run unicorn:start or unicorn:reload tasks from my development machine everything looks fine on the server:
$ ps aux | grep unicorn
myuser 8196 77.9 12.2 81020 62748 ? Sl 19:18 0:14 unicorn master -c /home/myuser/www/myapp/current/config/unicorn.rb -E production -D
myuser 8216 0.0 11.5 81020 59232 ? Sl 19:18 0:00 unicorn worker[0] -c /home/myuser/www/myapp/current/config/unicorn.rb -E production -D
However when I run a full-on cap deploy I get multiple instances of the unicorn server, which confuses the hell out of nginx.
$ ps aux | grep unicorn
myuser 8196 4.4 12.2 81020 62764 ? Sl 19:18 0:14 unicorn master (old) -c /home/myuser/www/myapp/current/config/unicorn.rb -E production -D
myuser 8216 1.1 13.2 87868 67764 ? Sl 19:18 0:03 unicorn worker[0] -c /home/myuser/www/myapp/current/config/unicorn.rb -E production -D
myuser 8362 5.8 12.8 83448 65408 ? Sl 19:19 0:16 unicorn master -c /home/myuser/www/myapp/current/config/unicorn.rb -E production -D
myuser 8385 0.0 12.1 83712 61980 ? Sl 19:19 0:00 unicorn worker[0] -c /home/myuser/www/myapp/current/config/unicorn.rb -E production -D
I have no idea why unicorn:reload is spinning up these duplicate instances on deploy. Apparently it's not stopping the previous master/worker. I have to run the unicorn:stop task twice then unicorn:start again to rectify the problem
Anyone else run into this? I've been poking at it for hours without any luck

So it looks like the issue was a faulty unicorn install. I nuked my gems and rebundled and now everything is sweet. Unicorn version is the same so it's still a bit of a mystery but at least it's working now

Related

run conditional migration sh script via docker compose yml

Been googling around for this but I can't get the exact answer.
I am building a mobile app and I want to run additional migration scripts when the environment is "local".
I have a docker-compose-local.yml which builds the db
database:
build:
context: ./src/Database/
dockerfile: Dockerfile
container_name: database
ports:
- 1401:1433
environment:
ACCEPT_EULA: 'Y'
SA_PASSWORD: 'password'
ENVIRONMENT: 'local'
networks:
- my-network
and then a Dockerfile with an entrypoint
ENTRYPOINT ["/usr/src/app/entry-point.sh"]
And then a script that runs migrations.
#!/bin/bash
# wait for MSSQL server to start
export STATUS=1
i=0
MIGRATIONS=$(ls migrations/*.sql | sort -V)
SEEDS=$(ls seed/*.sql)
while [[ $STATUS -ne 0 ]] && [[ $i -lt 30 ]]; do
i=$i+1
/opt/mssql-tools/bin/sqlcmd -t 1 -U sa -P $SA_PASSWORD -Q "select 1" >> /dev/null
STATUS=$?
done
if [ $STATUS -ne 0 ]; then
echo "Error: MSSQL SERVER took more than thirty seconds to start up."
exit 1
fi
echo "======= MSSQL SERVER STARTED ========" | tee -a ./config.log
# Run the setup script to create the DB and the schema in the DB
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD -d master -i create-database.sql
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD -d master -i create-database-user.sql
for f in $MIGRATIONS
do
echo "Processing migration $f"
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD -d master -i $f
done
# RUN THIS ONLY FOR ENVIRONMENT = local
for s in $SEEDS
do
echo "Seeding $s"
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD -d master -i $s
done
Currently everything works perfectly fine, except the seeds are added for all environments.
I only want to run the seed scripts if environment = local.
How can this condition be written into this script?
Alternative is there a cleaner way to do this?
Thanks
There are multiple ways to achieve your goal. Three that come to mind quickly are:
Insert and check the environment variable in the script (What you are trying to do now)
Have two versions of the script in the Container and change the entrypoint in the docker-compose file (either with environment variables or by using multiple compose files)
Build two different versions of the docker image for local and production environment
With your current setup the first alternative is the easiest:
# RUN THIS ONLY FOR ENVIRONMENT = local
if [[ "$ENVIRONMENT" == "local" ]]; then
for s in $SEEDS
do
echo "Seeding $s"
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $SA_PASSWORD -d master -i $s
done
fi

How to run a process in daemon mode with systemd service?

I've googled and read quite a bit of blogs, posts, etc. on this. I've also been trying them out manually on my EC2 instance. However, I'm still not able to properly configure the systemd service unit to have it run the process in background as I expect. The process I'm running is nessus service. Here's my service unit definition:
$ cat /etc/systemd/system/nessusagent.service
[Unit]
Description=Nessus
[Service]
ExecStart=/opt/myorg/bin/init_nessus
Type=simple
[Install]
WantedBy=multi-user.target
and here is my script /opt/myorg/bin/init_nessus:
$ cat /opt/apiq/bin/init_nessus
#!/usr/bin/env bash
set -e
NESSUS_MANAGER_HOST=...
NESSUS_MANAGER_PORT=...
NESSUS_CLIENT_GROUP=...
NESSUS_LINKING_KEY=...
#-------------------------------------------------------------------------------
# link nessus agent with manager host
#-------------------------------------------------------------------------------
/opt/nessus_agent/sbin/nessuscli agent link --key=${NESSUS_LINKING_KEY} --host=${NESSUS_MANAGER_HOST} --port=${NESSUS_MANAGER_PORT} --groups=${NESSUS_CLIENT_GROUP}
if [ $? -ne 0 ]; then
echo "Cannot link the agent to the Nessus manager, quitting."
exit 1
fi
/opt/nessus_agent/sbin/nessus-service -q -D
When I run the service, I always get the following:
$ systemctl status nessusagent.service
● nessusagent.service - Nessus
Loaded: loaded (/etc/systemd/system/nessusagent.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Mon 2020-08-24 06:40:40 UTC; 9min ago
Process: 27787 ExecStart=/opt/myorg/bin/init_nessus (code=exited, status=0/SUCCESS)
Main PID: 27787 (code=exited, status=0/SUCCESS)
...
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: + /opt/nessus_agent/sbin/nessuscli agent link --key=... --host=... --port=8834 --groups=...
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: [info] [agent] HostTag::getUnix: setting TAG value to '8596420322084e3ab97d3c39e5c92e00'
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: [info] [agent] Successfully linked to <myorg.com>:8834
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[27787]: + '[' 0 -ne 0 ']'
Aug 24 06:40:40 ip-10-27-0-104 init_nessus[28506]: + /opt/nessus_agent/sbin/nessus-service -q -D
However, I can't see the process that I expect to see:
$ ps faux | grep nessus
root 28565 0.0 0.0 12940 936 pts/0 S+ 06:54 0:00 \_ grep --color=auto nessus
If I run the last command manually, I can see it:
$ /opt/nessus_agent/sbin/nessus-service -q -D
$ ps faux | grep nessus
root 28959 0.0 0.0 12940 1016 pts/0 S+ 07:00 0:00 \_ grep --color=auto nessus
root 28952 0.0 0.0 6536 116 ? S 07:00 0:00 /opt/nessus_agent/sbin/nessus-service -q -D
root 28953 0.2 0.0 69440 9996 pts/0 Sl 07:00 0:00 \_ nessusd -q
What is it that I'm missing here?
Eventually figured out that this was because of the extra -D option in the last command. Removing the -D option fixed the issue. Running the process in daemon mode inside a system manager is not the way to go. We need to run it in the foreground and let the system manager handle it.

Postgres No such interface 'org.freedesktop.DBus.Properties'

Postgres database crashed after restart, tried just about everything including reinstalling postgres. It will not start on ubuntu 14.04,
$ systemctl status postgresql#9.6-main.service
Failed to issue method call: No such interface 'org.freedesktop.DBus.Properties' on object at path /org/freedesktop/systemd1/unit/postgresql_409_2e6_2dmain_2eservice
$ pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
9.6 main 5432 down postgres /var/lib/postgresql/9.6/main /var/log/postgresql/postgresql-9.6-main.log
$ sudo service postgresql start
* Starting PostgreSQL 9.6 database server
* Failed to issue method call: Unit postgresql#9.6-main.service failed to
load: No such file or directory. See system logs and 'systemctl status
postgresql#9.6-main.service' for details.
$ ps uxa|grep dbus-daemon
message+ 751 0.0 0.0 40812 4064 ? Ss 18:39 0:03 dbus-daemon --system --fork
dominic 3058 0.0 0.0 40840 4252 ? Ss 18:40 0:02 dbus-daemon --fork --session --address=unix:abstract=/tmp/dbus-S1LhlCDwl2
dominic 3145 0.0 0.0 39400 3536 ? S 18:40 0:00 /bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.conf --nofork --print-address 3
dominic 17462 0.0 0.0 15956 2244 pts/4 S+ 21:45 0:00 grep --color=auto dbus-daemon
Postgres log file is empty.
I had the same error after install snap on Ubuntu 14.04. It was install some parts from systemd and broke postgresql init script.
You need to add parameter --skip-systemctl-redirect to pg_ctlcluster in file /usr/share/postgresql-common/init.d-functions
The function you need to change:
do_ctl_all() {
...
# --skip-systemctl-redirect fix postgresql No such interface 'org.freedesktop.DBus.Properties'
if [ "$1" = "stop" ] || [ "$1" = "restart" ]; then
ERRMSG=$(pg_ctlcluster --skip-systemctl-redirect --force "$2" "$name" $1 2>&1)
else
ERRMSG=$(pg_ctlcluster --skip-systemctl-redirect "$2" "$name" $1 2>&1)
fi
...
}
Ubuntu 14.04 did not switch to systemd yet. I highly recommend upgrading to 16.04 or even better, 18.04.

Celery doesn't restart subprocesses

I have an issue with celery deployment - when I restart it old subprocesses don't stop and continue to process some of jobs. I use supervisord to run celery. Here is my config:
$ cat /etc/supervisor/conf.d/celery.conf
[program:celery]
; Full path to use virtualenv, honcho to load .env
command=/home/ubuntu/venv/bin/honcho run celery -A stargeo worker -l info --no-color
directory=/home/ubuntu/app
environment=PATH="/home/ubuntu/venv/bin:%(ENV_PATH)s"
user=ubuntu
numprocs=1
stdout_logfile=/home/ubuntu/logs/celery.log
stderr_logfile=/home/ubuntu/logs/celery.err
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
Here is how celery processes look:
$ ps axwu | grep celery
ubuntu 983 0.0 0.1 47692 10064 ? S 11:47 0:00 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/honcho run celery -A stargeo worker -l info --no-color
ubuntu 984 0.0 0.0 4440 652 ? S 11:47 0:00 /bin/sh -c celery -A stargeo worker -l info --no-color
ubuntu 985 0.0 0.5 168720 41356 ? S 11:47 0:01 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/celery -A stargeo worker -l info --no-color
ubuntu 990 0.0 0.4 167936 36648 ? S 11:47 0:00 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/celery -A stargeo worker -l info --no-color
ubuntu 991 0.0 0.4 167936 36648 ? S 11:47 0:00 /home/ubuntu/venv/bin/python /home/ubuntu/venv/bin/celery -A stargeo worker -l info --no-color
When I run sudo supervisorctl restart celery it only stops first process python ... honcho one and all the other ones continue. And if I try to kill them they continue (kill -9 works).
This appeared to be a bug with honcho. I ended up with workaround of starting this script from supervisor:
#!/bin/bash
source /home/ubuntu/venv/bin/activate
exec env $(cat .env | grep -v ^# | xargs) \
celery -A stargeo worker -l info --no-color

memcached restart starts a new memcached and doesn't kill the old one

I'm running my rails app in production mode and in staging mode on the same server, in different folders. They both use memcache-client which requires memcached to be running.
As yet i haven't set up a deploy script and so just do a deploy manually by sshing onto the server, going to the appropriate directory, updating the code, restarting memcached and then restarting unicorn (the processes which actually run the rails app). I restart memcached thus:
sudo /etc/init.d/memcached restart &
This starts a new memcached, but it doesn't kill the old one: check it out:
ip-<an-ip>:test.millionaire[subjects]$ ps afx | grep memcache
11176 pts/2 S+ 0:00 | \_ grep --color=auto memcache
10939 pts/3 R 8:13 \_ sudo /etc/init.d/memcached restart
7453 ? Sl 0:00 /usr/bin/memcached -m 64 -p 11211 -u nobody -l 127.0.0.1
ip-<an-ip>:test.millionaire[subjects]$ sudo /etc/init.d/memcached restart &
[1] 11187
ip-<an-ip>:test.millionaire[subjects]$ ps afx | grep memcache
11187 pts/2 T 0:00 | \_ sudo /etc/init.d/memcached restart
11199 pts/2 S+ 0:00 | \_ grep --color=auto memcache
10939 pts/3 R 8:36 \_ sudo /etc/init.d/memcached restart
7453 ? Sl 0:00 /usr/bin/memcached -m 64 -p 11211 -u nobody -l 127.0.0.1
[1]+ Stopped sudo /etc/init.d/memcached restart
ip-<an-ip>:test.millionaire[subjects]$ sudo /etc/init.d/memcached restart &
[2] 11208
ip-<an-ip>:test.millionaire[subjects]$ ps afx | grep memcache
11187 pts/2 T 0:00 | \_ sudo /etc/init.d/memcached restart
11208 pts/2 R 0:01 | \_ sudo /etc/init.d/memcached restart
11218 pts/2 S+ 0:00 | \_ grep --color=auto memcache
10939 pts/3 R 8:42 \_ sudo /etc/init.d/memcached restart
7453 ? Sl 0:00 /usr/bin/memcached -m 64 -p 11211 -u nobody -l 127.0.0.1
What might be causing it is there's another memcached running - see the bottom line. I'm mystified as to where this is from and my instinct is to kill it but i thought i'dd better check with someone who actually knows more about memcached than i do.
Grateful for any advice - max
EDIT - solution
I figured this out after a bit of detective work with a colleague. In the rails console i typed CACHE.stats which prints out a hash of values, including "pid", which i could see was set to the instance of memcached which wasn;t started with memcached restart, ie this process:
7453 ? Sl 0:00 /usr/bin/memcached -m 64 -p 11211 -u nobody -l 127.0.0.1
The memcached control script (ie that defines the start, stop and restart commands), is in /etc/init.d/memcached
A line in this says
# Edit /etc/default/memcached to change this.
ENABLE_MEMCACHED=no
So i looked in /etc/default/memcached, which was also set to ENABLE_MEMCACHED=no
So, this was basically preventing memcached from being stopped and started. I changed it to ENABLE_MEMCACHED=yes, then it would stop and start fine. Now when i stop and start memcached, it's the above process, the in-use memcached, that's stopped and started.
try using:
killall memcached