rkt discovery failed error while bringing flannel up (Kubernetes setup) - kubernetes

We are trying to setup Kubernetes cluster on 3 nodes with coreos following official step by step documentation - https://coreos.com/kubernetes/docs/latest/deploy-master.html
Servers are behind company proxy, and have proxy service defined in both
/etc/systemd/system/docker.service.d
/etc/systemd/system/flanneld.service.d
Following is picked in
systemctl cat flanneld
# /usr/lib/systemd/system/flanneld.service
[Unit]
Description=flannel - Network fabric for containers (System Application Container)
Documentation=https://github.com/coreos/flannel
After=etcd.service etcd2.service etcd-member.service
Before=docker.service flannel-docker-opts.service
Requires=flannel-docker-opts.service
[Service]
Type=notify
Restart=always
RestartSec=10s
LimitNOFILE=40000
LimitNPROC=1048576
Environment="FLANNEL_IMAGE_TAG=v0.6.2"
Environment="FLANNEL_OPTS=--ip-masq=true"
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid"
EnvironmentFile=-/run/flannel/options.env
ExecStartPre=/sbin/modprobe ip_tables
ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos /run/flannel
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/lib/coreos/flannel-wrapper.uuid
ExecStart=/usr/lib/coreos/flannel-wrapper $FLANNEL_OPTS
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/lib/coreos/flannel-wrapper.uuid
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env
# /etc/systemd/system/flanneld.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.140.65.114:8080/"
Environment="HTTPS_PROXY=http://10.140.65.114:8080/"
and
systemctl cat docker
# /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=containerd.service docker.socket early-docker.target network.target
Requires=containerd.service docker.socket early-docker.target
[Service]
Type=notify
EnvironmentFile=-/run/flannel/flannel_docker_opts.env
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/lib/coreos/dockerd --host=fd:// --containerd=/var/run/docker/libcontainerd/docker-containerd.sock $DOCKER_OPTS $DOCKER_CGROUPS $DOCKER_OPT_BIP $DOCKER_OP
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/docker.service.d/40-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/etc/kubernetes/cni/docker_opts_cni.env
# /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.140.65.114:8080/"
Environment="HTTPS_PROXY=http://10.140.65.114:8080/"
# /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env
# /etc/systemd/system/flanneld.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.140.65.114:8080/"
Environment="HTTPS_PROXY=http://10.140.65.114:8080/"
after running systemctl daemon-reload and systemctl start flannel, getting following error
Feb 16 19:50:40 localhost systemd[1]: Starting flannel - Network fabric for containers (System Application Container)...
-- Subject: Unit flanneld.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has begun starting up.
Feb 16 19:50:40 localhost rkt[52933]: rm: cannot get pod: no matches found for "26778eb4-9d8a-4d3c-9bb7-6ffb13a55d6a"
Feb 16 19:50:40 localhost rkt[52933]: rm: failed to remove one or more pods
Feb 16 19:50:40 localhost flannel-wrapper[52947]: + exec /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid --trust-keys-from-https --mount volume=notify,target=/run/systemd/notify --volume notify,kind=host,source=/run/systemd/notify --set-env=NOTIFY_SOCKET=/run/systemd/notify --net=host --volume run-flannel,kind=host,source=/run/flannel,readOnly=false --volume etc-ssl-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume etc-hosts,kind=host,source=/etc/hosts,readOnly=true --volume etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=run-flannel,target=/run/flannel --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=usr-share-certs,target=/usr/share/ca-certificates --mount volume=etc-hosts,target=/etc/hosts --mount volume=etc-resolv,target=/etc/resolv.conf --inherit-env --stage1-from-dir=stage1-fly.aci quay.io/coreos/flannel:v0.6.2 -- --ip-masq=true
Feb 16 19:50:41 localhost sudo[52978]: admin : TTY=pts/1 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/journalctl -e -u kubelet
Feb 16 19:50:41 localhost sudo[52978]: pam_unix(sudo:session): session opened for user root by admin(uid=0)
Feb 16 19:50:41 localhost sudo[52978]: pam_systemd(sudo:session): Cannot create session: Already running in a session
Feb 16 19:50:41 localhost sudo[52978]: pam_unix(sudo:session): session closed for user root
Feb 16 19:50:42 localhost flannel-wrapper[52947]: image: keys already exist for prefix "quay.io/coreos/flannel", not fetching again
Feb 16 19:50:43 localhost sudo[52990]: admin : TTY=pts/1 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/journalctl -e -u kubelet
Feb 16 19:50:43 localhost sudo[52990]: pam_unix(sudo:session): session opened for user root by admin(uid=0)
Feb 16 19:50:43 localhost sudo[52990]: pam_systemd(sudo:session): Cannot create session: Already running in a session
Feb 16 19:50:43 localhost sudo[52990]: pam_unix(sudo:session): session closed for user root
Feb 16 19:50:44 localhost flannel-wrapper[52947]: Downloading signature: 0 B/473 B
Feb 16 19:50:44 localhost flannel-wrapper[52947]: Downloading signature: 473 B/473 B
Feb 16 19:50:45 localhost flannel-wrapper[52947]: Downloading signature: 473 B/473 B
Feb 16 19:50:45 localhost flannel-wrapper[52947]: run: Get https://quay-registry.s3.amazonaws.com/sharedimages/36acf4f7-a5bd-470b-9a44-13cbd244b571/layer?Signature=v8rQghQZR0k%2B1UxDG8oGw89vTqY%3D&Expires=1487255465&AWSAccessKeyId=AKIAJWZWUIS24TWSMWRA: Blocked site:
Feb 16 19:50:45 localhost systemd[1]: flanneld.service: Main process exited, code=exited, status=254/n/a
Feb 16 19:50:45 localhost rkt[52993]: stop: cannot get pod: no matches found for "26778eb4-9d8a-4d3c-9bb7-6ffb13a55d6a"
Feb 16 19:50:45 localhost rkt[52993]: stop: failed to stop 1 pod(s)
Feb 16 19:50:45 localhost systemd[1]: Failed to start flannel - Network fabric for containers (System Application Container).
-- Subject: Unit flanneld.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has failed.
--
-- The result is failed.
Feb 16 19:50:45 localhost systemd[1]: flanneld.service: Unit entered failed state.
Feb 16 19:50:45 localhost systemd[1]: flanneld.service: Failed with result 'exit-code'.
Feb 16 19:50:45 localhost systemd[1]: Starting flannel docker export service - Network fabric for containers (System Application Container)...
-- Subject: Unit flannel-docker-opts.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flannel-docker-opts.service has begun starting up.
Feb 16 19:50:45 localhost sudo[53003]: admin : TTY=pts/1 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/journalctl -e -u kubelet
Feb 16 19:50:45 localhost sudo[53003]: pam_unix(sudo:session): session opened for user root by admin(uid=0)
Feb 16 19:50:45 localhost sudo[53003]: pam_systemd(sudo:session): Cannot create session: Already running in a session
Feb 16 19:50:45 localhost sudo[53003]: pam_unix(sudo:session): session closed for user root
Feb 16 19:50:45 localhost rkt[53000]: rm: cannot get pod: UUID cannot be empty
Feb 16 19:50:45 localhost rkt[53000]: rm: failed to remove one or more pods
Feb 16 19:50:45 localhost flannel-wrapper[53019]: + exec /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/flannel-wrapper2.uuid --trust-keys-from-https --net=host --volume run-flannel,kind=host,source=/run/flannel,readOnly=false --volume etc-ssl-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume etc-hosts,kind=host,source=/etc/hosts,readOnly=true --volume etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=run-flannel,target=/run/flannel --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=usr-share-certs,target=/usr/share/ca-certificates --mount volume=etc-hosts,target=/etc/hosts --mount volume=etc-resolv,target=/etc/resolv.conf --inherit-env --stage1-from-dir=stage1-fly.aci quay.io/coreos/flannel:v0.6.2 --exec=/opt/bin/mk-docker-opts.sh -- -d /run/flannel/flannel_docker_opts.env -i
Feb 16 19:50:46 localhost flannel-wrapper[53019]: run: discovery failed
Feb 16 19:50:46 localhost systemd[1]: flannel-docker-opts.service: Main process exited, code=exited, status=254/n/a
Feb 16 19:50:46 localhost systemd[1]: Failed to start flannel docker export service - Network fabric for containers (System Application Container).
-- Subject: Unit flannel-docker-opts.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flannel-docker-opts.service has failed.
--
-- The result is failed.
Feb 16 19:50:46 localhost systemd[1]: flannel-docker-opts.service: Unit entered failed state.
Feb 16 19:50:46 localhost systemd[1]: flannel-docker-opts.service: Failed with result 'exit-code'.
We tried different document https://www.upcloud.com/support/deploy-kubernetes-coreos/ following it, getting same type error while starting kubelet.
Seems to be problem with rkt and quay registry issue behind company proxy.
Let us know if we missed something or configured something wrong.

can you please try to
$ sudo rkt fetch quay.io/coreos/flannel:v0.6.2
first in the shell.
I believe the issue is due to either running https proxy over http, or rkt fetch running as an unprivileged user and not inheriting system environmental variables.

Related

Service starting gunicorn failing with "Start request repeated too quickly"

Trying to start a service to run gunicorn as backend server for Flask, not working. Running nginx as frontend server for React, working.
Server:
Virtualization: vmware
Operating System: Red Hat Enterprise Linux 8.4 (Ootpa)
CPE OS Name: cpe:/o:redhat:enterprise_linux:8.4:GA
Kernel: Linux 4.18.0-305.3.1.el8_4.x86_64
Architecture: x86-64
Service file in /etc/systemd/system/myservice.service:
[Unit]
Description="Description"
After=network.target
[Service]
User=root
Group=root
WorkingDirectory=/home/project/app/api
ExecStart=/home/project/app/api/venv/bin/gunicorn -b 127.0.0.1:5000 api:app
Restart=always
[Install]
WantedBy=multi-user.target
/app/api:
-rwxr-xr-x. 1 root root 2018 Jun 9 20:06 api.py
drwxrwxr-x+ 5 root root 100 Jun 7 10:11 venv
Error message:
● myservice.service - "Description"
Loaded: loaded (/etc/systemd/system/myservice.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2021-06-10 19:01:01 CEST; 5s ago
Process: 18307 ExecStart=/home/project/app/api/venv/bin/gunicorn -b 127.0.0.1:5000 api:app (code=exited, status=203/EXEC)
Main PID: 18307 (code=exited, status=203/EXEC)
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Service RestartSec=100ms expired, scheduling restart.
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Scheduled restart job, restart counter is at 5.
Jun 10 19:01:01 xxxx systemd[1]: Stopped "Description".
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Start request repeated too quickly.
Jun 10 19:01:01 xxxx systemd[1]: myservice.service: Failed with result 'exit-code'.
Jun 10 19:01:01 xxxx systemd[1]: Failed to start "Description".
Tried, not working:
Adding Environment="PATH=/home/project/app/api/venv/bin" under [Service]
$ systemctl reset-failed myservice.service
$ systemctl daemon-reload
Reboot, ofc.
Tried, working:
Running (as root) /home/project/app/api/venv/bin/gunicorn -b 127.0.0.1:5000 api:app while in /app/api directory
Does anyone know how to fix this problem?
Typically enough, I figured it out shortly after posting this issue.
SELinux is messing with permissions for files and directories, so for anyone experiencing the same issue, make sure to test with the following alterings (as root):
$ setsebool -P httpd_can_network_connect on
$ chcon -Rt httpd_sys_content_t /path/to/your/Flask/dir
In my case: $ chcon -Rt httpd_sys_content_t /home/project/app/api
While this is NOT a permanent fix, it's worth a try. Check out the SELinux docs for more permanent solutions.

celery daemon using systemd not running

i am using ubuntu 16 and systemd for running celery as a daemon.i have created the unit file also but i am not able to run celery as a service. why is this error?
/etc/systemd/system/celery.service
[Unit]
Description=Celery Service
After=network.target
[Service]
Type=forking
User=celery
Group=celery
EnvironmentFile=-/etc/default/celery
WorkingDirectory=/srv/weaver/src
ExecStart=/bin/sh -c '${CELERY_BIN} multi start ${CELERYD_NODES} \
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \
--logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL} ${CELERYD_OPTS}'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait ${CELERYD_NODES} \
--pidfile=${CELERYD_PID_FILE}'
ExecReload=/bin/sh -c '${CELERY_BIN} multi restart ${CELERYD_NODES} \
-A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} \
--logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL} ${CELERYD_OPTS}'
[Install]
WantedBy=multi-user.target
file at /etc/default/celery
ENABLED="true"
CELERYD_NODES="worker1"
#CELERYD_NODES="worker1 worker2 worker3"
CELERY_BIN="/usr/local/bin/celery"
CELERY_APP="main:celery_app"
CELERYD_CHDIR="/srv/weaver/src"
CELERYD_OPTS=" --queue=weaver --time-limit=100000 --concurrency=2"
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery2/%N.pid"
CELERYD_USER="celery"
CELERYD_GROUP="celery"
CELERY_CREATE_DIRS=1
# Change Celery Beat
CELERYBEAT_CHDIR="/srv/weaver/src"
# Log files
CELERYBEAT_LOG_FILE="/var/log/celery/celerybeat.log"
# Celery Beat Log files
CELERYBEAT_PID_FILE="/var/run/celery/celerybeat.pid"
# Scheduler for celery
CELERYBEAT_OPTS=" --pidfile=/var/run/celery/celerybeat.pid --sch
OUTPUT OF RUNNING SERVICE
● celery.service - Celery Service
Loaded: loaded (/etc/systemd/system/celery.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2017-01-12 17:12:32 IST; 2min 17s ago
Process: 18561 ExecStop=/bin/sh -c ${CELERY_BIN} multi stopwait ${CELERYD_NODES} --pidfile=${CELERYD_PID_FILE} (code=exited, status=0/SUCCESS)
Process: 18540 ExecStart=/bin/sh -c ${CELERY_BIN} multi start ${CELERYD_NODES} -A ${CELERY_APP} --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FIL
Main PID: 18555 (code=exited, status=1/FAILURE)
Jan 12 17:12:30 fb01 systemd[1]: Starting Celery Service...
Jan 12 17:12:31 fb01 sh[18540]: celery multi v3.1.23 (Cipater)
Jan 12 17:12:31 fb01 sh[18540]: > Starting nodes...
Jan 12 17:12:31 fb01 sh[18540]: > worker1#fb01: OK
Jan 12 17:12:31 fb01 systemd[1]: Started Celery Service.
Jan 12 17:12:32 fb01 systemd[1]: celery.service: Main process exited, code=exited, status=1/FAILURE
Jan 12 17:12:32 fb01 sh[18561]: celery multi v3.1.23 (Cipater)
Jan 12 17:12:32 fb01 sh[18561]: > worker1#fb01: DOWN
Jan 12 17:12:32 fb01 systemd[1]: celery.service: Unit entered failed state.
Jan 12 17:12:32 fb01 systemd[1]: celery.service: Failed with result 'exit-code'.
I just hit this exact issue. My problem was a configuration issue. In particular, I wasn't setting
CELERYD_LOG_LEVEL
in my environment file. (/etc/default/celeryd in your case). It looks like you have made the same mistake.
(I also had a few other configuration issues that I needed to resolve. I discovered these by running celery on the commandline.)
I had this same symptom, turned out to be permissions.
In my case something like:
chmod o+x /srv/weaver/src
sorted it.
Note: That this is not the best way to enable the required permission but that's not pertinent to this answer.

Cannot create systemd script for Puma

I created a service script named "puma.service" in /etc/systemd/system/ with the following contents:
[Unit]
Description=Puma HTTP Server
After=network.target
[Service]
Type=simple
User=ubuntu
WorkingDirectory=/home/username/appdir/current
ExecStart=/bin/bash -lc "/home/username/appdir/current/sbin/puma -C /home/username/appdir/current/config/puma.rb /home/username/appdir/current/config.ru"
Restart=always
[Install]
WantedBy=multi-user.target
I enabled the service and when started, I'm getting the following log from systemctl:
● puma.service - Puma HTTP Server
Loaded: loaded (/etc/systemd/system/puma.service; enabled; vendor preset: enabled)
Active: inactive (dead) (Result: exit-code) since Wed 2016-12-14 10:09:46 UTC; 12min ago
Process: 16889 ExecStart=/bin/bash -lc cd /home/username/appdir/current && bundle exec puma -C /home/username/appdir..
Main PID: 16889 (code=exited, status=127)
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Main process exited, code=exited, status=127/n/a
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Unit entered failed state.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Failed with result 'exit-code'.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Service hold-off time over, scheduling restart.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: Stopped Puma HTTP Server.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: puma.service: Start request repeated too quickly.
Dec 14 10:09:46 ip-172-31-29-40 systemd[1]: Failed to start Puma HTTP Server.
Although, when I give the command in SSH terminal the server started and was running perfect. Is there any changes I have to make in the service file?
Note:
I have changed the dirnames for your convenience.
I did some research and the cause of status 127 is due to the executable not in the path. But, I guess that won't be a problem.
Can you shed some light?
I found the problem and changed the ExecStart as mentioned below and it worked like a charm:
ExecStart=/home/username/.rbenv/shims/bundle exec puma -e production -C ./config/puma.rb config.ru
PIDFile=/home/username/appdir/shared/tmp/pids/puma.pid
bundle should be taken from the rbenv shims and also the puma's config file (config/puma.rb) and application's config file (config.ru) can be given in relative path.
One way to solve it is to specify a PID file and systemd will take a look at that file to check on service status.
Here's how we use this in our scripts (adapted to your given sample)
ExecStart=/bin/bash -lc '/home/username/appdir/current/sbin/puma -C /home/username/appdir/current/config/puma.rb /home/username/appdir/current/config.ru --pidfile /home/username/appdir/current/tmp/pids/puma.pid'
PIDFile=/home/username/appdir/current/tmp/pids/puma.pid
Take note that you might have to configure --pidfile via your -C puma.rb file instead of passing it an as parameter. I'm just showing it here to illustrate that --pidfile (in puma config) should be the same as PIDFile in the service file.
As for why the error message is that way, I'm not sure myself and is interested in the answer too.
For rvm users try with
[Unit]
Description=Puma HTTP Server
After=network.target
[Service]
Type=simple
User=user_name
Group=user_name
WorkingDirectory=/home/user_name/apps/app_name/current
Environment=RAILS_ENV=production
ExecStart=/home/user_name/.rvm/bin/rvm ruby-3.1.2#app_name do bundle exec --keep-file-descriptors puma -C /home/user_name/apps/app_name/current/config/puma/production.rb
ExecReload=/bin/kill -USR1 $MAINPID
StandardOutput=append:/home/user_name/apps/app_name/current/log/puma_access.log
StandardError=append:/home/user_name/apps/app_name/current/log/puma_error.log
SyslogIdentifier=app_name-puma
Restart=always
KillMode=process
[Install]
WantedBy=multi-user.target
if you are not using a gemset change
ExecStart=/home/user_name/.rvm/bin/rvm ruby-3.1.2#app_name ....
to
ExecStart=/home/user_name/.rvm/bin/rvm default

postmaster.pid permission denied CentOS 7

Postgres 9.2 on CentOS 7.
After "su - postgres" I installed using
pg-ctl initdb -D /var/lib/pgsql/data
which ran fine.
[root#server ~]# systemctl start postgresql
Job for postgresql.service failed. See 'systemctl status postgresql.service' and 'journalctl -xn' for details.
[root#server ~]# systemctl status postgresql.service
postgresql.service - PostgreSQL database server
Loaded: loaded (/usr/lib/systemd/system/postgresql.service; disabled)
Active: failed (Result: exit-code) since Fri 2015-11-27 13:48:57 EST; 9s ago
Process: 3262 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p ${PGPORT} -w -t 300 (code=exited, status=1/FAILURE)
Process: 3256 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS)
Nov 27 13:48:57 server.company.network systemd[1]: Starting PostgreSQL database server...
Nov 27 13:48:57 server.company.network pg_ctl[3262]: pg_ctl: could not open PID file "/var/lib/pgsql/data/postmaster.pid": Permission denied
Nov 27 13:48:57 server.company.network systemd[1]: postgresql.service: control process exited, code=exited status=1
Nov 27 13:48:57 server.company.network systemd[1]: Failed to start PostgreSQL database server.
Nov 27 13:48:57 server.company.network systemd[1]: Unit postgresql.service entered failed state.
[root#server ~]# journalctl -xn
-- Logs begin at Fri 2015-11-27 13:29:37 EST, end at Fri 2015-11-27 13:48:57 EST. --
Nov 27 13:48:35 server.company.network sudo[3228]: pam_unix(sudo:auth): conversation failed
Nov 27 13:48:35 server.company.network sudo[3228]: pam_unix(sudo:auth): auth could not identify password for [myuserid]
Nov 27 13:48:46 server.company.network sudo[3230]: myuserid : TTY=pts/0 ; PWD=/home/myuserid ; USER=root ; COMMAND=/bin/su -
Nov 27 13:48:46 server.company.network su[3234]: (to root) myuserid on pts/0
Nov 27 13:48:46 server.company.network su[3234]: pam_unix(su-l:session): session opened for user root by myuserid(uid=0)
Nov 27 13:48:57 server.company.network systemd[1]: Starting PostgreSQL database server...
-- Subject: Unit postgresql.service has begun with start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit postgresql.service has begun starting up.
Nov 27 13:48:57 server.company.network pg_ctl[3262]: pg_ctl: could not open PID file "/var/lib/pgsql/data/postmaster.pid": Permission denied
Nov 27 13:48:57 server.company.network systemd[1]: postgresql.service: control process exited, code=exited status=1
Nov 27 13:48:57 server.company.network systemd[1]: Failed to start PostgreSQL database server.
-- Subject: Unit postgresql.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit postgresql.service has failed.
--
-- The result is failed.
Nov 27 13:48:57 server.company.network systemd[1]: Unit postgresql.service entered failed state.
When I "su - postgres" I can "touch" the file, "ls" the file, "rm" /var/lib/pgsql/data/postmaster.pid. Permissions on data are 700 postgres:postgres. pgsql is a symlink to /data0/postgres and postgres is 700 postgres:postgres.
ADDITIONS:
I forgot to mention that after having this problem, I replaced the commands for ExecStartPre and ExecStart with shell scripts that wrote the user, primary group, PGDATA, and PGPORT values to a file. They were all correct. The start still died on postmaster.pid .
The postgresql.service file:
[root#server /]# cat /usr/lib/systemd/system/postgresql.service
# It's not recommended to modify this file in-place, because it will be
# overwritten during package upgrades. If you want to customize, the
# best way is to create a file "/etc/systemd/system/postgresql.service",
# containing
# .include /lib/systemd/system/postgresql.service
# ...make your changes here...
# For more info about custom unit files, see
# http://fedoraproject.org/wiki/Systemd#How_do_I_customize_a_unit_file.2F_add_a_custom_unit_file.3F
# For example, if you want to change the server's port number to 5433,
# create a file named "/etc/systemd/system/postgresql.service" containing:
# .include /lib/systemd/system/postgresql.service
# [Service]
# Environment=PGPORT=5433
# This will override the setting appearing below.
# Note: changing PGPORT or PGDATA will typically require adjusting SELinux
# configuration as well; see /usr/share/doc/postgresql-*/README.rpm-dist.
# Note: do not use a PGDATA pathname containing spaces, or you will
# break postgresql-setup.
# Note: in F-17 and beyond, /usr/lib/... is recommended in the .include line
# though /lib/... will still work.
[Unit]
Description=PostgreSQL database server
After=network.target
[Service]
Type=forking
User=postgres
Group=postgres
# Port number for server to listen on
Environment=PGPORT=5432
# Location of database directory
Environment=PGDATA=/var/lib/pgsql/data
# Where to send early-startup messages from the server (before the logging
# options of postgresql.conf take effect)
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog
# Disable OOM kill on the postmaster
OOMScoreAdjust=-1000
ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA}
ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o "-p ${PGPORT}" -w -t 300
ExecStop=/usr/bin/pg_ctl stop -D ${PGDATA} -s -m fast
ExecReload=/usr/bin/pg_ctl reload -D ${PGDATA} -s
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300
[Install]
WantedBy=multi-user.target
I figured it out. After running initdb, I copied the data directory to the other drive. With SELinux, the FILETYPE switches to the target parent directory FILETYPE. I tried to semanage the directory, but that wasn't working. So I started over again and moved the data directory instead, which maintained the FILETYPE.

Changing port of OpenLdap on Centos installed with yum

I am trying to change the default port of openldap (not so experienced with openldap so I might be doing something incorrectly).
Currently I am installing it through yum package manager on CentOS 7.1.1503 as follows :
yum install openldap-servers
After installing 'openldap-servers' I can start the openldap server by invoking service slapd start
however when I try to change the port by editing /etc/sysconfig/slapd for instance by changing SLAPD_URLS to the following :
# OpenLDAP server configuration
# see 'man slapd' for additional information
# Where the server will run (-h option)
# - ldapi:/// is required for on-the-fly configuration using client tools
# (use SASL with EXTERNAL mechanism for authentication)
# - default: ldapi:/// ldap:///
# - example: ldapi:/// ldap://127.0.0.1/ ldap://10.0.0.1:1389/ ldaps:///
SLAPD_URLS="ldapi:/// ldap://127.0.0.1:3421/"
# Any custom options
#SLAPD_OPTIONS=""
# Keytab location for GSSAPI Kerberos authentication
#KRB5_KTNAME="FILE:/etc/openldap/ldap.keytab"
(see SLAPD_URLS="ldapi:/// ldap://127.0.0.1:3421/" )..
it is failing to start
service slapd start
Redirecting to /bin/systemctl start slapd.service
Job for slapd.service failed. See 'systemctl status slapd.service' and 'journalctl -xn' for details.
service slapd status
Redirecting to /bin/systemctl status slapd.service
slapd.service - OpenLDAP Server Daemon
Loaded: loaded (/usr/lib/systemd/system/slapd.service; disabled)
Active: failed (Result: exit-code) since Fri 2015-07-31 07:49:06 EDT; 10s ago
Docs: man:slapd
man:slapd-config
man:slapd-hdb
man:slapd-mdb
file:///usr/share/doc/openldap-servers/guide.html
Process: 41704 ExecStart=/usr/sbin/slapd -u ldap -h ${SLAPD_URLS} $SLAPD_OPTIONS (code=exited, status=1/FAILURE)
Process: 41675 ExecStartPre=/usr/libexec/openldap/check-config.sh (code=exited, status=0/SUCCESS)
Main PID: 34363 (code=exited, status=0/SUCCESS)
Jul 31 07:49:06 osboxes runuser[41691]: pam_unix(runuser:session): session opened for user ldap by (uid=0)
Jul 31 07:49:06 osboxes runuser[41693]: pam_unix(runuser:session): session opened for user ldap by (uid=0)
Jul 31 07:49:06 osboxes runuser[41695]: pam_unix(runuser:session): session opened for user ldap by (uid=0)
Jul 31 07:49:06 osboxes runuser[41697]: pam_unix(runuser:session): session opened for user ldap by (uid=0)
Jul 31 07:49:06 osboxes runuser[41699]: pam_unix(runuser:session): session opened for user ldap by (uid=0)
Jul 31 07:49:06 osboxes runuser[41701]: pam_unix(runuser:session): session opened for user ldap by (uid=0)
Jul 31 07:49:06 osboxes slapd[41704]: #(#) $OpenLDAP: slapd 2.4.39 (Mar 6 2015 04:35:49) $
mockbuild#worker1.bsys.centos.org:/builddir/build/BUILD/openldap-2.4.39/openldap-2.4.39/servers/slapd
Jul 31 07:49:06 osboxes systemd[1]: slapd.service: control process exited, code=exited status=1
Jul 31 07:49:06 osboxes systemd[1]: Failed to start OpenLDAP Server Daemon.
Jul 31 07:49:06 osboxes systemd[1]: Unit slapd.service entered failed state.
ps I also disabled firewalld
the solution was provided when I ran journalctl -xn which basically says:
SELinux is preventing /usr/sbin/slapd from name_bind access on the tcp_socket port 9312.
***** Plugin bind_ports (92.2 confidence) suggests ************************
If you want to allow /usr/sbin/slapd to bind to network port 9312
Then you need to modify the port type.
Do
# semanage port -a -t ldap_port_t -p tcp 9312
***** Plugin catchall_boolean (7.83 confidence) suggests ******************
If you want to allow nis to enabled
Then you must tell SELinux about this by enabling the 'nis_enabled' boolean.
You can read 'None' man page for more details.
Do
setsebool -P nis_enabled 1
***** Plugin catchall (1.41 confidence) suggests **************************
If you believe that slapd should be allowed name_bind access on the port 9312 tcp_socket by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# grep slapd /var/log/audit/audit.log | audit2allow -M mypol
# semodule -i mypol.pp