ceph orch host add host issues (env: centos8, ceph:12.2.5) - ceph

information:
hostnames:
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.49.41 ceph-gw-one
172.16.49.42 ceph-gw-two
shell: ceph orch host add 172.16.49.42
Error EINVAL: New host 172.16.49.42 (172.16.49.42) failed check: ['INFO:cephadm:podman|docker (/bin/docker) is present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present', 'INFO:cephadm:Unit chronyd.service is enabled and running', 'INFO:cephadm:Hostname "172.16.49.42" matches what is expected.', 'ERROR: hostname "ceph-gw-two" does not match expected hostname "172.16.49.42"']
shell: orch host add ceph-gw-two
Error EINVAL: Failed to connect to ceph-gw-two (ceph-gw-two).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
ceph cephadm get-ssh-config > ssh_config
ceph config-key get mgr/cephadm/ssh_identity_key > key
ssh -F ssh_config -i key root#ceph-gw-two
i have checked that wether by ip or hostname, ssh login success;

i read the adm source scripts:
out, err, code = self._run_cephadm(spec.hostname, cephadmNoImage, 'check-host',
['--expect-hostname', spec.hostname],
addr=spec.addr,
error_ok=True, no_fsid=True)
if code:
raise OrchestratorError('New host %s (%s) failed check: %s' % (
spec.hostname, spec.addr, err))
so ,i change the cmd to:
ceph orch host add ceph-gw-two 172.16.49.42;
done, it works well;

Related

Rancher desktop error when starting kubernetes

My Rancher desktop was working just fine, until today when I switched container runtime from containerd to dockerd. When I wanted to change it back to containerd, it says:
Error Starting Kubernetes
Error: unable to verify the first certificate
Some recent logfile lines:
client-key-data: LS0tLS1CRUdJTiBFQyBQUklWQVRFIEtFWS0tLS0tCk1IY0NBUUVFSUV1eXhYdFYvTDZOQmZsZVV0Mnp5ekhNUmlzK2xXRzUxUzBlWklKMmZ5MHJvQW9HQ0NxR1NNNDkKQXdFSG9VUURRZ0FFNGdQODBWNllIVzBMSW13Q3lBT2RWT1FzeGNhcnlsWU8zMm1YUFNvQ2Z2aTBvL29UcklMSApCV2NZdUt3VnVuK1liS3hEb0VackdvbTJ2bFJTWkZUZTZ3PT0KLS0tLS1FTkQgRUMgUFJJVkFURSBLRVktLS0tLQo=
2022-09-02T13:03:15.834Z: Error starting lima: Error: unable to verify the first certificate
at TLSSocket.onConnectSecure (node:_tls_wrap:1530:34)
at TLSSocket.emit (node:events:390:28)
at TLSSocket._finishInit (node:_tls_wrap:944:8)
at TLSWrap.ssl.onhandshakedone (node:_tls_wrap:725:12) {
code: 'UNABLE_TO_VERIFY_LEAF_SIGNATURE'
}
Tried reinstalling, factory reset etc. but no luck. I am using 1.24.4 verison.
TLDR: Try turning off Docker/Something that is binding to port 6443. Reset Kubernetes in Rancher Desktop, then try again.
Try checking if there is anything else listening on port 6443 which is needed by kubernetes:rancher-desktop.
In my case, lsof -i :6443 gave me...
~ lsof -i :6443
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
com.docke 63385 ~~~~~~~~~~~~ 150u IPv4 0x44822db677e8e087 0t0 TCP localhost:sun-sr-https (LISTEN)
ssh 82481 ~~~~~~~~~~~~ 27u IPv4 0x44822db677ebb1e7 0t0 TCP *:sun-sr-https (LISTEN)

Unable to start docker container, "docker ps -a" STATUS = Exited (1)

I'm trying to start a postgres instance as described in docker hub.
To do that, I ran the following command:
sudo docker run --name database -e POSTGRES_PASSWORD=supersecret -p 5432:5432 -d postgres
When I run docker ps it shows nothing, and when I run docker ps -a it shows:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
967ebe7efb74 postgres "docker-entrypoint.s…" 2 minutes ago Exited (1) 2 minutes ago database
Trying to docker start database also leads to STATUS Exited (1) as displayed above.
Here are the logs displayed by docker logs -f database:
PostgreSQL init process complete; ready for start up.
2019-09-10 14:08:26.941 UTC [1] LOG: could not create IPv6 socket for address "::": Permission denied
2019-09-10 14:08:26.941 UTC [1] LOG: could not create IPv4 socket for address "0.0.0.0": Permission denied
2019-09-10 14:08:26.941 UTC [1] WARNING: could not create listen socket for "*"
2019-09-10 14:08:26.941 UTC [1] FATAL: could not create any TCP/IP sockets
2019-09-10 14:08:26.941 UTC [1] LOG: database system is shut down
In my research on the internet to solve this problem, some people said that it could be something with my hosts file, but it seems fine as shown below.
127.0.0.1 localhost
127.0.1.1 user-PC
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
I tried to reinstall docker following the docker docs step-by-step tutorial, and I also did the post installation steps without success.
OS: Deepin GNU/Linux 15.11
Docker: Docker version 18.09.6, build 481bc77
Because you confirmed it, I write the comment as answer.
Problem is due to apparmor.
Try disabling it or, better, to configure the security profile

could not bind IPv4 socket: Permission denied

I am trying to set up a new instance of PostgreSQL 9.6 on a machine. I have tested it on another machine and its working fine on that machine. But the same process is not working on new machine. Below are the steps I am using
created a new data directory with below command
/opt/rh/rh-postgresql96/root/bin/initdb -D /var/lib/pgsql/9.6/data/
created a service file /etc/systemd/system/rh-postgresql96-inst2.service with below content
.include /lib/systemd/system/rh-postgresql96-postgresql.service
[Service]
Environment=PGDATA=/var/lib/pgsql/9.6/data/
Environment=PGPORT=5433
User=postgres
Group=root
registered service using command systemctl enable rh-postgresql96-inst2
now using command systemctl start rh-postgresql96-inst2 to start service.
All these steps are working fine on one machine but not on the 2nd one.
I am getting below error while starting service on the 2nd machine
rh-postgresql96-inst2.service - PostgreSQL database server
Loaded: loaded (/etc/systemd/system/rh-postgresql96-inst2.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2018-06-18 09:59:01 UTC; 10s ago
Process: 7552 ExecStart=/opt/rh/rh-postgresql96/root/usr/libexec/postgresql-ctl start -D ${PGDATA} -s -w -t ${PGSTARTTIMEOUT} (code=exited, status=1/FAILURE)
Process: 7550 ExecStartPre=/opt/rh/rh-postgresql96/root/usr/libexec/postgresql-check-db-dir %N (code=exited, status=0/SUCCESS)
HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
LOG: could not bind IPv4 socket: Permission denied
HINT: Is another postmaster already running on port 5433? If not, wait a few seconds and retry.
WARNING: could not create listen socket for "localhost"
FATAL: could not create any TCP/IP sockets
LOG: database system is shut down
systemd[1]: rh-postgresql96-inst2.service: control process exited, code=exited status=1
systemd[1]: Failed to start PostgreSQL database server.
systemd[1]: Unit rh-postgresql96-inst2.service entered failed state.
systemd[1]: rh-postgresql96-inst2.service failed.
However, I am able to start service using pg_ctl.
Also, I have checked with netstat, lsof command to check if any other postgresql instance is running on port 5433 but its not the case.
Infact i tried 5431, 5434 ports also but server is not starting up
Instead of turning of SELinux you should allow postgres to bind to port 5433 in SELinux.
There is a port parameter postgresql_port_t which by default has port 5432 and 9898.
semanage port -l | grep post
postgresql_port_t tcp 5433, 9898
What you could do is simply add port 5433 to this list.
semanage port -a -t postgresql_port_t 5433 -p tcp
semanage port -l | grep post
postgresql_port_t tcp 5433, 5432, 9898
After that you can start your postgres server listening on port 5433
systemctl enable rh-postgresql96-postgresql
systemctl start rh-postgresql96-postgresql
netstat -tulpn
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN 2847/postgres
tcp 0 0 127.0.0.1:5433 0.0.0.0:* LISTEN 2775/postgres
There is also a handy tool called audit2allow to help debug selinux problems.
audit2allow -m whatiswrong < /var/log/audit/audit.log > /root/showme.te
The file showme.te show you why SELinux is not allowing the service to do what you need.
You should not turn off SELinux just because it's hard to understand or if you don't know how it works. Instead you should study it :)
I reccomend this lecture from the Red Hat Summit https://www.redhat.com/en/about/videos/summit-2018-security-enhanced-linux-mere-mortals
This issue was related to SELinux.
When I run command sestatus on both machines, output was a little bit different.
One server had Current mode: permissive and 2nd one had Current mode: enforcing.
So I changed the current mode to permissive on the 2nd machine using command setenforce 0.
and it resolved the permission related issue. Now I am able to start 2nd instance.

Catalyst exiting when started with start-stop-daemon

I am trying to run Catalyst on CentOS 7 using start-stop-daemon. Here is the start-stop-daemon command that I run:
start-stop-daemon --start --pidfile /var/run/myapp.pid -d "/home/user/myapp" --exec /opt/perlbrew/perls/perl-5.22.0/bin/perl --startas "/home/user/myapp/script/myapp_fastcgi.pl" --chuid root --make-pid -- "-l :8100 -n 6"
Then I get this error:
Cannot resolve host name -- exiting!
It displays this error after loading the chained actions and printing them to the screen, and after displaying the final message:
[info] myapp powered by Catalyst 5.90112
In /etc/hosts I've tried commenting out any hostnames I thought might be causing an issue:
127.0.0.1 myapp.com myapp.com
#127.0.0.1 localhost.localdomain localhost
#127.0.0.1 localhost4.localdomain4 localhost4
# The following lines are desirable for IPv6 capable hosts
#::1 myapp.com myapp.com
#::1 localhost.localdomain localhost
#::1 localhost6.localdomain6 localhost6
What's strange is that if I don't use start-stop-daemon and I just start the server from the command-line, the server starts fine.
Most likely it can't resolve your hostname.
Check what your hostname command returns and make sure that same host name is present in your /etc/hosts. And don't assign it to loopback, use a real IP.
You can also trace what exactly it's trying to resolve by using this method
https://serverfault.com/questions/666482/how-to-find-out-pid-of-the-process-sending-packets-generating-network-traffic
Or might be even more simple to do tcpdump -s 0 port 53

Get this result back when putting `postgres -D /usr/local/var/postgres` in terminal

When I put postgres -D /usr/local/var/postgres in terminal I get this back:
LOG: could not translate host name "localhost", service "5432" to address: nodename nor servname provided, or not known
WARNING: could not create listen socket for "localhost"
FATAL: could not create any TCP/IP sockets`
Can someone help me fix this?
Check your /etc/hosts file for missing line:
127.0.0.1 localhost.localdomain localhost
Regards
H