ceph-deploy mon add node2 admin_socket: exception - ceph

Follow the official quick-ceph-deploy, at the adding-monitor step, meet the following problem:
[ubuntu#admin-node my-cluster]$ ceph-deploy mon add node2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/ubuntu/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.37): /usr/bin/ceph-deploy mon add node2
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x1b0fc20>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : ['node2']
[ceph_deploy.cli][INFO ] func : <function mon at 0x1b07320>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to node2
[node2][DEBUG ] connection detected need for sudo
[node2][DEBUG ] connected to host: node2
[node2][DEBUG ] detect platform information from remote host
[node2][DEBUG ] detect machine type
[node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host node2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host: 172.24.2.232
[ceph_deploy.mon][DEBUG ] detecting platform for host node2 ...
[node2][DEBUG ] connection detected need for sudo
[node2][DEBUG ] connected to host: node2
[node2][DEBUG ] detect platform information from remote host
[node2][DEBUG ] detect machine type
[node2][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.3.1611 Core
[node2][DEBUG ] determining if provided host has same hostname in remote
[node2][DEBUG ] get remote short hostname
[node2][DEBUG ] adding mon to node2
[node2][DEBUG ] get remote short hostname
[node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[node2][DEBUG ] create the mon path if it does not exist
[node2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-node2/done
[node2][DEBUG ] create a done file to avoid re-doing the mon deployment
[node2][DEBUG ] create the init path if it does not exist
[node2][INFO ] Running command: sudo systemctl enable ceph.target
[node2][INFO ] Running command: sudo systemctl enable ceph-mon#node2
[node2][INFO ] Running command: sudo systemctl start ceph-mon#node2
[node2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
[node2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[node2][WARNIN] node2 is not defined in `mon initial members`
[node2][WARNIN] monitor node2 does not exist in monmap
[node2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[node2][WARNIN] monitors may not be able to form quorum
[node2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node2.asok mon_status
[node2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[node2][WARNIN] monitor: mon.node2, might not be running yet
ceph.conf:
[global]
fsid = 5ec213d4-ae42-44c2-81d1-d7bdbee7f36a
mon_initial_members = node1
mon_host = 172.24.2.230
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
# by honghe
osd pool default size = 2
the ceph status is:
[ubuntu#admin-node my-cluster]$ ceph status
cluster 5ec213d4-ae42-44c2-81d1-d7bdbee7f36a
health HEALTH_OK
monmap e1: 1 mons at {node1=172.24.2.230:6789/0}
election epoch 3, quorum 0 node1
osdmap e25: 3 osds: 3 up, 3 in
flags sortbitwise,require_jewel_osds
pgmap v12180: 112 pgs, 7 pools, 1588 bytes data, 171 objects
19243 MB used, 90095 MB / 106 GB avail
112 active+clean
[ubuntu#admin-node my-cluster]$ ceph -v
ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)
Any help will be appreciated!

Resolved.
I miss the configuration of newtork.
cause nodeX have 2 IF:
[ubuntu#node1 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:a8:33:13 brd ff:ff:ff:ff:ff:ff
inet 172.24.2.230/24 brd 172.24.2.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::9aae:a37b:289d:1745/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:7f:98:49 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.210/24 brd 192.168.122.255 scope global dynamic eth1
valid_lft 2754sec preferred_lft 2754sec
inet6 fe80::19b4:5768:7469:5767/64 scope link
valid_lft forever preferred_lft forever
So we have to config the network
If you have more than one network interface, add the public network setting under the [global] section of your Ceph configuration file. See the Network Configuration Reference for details.
Just as follow:
[global]
fsid = 5ec213d4-ae42-44c2-81d1-d7bdbee7f36a
mon_initial_members = node1
mon_host = 172.24.2.230
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
# by honghe
osd pool default size = 2
public network = 172.24.2.0/24
More over, add configuration of cluster network to use another IF can get better performance.

Related

Ceph (cepadm) quincy: can't add osd from remote nodes (command hanging)

I stuck with a problem, while trying to create cluster of 3 nodes (AWS EC2 instancies):
fa11.something.com ~ # ceph orch host ls
HOST ADDR LABELS STATUS
fa11.something.com 172.16.24.67 _admin
fa12.something.com 172.16.23.159 _admin
fa13.something.com 172.16.25.119 _admin
3 hosts in cluster
Each of them have 2 disks (all accepted by CEPH):
fa11.something.com ~ # ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
fa11.something.com /dev/nvme1n1 ssd Amazon_Elastic_Block_Store_vol016651cf7f3b9c9dd 8589M Yes 7m ago
fa11.something.com /dev/nvme2n1 ssd Amazon_Elastic_Block_Store_vol034082d7d364dfbdb 5368M Yes 7m ago
fa12.something.com /dev/nvme1n1 ssd Amazon_Elastic_Block_Store_vol0ec193fa3f77fee66 8589M Yes 3m ago
fa12.something.com /dev/nvme2n1 ssd Amazon_Elastic_Block_Store_vol018736f7eeab725f5 5368M Yes 3m ago
fa13.something.com /dev/nvme1n1 ssd Amazon_Elastic_Block_Store_vol0443a031550be1024 8589M Yes 84s ago
fa13.something.com /dev/nvme2n1 ssd Amazon_Elastic_Block_Store_vol0870412d37717dc2c 5368M Yes 84s ago
fa11.something.com is first host, where from I manage cluster.
Adding OSD from fa11.something.com itself works fine:
fa11.something.com ~ # ceph orch daemon add osd fa11.something.com:/dev/nvme1n1
Created osd(s) 0 on host 'fa11.something.com'
But it doesn't work for other 2 hosts (it hangs forever):
fa11.something.com ~ # ceph orch daemon add osd fa12.something.com:/dev/nvme1n1
^CInterrupted
Logs on fa12.something.com shows that it hangs at following step:
fa12.something.com ~ # tail /var/log/ceph/a9ef6c26-ac38-11ed-9429-06e6bc29c1db/ceph-volume.log
...
[2023-02-14 07:38:20,942][ceph_volume.process][INFO ] Running command: /usr/bin/ceph-authtool --gen-print-key
[2023-02-14 07:38:20,964][ceph_volume.process][INFO ] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new a51506c2-e910-4763-9a0c-f6c2194944e2
I'm not sure what might be the reason for this hanging?
*Additional details:
cephadm installed, using curl (https://docs.ceph.com/en/quincy/cephadm/install/#curl-based-installation)
I use user ceph, instead of root and port 2222 instead of 22. First node was bootstrapped, using below command:
cephadm bootstrap --mon-ip 172.16.24.67 --allow-fqdn-hostname --ssh-user ceph --ssh-config /home/anton/ceph/ssh_config --cluster-network 172.16.16.0/20 --skip-monitoring-stack
Content of /home/anton/ceph/ssh_config:
fa11.something.com ~ # cat /home/anton/ceph/ssh_config
Host *
User ceph
Port 2222
IdentityFile /home/ceph/.ssh/id_rsa
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
Hosts fa12.something.com and fa13.something.com were added, using commands:
ceph orch host add fa12.something.com 172.16.23.159 --labels _admin
ceph orch host add fa13.something.com 172.16.25.119 --labels _admin
Not sure if I have to check some specific ports are not blocked? Just I expected to get some error at early stages in case if ceph can't access some port...
Thanks in advance for any suggestion!
It should be that problem caused by some port(s) blocked. I've tried to drop all restrictions on iptables level: iptables -F; iptables -P INPUT ACCEPT and it started to work. Will check later which ports exactly needed. I guess it should be described somewhere here: https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/

Can't connect to PostgreSQL on host machine from Docker container

I use VM with Ubuntu Server 20.04 LTS where I set up the next Docker network:
[
{
"Name": "my-net",
"Id": "d06d15cbc443df8565b76e30aa13da05e26cd3bfc8d33551020d2ce3fe94a118",
"Created": "2022-05-29T22:08:12.618759894Z",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"119b4767f4ec8fc7b5d8adcaab1a71999df2f79b32a02f8ba1a66270c7531a70": {
"Name": "server",
"EndpointID": "4c8cb2a54a82b0092dd677cdf8ce9264812b4f3e31bef59c723a085928cb0441",
"MacAddress": "02:42:ac:12:00:03",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"c55c205642ebe8f5eb511af071a6f3183277d871ed4b66df9e00fe53e6eb9c54": {
"Name": "sso",
"EndpointID": "aff4f8ddfea45d718d73ba609f035ac871d3633ee7c47d318cb84a757f92e9ed",
"MacAddress": "02:42:ac:12:00:02",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
},
"cf80d38ab854f9d8fdb66129632a38fce575fc47de1044088871ff8a6e67016a": {
"Name": "gateway",
"EndpointID": "c98bd162d5383ff3e46753c0c9c4101a87dd189af8c0144bf999b64cf63691be",
"MacAddress": "02:42:ac:12:00:04",
"IPv4Address": "172.18.0.4/16",
"IPv6Address": ""
}
},
"Options": {},
"Labels": {}
}
]
I need to connect from the sso container to PostgreSQL instance on the host machine localhost:5432 (just default).
Below is pg_hba.conf:
local all postgres peer
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all peer
# IPv4 local connections:
host all all 127.0.0.1/32 md5
# IPv6 local connections:
host all all ::1/128 md5
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 127.0.0.1/32 md5
host replication all ::1/128 md5
NB! PostgreSQL is not available from outer network (it was so by default), if I will connect to my VM by IP address, the connection will be refused (I need it, I don't want to my database was visible for the Internet).
I run sso container by the next command:
docker run --name sso \
--network my-net \
--add-host host.docker.internal:host-gateway \
-e DB_URL=jdbc:postgresql://host.docker.internal:5432/sso?user=my_user&password=my_password \
-d sso:latest
Also I've tested a connection to my database from DBeaver using SSH tunnel, everything works.
However, when I send a request to sso it crashes with error:
com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: Connection to host.docker.internal:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
at com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:596)
at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:582)
at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:100)
at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81)
at com.kleinstein.sso.data.gateways.db.DatabaseGateway.<init>(DatabaseGateway.kt:27)
at com.kleinstein.sso.DependencyInjectionKt.initDatabase(DependencyInjection.kt:47)
at com.kleinstein.sso.DependencyInjectionKt$installDi$1$2$1.invoke(DependencyInjection.kt:20)
at com.kleinstein.sso.DependencyInjectionKt$installDi$1$2$1.invoke(DependencyInjection.kt:20)
at org.kodein.di.bindings.Singleton$getFactory$1$1$1.invoke(standardBindings.kt:134)
at org.kodein.di.bindings.SingletonReference.make(references.kt:34)
at org.kodein.di.bindings.Singleton$getFactory$1$1.invoke(standardBindings.kt:134)
at org.kodein.di.bindings.Singleton$getFactory$1$1.invoke(standardBindings.kt:134)
at org.kodein.di.bindings.StandardScopeRegistry.getOrCreate(scopes.kt:66)
at org.kodein.di.bindings.Singleton$getFactory$1.invoke(standardBindings.kt:134)
at org.kodein.di.bindings.Singleton$getFactory$1.invoke(standardBindings.kt:131)
at org.kodein.di.DIContainer$DefaultImpls$provider$$inlined$toProvider$1.invoke(curry.kt:14)
at org.kodein.di.internal.DirectDIBaseImpl.Instance(DirectDIImpl.kt:30)
at com.kleinstein.sso.DependencyInjectionKt$installDi$1$5$1.invoke(DependencyInjection.kt:67)
at com.kleinstein.sso.DependencyInjectionKt$installDi$1$5$1.invoke(DependencyInjection.kt:23)
at org.kodein.di.bindings.Provider$getFactory$1.invoke(standardBindings.kt:89)
at org.kodein.di.bindings.Provider$getFactory$1.invoke(standardBindings.kt:89)
at org.kodein.di.DIContainer$DefaultImpls$provider$$inlined$toProvider$1.invoke(curry.kt:14)
at org.kodein.di.DIAwareKt$Instance$1.invoke(DIAware.kt:209)
at org.kodein.di.DIAwareKt$Instance$1.invoke(DIAware.kt:207)
at org.kodein.di.DIProperty$provideDelegate$1.invoke(properties.kt:57)
at kotlin.SynchronizedLazyImpl.getValue(LazyJVM.kt:74)
at com.kleinstein.sso.presentation.handlers.AuthenticationHandlersKt.installAuthHandlers$lambda-1(AuthenticationHandlers.kt:18)
at com.kleinstein.sso.presentation.handlers.AuthenticationHandlersKt.access$installAuthHandlers$lambda-1(AuthenticationHandlers.kt:1)
at com.kleinstein.sso.presentation.handlers.AuthenticationHandlersKt$installAuthHandlers$1$1$1.invokeSuspend(AuthenticationHandlers.kt:26)
at com.kleinstein.sso.presentation.handlers.AuthenticationHandlersKt$installAuthHandlers$1$1$1.invoke(AuthenticationHandlers.kt)
at com.kleinstein.sso.presentation.handlers.AuthenticationHandlersKt$installAuthHandlers$1$1$1.invoke(AuthenticationHandlers.kt)
at io.ktor.auth.BasicAuthKt$basic$1.invokeSuspend(BasicAuth.kt:81)
at io.ktor.auth.BasicAuthKt$basic$1.invoke(BasicAuth.kt)
at io.ktor.auth.BasicAuthKt$basic$1.invoke(BasicAuth.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(SuspendFunctionGun.kt:136)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:78)
at io.ktor.auth.Authentication.processAuthentication(Authentication.kt:235)
at io.ktor.auth.Authentication.access$processAuthentication(Authentication.kt:19)
at io.ktor.auth.Authentication$interceptPipeline$2.invokeSuspend(Authentication.kt:125)
at io.ktor.auth.Authentication$interceptPipeline$2.invoke(Authentication.kt)
at io.ktor.auth.Authentication$interceptPipeline$2.invoke(Authentication.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(SuspendFunctionGun.kt:136)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:78)
at io.ktor.routing.Routing.executeResult(Routing.kt:155)
at io.ktor.routing.Routing.interceptor(Routing.kt:39)
at io.ktor.routing.Routing$Feature$install$1.invokeSuspend(Routing.kt:107)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.features.ContentNegotiation$Feature$install$1.invokeSuspend(ContentNegotiation.kt:145)
at io.ktor.features.ContentNegotiation$Feature$install$1.invoke(ContentNegotiation.kt)
at io.ktor.features.ContentNegotiation$Feature$install$1.invoke(ContentNegotiation.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(SuspendFunctionGun.kt:136)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:78)
at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invokeSuspend(DefaultEnginePipeline.kt:127)
at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invoke(DefaultEnginePipeline.kt)
at io.ktor.server.engine.DefaultEnginePipelineKt$defaultEnginePipeline$2.invoke(DefaultEnginePipeline.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(SuspendFunctionGun.kt:136)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:78)
at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invokeSuspend(NettyApplicationCallHandler.kt:123)
at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invoke(NettyApplicationCallHandler.kt)
at io.ktor.server.netty.NettyApplicationCallHandler$handleRequest$1.invoke(NettyApplicationCallHandler.kt)
at kotlinx.coroutines.intrinsics.UndispatchedKt.startCoroutineUndispatched(Undispatched.kt:55)
at kotlinx.coroutines.CoroutineStart.invoke(CoroutineStart.kt:112)
at kotlinx.coroutines.AbstractCoroutine.start(AbstractCoroutine.kt:126)
at kotlinx.coroutines.BuildersKt__Builders_commonKt.launch(Builders.common.kt:56)
at kotlinx.coroutines.BuildersKt.launch(Unknown Source)
at io.ktor.server.netty.NettyApplicationCallHandler.handleRequest(NettyApplicationCallHandler.kt:43)
at io.ktor.server.netty.NettyApplicationCallHandler.channelRead(NettyApplicationCallHandler.kt:34)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:61)
at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:370)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.ktor.server.netty.EventLoopGroupProxy$Companion.create$lambda-1$lambda-0(NettyApplicationEngine.kt:251)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.postgresql.util.PSQLException: Connection to host.docker.internal:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:319)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
at org.postgresql.Driver.makeConnection(Driver.java:400)
at org.postgresql.Driver.connect(Driver.java:259)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:121)
at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:359)
at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:201)
at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:470)
at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561)
... 87 common frames omitted
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.Net.pollConnect(Native Method)
at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
at java.base/java.net.Socket.connect(Socket.java:633)
at org.postgresql.core.PGStream.createSocket(PGStream.java:241)
at org.postgresql.core.PGStream.<init>(PGStream.java:98)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:109)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:235)
... 96 common frames omitted
Where can the problem be? Unfortunately, I am bad in networking.
P.S. If PostgreSQL is yet one Docker container, all works, also everything works on my local machine (without Docker).
Finally I realised how to fix my problem.
First, I had a look at my Docker network... At these strings:
"Subnet": "172.18.0.0/16",
"Gateway": "172.18.0.1"
It means Docker must define a new network interface on my host machine. Let's check by command ip address show:
98: br-d06d15cbc443: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:78:ab:0f:d2 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.1/16 brd 172.18.255.255 scope global br-d06d15cbc443
valid_lft forever preferred_lft forever
inet6 fe80::42:78ff:feab:fd2/64 scope link
valid_lft forever preferred_lft forever
Really, there is our interface. Well, it means we need to allow PostgreSQL to accept connections from this subnet.
There are two variants (/etc/postgresql/12/main/postgresql.conf file):
Set listen_address to my-net gateway IP but I won't be able to connect using DBeaver with SSH Tunnel (it's convenient)
Set listen_address to all * addresses but everyone will be able to connect to our PostgreSQL from outer network.
Fortunately, PostgreSQL has yet one configuration file: /etc/postgresql/12/main/pg_hba.conf where we can restrict allowed subnets. Great! Then set listen_address to * and edit pg_hba.conf as below:
# IPv4 local connections:
# my-net subnet
host all all 172.18.0.1/16 md5
# localhost to use SSH Tunnel in DBeaver
host all all 127.0.0.1/32 md5
Send a request to sso and Whoalya! It works!
Last step, check a direct connection to database from outer network:
FATAL: no pg_hba.conf entry for host "xx.xx.xx.xx", user "my_user", database "sso", SSL on
Profit!
P.S. I am not sure if it's secure and optimal solution but it's better than to open DBMS for all Internet. I hope this answer will help a someone who is the same noobie in networking as me.

Kubernetes cluster deployment in packstack

I am trying to deploy a k8s cluster in openstack rocky but after long time it fails. I've checked orchestration stack and see that kube_minions resources never completes. Checking the logs output for all the instances created:
[ 196.817505] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 215.082433] random: crng init done
Fedora 27 (Atomic Host)
Kernel 4.14.18-300.fc27.x86_64 on an x86_64 (ttyS0)
host-10-0-0-3 login: [ 691.438618] bridge: filtering via
arp/ip/ip6tables is no longer available by default. Update your scripts
to load br_netfilter if you need this.
[ 691.516277] Bridge firewalling registered
[ 692.149217] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 701.932912] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
Checking deeply in the instances I've found that in master node can not start heat-agent-service...
_prefix=docker.io/openstackmagnum/
atomic install --storage ostree --system --system-package no --set
REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt --name heat-container-
agent docker.io/openstackmagnum/heat-container-agent:rocky-stable
systemctl start heat-container-agent
Failed to start heat-container-agent.service: Unit heat-container-
agent.service not found.
2019-04-04 14:57:40,238 - util.py[WARNING]: Failed running
/var/lib/cloud/instance/scripts/part-013 [5]ยด

Unable in adding initial monitor to Ceph in Ubuntu

I am trying to set up a Ceph cluster. I have 4 nodes - 1 admin-node, 1 monitor and 2 object storage devices. The installation guide I am using is given at the following location:
http://ceph.com/docs/master/start/quick-ceph-deploy/.
When I am trying to add the initial monitor (step 5 in the guide), I am getting the following error:
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cloud-user/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.21): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts worker-1-full
[ceph_deploy.mon][DEBUG ] detecting platform for host worker-1-full ...
[worker-1-full][DEBUG ] connection detected need for sudo
[worker-1-full][DEBUG ] connected to host: worker-1-full
[worker-1-full][DEBUG ] detect platform information from remote host
[worker-1-full][DEBUG ] detect machine type
[ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty
[worker-1-full][DEBUG ] determining if provided host has same hostname in remote
[worker-1-full][DEBUG ] get remote short hostname
[worker-1-full][DEBUG ] deploying mon to worker-1-full
[worker-1-full][DEBUG ] get remote short hostname
[worker-1-full][DEBUG ] remote hostname: worker-1-full
[worker-1-full][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[worker-1-full][DEBUG ] create the mon path if it does not exist
[worker-1-full][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-worker-1-full/done
[worker-1-full][DEBUG ] create a done file to avoid re-doing the mon deployment
[worker-1-full][DEBUG ] create the init path if it does not exist
[worker-1-full][DEBUG ] locating the `service` executable...
[worker-1-full][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=worker-1-full
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[worker-1-full][WARNIN] monitor: mon.worker-1-full, might not be running yet
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[worker-1-full][WARNIN] monitor worker-1-full does not exist in monmap
[worker-1-full][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[worker-1-full][WARNIN] monitors may not be able to form quorum
[ceph_deploy.mon][INFO ] processing monitor mon.worker-1-full
[worker-1-full][DEBUG ] connection detected need for sudo
[worker-1-full][DEBUG ] connected to host: worker-1-full
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 4
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 3
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 2
[ceph_deploy.mon][WARNIN] waiting 15 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 1
[ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] worker-1-full
"worker-1-full" is the node I am trying to set up as my monitor. The command I used is:
"ceph-deploy mon create-initial". Please help. Thanks in advance!
Please check your ceph-deploy version:
ceph-deploy --version
I met same problem on 1.5.30,
see http://docs.ceph.com/ceph-deploy/docs/changelog.html#id34,
Default to the "infernalis" release.
use 1.5.29 ceph-deploy and "hammer" release , it's work fine.
(perhaps other combination also ok.)
Good luck ~~

Cannot connect to mongodb using machine ip

Installed Mongo using homebrew. If I type mongo on shell, it gets connected to test. But when I type the ip address of the local machine instead of 127.0.0.1
mongo --host 192.168.1.100 --verbose
It gives me error message
MongoDB shell version: 2.4.6
Fri Aug 23 15:18:27.552 versionArrayTest passed
connecting to: 192.168.1.100:27017/test
Fri Aug 23 15:18:27.579 creating new connection to:192.168.1.100:27017
Fri Aug 23 15:18:27.579 BackgroundJob starting: ConnectBG
Fri Aug 23 15:18:27.580 Error: couldn't connect to server 192.168.1.100:27017 at src/mongo/shell/mongo.js:147
Fri Aug 23 15:18:27.580 User Assertion: 12513:connect failed
Have tried modifying the mongo.conf by commenting bind_ip or by changing the ip address from 127.0.0.1 to 0.0.0.0 but no luck. This should be simple but have no clue now. Using mac.
Thanks
Update: As requested. This works after I have made the changes as you suggested.
ifconfig output
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
options=3<RXCSUM,TXCSUM>
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
inet6 ::1 prefixlen 128
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
options=b<RXCSUM,TXCSUM,VLAN_HWTAGGING>
ether XX:XX:XX:
media: autoselect (none)
status: inactive
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
ether XX:XX:XX:XX:01
inet6 XXXX:XXXX:XXXX: %en1 prefixlen 64 scopeid 0x5
inet 192.168.1.100 netmask 0xffffff00 broadcast 192.168.1.255
media: autoselect
status: active
p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304
ether XX:XX:XX:XX:XX
media: autoselect
status: inactive
fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
lladdr XX:XX:XX:XX
media: autoselect <full-duplex>
status: inactive
Output when executing command mongo --host 192.168.1.100 --verbose
MongoDB shell version: 2.4.5
Fri Aug 23 16:42:09.806 versionArrayTest passed
connecting to: 192.168.1.100:27017/test
Fri Aug 23 16:42:09.837 creating new connection to:192.168.1.100:27017
Fri Aug 23 16:42:09.837 BackgroundJob starting: ConnectBG
Fri Aug 23 16:42:10.129 connected connection!
Server has startup warnings:
Fri Aug 23 16:41:59.025 [initandlisten]
Fri Aug 23 16:41:59.025 [initandlisten] ** WARNING: soft rlimits too low. Number of files is 256, should be at least 1000
File mongod.conf
# Store data in /usr/local/var/mongodb instead of the default /data/db
dbpath = /usr/local/var/mongodb
# Append logs to /usr/local/var/log/mongodb/mongo.log
logpath = /usr/local/var/log/mongodb/mongo.log
logappend = true
# Only accept local connections
bind_ip = 0.0.0.0`
I just tested this on my Mac with Homebrew, works fine if you change the bind address. I suspect you probably just didn't get the config for bind correct?
Just so we have all the information, can you paste the output of ifconfig please?
By default, MongoDB should listen on all interfaces, you shouldn't need to change the configuration, however, the Homebrew setup seems to override this (/usr/local/etc/mongod.conf):
# Only accept local connections
bind_ip = 127.0.0.1
Please kill MongoDB and run this (note the -v):
$ mongod --bind_ip 0.0.0.0 -v
warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default
all output going to: /usr/local/var/log/mongodb/mongo.log
Just paste your output for that please?
And then just try:
$ mongo --host 192.168.43.2 --verbose
MongoDB shell version: 2.4.6
Sat Aug 24 09:07:14.556 versionArrayTest passed
connecting to: 192.168.43.2:27017/test
Sat Aug 24 09:07:14.657 creating new connection to:192.168.43.2:27017
Sat Aug 24 09:07:14.657 BackgroundJob starting: ConnectBG
Sat Aug 24 09:07:14.657 connected connection!
Server has startup warnings:
Sat Aug 24 09:06:44.360 [initandlisten]
Sat Aug 24 09:06:44.360 [initandlisten] ** WARNING: soft rlimits too low. Number of files is 256, should be at least 1000
>
Obviously replace it with your IP address. Let us know how that goes.
success:
change service config file C:\Program Files\MongoDB\Server\4.0\bin\mongod.cfg
# network interfaces (wjp: comment bindIp listening on all IPs)
net:
port: 27017
# bindIp: 127.0.0.1
bindIp: 0.0.0.0
restart mongodb service (or restart computer);
make sure machine level firewall open port 27017
connect ok.
No
In fact, if you want to access the rest interface from any ip, you needn't set bind_ip to 0.0.0.0 in the mongod.conf, you only comment or remove the configuration item from it, as similar as
#bind_ip=127.0.0.1
And then, restart your service, you can find that you can access the rest service from the 28017 port from your machine
For me I replaced the bindIp with bindIpAll: true (see http://docs.mongodb.org/manual/reference/configuration-options/ for details).
This is the content of my mongod.conf file.
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
# network interfaces
net:
port: 27017
# bindIp: 127.0.0.1
bindIpAll: true
# how the process runs
processManagement:
timeZoneInfo: /usr/share/zoneinfo
#security:
#operationProfiling:
#replication:
#sharding:
## Enterprise-Only Options:
#auditLog:
#snmp:
Note: Below configuration is for windows 10 & MongoDB 4.0
There are two ways to configure this.
Using shell commands.
Or through mongod.cfg.
Using shell commands
First open the cmd as administrator and go to --YourPath--\MongoDB\Server\4.0\bin
If you already running the MongoDB service then run this command or else, go to second point.mongod --remove
Then set the path for data folder, log file and ip bind using the below command
mongod --dbpath "C:\Program Files\MongoDB\Server\4.0\data" --logpath "C:\Program Files\MongoDB\Server\4.0\log\mongod.log" --bind_ip "0.0.0.0" --install --serviceName "MongoDB"
caution: This ip configuration gives access to your DB throughout the network.
Now go to services.msc and find the MongoDB, then Right click -> Start the service.
Through mongod.cfg
MongoDB has a very good documentation for Configuration here.
Just change the bindIp: 127.0.0.1 to bindIp: 0.0.0.0. Then restart the service using the #3 point from the above topic.