Ceph OSD (authenticate timed out) after node restart

Ceph OSD (authenticate timed out) after node restart - ceph

A couple of our nodes restarted unexpectedly and since the OSDs on those nodes will no longer authenticate with the MON.
I have tested that the node still has access to all the MON nodes using nc to see if the ports are open.
We can not find anything in the mon logs about authentication errors.
At the moment 50% of the cluster is down due to 2/4 nodes offline.
Feb 06 21:04:07 ceph1 systemd[1]: Starting Ceph osd.7 for d5126e5a-882e-11ec-954e-90e2baec3d2c...
Feb 06 21:04:08 ceph1 podman[520029]: 2023-02-06 21:04:08.056452052 +0100 CET m=+0.123533698 container create 0b396efc0543af48d593d1e4c72ed74d>
Feb 06 21:04:08 ceph1 podman[520029]: 2023-02-06 21:04:08.334525479 +0100 CET m=+0.401607145 container init 0b396efc0543af48d593d1e4c72ed74d30>
Feb 06 21:04:08 ceph1 podman[520029]: 2023-02-06 21:04:08.346028585 +0100 CET m=+0.413110241 container start 0b396efc0543af48d593d1e4c72ed74d3>
Feb 06 21:04:08 ceph1 podman[520029]: 2023-02-06 21:04:08.346109677 +0100 CET m=+0.413191333 container attach 0b396efc0543af48d593d1e4c72ed74d>
Feb 06 21:04:08 ceph1 bash[520029]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-7
Feb 06 21:04:08 ceph1 bash[520029]: Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-03539866-06e2-4>
Feb 06 21:04:08 ceph1 bash[520029]: Running command: /usr/bin/ln -snf /dev/ceph-03539866-06e2-4ba6-8809-6a491becb4fe/osd-block-1dd63d2a-9803-4>
Feb 06 21:04:08 ceph1 bash[520029]: Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-7/block
Feb 06 21:04:08 ceph1 bash[520029]: Running command: /usr/bin/chown -R ceph:ceph /dev/dm-0
Feb 06 21:04:08 ceph1 bash[520029]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-7
Feb 06 21:04:08 ceph1 bash[520029]: --> ceph-volume lvm activate successful for osd ID: 7
Feb 06 21:04:08 ceph1 podman[520029]: 2023-02-06 21:04:08.635416784 +0100 CET m=+0.702498460 container died 0b396efc0543af48d593d1e4c72ed74d30>
Feb 06 21:04:09 ceph1 podman[520029]: 2023-02-06 21:04:09.036165374 +0100 CET m=+1.103247040 container remove 0b396efc0543af48d593d1e4c72ed74d>
Feb 06 21:04:09 ceph1 podman[520260]: 2023-02-06 21:04:09.299438115 +0100 CET m=+0.070335845 container create d25c3024614dfb0a01c70bd56cf0758e>
Feb 06 21:04:09 ceph1 podman[520260]: 2023-02-06 21:04:09.384256486 +0100 CET m=+0.155154236 container init d25c3024614dfb0a01c70bd56cf0758ef1>
Feb 06 21:04:09 ceph1 podman[520260]: 2023-02-06 21:04:09.393054076 +0100 CET m=+0.163951816 container start d25c3024614dfb0a01c70bd56cf0758ef>
Feb 06 21:04:09 ceph1 bash[520260]: d25c3024614dfb0a01c70bd56cf0758ef16aa67f511ee4add8a85586c67beb0b
Feb 06 21:04:09 ceph1 systemd[1]: Started Ceph osd.7 for d5126e5a-882e-11ec-954e-90e2baec3d2c.
Feb 06 21:09:09 ceph1 conmon[520298]: debug 2023-02-06T20:09:09.394+0000 7f6c10705080 0 monclient(hunting): authenticate timed out after 300
Feb 06 21:14:09 ceph1 conmon[520298]: debug 2023-02-06T20:14:09.395+0000 7f6c10705080 0 monclient(hunting): authenticate timed out after 300
Feb 06 21:19:09 ceph1 conmon[520298]: debug 2023-02-06T20:19:09.397+0000 7f6c10705080 0 monclient(hunting): authenticate timed out after 300
Feb 06 21:24:09 ceph1 conmon[520298]: debug 2023-02-06T20:24:09.398+0000 7f6c10705080 0 monclient(hunting): authenticate timed out after 300
Feb 06 21:29:09 ceph1 conmon[520298]: debug 2023-02-06T20:29:09.399+0000 7f6c10705080 0 monclient(hunting): authenticate timed out after 300
We have restarted the OSD nodes and this did not resolve the issue.
Confirmed that nodes have access to all mon servers.
I have looked in /var/run/ceph and the admin sockets are not there.
Here is output as its starting the OSD.
[2023-02-07 10:38:58,167][ceph_volume.main][INFO ] Running command: ceph-volume lvm list --format json
[2023-02-07 10:38:58,168][ceph_volume.process][INFO ] Running command: /usr/sbin/lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size
[2023-02-07 10:38:58,213][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-03539866-06e2-4ba6-8809-6a491becb4fe/osd-block-1dd63d2a-9803-452c-a102-3b826e6ef448,ceph.block_uuid=VjbtJW-iiCA-PMvC-TCnV-9xgJ-a8UU-IDo0Pv,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d5126e5a-882e-11ec-954e-90e2baec3d2c,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=1dd63d2a-9803-452c-a102-3b826e6ef448,ceph.osd_id=7,ceph.osdspec_affinity=all-available-devices,ceph.type=block,ceph.vdo=0";"/dev/ceph-03539866-06e2-4ba6-8809-6a491becb4fe/osd-block-1dd63d2a-9803-452c-a102-3b826e6ef448";"osd-block-1dd63d2a-9803-452c-a102-3b826e6ef448";"ceph-03539866-06e2-4ba6-8809-6a491becb4fe";"VjbtJW-iiCA-PMvC-TCnV-9xgJ-a8UU-IDo0Pv";"16000896466944
[2023-02-07 10:38:58,213][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-1ce58676-9409-4e19-ac66-f63b5025dfb0/osd-block-9949a437-7e8a-489b-ba10-ded82c775c43,ceph.block_uuid=KLNJDx-J1iC-V5GJ-0nw3-YuEA-Q41D-HNIXv8,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d5126e5a-882e-11ec-954e-90e2baec3d2c,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=9949a437-7e8a-489b-ba10-ded82c775c43,ceph.osd_id=3,ceph.osdspec_affinity=all-available-devices,ceph.type=block,ceph.vdo=0";"/dev/ceph-1ce58676-9409-4e19-ac66-f63b5025dfb0/osd-block-9949a437-7e8a-489b-ba10-ded82c775c43";"osd-block-9949a437-7e8a-489b-ba10-ded82c775c43";"ceph-1ce58676-9409-4e19-ac66-f63b5025dfb0";"KLNJDx-J1iC-V5GJ-0nw3-YuEA-Q41D-HNIXv8";"16000896466944
[2023-02-07 10:38:58,213][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-7053d77a-5d1c-450b-a932-d1590411ea2b/osd-block-29ac0ada-d23c-45c1-ae5d-c8aba5a60195,ceph.block_uuid=NTTkze-YV08-lOir-SJ6W-39un-oUc7-ZvOBra,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d5126e5a-882e-11ec-954e-90e2baec3d2c,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=29ac0ada-d23c-45c1-ae5d-c8aba5a60195,ceph.osd_id=14,ceph.osdspec_affinity=all-available-devices,ceph.type=block,ceph.vdo=0";"/dev/ceph-7053d77a-5d1c-450b-a932-d1590411ea2b/osd-block-29ac0ada-d23c-45c1-ae5d-c8aba5a60195";"osd-block-29ac0ada-d23c-45c1-ae5d-c8aba5a60195";"ceph-7053d77a-5d1c-450b-a932-d1590411ea2b";"NTTkze-YV08-lOir-SJ6W-39un-oUc7-ZvOBra";"16000896466944
[2023-02-07 10:38:58,213][ceph_volume.process][INFO ] stdout ceph.block_device=/dev/ceph-e0a1e940-dec3-4369-a533-1e88bea5fa5e/osd-block-2d002c14-7751-4037-a070-7538e1264d88,ceph.block_uuid=1Gts1p-KwPO-LnIb-XlP2-zCGQ-92fb-Kvv53H,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=d5126e5a-882e-11ec-954e-90e2baec3d2c,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.encrypted=0,ceph.osd_fsid=2d002c14-7751-4037-a070-7538e1264d88,ceph.osd_id=11,ceph.osdspec_affinity=all-available-devices,ceph.type=block,ceph.vdo=0";"/dev/ceph-e0a1e940-dec3-4369-a533-1e88bea5fa5e/osd-block-2d002c14-7751-4037-a070-7538e1264d88";"osd-block-2d002c14-7751-4037-a070-7538e1264d88";"ceph-e0a1e940-dec3-4369-a533-1e88bea5fa5e";"1Gts1p-KwPO-LnIb-XlP2-zCGQ-92fb-Kvv53H";"16000896466944
[2023-02-07 10:38:58,214][ceph_volume.process][INFO ] Running command: /usr/sbin/pvs --noheadings --readonly --separator=";" -S lv_uuid=VjbtJW-iiCA-PMvC-TCnV-9xgJ-a8UU-IDo0Pv -o pv_name,pv_tags,pv_uuid,vg_name,lv_uuid
[2023-02-07 10:38:58,269][ceph_volume.process][INFO ] stdout /dev/sdb";"";"a6T0sC-DeMp-by25-wUjP-wL3R-u6d1-nPXfji";"ceph-03539866-06e2-4ba6-8809-6a491becb4fe";"VjbtJW-iiCA-PMvC-TCnV-9xgJ-a8UU-IDo0Pv
[2023-02-07 10:38:58,269][ceph_volume.process][INFO ] Running command: /usr/sbin/pvs --noheadings --readonly --separator=";" -S lv_uuid=KLNJDx-J1iC-V5GJ-0nw3-YuEA-Q41D-HNIXv8 -o pv_name,pv_tags,pv_uuid,vg_name,lv_uuid
[2023-02-07 10:38:58,333][ceph_volume.process][INFO ] stdout /dev/sda";"";"63b0j0-o1S7-FHqG-lwOk-0OYj-I9pH-g58TzB";"ceph-1ce58676-9409-4e19-ac66-f63b5025dfb0";"KLNJDx-J1iC-V5GJ-0nw3-YuEA-Q41D-HNIXv8
[2023-02-07 10:38:58,333][ceph_volume.process][INFO ] Running command: /usr/sbin/pvs --noheadings --readonly --separator=";" -S lv_uuid=NTTkze-YV08-lOir-SJ6W-39un-oUc7-ZvOBra -o pv_name,pv_tags,pv_uuid,vg_name,lv_uuid
[2023-02-07 10:38:58,397][ceph_volume.process][INFO ] stdout /dev/sde";"";"qDEqYa-cgXd-Tc2h-64wQ-zT63-vIBZ-ZfGGO0";"ceph-7053d77a-5d1c-450b-a932-d1590411ea2b";"NTTkze-YV08-lOir-SJ6W-39un-oUc7-ZvOBra
[2023-02-07 10:38:58,398][ceph_volume.process][INFO ] Running command: /usr/sbin/pvs --noheadings --readonly --separator=";" -S lv_uuid=1Gts1p-KwPO-LnIb-XlP2-zCGQ-92fb-Kvv53H -o pv_name,pv_tags,pv_uuid,vg_name,lv_uuid
[2023-02-07 10:38:58,457][ceph_volume.process][INFO ] stdout /dev/sdd";"";"aqhedj-aUlM-0cl4-P98k-XZRL-1mPG-0OgKLV";"ceph-e0a1e940-dec3-4369-a533-1e88bea5fa5e";"1Gts1p-KwPO-LnIb-XlP2-zCGQ-92fb-Kvv53H
config dump
WHO MASK LEVEL OPTION VALUE RO
global advanced cluster_network 10.125.0.0/24 *
global basic container_image quay.io/ceph/ceph#sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a77fa32d0b903061 *
mon advanced auth_allow_insecure_global_id_reclaim false
mon advanced public_network 10.123.0.0/24 *
mgr advanced mgr/cephadm/container_init True *
mgr advanced mgr/cephadm/migration_current 3 *
mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://10.123.0.21:9093 *
mgr advanced mgr/dashboard/GRAFANA_API_SSL_VERIFY false *
mgr advanced mgr/dashboard/GRAFANA_API_URL https://10.123.0.21:3000 *
mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://10.123.0.21:9095 *
mgr advanced mgr/dashboard/ssl_server_port 8443 *
mgr advanced mgr/orchestrator/orchestrator cephadm
mgr advanced mgr/pg_autoscaler/autoscale_profile scale-up
mds advanced mds_max_caps_per_client 65536
mds.cephfs basic mds_join_fs cephfs
####
ceph status
cluster:
id: d5126e5a-882e-11ec-954e-90e2baec3d2c
health: HEALTH_WARN
8 failed cephadm daemon(s)
2 stray daemon(s) not managed by cephadm
nodown,noout flag(s) set
4 osds down
1 host (4 osds) down
Degraded data redundancy: 195662646/392133183 objects degraded (49.897%), 160 pgs degraded, 160 pgs undersized
6 pgs not deep-scrubbed in time
1 daemons have recently crashed
services:
mon: 3 daemons, quorum ceph5,ceph7,ceph6 (age 2d)
mgr: ceph2.tofizp(active, since 9M), standbys: ceph1.vnkagp
mds: 3/3 daemons up
osd: 19 osds: 15 up (since 11h), 19 in (since 11h); 151 remapped pgs
flags nodown,noout
data:
volumes: 1/1 healthy
pools: 6 pools, 257 pgs
objects: 102.97M objects, 67 TiB
usage: 69 TiB used, 107 TiB / 176 TiB avail
pgs: 195662646/392133183 objects degraded (49.897%)
2620377/392133183 objects misplaced (0.668%)
150 active+undersized+degraded+remapped+backfill_wait
97 active+clean
9 active+undersized+degraded
1 active+undersized+degraded+remapped+backfilling
io:
client: 170 B/s rd, 0 op/s rd, 0 op/s wr
recovery: 9.7 MiB/s, 9 objects/s

Related

Redis replication - Failed to resolve hostname

I am trying to setup a redis replication cluster with 3 redis servers (1 primary and 2 replicas) and 3 redis sentinels.
One server and database pair will exist on a machine and there will be 3 machines, each machine has docker installed.
The issue I am having is that the redis server instances are unable to connect to MASTER and resolve the primary name.
There seems to be no issues with the network or port openings since I am able to specify each machines IPv4 in the
configuration below and the communication works as expected.
Redis version available in logs further down
docker-compose version 1.29.2, build 5becea4c
Docker version 20.10.14, build a224086
Ubuntu 20.04.4 LTS
Below are the docker-compose.yml files:
I did not include redis_3 config and output since it is almost identical to redis_2
primary:
version: '3.8'
services:
redis_1:
image: bitnami/redis:6.2
restart: always
command: ["redis-server", "--protected-mode", "no", "--dir", "/data"]
environment:
- REDIS_REPLICA_IP=redis_1
- REDIS_REPLICATION_MODE=master
- REDIS_MASTER_PASSWORD=very-good-password
- REDIS_PASSWORD=very-good-password
ports:
- "6379:6379"
volumes:
- "/opt/knowit/docker/data:/data"
sentinel_1:
image: bitnami/redis-sentinel:6.2
restart: always
environment:
- REDIS_MASTER_HOST=redis_1
- REDIS_MASTER_PASSWORD=very-good-password
- REDIS_SENTINEL_ANNOUNCE_IP=redis_1
- REDIS_SENTINEL_QUORUM=2
- REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS=5000
- REDIS_SENTINEL_FAILOVER_TIMEOUT=60000
- REDIS_SENTINEL_PASSWORD=other-good-password
- REDIS_SENTINEL_ANNOUNCE_HOSTNAMES=yes
- REDIS_SENTINEL_RESOLVE_HOSTNAMES=yes
ports:
- "26379:26379"
replica:
version: '3.8'
services:
redis_2:
image: bitnami/redis:6.2
restart: always
command: ["redis-server", "--protected-mode", "no", "--replicaof", "redis_1", "6379", "--dir", "/data"]
environment:
- REDIS_REPLICA_IP=redis_2
- REDIS_REPLICATION_MODE=replica
- REDIS_MASTER_PASSWORD=very-good-password
- REDIS_PASSWORD=very-good-password
ports:
- "6379:6379"
volumes:
- "/opt/knowit/docker/data:/data"
sentinel_2:
image: bitnami/redis-sentinel:6.2
restart: always
environment:
- REDIS_MASTER_HOST=redis_1
- REDIS_MASTER_PASSWORD=very-good-password
- REDIS_SENTINEL_ANNOUNCE_IP=redis_2
- REDIS_SENTINEL_QUORUM=2
- REDIS_SENTINEL_DOWN_AFTER_MILLISECONDS=5000
- REDIS_SENTINEL_FAILOVER_TIMEOUT=60000
- REDIS_SENTINEL_PASSWORD=other-good-password
- REDIS_SENTINEL_ANNOUNCE_HOSTNAMES=yes
- REDIS_SENTINEL_RESOLVE_HOSTNAMES=yes
ports:
- "26379:26379"
The docker logs looks like this:
primary:
$ sudo docker-compose up
Starting docker_redis_1 ... done
Starting docker_sentinel_1 ... done
Attaching to docker_redis_1, docker_sentinel_1
redis_1 | redis 12:15:41.03
redis_1 | redis 12:15:41.04 Welcome to the Bitnami redis container
redis_1 | redis 12:15:41.04 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis
redis_1 | redis 12:15:41.04 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis/issues
redis_1 |
redis_1 | redis 12:15:41.04
redis_1 | 1:C 29 Apr 2022 12:15:41.068 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1 | 1:C 29 Apr 2022 12:15:41.069 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1 | 1:C 29 Apr 2022 12:15:41.069 # Configuration loaded
redis_1 | 1:M 29 Apr 2022 12:15:41.070 * monotonic clock: POSIX clock_gettime
redis_1 | 1:M 29 Apr 2022 12:15:41.072 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
redis_1 | 1:M 29 Apr 2022 12:15:41.072 * Running mode=standalone, port=6379.
redis_1 | 1:M 29 Apr 2022 12:15:41.072 # Server initialized
redis_1 | 1:M 29 Apr 2022 12:15:41.073 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
redis_1 | 1:M 29 Apr 2022 12:15:41.073 * Ready to accept connections
sentinel_1 | redis-sentinel 12:15:41.08
sentinel_1 | redis-sentinel 12:15:41.08 Welcome to the Bitnami redis-sentinel container
sentinel_1 | redis-sentinel 12:15:41.08 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis-sentinel
sentinel_1 | redis-sentinel 12:15:41.08 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis-sentinel/issues
sentinel_1 | redis-sentinel 12:15:41.08
sentinel_1 | redis-sentinel 12:15:41.08 INFO ==> ** Starting Redis sentinel setup **
sentinel_1 | redis-sentinel 12:15:41.11 INFO ==> Initializing Redis Sentinel...
sentinel_1 | redis-sentinel 12:15:41.11 INFO ==> Persisted files detected, restoring...
sentinel_1 | redis-sentinel 12:15:41.12 INFO ==> ** Redis sentinel setup finished! **
sentinel_1 |
sentinel_1 | redis-sentinel 12:15:41.13 INFO ==> ** Starting Redis Sentinel **
sentinel_1 | 1:X 29 Apr 2022 12:15:41.143 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
sentinel_1 | 1:X 29 Apr 2022 12:15:41.144 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
sentinel_1 | 1:X 29 Apr 2022 12:15:41.144 # Configuration loaded
sentinel_1 | 1:X 29 Apr 2022 12:15:41.145 * monotonic clock: POSIX clock_gettime
sentinel_1 | 1:X 29 Apr 2022 12:15:41.146 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
sentinel_1 | 1:X 29 Apr 2022 12:15:41.147 * Running mode=sentinel, port=26379.
sentinel_1 | 1:X 29 Apr 2022 12:15:41.148 # Sentinel ID is 232f6b838b76c348f123597f2852091a77bdae03
sentinel_1 | 1:X 29 Apr 2022 12:15:41.148 # +monitor master mymaster redis_1 6379 quorum 2
replica:
$ sudo docker-compose up
Starting docker_redis_2 ... done
Starting docker_sentinel_2 ... done
Attaching to docker_redis_2, docker_sentinel_2
redis_2 | redis 11:53:24.61
redis_2 | redis 11:53:24.62 Welcome to the Bitnami redis container
redis_2 | redis 11:53:24.63 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis
redis_2 | redis 11:53:24.63 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis/issues
redis_2 | redis 11:53:24.63
redis_2 |
redis_2 | 1:C 29 Apr 2022 11:53:24.649 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_2 | 1:C 29 Apr 2022 11:53:24.651 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
redis_2 | 1:C 29 Apr 2022 11:53:24.651 # Configuration loaded
redis_2 | 1:S 29 Apr 2022 11:53:24.653 * monotonic clock: POSIX clock_gettime
redis_2 | 1:S 29 Apr 2022 11:53:24.656 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
redis_2 | 1:S 29 Apr 2022 11:53:24.657 * Running mode=standalone, port=6379.
redis_2 | 1:S 29 Apr 2022 11:53:24.657 # Server initialized
redis_2 | 1:S 29 Apr 2022 11:53:24.657 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
redis_2 | 1:S 29 Apr 2022 11:53:24.659 * Ready to accept connections
redis_2 | 1:S 29 Apr 2022 11:53:24.659 * Connecting to MASTER redis_1:6379
sentinel_2 | redis-sentinel 11:53:24.70
sentinel_2 | redis-sentinel 11:53:24.71 Welcome to the Bitnami redis-sentinel container
sentinel_2 | redis-sentinel 11:53:24.71 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-redis-sentinel
sentinel_2 | redis-sentinel 11:53:24.71 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-redis-sentinel/issues
sentinel_2 | redis-sentinel 11:53:24.71
sentinel_2 | redis-sentinel 11:53:24.71 INFO ==> ** Starting Redis sentinel setup **
redis_2 | 1:S 29 Apr 2022 11:53:34.673 # Unable to connect to MASTER: Resource temporarily unavailable
sentinel_2 | redis-sentinel 11:53:34.75 WARN ==> Hostname redis_1 could not be resolved, this could lead to connection issues
sentinel_2 | redis-sentinel 11:53:34.76 INFO ==> Initializing Redis Sentinel...
sentinel_2 | redis-sentinel 11:53:34.76 INFO ==> Persisted files detected, restoring...
sentinel_2 | redis-sentinel 11:53:34.77 INFO ==> ** Redis sentinel setup finished! **
sentinel_2 |
sentinel_2 | redis-sentinel 11:53:34.79 INFO ==> ** Starting Redis Sentinel **
redis_2 | 1:S 29 Apr 2022 11:53:35.675 * Connecting to MASTER redis_1:6379
sentinel_2 | 1:X 29 Apr 2022 11:53:44.813 # Failed to resolve hostname 'redis_1'
sentinel_2 | 1:X 29 Apr 2022 11:53:44.813 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
sentinel_2 | 1:X 29 Apr 2022 11:53:44.813 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
sentinel_2 | 1:X 29 Apr 2022 11:53:44.813 # Configuration loaded
sentinel_2 | 1:X 29 Apr 2022 11:53:44.814 * monotonic clock: POSIX clock_gettime
sentinel_2 | 1:X 29 Apr 2022 11:53:44.815 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
sentinel_2 | 1:X 29 Apr 2022 11:53:44.815 * Running mode=sentinel, port=26379.
sentinel_2 | 1:X 29 Apr 2022 11:53:44.816 # Sentinel ID is bfec501e81d8da33def75f23911b606aa395078d
sentinel_2 | 1:X 29 Apr 2022 11:53:44.816 # +monitor master mymaster redis_1 6379 quorum 2
sentinel_2 | 1:X 29 Apr 2022 11:53:44.817 # +tilt #tilt mode entered
redis_2 | 1:S 29 Apr 2022 11:53:45.687 # Unable to connect to MASTER: Resource temporarily unavailable
redis_2 | 1:S 29 Apr 2022 11:53:46.689 * Connecting to MASTER redis_1:6379
sentinel_2 | 1:X 29 Apr 2022 11:53:54.831 # Failed to resolve hostname 'redis_1'
sentinel_2 | 1:X 29 Apr 2022 11:53:54.914 # +tilt #tilt mode entered
redis_2 | 1:S 29 Apr 2022 11:53:56.701 # Unable to connect to MASTER: Resource temporarily unavailable
Is there some issue with resolving hostnames when the different redis instances are located on separate machines or have I just missed something basic?
I would assume the latter, since I have been able to get this up and running by specifying the ip addresses and also the replica receives the hostname of the primary.
Any help would be much appreciated! Let me know if additional information is required

Only sentinel with version above 6.2 can resolve host names, but this is not enabled by default. Adding sentinel resolve-hostnames yes to sentinel.conf will help.
If your sentinel has older versions, the hostname redis_node should be replaced by and ip.
For more details, check out IP Addresses and DNS names in Redis document
Reference - answer by Tsonglew

Why my mongodb docker not writing data to host folder using volume?

I want to know why my mongo docker not writing data to local foler. I run my mongo docker with this command:
(/data/db seems to be the mongdo docker's data storage position, and /data/docker/mongo_volume is the folder in the "host" machine)
sudo docker run -it -v /data/db:/data/docker/mongo_volume -d mongo
When the mongo docker successfully started in my host , $docker ps looks like good:
u#VM-0-9-ubuntu:/data/docker/mongo_volume$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5848da7562a3 mongo "docker-entrypoint.s…" 10 minutes ago Up 10 minutes 0.0.0.0:27017->27017/tcp, :::27017->27017/tcp sleepy_clarke
and $ docker inspect <container_id> shows the mounted volume:
"Mounts": [
{
"Type": "bind",
"Source": "/data/db",
"Destination": "/data/docker/mongo_volume",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
},
and I check the dockers (in docker shell) /data/db folder, everything looks good:
ls -al
total 716
drwxr-xr-x 4 mongodb mongodb 4096 Mar 25 00:38 .
drwxr-xr-x 1 root root 4096 Mar 25 00:28 ..
-rw------- 1 mongodb mongodb 50 Mar 25 00:28 WiredTiger
-rw------- 1 mongodb mongodb 21 Mar 25 00:28 WiredTiger.lock
-rw------- 1 mongodb mongodb 1472 Mar 25 00:38 WiredTiger.turtle
-rw------- 1 mongodb mongodb 94208 Mar 25 00:38 WiredTiger.wt
-rw------- 1 mongodb mongodb 4096 Mar 25 00:28 WiredTigerHS.wt
-rw------- 1 mongodb mongodb 36864 Mar 25 00:34 _mdb_catalog.wt
-rw------- 1 mongodb mongodb 20480 Mar 25 00:29 collection-0--6476415430291015248.wt
-rw------- 1 mongodb mongodb 65536 Mar 25 00:34 collection-11--6476415430291015248.wt
-rw------- 1 mongodb mongodb 20480 Mar 25 00:29 collection-2--6476415430291015248.wt
-rw------- 1 mongodb mongodb 4096 Mar 25 00:28 collection-4--6476415430291015248.wt
-rw------- 1 mongodb mongodb 20480 Mar 25 00:33 collection-7--6476415430291015248.wt
-rw------- 1 mongodb mongodb 225280 Mar 25 00:33 collection-9--6476415430291015248.wt
drwx------ 2 mongodb mongodb 4096 Mar 25 00:39 diagnostic.data
-rw------- 1 mongodb mongodb 20480 Mar 25 00:29 index-1--6476415430291015248.wt
-rw------- 1 mongodb mongodb 73728 Mar 25 00:33 index-10--6476415430291015248.wt
-rw------- 1 mongodb mongodb 20480 Mar 25 00:34 index-12--6476415430291015248.wt
-rw------- 1 mongodb mongodb 20480 Mar 25 00:29 index-3--6476415430291015248.wt
-rw------- 1 mongodb mongodb 4096 Mar 25 00:28 index-5--6476415430291015248.wt
-rw------- 1 mongodb mongodb 4096 Mar 25 00:28 index-6--6476415430291015248.wt
-rw------- 1 mongodb mongodb 20480 Mar 25 00:33 index-8--6476415430291015248.wt
drwx------ 2 mongodb mongodb 4096 Mar 25 00:28 journal
-rw-r--r-- 1 root root 0 Mar 25 00:29 lueluelue
-rw------- 1 mongodb mongodb 2 Mar 25 00:28 mongod.lock
-rw------- 1 mongodb mongodb 36864 Mar 25 00:35 sizeStorer.wt
-rw------- 1 mongodb mongodb 114 Mar 25 00:28 storage.bson
However, here comes the problem: I found there's nothing in my "host machine"'s /data/docker/mongo_volume:
ubuntu#VM-0-9-ubuntu:/data/docker/mongo_volume$ ll
total 8
drwxr-xr-x 2 root root 4096 Mar 20 13:46 ./
drwxr-xr-x 3 root root 4096 Mar 20 13:46 ../
So anyone could give me a clue? thanks a lot!

your docker command is incorrect, you should place -v <host_folder>:<container_folder>, e.g.
sudo docker run -it -v /data/docker/mongo_volume:/data/db -d mongo

mongod : ERROR: child process failed, exited with 100

I just upgraded from mongodb 3.4 to 4.4, after that the database won't start.
As a service.....
[root#ssdnodes-54313 mongo]# systemctl restart mongod
Job for mongod.service failed because the control process exited with error code.
See "systemctl status mongod.service" and "journalctl -xe" for details.
-- Unit mongod.service has begun starting up.
Dec 09 15:30:22 ssdnodes-54313 mongod[217641]: about to fork child process, waiting until server is ready for connections.
Dec 09 15:30:22 ssdnodes-54313 mongod[217641]: forked process: 217643
Dec 09 15:30:22 ssdnodes-54313 mongod[217641]: ERROR: child process failed, exited with 1
Dec 09 15:30:22 ssdnodes-54313 mongod[217641]: To see additional information in this output, start without the "--fork" option.
Dec 09 15:30:22 ssdnodes-54313 systemd[1]: mongod.service: Control process exited, code=exited status=1
Dec 09 15:30:22 ssdnodes-54313 systemd[1]: mongod.service: Failed with result 'exit-code'.
Dec 09 15:30:22 ssdnodes-54313 systemd[1]: Failed to start MongoDB Database Server.
Executing mongod ....
[root#host mongo]# mongod --fork --logpath /var/log/mongodb/mongod.log
about to fork child process, waiting until server is ready for connections.
forked process: 217394
ERROR: child process failed, exited with 100
To see additional information in this output, start without the "--fork" option.
Executing without --fork (does nothing, no error, no server listening)
[root#host mongo]# mongod --logpath /var/log/mongodb/mongod.log
[root#host mongo]#
If I start as :
[root#host mongo]# mongod --dbpath /var/lib/mongo
{"t":{"$date":"2020-12-09T15:28:43.674+00:00"},"s":"I", "c":"CONTROL", "id":23285, "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}
{"t":{"$date":"2020-12-09T15:28:43.687+00:00"},"s":"W", "c":"ASIO", "id":22601, "ctx":"main","msg":"No TransportLayer configured during NetworkInterface startup"}
{"t":{"$date":"2020-12-09T15:28:43.691+00:00"},"s":"I", "c":"NETWORK", "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}
{"t":{"$date":"2020-12-09T15:28:43.698+00:00"},"s":"I", "c":"STORAGE", "id":4615611, "ctx":"initandlisten","msg":"MongoDB starting","attr":{"pid":217522,"port":27017,"dbPath":"/var/lib/mongo","architecture":"64-bit","host":"ssdnodes-54313"}}
Do works! but how do I fork it to run in background and start as a service at boot-time.
Edit:
my /etc/mongod.conf
[root#ssdnodes-54313 ~]# cat /etc/mongod.conf
# mongod.conf
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
# Where and how to store data.
storage:
dbPath: /var/lib/mongo
journal:
enabled: true
# engine:
# wiredTiger:
# how the process runs
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
port: 27017
bindIp: 127.0.0.1 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.
File permissions:
[root#ssdnodes-54313 lib]# ls -la /var/lib
drwxr-xr-x 4 mongod mongod 4096 Dec 9 18:50 mongo
[root#ssdnodes-54313 lib]# ls -la /var/lib/mongo/
total 556
drwxr-xr-x 4 mongod mongod 4096 Dec 9 18:50 .
drwxr-xr-x. 40 root root 4096 Dec 9 14:01 ..
-rw------- 1 mongod mongod 32768 Dec 9 15:32 collection-0--1818581548198400736.wt
-rw------- 1 mongod mongod 36864 Dec 9 15:33 collection-0--4356046170403439820.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-0-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-10-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-12-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 collection-14-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 collection-16-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 collection-18-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 collection-20-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 collection-22-7358854442417001382.wt
-rw------- 1 mongod mongod 24576 Dec 9 16:23 collection-2--4356046170403439820.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:36 collection-24-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:36 collection-26-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-2-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-4-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-6-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 collection-8-7358854442417001382.wt
drwx------ 2 mongod mongod 4096 Dec 9 18:50 diagnostic.data
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-11-7358854442417001382.wt
-rw------- 1 mongod mongod 32768 Dec 9 15:32 index-1--1818581548198400736.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-13-7358854442417001382.wt
-rw------- 1 mongod mongod 36864 Dec 9 15:33 index-1--4356046170403439820.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 index-15-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-1-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 index-17-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 index-19-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 index-21-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:35 index-23-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:36 index-25-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:36 index-27-7358854442417001382.wt
-rw------- 1 mongod mongod 12288 Dec 9 16:24 index-3--4356046170403439820.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-3-7358854442417001382.wt
-rw------- 1 mongod mongod 12288 Dec 9 18:50 index-4--4356046170403439820.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-5-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-7-7358854442417001382.wt
-rw------- 1 mongod mongod 4096 Dec 9 15:34 index-9-7358854442417001382.wt
drwx------ 2 mongod mongod 4096 Dec 9 15:32 journal
-rw------- 1 mongod mongod 36864 Dec 9 15:36 _mdb_catalog.wt
-rw------- 1 mongod mongod 7 Dec 9 15:32 mongod.lock
-rw------- 1 mongod mongod 36864 Dec 9 16:23 sizeStorer.wt
-rw------- 1 mongod mongod 114 Dec 9 15:17 storage.bson
-rw------- 1 mongod mongod 47 Dec 9 15:17 WiredTiger
-rw------- 1 mongod mongod 4096 Dec 9 15:32 WiredTigerHS.wt
-rw------- 1 mongod mongod 21 Dec 9 15:17 WiredTiger.lock
-rw------- 1 mongod mongod 1256 Dec 9 18:50 WiredTiger.turtle
-rw------- 1 mongod mongod 143360 Dec 9 18:50 WiredTiger.wt
SELinux is disabled...
[root#ssdnodes-54313 log]# sestatus
SELinux status: disabled

Mongo db service is not running

mongod.service - High-performance, schema-free document-oriented database
Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2018-10-19 11:54:22 BST; 1min 8s ago
Docs: https://docs.mongodb.org/manual
Process: 28567 ExecStart=/usr/bin/mongod --config /etc/mongod.conf (code=exited, status=2)
Process: 28565 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)
Process: 28559 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)
Process: 28557 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)
Oct 19 11:54:22 d203.tld systemd[1]: Starting High-performance, schema-free document-oriented database...
Oct 19 11:54:22 d203.tld mongod[28567]: Unrecognized option: security
Oct 19 11:54:22 d203.tld mongod[28567]: try '/usr/bin/mongod --help' for more information
Oct 19 11:54:22 d203.tld systemd[1]: mongod.service: control process exited, code=exited status=2
Oct 19 11:54:22 d203.tld systemd[1]: Failed to start High-performance, schema-free document-oriented database.
Oct 19 11:54:22 d203.tld systemd[1]: Unit mongod.service entered failed state.
Oct 19 11:54:22 d203.tld systemd[1]: mongod.service failed.

Mongo db failed to start in centos 07

I followed this link [1] to install mongo db under centos 07, the data base started normally but then I don't know what happen it did not want to start again giving me this error:
[root#localhost ~]# systemctl start mongod
Job for mongod.service failed. See 'systemctl status mongod.service' and 'journalctl -xn' for details.
[root#localhost ~]# systemctl status mongod.service -l
mongod.service - SYSV: Mongo is a scalable, document-oriented database.
Loaded: loaded (/etc/rc.d/init.d/mongod)
Active: failed (Result: exit-code) since mer. 2015-08-05 17:13:12 CEST; 24s ago
Process: 2872 ExecStart=/etc/rc.d/init.d/mongod start (code=exited, status=1/FAILURE)
août 05 17:13:12 localhost systemd[1]: Starting SYSV: Mongo is a scalable, document-oriented database....
août 05 17:13:12 localhost runuser[2878]: pam_unix(runuser:session): session opened for user mongod by (uid=0)
août 05 17:13:12 localhost mongod[2872]: Starting mongod: [ÉCHOUÉ]
août 05 17:13:12 localhost systemd[1]: mongod.service: control process exited, code=exited status=1
août 05 17:13:12 localhost systemd[1]: Failed to start SYSV: Mongo is a scalable, document-oriented database..
août 05 17:13:12 localhost systemd[1]: Unit mongod.service entered failed state.
I configured SELinux to "enforcing" and I enabled the access to the port 27017 using
semanage port -a -t mongod_port_t -p tcp 27017
I can start the data base using this command:
[root#localhost ~]# sudo -u root /usr/bin/mongod --quiet --config /etc/mongod.conf
about to fork child process, waiting until server is ready for connections.
forked process: 3058
child process started successfully, parent exiting
But still can't start it as a service :(
Any ideas of what I have missed?
Thanks in advance for your help!
[1] http://docs.mongodb.org/manual/tutorial/install-mongodb-on-red-hat/

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Ceph OSD (authenticate timed out) after node restart - ceph

Related

Redis replication - Failed to resolve hostname

Why my mongodb docker not writing data to host folder using volume?

mongod : ERROR: child process failed, exited with 100

Mongo db service is not running

Mongo db failed to start in centos 07

Categories

Resources