FailedToSatisfyReadPreference: Could not find host matching read preference { mode: "primary" } - mongodb

I want to use mongodb-shard for my project. I created a helm chart https://github.com/b-rohit/mongodb-shard-chart to deploy to kubernetes cluster.
I use kind cluster running locally to test it. The config and shard servers are running properly. I am able to execute commands in their mongo shells. The mongos server is not able to connect to the replica set in config server. I get following error message in mongos
2020-04-17T13:33:31.579+0000 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: FailedToSatisfyReadPreference: Could not find host matching read preference { mode: "primary" } for set mongors1conf
2020-04-17T13:33:31.579+0000 W SHARDING [mongosMain] Error initializing sharding state, sleeping for 2 seconds and trying again :: caused by :: FailedToSatisfyReadPreference: Error loading clusterID :: caused by :: Could not find host matching read preference { mode: "nearest" } for set mongors1conf
2020-04-17T13:33:31.579+0000 I SHARDING [shard-registry-reload] Periodic reload of shard registry failed :: caused by :: FailedToSatisfyReadPreference: could not get updated shard list from config server :: caused by :: Could not find host matching read preference { mode: "nearest" } for set mongors1conf; will retry after 30s
On config server logs are following
2020-04-17T13:33:11.578+0000 I NETWORK [listener] connection accepted from 10.244.0.6:34400 #5 (1 connection now open)
2020-04-17T13:33:11.578+0000 I NETWORK [conn5] received client metadata from 10.244.0.6:34400 conn5: { driver: { name: "NetworkInterfaceTL", version: "4.2.5" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } }
2020-04-17T13:33:11.589+0000 I ACCESS [conn5] Successfully authenticated as principal __system on local from client 10.244.0.6:34400
2020-04-17T13:33:38.197+0000 I SHARDING [replSetDistLockPinger] Marking collection config.lockpings as collection version: <unsharded>
2020-04-17T13:33:38.202+0000 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: LockStateChangeFailed: findAndModify query predicate didn't match any lock document
2020-04-17T13:44:39.743+0000 I CONTROL [LogicalSessionCacheRefresh] Failed to create config.system.sessions: Cannot create config.system.sessions until there are shards, will try again at the next refresh interval
2020-04-17T13:44:39.743+0000 I CONTROL [LogicalSessionCacheRefresh] Sessions collection is not set up; waiting until next sessions refresh interval: Cannot create config.system.sessions until there are shards
2020-04-17T13:44:39.743+0000 I SH_REFR [ConfigServerCatalogCacheLoader-1] Refresh for collection config.system.sessions took 0 ms and found the collection is not sharded
2020-04-17T13:44:39.743+0000 I CONTROL [LogicalSessionCacheReap] Sessions collection is not set up; waiting until next sessions reap interval: Collection config.system.sessions is not sharded.
2020-04-17T13:44:42.570+0000 I NETWORK [conn5] end connection 10.244.0.10:37922 (0 connections now open)
I am new to mongodb. It took lot of time to put this chart together. I checked other similar questions also. could not find host matching read preferences in mongodb
I am not able to debug it further.

Your config server replica set is either:
not running (not all nodes are up)
not a replica set (replSetInitiate not executed, or failed)
is referenced from the shard nodes incorrectly (wrong host, ip or wrong replica set name)
is up and running but your shards aren't allowed to access it due to firewall rules
Ensure you can access the replica set nodes from mongo shell from the machines on which the shard mongods are running.

Related

MongoDb ReplicaSet in container, accessable on bridge and host network

I am running 3 MongoDB instances in a container named mongodb-local, and initiating them as a replica set with the command:
mongo --port 27020 --eval 'rs.initiate({_id: "rs1", members: [ { _id: 0, host: "mongodb-local:27018"}, { _id: 1, host: "mongodb-local:27019"}, { _id: 2, host: "mongodb-local:27020"} ] })'
This allows the MongoDB instances to talk to other services on the bridge network as mongodb://mongodb-local:27018,mongodb-local:27019,mongodb-local:27020/?replicaSet=rs1
However, I also want to connect to the MongoDB replica-set from the host/public. I have published the 3 ports there, however attempting to connect to and use mongodb://localhost:27018,localhost:27019,localhost:27020/?replicaSet=rs1 fails because the replicas know eachother as mongodb-local, which is not resolvable to the host.
Errors for example in the c# driver such as:
One or more errors occurred. (A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "ReplicaSet", Type : "ReplicaSet", State : "Disconnected", Servers : [{ ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/localhost:27018" }", EndPoint: "Unspecified/localhost:27018", State: "Disconnected", Type: "Unknown", HeartbeatException: "MongoDB.Driver.MongoConnectionException: An exception occurred while opening a connection to the server.
or in the shell
...>mongo mongodb://localhost:27020,localhost:27019,localhost:27018/?replicaSet=rs1
MongoDB shell version v4.0.2
connecting to: mongodb://localhost:27020,localhost:27019,localhost:27018/?replicaSet=rs1
2020-07-22T11:24:23.036+0100 I NETWORK [js] Starting new replica set monitor for rs1/localhost:27020,localhost:27019,localhost:27018
2020-07-22T11:24:23.042+0100 I NETWORK [js] Successfully connected to localhost:27018 (1 connections now open to localhost:27018 with a 5 second timeout)
2020-07-22T11:24:23.042+0100 I NETWORK [ReplicaSetMonitor-TaskExecutor] Successfully connected to localhost:27019 (1 connections now open to localhost:27019 with a 5 second timeout)
2020-07-22T11:24:23.046+0100 I NETWORK [ReplicaSetMonitor-TaskExecutor] Successfully connected to localhost:27020 (1 connections now open to localhost:27020 with a 5 second timeout)
2020-07-22T11:24:23.048+0100 I NETWORK [ReplicaSetMonitor-TaskExecutor] changing hosts to rs1/mongodb-local:27018,mongodb-local:27019,mongodb-local:27020 from rs1/localhost:27018,localhost:27019,localhost:27020
2020-07-22T11:24:28.563+0100 W NETWORK [js] Unable to reach primary for set rs1
2020-07-22T11:24:28.563+0100 I NETWORK [js] Cannot reach any nodes for set rs1. Please check network connectivity and the status of the set. This has happened for 1 checks in a row.
2020-07-22T11:24:32.072+0100 W NETWORK [js] Unable to reach primary for set rs1
2020-07-22T11:24:32.072+0100 I NETWORK [js] Cannot reach any nodes for set rs1. Please check network connectivity and the status of the set. This has happened for 2 checks in a row.
2020-07-22T11:24:35.581+0100 W NETWORK [js] Unable to reach primary for set rs1
2020-07-22T11:24:35.581+0100 I NETWORK [js] Cannot reach any nodes for set rs1. Please check network connectivity and the status of the set. This has happened for 3 checks in a row.
2020-07-22T11:24:39.091+0100 W NETWORK [js] Unable to reach primary for set rs1
2020-07-22T11:24:39.092+0100 I NETWORK [js] Cannot reach any nodes for set rs1. Please check network connectivity and the status of the set. This has happened for 4 checks in a row.
2020-07-22T11:24:39.097+0100 E QUERY [js] Error: connect failed to replica set rs1/localhost:27020,localhost:27019,localhost:27018 :
connect#src/mongo/shell/mongo.js:257:13
#(connect):1:6
exception: connect failed
Conversely, I could initiate the replica set as:
mongo --port 27020 --eval 'rs.initiate({_id: "rs1", members: [ { _id: 0, host: "localhost:27018"}, { _id: 1, host: "localhost:27019"}, { _id: 2, host: "localhost:27020"} ] })'
Which will allow connections from the host machine/public, and by coincedence since the replicas are all running on the same container, they can also talk to eachother.
However, other services cannot connect with mongodb://mongodb-local:27018,mongodb-local:27019,mongodb-local:27020/?replicaSet=rs1, with a similar error.
This would also fail if I wanted to put each replica on a separate container.
I believe this is because the client receives the replica-set record after connecting and expects that to be available on the network it is connecting from? This is not a great assumption if so.
How can I initiate the replica-set on the bridge network and also allow connections from outside the bridge network?

Mongo Zone Sharding Implementation Issue

As I am new with MongoDB.
I have configured 1 config server, 2 shards and all are connected with mongod client. Also, Mongos is UP and Running.
I am currenctly implementing Sharding using Mongo Zone. As I have to store the data accourding to country/Zone. But facing below error in shards:
2018-05-03T16:22:40.200+0530 W SHARDING [conn13] Chunk move failed :: caused by ::
NoShardingEnabled: Cannot accept sharding commands if not started with --shardsvr
2018-05-03T16:27:11.223+0530 I SHARDING [Balancer] Balancer move testdb.user:
[{ zipcode: "380001" }, { zipcode: "39000" }), from rs1, to rs0 failed :: caused by ::
NoShardingEnabled: Cannot accept sharding commands if not started with --shardsvr
I have already started server with --shardsvr and also mentioned in config file as well. But still I am facing issue.
Please help me get this resolve.
Thanks in Advance.
Ankit

MongoDB stops when I try to make a connection

Mongodb stops whenever I try to make a connection.
When I run
sudo service mongod start
I get a message that mongodb is running.
But then when I try to make a connection using PyMongo I get an error that says
Autoreconnect: connection closed
I check my mongodb status:
sudo service mongod status
And it says that my mongodb instance is stopped/waiting.
My mongo log file reports the following:
2015-09-17T18:19:46.816+0000 I NETWORK [initandlisten] waiting for connections on port 7000
2015-09-17T18:19:58.813+0000 I NETWORK [initandlisten] connection accepted from 54.152.111.120:51387 #1 (1 connection now open)
2015-09-17T18:19:58.816+0000 I STORAGE [conn1] _getOpenFile() invalid file index requested 4
2015-09-17T18:19:58.816+0000 I - [conn1] Invariant failure false src/mongo/db/storage/mmap_v1/mmap_v1_extent_manager.cpp 201
2015-09-17T18:19:58.837+0000 I CONTROL [conn1]
This is followed by a lengthy backtrace that I can't decipher, then closes with:
2015-09-17T18:19:58.837+0000 I - [conn1]
***aborting after invariant() failure
I've looked around SO, particularly trying the top two answers here, but haven't been able to figure out how to solve the problem.
I'll also note that last time I tried to connect last week, it was working fine.

mongodb keyFile between replicas throws Permission denied

I have a single node ReplicaSet with auth activated, a root user and a keyFile I've created with this tutorial, I also have two more mongod processes in the same server in different ports (37017 and 47017) and the same replSet name, but when I try to add the secondaries in the mongo shell connected to PRIMARY with rs.add("172.31.48.41:37017") I get:
{
"ok" : 0,
"errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 172.31.48.41:27017; the following nodes did not respond affirmatively: 172.31.48.41:37017 failed with Failed attempt to connect to 172.31.48.41:37017; couldn't connect to server 172.31.48.41:37017 (172.31.48.41), connection attempt failed",
"code" : 74
}
Then I went to the mongod process log of the PRIMARY and found out this:
2015-05-19T20:53:59.848-0400 I REPL [conn51] replSetReconfig admin command received from client
2015-05-19T20:53:59.848-0400 W NETWORK [conn51] Failed to connect to 172.31.48.41:37017, reason: errno:13 Permission denied
2015-05-19T20:53:59.848-0400 I REPL [conn51] replSetReconfig config object with 2 members parses ok
2015-05-19T20:53:59.849-0400 W NETWORK [ReplExecNetThread-0] Failed to connect to 172.31.48.41:37017, reason: errno:13 Permission denied
2015-05-19T20:53:59.849-0400 W REPL [ReplicationExecutor] Failed to complete heartbeat request to 172.31.48.41:37017; Location18915 Failed attempt to connect to 172.31.48.41:37017; couldn't connect to server 172.31.48.41:37017 (172.31.48.41), connection attempt failed
2015-05-19T20:53:59.849-0400 E REPL [conn51] replSetReconfig failed; NodeNotFound Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 172.31.48.41:27017; the following nodes did not respond affirmatively: 172.31.48.41:37017 failed with Failed attempt to connect to 172.31.48.41:37017; couldn't connect to server 172.31.48.41:37017 (172.31.48.41), connection attempt failed
And the log of the mongod that should become SECONDARY shows nothing, the last two lines are:
2015-05-19T20:48:36.584-0400 I REPL [initandlisten] Did not find local replica set configuration document at startup; NoMatchingDocument Did not find replica set configuration document in local.system.replset
2015-05-19T20:48:36.591-0400 I NETWORK [initandlisten] waiting for connections on port 37017
It's clear that I cannot rs.initiate() in this node because it will self vote to be PRIMARY and that would create a conflict, so the line that states "Did not find local replica set configuration document at startup" is to be ignores as far as I know.
So I would think that the permission should be ok since I'm using the same key file in every mongod process and the replSet is the same in every config file, and that's all the tutorial states to be needed, but obviously something is missing.
Any ideas? Is this a bug?
If you are using ec2 instances and ip 27017 port in security group for both instances, just add a secondary instance port. It worked for me.

Mongos distribute too much query to one MongoD

We have a mongodb with 2 shardings, each of the shardings has those servers:
Shard 1: Master, running MongoD and Config server
Shard 1-s1: Slave, running MongoD and MongoS server
Shard 1-s2: Slave, running MongoD and MongoS and Arbiter server
Shard 2: Master, running MongoD and Config Server
Shard 2-s1: Slave, running MongoD and Config and MongoS server
Shard 2-s2: Slave, running MongoD and MongoS and Arbiter server
But the mongodb allways failed in recent days, after days of search, i find out that the MongoD runs at Shard 1(Master) always going down after reviced too many connections, other MongoD don't have this problem.
When the S1Master's MongoD running with too many connections for about 2 hours, the 4 Mongos Server will shut down one by one. Here is the Mongos's error log(10.81.4.72:7100 runs MongoD):
Tue Aug 20 20:01:52 [conn8526] DBClientCursor::init call() failed
Tue Aug 20 20:01:52 [conn3897] ns: user.dev could not initialize cursor across all shards because : stale config detected for ns: user.
dev ParallelCursor::_init # s01/10.36.31.36:7100,10.42.50.24:7100,10.81.4.72:7100 attempt: 0
Tue Aug 20 20:01:52 [conn744] ns: user.dev could not initialize cursor across all shards because : stale config detected for ns: user.d
ev ParallelCursor::_init # s01/10.36.31.36:7100,10.42.50.24:7100,10.81.4.72:7100 attempt: 0
I don't know why this mongod revieved so many connections, the chunks shows the sharding works well.