I'm having extreme frustration trying to setup a MongoDB replica set from scratch.
I have 2 machine run debian os and installed mongodb. When i try use rs.add() to add member to replica set then i appear error although i still connect to mongodb by
mongo --host 13.212.31.212:27017
Here is the error messages
rs0:PRIMARY> rs.add("13.212.31.212:27017")
{
"operationTime" : Timestamp(1597144435, 1),
"ok" : 0,
"errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 192.168.0.59:27017; the following nodes did not respond affirmatively: 13.212.31.212:27017 failed with Received heartbeat from member with the same member ID as ourself: 0",
"code" : 74,
"codeName" : "NodeNotFound",
"$clusterTime" : {
"clusterTime" : Timestamp(1597144440, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
Here is the mongod conf
# Where and how to store data.
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
# network interfaces
net:
port: 27017
bindIp: 127.0.0.1,172.26.2.229
# how the process runs
processManagement:
timeZoneInfo: /usr/share/zoneinfo
what i am doing wrong ?
This is the descriptive error message:
the following 1 voting nodes responded: 192.168.0.59:27017; the following nodes did not respond affirmatively: 13.212.31.212:27017 failed with Received heartbeat from member with the same member ID as ourself: 0"
That is telling you that 192.168.0.59:27017 and 13.212.31.212:27017 are the same node, and you can't add the same node twice.
Related
I'm trying to configure a replicaset of mongodb using ansible,
I succeeded to install mongoDB on the primary server and created the replica-set configuration file except when I launch the playbook, I get an error of type: MongoNetworkError: connect ECONNREFUSED 3.142.150.62:28041
Does anyone have an idea please how to solve this?
attached, the playbook and the error on the Jenkins console
Playbook:
---
- name: Play1
hosts: hhe
#connection: local
become: true
#remote_user: ec2-user
#remote_user: root
tasks:
- name: Install gnupg
package:
name: gnupg
state: present
- name: Import the public key used by the package management system
shell: wget -qO - https://www.mongodb.org/static/pgp/server-5.0.asc | sudo apt-key add -
- name: Create a list file for MongoDB
shell: echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/5.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-5.0.list
- name: Reload local package database
command: sudo apt-get update
- name: Installation of mongodb-org
package:
name: mongodb-org
state: present
update_cache: yes
- name: Start mongodb
service:
name: mongod
state: started
enabled: yes
- name: Play2
hosts: hhe
become: true
tasks:
- name: create directories on all the EC2 instances
shell: mkdir -p replicaset/member
- name: Play3
hosts: secondary1
become: true
tasks:
- name: Start mongoDB with the following command on secondary1
shell: nohup mongod --port 28042 --bind_ip localhost,ec2-18-191-39-71.us-east-2.compute.amazonaws.com --replSet replica_demo --dbpath replicaset/member &
- name: Play4
hosts: secondary2
become: true
tasks:
- name: Start mongoDB with the following command on secondary2
shell: nohup mongod --port 28043 --bind_ip localhost,ec2-18-221-31-81.us-east-2.compute.amazonaws.com --replSet replica_demo --dbpath replicaset/member &
- name: Play5
hosts: arbiter
become: true
tasks:
- name: Start mongoDB with the following command on arbiter
shell: nohup mongod --port 27018 --bind_ip localhost,ec2-13-58-35-255.us-east-2.compute.amazonaws.com --replSet replica_demo --dbpath replicaset/member &
- name: Play6
hosts: primary
become: true
tasks:
- name: Start mongoDB with the following command on primary
shell: nohup mongod --port 28041 --bind_ip localhost,ec2-3-142-150-62.us-east-2.compute.amazonaws.com --replSet replica_demo --dbpath replicaset/member &
- name: Create replicaset initialize file
copy:
dest: /tmp/replicaset_conf.js
mode: "u=rw,g=r,o=rwx"
content: |
var cfg =
{
"_id" : "replica_demo",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "3.142.150.62:28041"
},
{
"_id" : 1,
"host" : "18.191.39.71:28042"
},
{
"_id" : 2,
"host" : "18.221.31.81:28043"
}
]
}
rs.initiate(cfg)
- name: Pause for a while
pause: seconds=20
- name: Initialize the replicaset
shell: mongo /tmp/replicaset_conf.js
The error on Jenkins Consol:
PLAY [Play6] *******************************************************************
TASK [Gathering Facts] *********************************************************
ok: [primary]
TASK [Start mongoDB with the following command on primary] *********************
changed: [primary]
TASK [Create replicaset initialize file] ***************************************
ok: [primary]
TASK [Pause for a while] *******************************************************
Pausing for 20 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [primary]
TASK [Initialize the replicaset] ***********************************************
fatal: [primary]: FAILED! => {"changed": true, "cmd": "/usr/bin/mongo 3.142.150.62:28041 /tmp/replicaset_conf.js", "delta": "0:00:00.146406", "end": "2022-08-11 09:46:07.195269", "msg": "non-zero return code", "rc": 1, "start": "2022-08-11 09:46:07.048863", "stderr": "", "stderr_lines": [], "stdout": "MongoDB shell version v5.0.10\nconnecting to: mongodb://3.142.150.62:28041/test?compressors=disabled&gssapiServiceName=mongodb\nError: couldn't connect to server 3.142.150.62:28041, connection attempt failed: SocketException: Error connecting to 3.142.150.62:28041 :: caused by :: Connection refused :\nconnect#src/mongo/shell/mongo.js:372:17\n#(connect):2:6\nexception: connect failed\nexiting with code 1", "stdout_lines": ["MongoDB shell version v5.0.10", "connecting to: mongodb://3.142.150.62:28041/test?compressors=disabled&gssapiServiceName=mongodb", "Error: couldn't connect to server 3.142.150.62:28041, connection attempt failed: SocketException: Error connecting to 3.142.150.62:28041 :: caused by :: Connection refused :", "connect#src/mongo/shell/mongo.js:372:17", "#(connect):2:6", "exception: connect failed", "exiting with code 1"]}
You start the service already with
service:
name: mongod
state: started
enabled: yes
thus shell: nohup mongod ... & is pointless. You cannot start the mongod service multiple times, unless you use different port and dbPath. You should prefer to start the mongod as service, i.e. systemctl start mongod or similar instead of nohup mongod ... &. I prefer to use the configuration file (typically /etc/mongod.conf) rather than command line options.
Plain mongo command uses the default port 27017, i.e. it does not connect to the MongoDB instances you started in above task.
You should wait till replica set is initated. You can do it like this:
content: |
var cfg =
{
"_id" : "replica_demo",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "3.142.150.62:28041"
},
{
"_id" : 1,
"host" : "18.191.39.71:28042"
},
{
"_id" : 2,
"host" : "18.221.31.81:28043"
}
]
}
rs.initiate(cfg)
while (! db.hello().isWritablePrimary ) { sleep(1000) }
You configured an ARBITER. However, an arbiter node is useful only with an even number of Replica Set members. With 3 members it does not make much sense. Anyway, you don't add the arbiter to your Replica Set, so what is the reason to define it?
Just a note, you don't have to create a temp file, you can execute script directly, e.g. similar to this:
shell:
cmd: mongo --eval '{{ script }}'
executable: /bin/bash
vars:
script: |
var cfg =
{
"_id" : "replica_demo",
...
}
rs.initiate(cfg)
while (! db.hello().isWritablePrimary ) { sleep(1000) }
print(rs.status().ok)
register: ret
failed_when: ret.stdout_lines | last != "1"
Be aware of correct quoting.
I'm not a Go guy, just need to use a plugin written in Go and I'm having some trouble between plugin and MongoDB.
The error is:
server selection error: server selection timeout
current topology: Type: Unknown
Servers:
Addr: localhost:27017, Type: Unknown, State: Connected, Avergage RTT: 0, Last error: dial tcp 127.0.0.1:27017: connect: connection refused
exit status 1
My configuration:
time=“2019-09-03T16:29:35Z” level=debug msg=“Host: ip-XXX-XX-XX-XXX.sa-east-1.compute.internal”
time=“2019-09-03T16:29:35Z” level=debug msg=“Port: 27017”
time=“2019-09-03T16:29:35Z” level=debug msg=“Username: user”
time=“2019-09-03T16:29:35Z” level=debug msg=“Password: user123*”
time=“2019-09-03T16:29:35Z” level=debug msg=“DBName: dbBackend”
The plugin snippet that performs the connection:
addr := fmt.Sprintf("mongodb://%s:%s", m.Host, m.Port)
to := 60 * time.Second
opts := options.ClientOptions{
ConnectTimeout: &to,
}
opts.ApplyURI(addr)
if m.Username != "" && m.Password != "" {
opts.Auth = &options.Credential{
AuthSource: m.DBName,
Username: m.Username,
Password: m.Password,
PasswordSet: true,
}
}
client, err := mongo.Connect(context.TODO(), &opts)
if err != nil {
return m, errors.Errorf("couldn't start mongo backend. error: %s\n", err)
}
err1 := client.Ping(context.TODO(), nil)
if err1 != nil {
log.Fatal(err1) // error happens here
}
log.Debugf("MONGO CONNECTED")
m.Conn = client
return m, nil
I just can't realize why the mongo driver is looking on localhost if I'm setting the address of my mongoDB server.
EDIT 1
My db has replica set configured only to use change streams.
This is my RS configuration:
{
"_id" : "rs0",
"version" : 69559,
"protocolVersion" : 1,
"writeConcernMajorityJournalDefault" : true,
"members" : [
{
"_id" : 0,
"host" : "localhost:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : 0,
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5cf684c3c0db3f53727d1bb4")
}
}
Any help solving it appreciated.
Thanks
why the mongo driver is looking on localhost if I'm setting the address of my mongoDB server.
When mongo-go-driver's client is connecting to a MongoDB deployment, it will perform Server Discovery and Monitoring to discovers one or more servers (MongoDB being a distributed database by nature). One of the early steps is to begin monitoring the topology by invoking isMaster command on all servers. Based on the output of isMaster the client will try to contact those servers. In the case of Replica Set (your case), the client strives to connect to the primary server (from isMaster.primary).
However, the hostname address is not a Fully Qualified Domain Name (FQDN) to be resolvable from the client's machine. The client's machine trying to connect to localhost defined as the replica set primary, thus failed to make a connection. Also, this is why you're seeing a message status where current topology: Type: Unknown but State: Connected. It failed to discover the deployment topology even before able to select a server to execute the command (ping)
You can solve this by setting resolvable hostnames for the value of the members field in the replica set configuration. In addition, when possible, use a logical DNS hostname instead of an ip address, as this avoids configuration changes due to ip address changes.
You can change the replica set hostnames using rs.reconfig() i.e:
cfg = rs.conf()
cfg.members[1].host = "<RESOLVABLE HOSTNAME>:<PORT NUMBER>"
rs.reconfig(cfg)
In your case, where there's only one replica set member it's quite straight forward. However if you're in production mode and have more than one members you can follow the steps outlined in Change Hostnames in a Replica Set where there are two options:
Change Hostnames without disrupting availability
Change Hostnames at the same time (one-go)
Having said all the explanation above,
alternatively, as your replica set deployment is only one server (development mode) you can set the connection mode to direct via ClientOptions.SetDirect(). Which specifies whether the client should connect directly to a server instead of auto-discovering other servers in the cluster (although this means you have no redundancy) i.e.:
opts := options.ClientOptions{ ConnectTimeout: &timeoutVariable}
opts.SetDirect(true)
opts.ApplyURI(addr)
client, err := mongo.Connect(connect.TODO(), &opts)
I'm setting up a sharded mongo cluster. I have two replica sets consisting of two nodes each, a replica set of three config servers, and a single mongos instance.
I have been able to add the replica set to the mongos instance:
sh.addShard("rs1/shard-rs01-s01");
This returns {"ok" : 1} and the same is true of the second replica set.
However when I try to do any database operations such as db.test.insert(...) I receive this error:
2017-02-23T01:17:28.599+0000 I ASIO [CatalogManagerReplacer]
Connecting to shard-RS01-S01:27017
2017-02-23T01:17:28.600+0000 I ASIO [CatalogManagerReplacer]
Connecting to config-01:27019
2017-02-23T01:17:28.603+0000 I ASIO [CatalogManagerReplacer]
Successfully connected to config-01:27019
2017-02-23T01:17:48.600+0000 I ASIO [CatalogManagerReplacer] Failed to connect to shard-RS01-S01:27017 - ExceededTimeLimit: Operation timed out
I double checked that the firewall wasn't blocking the connection by disabling it on all of the systems. For what it is worth, on the node that contains the mongos instance I can connect to the replica-set directly through the command like using this command regardless of the firewall state:
mongo --host rs1/shard-rs01-s01:27017
So I am fairly sure it is not a firewall issue. Anyone have any ideas?
Here's a shard map of the setup if it is useful for anyone able to help...
mongos> db.runCommand("getShardMap")
{
"map" : {
"config" : "rs0/config-01:27019,config-02:27019,config-03:27019",
"config-01:27019" : "rs0/config-01:27019,config-02:27019,config-03:27019",
"config-02:27019" : "rs0/config-01:27019,config-02:27019,config-03:27019",
"config-03:27019" : "rs0/config-01:27019,config-02:27019,config-03:27019",
"rs0/config-01:27019,config-02:27019,config-03:27019" : "rs0/config-01:27019,config-02:27019,config-03:27019",
"rs1" : "rs1/shard-RS01-S01:27017,shard-RS01-S02:27017",
"rs1/shard-RS01-S01:27017,shard-RS01-S02:27017" : "rs1/shard-RS01-S01:27017,shard-RS01-S02:27017",
"rs2" : "rs2/shard-RS02-S03:27017,shard-RS02-S04:27017",
"rs2/shard-RS02-S03:27017,shard-RS02-S04:27017" : "rs2/shard-RS02-S03:27017,shard-RS02-S04:27017",
"shard-RS01-S01:27017" : "rs1/shard-RS01-S01:27017,shard-RS01-S02:27017",
"shard-RS01-S02:27017" : "rs1/shard-RS01-S01:27017,shard-RS01-S02:27017",
"shard-RS02-S03:27017" : "rs2/shard-RS02-S03:27017,shard-RS02-S04:27017",
"shard-RS02-S04:27017" : "rs2/shard-RS02-S03:27017,shard-RS02-S04:27017"
},
"ok" : 1
}
you need to initialize your mongos.
rs.initiate( { _id: "configReplSet", configsvr: true, members: [ { _id: 0, host: "mongo-config-1:27017" }] } )
Originally, mongod is running and log to file shard1.log. And I run mongod again and failed.
In the shard1.log, there are only infomation about the failure of second command. The origin log is renamed to shard1.log.2015-02-01T07-41-46.
But the problem is the origin mongod not logging any more.
I check the opened file of origin mongod.
ls -l /proc/7980/fd | grep -v socket | grep log
l-wx------ 1 mongodb mongodb 64 2月 6 06:05 1 -> /data/mongodb/log/shard1.log.2015-02-01T07-41-46
The shard1.log.2015-02-01T07-41-46 stayed unchanged.
So how can I make the origin mongod log again?
Edit
here is my mongo config file
logpath=/data/mongodb/log/shard1.log
fork=true
dbpath=/data/mongodb/db/shard1
pidfilepath=/data/mongodb/pid/shard1.pid
shardsvr=true
replSet=shard1/10.0.0.1:27017
port=27017
bind_ip=10.0.0.2
the output of command db.serverCmdLineOpts
{
"argv" : [
"/data/mongodb/bin/mongod",
"-f",
"/data/mongodb/conf/shard1.conf"
],
"parsed" : {
"bind_ip" : "10.0.0.2",
"config" : "/data/mongodb/conf/shard1.conf",
"dbpath" : "/data/mongodb/db/shard1",
"fork" : "true",
"logpath" : "/data/mongodb/log/shard1.log",
"pidfilepath" : "/data/mongodb/pid/shard1.pid",
"port" : 27017,
"replSet" : "shard1/10.0.0.1:27017",
"shardsvr" : "true"
},
"ok" : 1
}
I have restarted 2 shards on non standard ports, by chaning their .conf files. Now when I connect via mongo and issue a listshards I get:
mongos> db.runCommand( { listshards : 1 } );
Tue Oct 23 17:36:21 uncaught exception: error {
"$err" : "error creating initial database config information :: caused by :: socket exception [CONNECT_ERROR] for vserver-dev-2:37017",
"code" : 11002
}
(37017 is the old port).
How can I update the shard ports on the router (mongos) ?
Manual updating the ports on the mongo config server:
mongo
use config
configsvr> db.shards.update({_id: "shard0000"} , {$set: {"host" : "vserver-dev-2:37018"}})
configsvr> db.shards.find()
{ "_id" : "shard0000", "host" : "vserver-dev-2:37018" }