MongoDB sharding problems - mongodb

Our mongodb server deployed with 2 shards, each has 1 master server and 2 slave servers.
The four slave servers run mongo config as proxy, and two of the slave servers run arbiters.
But the mongodb coundn't be used now.
I can connect to 192.168.0.1:8000(mongos) and exec queries like 'use database' or 'show dbs', but i cann't exec queries in a choosed database such as 'db.foo.count()', 'db.foo.findOne()'
Here is the error log:
mongos> db.dev.count()
Fri Aug 16 12:55:36 uncaught exception: count failed: {
"assertion" : "DBClientBase::findN: transport error: 10.81.4.72:7100 query: { setShardVersion: \"\", init: true, configdb: \"10.81.4.72:7300,10.42.50.26:7300,10.81.51.235:7300\", serverID: ObjectId('520db0a51fa00999772612b9'), authoritative: true }",
"assertionCode" : 10276,
"errmsg" : "db assertion failure",
"ok" : 0
}
Fri Aug 16 11:23:29 [conn8431] DBClientCursor::init call() failed
Fri Aug 16 11:23:29 [conn8430] Socket recv() errno:104 Connection reset by peer 10.81.4.72:7100
Fri Aug 16 11:23:29 [conn8430] SocketException: remote: 10.81.4.72:7100 error: 9001 socket exception [1] server [10.81.4.72:7100]
Fri Aug 16 11:23:29 [conn8430] DBClientCursor::init call() failed
Fri Aug 16 11:23:29 [conn8430] DBException in process: could not initialize cursor across all shards because : DBClientBase::findN: transport error: 10.81.4.72:7100 query: { setShardVersion: "", init: true, configdb: "10.81.4.72:7300,10.42.50.26:7300,10.81.51.235:7300", serverID: ObjectId('520d99c972581e6a124d0561'), authoritative: true } # s01/10.36.31.36:7100,10.42.50.24:7100,10.81.4.72:7100
i can only start on mongos, queries wouldn't be exec if more than 1 mongos run at the same time, error log:
mongos> db.dev.count() Fri Aug 16 15:12:29 uncaught exception: count failed: { "assertion" : "DBClientBase::findN: transport error: 10.81.4.72:7100 query: { setShardVersion: \"\", init: true, configdb: \"10.81.4.72:7300,10.42.50.26:7300,10.81.51.235:7300\", serverID: ObjectId('520dd04967557902f73a9fba'), authoritative: true }", "assertionCode" : 10276, "errmsg" : "db assertion failure", "ok" : 0 }

Could you please clarify if your set-up was working before, if you are just setting it up now?
To repair your MongoDB, you might want to follow this link:
http://docs.mongodb.org/manual/tutorial/recover-data-following-unexpected-shutdown/
References
MongoDB Documentation : Deploying a Shard-Cluster
MongoDB Documentation : Add Shards to an existing cluster
Older, outdated(!) info:
YouTube Video on Setting-up Sharding for MongoDB
Corresponding Blog on blog.serverdensity.com

Related

Older oplog entries are not getting truncated

I have a mongo instance running with oplogMinRetentionHours set to 24 hours and max oplog size set to 50G. But despite this config settings oplog entries seem to be withhold indefinitely since oplog has entries past 24 hours and oplog size has reached 1.4 TB and .34 TB on disk
db.runCommand( { serverStatus: 1 } ).oplogTruncation.oplogMinRetentionHours
24 hrs
db.getReplicationInfo()
{
"logSizeMB" : 51200,
"usedMB" : 1464142.51,
"timeDiff" : 3601538,
"timeDiffHours" : 1000.43,
"tFirst" : "Fri Mar 19 2021 14:15:49 GMT+0000 (Greenwich Mean Time)",
"tLast" : "Fri Apr 30 2021 06:41:27 GMT+0000 (Greenwich Mean Time)",
"now" : "Fri Apr 30 2021 06:41:28 GMT+0000 (Greenwich Mean Time)"
}
MongoDB server version: 4.4.0
OS: Windows Server 2016 DataCenter 64bit
what I have noticed is event with super user with root role is not able to access replset.oplogTruncateAfterPoint, not sure if this is by design
mongod.log
{"t":{"$date":"2021-04-30T06:35:51.308+00:00"},"s":"I", "c":"ACCESS",
"id":20436, "ctx":"conn8","msg":"Checking authorization
failed","attr":{"error":{"code":13,"codeName":"Unauthorized","errmsg":"not
authorized on local to execute command { aggregate:
"replset.oplogTruncateAfterPoint", pipeline: [ { $indexStats: {} }
], cursor: { batchSize: 1000.0 }, $clusterTime: { clusterTime:
Timestamp(1619764547, 1), signature: { hash: BinData(0,
180A28389B6BBA22ACEB5D3517029CFF8D31D3D8), keyId: 6935907196995633156
} }, $db: "local" }"}}}
Not sure why mongo would not delete older entries from oplog?
Mongodb oplog truncation seems to be triggered with inserts. So as and when insert happens oplog gets truncated.

Mongo show collections on remote server works on windows but stucks on ubuntu

I am in a strange situation with mongodb remote connection.
I have a Mongo DB on an ec2 instance and has been configured with password auth.
When I connect it and browse collections on a windows platform it works fine but when I try to do the same on Amazon ec2(Ubuntu) it just gets stuck.
Though I am able to connect on Ubuntu instance but when I try to run the commmand show collections.The server just hangs.
Windows-
E:\check\scheduler\trunk>mongo 54.xxx.xx.xx:27017/vdb -u vadmin -p db#dummypass
MongoDB shell version: 2.6.4
connecting to: 54.xxx.xx.xx:27017/vdb
> use vdb;
switched to db vdb
> show collections;
activity
user
adminpush
adminuserlog
friend
demo
addressbook
branddemovideo
Ubuntu(Ec2 Instance )
ubuntu#ip-10-146-147-102:~$ mongo 54.xxx.xx.xx:27017/vdb -u vadmin -p 'db#dummypass'
MongoDB shell version: 2.6.5
connecting to: 54.xxx.xx.xx:27017/vdb
> show databases;
admin 0.078125GB
local 0.078125GB
vdb 1.49951171875GB
> use vdb;
switched to db vdb
> show collections;
Its Just Stuck over here.
Finally after some time I got to see an error-
Mon Oct 20 19:52:51.949 Socket recv() errno:104 Connection reset by peer 54.xxx.xx.xx:27017
Mon Oct 20 19:52:51.954 SocketException: remote: 54.xxx.xx.xx:27017 error: 9001 socket exception [RECV_ERROR] server [54.xxx.xx.xx:27017]
Mon Oct 20 19:52:51.954 DBClientCursor::init call() failed
Mon Oct 20 19:52:51.955 Error: error doing query: failed at src/mongo/shell/query.js:78
Mon Oct 20 19:52:51.955 trying reconnect to 54.xxx.xx.xx:27017
Mon Oct 20 19:52:51.969 reconnect 54.xxx.xx.xx:27017 ok
Mongo Logs given below-
Local Server
ubuntu#ip-172-xx-x-xx:/var/log/mongodb$ tail -f mongodb.log
2014-10-20T20:34:50.621+0530 [clientcursormon] connections:0
2014-10-20T20:39:50.633+0530 [clientcursormon] mem (MB) res:36 virt:342
2014-10-20T20:39:50.633+0530 [clientcursormon] mapped (incl journal view):160
2014-10-20T20:39:50.633+0530 [clientcursormon] connections:0
2014-10-20T20:44:50.645+0530 [clientcursormon] mem (MB) res:36 virt:342
2014-10-20T20:44:50.645+0530 [clientcursormon] mapped (incl journal view):160
2014-10-20T20:44:50.645+0530 [clientcursormon] connections:0
2014-10-20T20:49:50.658+0530 [clientcursormon] mem (MB) res:36 virt:342
2014-10-20T20:49:50.658+0530 [clientcursormon] mapped (incl journal view):160
2014-10-20T20:49:50.658+0530 [clientcursormon] connections:0
Remote Server-
2014-10-20T20:55:05.688+0530 [initandlisten] connection accepted from 54.xxx.xxx.xxx:57881 #42 (11 connections now open)
2014-10-20T20:55:05.692+0530 [conn42] authenticate db: vdb { authenticate: 1, nonce: "xxx", user: "vadmin", key: "xxx" }
2014-10-20T20:55:36.039+0530 [conn13] query vdb.video query: { query: { videoType: 5, viewType: 1, status: { $nin: [ 2, 5, 3 ] }, location.regionId: 2, frameHeight: { $gt: 0 } }, orderby: { popularRating: -1 } } planSummary: IXSCAN { location.regionId: 1 }, IXSCAN { location.regionId: 1 } cursorid:86724525795 ntoreturn:3 ntoskip:0 nscanned:2892 nscannedObjects:2892 keyUpdates:0 numYields:1 locks(micros) r:232390 nreturned:3 reslen:8101 116ms
2014-10-20T20:55:36.087+0530 [conn9] query vdb.video query: { query: { videoType: 5, viewType: 1, status: { $nin: [ 2, 5, 3 ] }, location.regionId: 2, frameHeight: { $gt: 0 } }, orderby: { popularRating: -1 } } planSummary: IXSCAN { location.regionId: 1 }, IXSCAN { location.regionId: 1 } cursorid:86710924707 ntoreturn:3 ntoskip:0 nscanned:2892 nscannedObjects:2892 keyUpdates:0 numYields:1 locks(micros) r:162238 nreturned:3 reslen:8101 112ms
Finally I resolved it. I was working in a VPC and had to use Private IP to connect to remote mongo as well as mysql.

MongoDs in ReplSet won't start after trying out some MapReduce

I was practicing some MapReduce inside of my Primary's mongo shell when it suddenly became a Secondary. I SSHed into the two other VM's with the other secondaries, and discovered that the mongod's had been rendered inoperaple. I killed them and I issued the mongod --config /etc/mongod.conf to kick them off and I entered the mongo shell. After a few seconds they were interrupted with:
2014-09-14T22:29:54.142-0500 DBClientCursor::init call() failed
2014-09-14T22:29:54.143-0500 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-09-14T22:29:54.143-0500 warning: Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
2014-09-14T22:29:54.143-0500 reconnect 127.0.0.1:27017 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:27017 (127.0.0.1), connection attempt failed
>
This is from their (the two original secondaries in the replicaset) logs:
2014-09-14T22:09:21.879-0500 [rsBackgroundSync] replSet syncing to: vm-billing-001:27017
2014-09-14T22:09:21.880-0500 [rsSync] replSet still syncing, not yet to minValid optime 54165090:1
2014-09-14T22:09:21.882-0500 [rsBackgroundSync] replset setting syncSourceFeedback to vm-billing-001:27017
2014-09-14T22:09:21.886-0500 [rsSync] replSet SECONDARY
2014-09-14T22:09:21.886-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1_inc properties: { v: 1, key: { 0: 1 }, name: "_temp_0", ns: "test.tmp.mr.CCS.nonconforming_1_inc" }
2014-09-14T22:09:21.887-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.887-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1 properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "test.tmp.mr.CCS.nonconforming_1" }
2014-09-14T22:09:21.887-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.888-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1 properties: { v: 1, unique: true, key: { id: 1.0 }, name: "id_1", ns: "test.tmp.mr.CCS.nonconforming_1" }
2014-09-14T22:09:21.888-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.891-0500 [repl writer worker 2] ERROR: writer worker caught exception: :: caused by :: 11000 insertDocument :: caused by :: 11000 E11000 duplicate key error index: cisco.tmp.mr.CCS.nonconforming_1.$id_1 dup key: { : null } on: { ts: Timestamp 1410748561000|46, h: 9014687153249982311, v: 2, op: "i", ns: "cisco.tmp.mr.CCS.nonconforming_1", o: { _id: 14, value: 1.0 } }
2014-09-14T22:09:21.891-0500 [repl writer worker 2] Fatal Assertion 16360
2014-09-14T22:09:21.891-0500 [repl writer worker 2]
I can issue mongo --host ... --port ... from both of the two VMs that can't start the mongo to the original primary mongo, but I do see some connection refused notes above in the error log.
My original primary mongod can still be connected to in the mongo shell, but it is a primary. I can kill it and restart it and it will start up in secondary.
How can I roll back to the last known state and restart my replica set?

mongoexport does not write any records to json output file

I have tried to export a json file from MongoDB with mongoexport in the following way:
$ mongoexport --db db --collection ds --dbpath ~/db --out ds.json
exported 0 records
Sat Apr 20 23:13:18 dbexit:
Sat Apr 20 23:13:18 [tools] shutdown: going to close listening sockets...
Sat Apr 20 23:13:18 [tools] shutdown: going to flush diaglog...
Sat Apr 20 23:13:18 [tools] shutdown: going to close sockets...
Sat Apr 20 23:13:18 [tools] shutdown: waiting for fs preallocator...
Sat Apr 20 23:13:18 [tools] shutdown: closing all files...
Sat Apr 20 23:13:18 [tools] closeAllFiles() finished
Sat Apr 20 23:13:18 [tools] shutdown: removing fs lock...
Sat Apr 20 23:13:18 dbexit: really exiting now
I do not understand why the created json file empty is, because the database actually contains the following data:
$ mongo
MongoDB shell version: 2.2.3
connecting to: test
> use ds
switched to db ds
> db.ds.find().pretty()
{
"_id" : "1_522311",
"chr" : 1,
"kg" : {
"yri" : {
"major" : "D",
"minor" : "A",
"maf" : 0.33036
},
"ceu" : {
"major" : "C",
"minor" : "A",
"maf" : 0.05263
}
},
"pos" : 522311
}
{
"_id" : "1_223336",
"chr" : 1,
"kg" : {
"yri" : {
"major" : "G",
"minor" : "C",
"maf" : 0.473214
},
"ceu" : {
"major" : "C",
"minor" : "G",
"maf" : 0.017544
},
"jptchb" : {
"major" : "C",
"minor" : "G",
"maf" : 0.220339
}
},
"pos" : 223336
}
What did I do wrong?
Thank you in advance.
It appears that you have a database called ds:
> use ds
switched to db ds
use ds switches the current database to the ds database (db from the shell is just an alias for the current database).
Then, you have a collection called ds as well:
> db.ds.find().pretty()
So, that means you have a ds database with a ds collection (ds.ds).
You should then use the export like this with the --db option set to ds (assuming the path to the database is correct):
mongoexport --db ds --collection ds --dbpath ~/db --out ds.json
3.0+ Update: --dbpath is unavailable.
I know this answer may don't satisfied the question, but I hope it will help people struggling with mongoexport does not write any records to json output file
My problem was that I was using quotes. For example:
$mongoexport --db 'my-database' --collection 'my-collection' --out ds.json
but the correct query is (without quotes):
$mongoexport --db my-database --collection my-collection --out ds.json
I discover this when I did a $mongodump and it creates a folder with quotes. This was very estrange to me, but I understood that mongoexport interprets the quotes as part of the name. so when I removed it, it workout fine.

mongodb reconfigure shard ports

I have restarted 2 shards on non standard ports, by chaning their .conf files. Now when I connect via mongo and issue a listshards I get:
mongos> db.runCommand( { listshards : 1 } );
Tue Oct 23 17:36:21 uncaught exception: error {
"$err" : "error creating initial database config information :: caused by :: socket exception [CONNECT_ERROR] for vserver-dev-2:37017",
"code" : 11002
}
(37017 is the old port).
How can I update the shard ports on the router (mongos) ?
Manual updating the ports on the mongo config server:
mongo
use config
configsvr> db.shards.update({_id: "shard0000"} , {$set: {"host" : "vserver-dev-2:37018"}})
configsvr> db.shards.find()
{ "_id" : "shard0000", "host" : "vserver-dev-2:37018" }