Mongo staled with too much insertions - mongodb

I trying to use mongodb to run a multiagent simulation.
I have one mongo instance in the same server that runs the simulation program, but when I have too much agents (~100.000 in 10 simulation steps) the mongodb becomes stalled during seconds.
The code for insert data in mongo is similar to:
if( mongo_client( &m_conn , m_dbhost.c_str(), m_dbport ) != MONGO_OK ) {
cout << "failed to connect '" << m_dbhost << ":" << m_dbport << "'\n";
cout << " mongo error: " << m_conn.err << endl;
return;
}
bson_init( &b );
bson_append_new_oid( &b, "_id" ) != BSON_OK );
bson_append_double( &b, "time", time );
bson_append_double( &b, "x", posx );
bson_append_double( &b, "y", posy );
bson_finish( &b );
if( mongo_insert( &m_conn , ns.c_str() , &b, NULL ) != MONGO_OK ){
cout << "failed to insert in mongo\n";
}
bson_destroy( &b );
mongo_disconnect( &m_conn );
Also, during the simulation, If I try to access using the mongo-shell, I also get errors:
$ mongo
MongoDB shell version: 2.4.1
connecting to: test
Wed Apr 3 10:10:24.870 JavaScript execution failed: Error: couldn't connect to server 127.0.0.1:27017 at src/mongo/shell/mongo.js:L112
exception: connect failed
After the simulation is ended, the mongo shell gets responsive again, and I can check that there is data in the database but it is discontinued. In the example, the agent m0n999 saved only 6 of 10 steps:
> show dbs
dB0B7F527F0FA45518712C8CB27611BD7 5.951171875GB
local 0.078125GB
> db.ins.m0n999.find()
{ "_id" : ObjectId("515bdf564c60ec1e000003e7"), "time" : 1, "x" : 1.1, "y" : 8.1 }
{ "_id" : ObjectId("515be0214c60ec1e0001075f"), "time" : 2, "x" : 1.2000000000000002, "y" : 8.2 }
{ "_id" : ObjectId("515be1c04c60ec1e0002da3a"), "time" : 4, "x" : 1.4000000000000004, "y" : 8.399999999999999 }
{ "_id" : ObjectId("515be2934c60ec1e0003b82c"), "time" : 5, "x" : 1.5000000000000004, "y" : 8.499999999999998 }
{ "_id" : ObjectId("515be3664c60ec1e000497cf"), "time" : 6, "x" : 1.6000000000000005, "y" : 8.599999999999998 }
{ "_id" : ObjectId("515be6cc4c60ec1e000824b2"), "time" : 10, "x" : 2.000000000000001, "y" : 8.999999999999996 }
>
How can solve this problem? How can avoid the lost of connections and recover from mongo stalls?
UPDATE
I'm getting in the global log errors like:
"Wed Apr 3 11:53:00.379 [conn1378573] error: hashtable namespace index max chain reached:1335",
"Wed Apr 3 11:53:00.379 [conn1378573] error: hashtable namespace index max chain reached:1335",
"Wed Apr 3 11:53:00.379 [conn1378573] error: hashtable namespace index max chain reached:1335",
"Wed Apr 3 11:53:00.379 [conn1378573] error: hashtable namespace index max chain reached:1335",
"Wed Apr 3 11:53:00.379 [conn1378573] end connection 127.0.0.1:40748 (1 connection now open)",

I solved the problem, it has two errors:
I was creating too much collections. I changed from one collection per Agent, to only on collection per simulation proccess.
I was creating too much connections. I changed from one connection per Agent iteration, to only one connection per simulation step.

Related

retrieving RDS license

I have a question, iam running a PowerShell command to retrieved RDS LicenseKeyPack details, according to win32-tslicensekeypack KeyPackType should a number between 0 and 6, yet i am getting 7 on some outputs. What dose this mean?.
Example output:
KeyPackId : 3
KeyPackType : 2
ProductVersion : Windows Server 2016
TypeAndModel : RDS Per User CAL
AvailableLicenses : 48
IssuedLicenses : 1202
ExpirationDate : 20380101000000.000000-000
KeyPackId : 5
KeyPackType : 7
ProductVersion : Windows Server 2012
TypeAndModel : RDS Per User CAL
AvailableLicenses : 0
IssuedLicenses : 1
ExpirationDate : 20380119031407.000000-000
KeyPackId : 7
KeyPackType : 7
ProductVersion : Windows Server 2016
TypeAndModel : RDS Per User CAL
AvailableLicenses : 0
IssuedLicenses : 49
ExpirationDate : 20380119031407.000000-000"

MongoDB query to compute percentage

I am new to MongoDB and kind of stuck at this query. Any help/guidance will be highly appreciated. I am not able to calculate the percentage in the desired way. There is something wrong with my pipeline where prerequisites of percentage are not computed correctly. Following I provide my unsuccessful attempt along with the desired output.
Single entry in the collection looks like below:
_id : ObjectId("602fb382f060fff5419fd0d1")
time : "2019/05/02 00:00:00"
station_id : 3544
station_name : "Underhill Ave &; Pacific St"
station_status : "In Service"
latitude : 40.6804836
longitude : -73.9646795
zipcode : 11238
borough : "Brooklyn"
neighbourhood : "Prospect Heights"
available_bikes : 5
available_docks : 21
The query I am trying to solve is:
Given a station_id (e.g., 522) and a num_hours (e.g., 3) passed as parameters:
- Consider only the measurements where the station_status = “In Service”.
- Consider only the measurements for that concrete
“station_id”.
- Compute the percentage of measurements with
available_bikes = 0 for each hour of the day (e.g., for the period
[8am, 9am) the percentage is 15.06% and for the period [9am, 10am)
the percentage is
27.32%).
- Sort the percentage results in decreasing order.
- Return the top “num_hours” documents.
The desired output is:
--- DOCUMENT 0 INFO ---
---------------------------------
hour : 19
percentage : 65.37
total_measurements : 283
zero_bikes_measurements : 185
---------------------------------
--- DOCUMENT 1 INFO ---
---------------------------------
hour : 21
percentage : 64.79
total_measurements : 284
zero_bikes_measurements : 184
---------------------------------
--- DOCUMENT 2 INFO ---
---------------------------------
hour : 00
percentage : 63.73
total_measurements : 284
zero_bikes_measurements : 181
My attempt is:
command_1 = {"$match": {"station_status": "In Service", "station_id": station_id, "available_bikes": 0}}
my_query.append(command_1)
command_2 = {"$group": {"_id": "null", "total_measurements": {"$sum": 1}}}
my_query.append(command_2)
command_3 = {"$project": {"_id": 0,
"station_id": 1,
"station_status": 1,
"hour": {"$substr": ["$time", 11, 2]},
"available_bikes": 1,
"total_measurements": {"$sum": 1}
}
}
my_query.append(command_3)
command_4 = {"$group": {"_id": "$hour", "zero_bikes_measurements": {"$sum": 1}}}
my_query.append(command_4)
command_5 = {"$project": {"percent": {
"$multiply": [{"$divide": ["$total_measurements", "$zero_bikes_measurements"]},
100]}}}
my_query.append(command_5)
I've taken a look at this and I'm going to offer some sincere advice:
Don't try and do this in an aggregate query. Just go back to basics and pull the numbers out using find()s and then calculate the numbers in python.
If you want to persist with an aggregate query, I will say that your match command filters on available_bikes equal to zero. You never have the total number of measurements, so you can never find the percentage. Also when you have done your first $group, your "lose" your projection so at that point in the pipeline you only have total_measurements and that's it (comment out the commands 3 to 5 to see what I mean).

MongoDB benchmarking inserts

I am trying to benchmark MongoDB with the JS harness. I am trying to do inserts. The example given in mongo website.
However, I am trying an insert operation, which works totally fine, but gives out wrong queries/sec.
ops = [{op: "insert", ns: "benchmark.bench", safe: false, doc: {"a": 1}}]
The above works fine. Then, I have run the following in mongo shell:
for ( x = 1; x<=128; x*=2){
res = benchRun( { parallel : x ,
seconds : 5 ,
ops : ops
} )
print( "threads: " + x + "\t queries/sec: " + res.query )
}
It gives out:
threads: 1 queries/sec: 0
threads: 2 queries/sec: 0
threads: 4 queries/sec: 0
threads: 8 queries/sec: 0
threads: 16 queries/sec: 0
threads: 32 queries/sec: 1.4
threads: 64 queries/sec: 0
threads: 128 queries/sec: 0
I dont understand why the queries/sec is 0 and not a single doc has been inserted. Is this right was testing performance for inserts?
Answering because I just encountered a similar problem.
Try replacing your print statement with printjson(res).
You will see that res has the following fields:
{
"note" : "values per second",
"errCount" : NumberLong(0),
"trapped" : "error: not implemented",
"insertLatencyAverageMicros" : 8.173300153139357,
"totalOps" : NumberLong(130600),
"totalOps/s" : 25366.173139864142,
"findOne" : 0,
"insert" : 25366.173139864142,
"delete" : 0,
"update" : 0,
"query" : 0,
"command" : 0
}
As you can see, the query count is 0, hence when you print res.query it gives 0. To get the number of insert operations per second you would want to print res.insert. I believe res.query corresponds to the "find" operation.

MongoDB chunk migration steps in changelog collection

Can someone clarify all the steps for "moveChunk.from" and "moveChunk.to". I want to know at these steps, which operations are performed (I guess the value of the steps represents the time ms, it took for the step). This will help me to derive any slowest step that is occurring during chunk migration.
{
"_id" : "bdvlpabhishekk-2013-07-20T17:46:28-51eaccf40c5c5c12e0e451d5",
"server" : "bdvlpabhishekk",
"clientAddr" : "127.0.0.1:50933",
"time" : ISODate("2013-07-20T17:46:28.589Z"),
"what" : "moveChunk.from",
"ns" : "test.test",
"details" : {
"min" : {
"key1" : 151110
},
"max" : {
"key1" : 171315
},
"step1 of 6" : 0,
"step2 of 6" : 1,
"step3 of 6" : 60,
"step4 of 6" : 2067,
"step5 of 6" : 7,
"step6 of 6" : 0
}
}
{
"_id" : "bdvlpabhishekk-2013-07-20T17:46:31-51eaccf7d6a98a5663942b06",
"server" : "bdvlpabhishekk",
"clientAddr" : ":27017",
"time" : ISODate("2013-07-20T17:46:31.671Z"),
"what" : "moveChunk.to",
"ns" : "test.test",
"details" : {
"min" : {
"key1" : 171315
},
"max" : {
"key1" : 192199
},
"step1 of 5" : 0,
"step2 of 5" : 0,
"step3 of 5" : 1712,
"step4 of 5" : 0,
"step5 of 5" : 344
}
}
All these steps are explained in the "M202: MONGODB ADVANCED DEPLOYMENT AND OPERATIONS" course, which is available online for free (I can't post this link here because of the limitation on stackoverflow on a number of posted URLs, just try to find the course in google)
Related videos from this course are: Anatomy of a migration overview and Anatomy of a migration deep dive.
The explanation follows.
All time values are in milliseconds.
Let's say F is a "moveChunk.from" and T is a "moveChunk.to". The steps are F1..F6 and T1..T5.The steps are performed sequentially F1, F2, F3, F4: {T1, T2, T3, T4, T5}, F5, F6. Step F4 includes {T1..T5} and timing of F4 is the sum of T1..T5 (but there is no exact match).
F1 - mongos sends 'moveChunk' command to F (primary of the shard to migrate from)
F2 - sanity checking of the command
F3 - F sends the command to T (read this chunk from me)
F4,T1..T3 - transfer starts, T performs sanity checking, index checking etc
F4,T4 - catch up on subsequent operations (if there were inserts into the chunk while the transfer was going, send updates from F to T)
F4,T5 - steady state (changes are written to the primary log)
F5 - about to commit to config server changes about the new chunk location (critical secition)
F6 - clean up
All chunk migrations use the following procedure:
The balancer process sends the moveChunk command to the source shard.
The source starts the move with an internal moveChunk command. During the migration process, operations to the chunk route to the source shard. The source shard is responsible for incoming write operations for the chunk.
The destination shard begins requesting documents in the chunk and starts receiving copies of the data.
After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure that it has the changes to the migrated documents that occurred during the migration.
When fully synchronized, the destination shard connects to the config database and updates the cluster metadata with the new location for the chunk.
After the destination shard completes the update of the metadata, and once there are no open cursors on the chunk, the source shard deletes its copy of the documents.
Taken from this.

$pull operation in MongoDB not working for me

I have a document with the following key-array pair:
"home" : [
"Kevin Garnett",
"Paul Pierce",
"Rajon Rondo",
"Brandon Bass",
" 5 sec inbound",
"Kevin Seraphin"
]
I want to remove the element " 5 sec inbound" from the array and use the following command (in the MongoDB shell):
>coll.update({},{"$pull":{"home":" 5 sec inbound"}})
This is not working as verified by a query:
>coll.findOne({"home":/5 sec inbound/})
"home" : [
"Kevin Garnett",
"Paul Pierce",
"Rajon Rondo",
"Brandon Bass",
" 5 sec inbound",
"Kevin Seraphin"
]
Any help would be greatly appreciated!
That very same statement works for me:
> db.test.insert({"home" : [
... "Kevin Garnett",
... "Paul Pierce",
... "Rajon Rondo",
... "Brandon Bass",
... " 5 sec inbound",
... "Kevin Seraphin"
... ]})
> db.test.find({"home":/5 sec inbound/}).count()
1
> db.test.update({},{"$pull":{"home":" 5 sec inbound"}})
> db.test.find({"home":/5 sec inbound/}).count()
0