How to use Cashbah MongoDB connections? - mongodb

Note: I realise there is a similar question on SO but it talks about an old version of Casbah, plus, the behaviour explained in the answer is not what I see!
I was under the impression that Casbah's MongoClient handled connection pooling. However, doing lsof on my process I see a big and growing number of mongodb connections, which makes me doubt this pooling actually exists.
Basically, this is what I'm doing:
class MongodbDataStore {
val mongoClient = MongoClient("host",27017)("database")
var getObject1(): Object1 = {
val collection = mongoClient("object1Collection")
...
}
var getObject2(): Object2 = {
val collection = mongoClient("object2Collection")
...
}
}
So, I never close MongoClient.
Should I be closing it after every query? Implement my own pooling? What then?
Thank you

Casbah is a wrapper around the MongoDB Java client, so the connection is actually managed by it.
According to the Java driver documentation (http://docs.mongodb.org/ecosystem/drivers/java-concurrency/) :
If you are using in a web serving environment, for example, you should
create a single MongoClient instance, and you can use it in every
request. The MongoClient object maintains an internal pool of
connections to the database (default maximum pool size of 100). For
every request to the DB (find, insert, etc) the Java thread will
obtain a connection from the pool, execute the operation, and release
the connection. This means the connection (socket) used may be
different each time.
By the way, that's what I've experienced in production. I did not see any problem with this.

Related

MongoDB - MongoClientSettings property deprecation

I have configured in my .NET Core 3.1 application the MongoDb waitQueueMultiple & maxPoolSize in the Mongo connection string.
I would like to set these parameters in the MongoClientSettings instead of in the connection string, but I read that from 2.12.3 version of MongoDb waitQueueSize will be deprecated and I don't understand what the alternative will be.
Do you have any suggestions?
This is how I configured now my code:
var url = new MongoUrl(_mongoDbConfiguration.ConnectionString);
var settings = MongoClientSettings.FromUrl(url);
settings.MaxConnectionPoolSize = _mongoDbConfiguration.MaxPoolSize;
settings.WaitQueueSize = _mongoDbConfiguration.WaitQueueMultiple;
var client = new MongoClient(settings);
var database = client.GetDatabase(url.DatabaseName);
return database;
Thanks,
Dave.
I found that documentation digging around.
This docs come from PyMongo but I think could be useful also for .NET one:
waitQueueMultiple has been deprecated without replacement. This option was a poor solution for putting an upper bound on queuing since it didn't affect queuing in other parts of the driver.
Once the pool reaches its maximum size, additional threads have to wait for sockets to become available. PyMongo does not limit the number of threads that can wait for sockets to become available and it is the application's responsibility to limit the size of its thread pool to bound queuing during a load spike. Threads are allowed to wait for any length of time unless waitQueueTimeoutMS is defined.
The default value for waitQueueTimeout is 2 minutes as per csharp driver docs.
So in my application I give the possibility to set MaxConnectionPoolSize and WaitQueueTimeout.
If not configured the application will be taken the default values.

Can I reuse an connection in mongodb? How this connections actually work?

Trying to do some simple things with mongodb my mind got stuck in something that feels kinda strange for me.
client = MongoClient(connection_string)
db = client.database
print(db)
client.close()
I thought that when make a connection it is used only this one along the rest of the code until the close() method. But it doesn't seem to work that way... I don't know how I ended up having 9 connections when it supposed to be a single one, and even if each 'request' is a connection there's too many of them
For now it's not a big problem, just bothers me the fact that I don't know exactly how this works!
When you do new MongoClient(), you are not establishing just one connection. In fact you are creating the client, that will have a connection pool. When you do one or multiple requests, the driver uses an available connection from the pool. When the use is complete, the connection goes back to the pool.
Calling MongoClient constructor every time you need to talk to the db is a very bad practice and will incur a penalty for the handshake. Use dependency injection or singleton to have MongoClient.
According to the documentation, you should create one client per process.
Your code seems to be the correct way if it is a single thread process. If you don't need any more connections to the server, you can limit the pool size by explicitly specifying the number:
client = MongoClient(host, port, maxPoolSize=<num>).
On the other hand, if the code might later use the same connection, it is better to simply create the client once in the beginning, and use it across the code.

Vertx to mongoDB connections

I'm working on a Java/vertx project where the backend is MongoDB (I used to work with Elixir/Erlang since some time, and I'm quite new to vertx but I believe it's the best fit). Basically, I have an http API handled by some HttpServerVerticles which need to store data to (or retrieve data from) the mongo db and to send the appropriate reply to the API caller. I'm looking for the right pattern to implement the queries and the handling of the replies.
From the official guide and some tutorials, I see that for a relational JDBC database, it is necessary to define a dedicated verticle that will handle queries asynchronously. This was my first try with the mongo client but it introduces a lot of boilerplate.
On the other hand, from the mongo client documentation I read that it's Completely non-blocking and that it has its own connection pool. Does that mean that we can safely (from vertx event loop point of view), define and use the mongo client directly in the http verticle ?
Is there any alternative pattern ?
Versions : vertx:3.5.4 / mongodb:4.0.3
It's like that: mongo connection pool is exactly like SQL-db pool synchronous and blocking in it's nature, but is wrapped with non-blocking vert.x API around.
So, instead of a normal blocking way of
JsonObject obj = mongo.get( someQuery )
you have rather a non-blocking call out of the box:
mongo.findOne( 'collectionName', someQuery ){ AsyncResult<JsonObject> res ->
JsonObject obj = res.result()
doStuff( obj )
}
That means, that you can safely use it directly on the event-loop in any type of verticle without reinventing the asyncronous wheel over and over again.
At our client we use mongodb-driver-rx. Vertx has support for RX (vertx-rx-java) and it fits pretty well on mongodb-driver-rx.
For more information see:
https://mongodb.github.io/mongo-java-driver-rx/
https://vertx.io/docs/vertx-rx/java/
https://github.com/vert-x3/vertx-examples/blob/master/rxjava-2-examples/src/main/java/io/vertx/example/reactivex/database/mongo/Client.java

Server Swift with mongodb manager singleton

I am working on a project using Vapor and Mongodb.
Let's say that at a specific route
drop.get("user", String.self) { request, user in
// ... query Mongodb
}
I want to query the database and see if an input user already exists.
Is it wise to have a singleton MongoManager class that handles all the connection with the database?
drop.get("user", String.self) { request, user in
MongoManager.sharedInstance.findUser(user)
}
Do I create a bottleneck with this implementation?
No, you will not create a bottleneck unless you have a single-threaded mechanism that stands between your Vapor Handler and MongoDB.
MongoKitten (the underlying driver for Swift + MongoDB projects) manages the connection pool internally. You can blindly fire queries at MongoKitten and it'll figure out what connection to use or will create a new one if necessary.
Users of MongoKitten 3 will use a single connection per request. If multiple requests are being handled simultaneously, additional connections will be opened.
Users of MongoKitten 4 will use a single connection for 3 requests, this is configurable. The connection pool will expand by opening more connections if there are too many requests are being done.
Users of the upcoming Meow ORM (which works similar to what you're building) will use a single connection per thread. The connection pool will expand if all connections are reserved.

Re-using sessions in ScalaQuery?

I need to do small (but frequent) operations on my database, from one of my api methods. When I try wrapping them into "withSession" each time, I get terrible performance.
db withSession {
SomeTable.insert(a,b)
}
Running the above example 100 times takes 22 seconds. Running them all in a single session is instantaneous.
Is there a way to re-use the session in subsequent function invocations?
Do you have some type of connection pooling (see JDBC Connection Pooling: Connection Reuse?)? If not you'll be using a new connection for every withSession(...) and that is a very slow approach. See http://groups.google.com/group/scalaquery/browse_thread/thread/9c32a2211aa8cea9 for a description of how to use C3PO with ScalaQuery.
If you use a managed resource from an application server you'll usually get this for "free", but in stand-alone servers (for example jetty) you'll have to configure this yourself.
I'm probably stating the way too obvious, but you could just put more calls inside the withSession block like:
db withSession {
SomeTable.insert(a,b)
SomeOtherTable.insert(a,b)
}
Alternately you can create an implicit session, do your business, then close it when you're done:
implicit val session = db.createSession
SomeTable.insert(a,b)
SomeOtherTable.insert(a,b)
session.close