We have mongodb as datastorage, and there is a MongoClient which we are using for connection pooling.
Question is whether to explicitly use the MongoClient.close to shutdown the connection pool or not.
Here's what I have explored on this so far.
The documentation for the close API says
Closes all resources associated with this instance, in particular any open network connections. Once called, this instance and any databases obtained from it can no longer be used.
But when I referred other questions on this topic, it says you can perform your operations and don't need to explicitly manage operations like MongoClient.close, as this object manages connection pooling automatically.
Java MongoDB connection pool
Both of them are contradicting. If I were to follow the second, what will be the downsides of it?
Will the connections in the pool be closed when the mongoclient object is de-referenced from jvm?
or will the connections stay open for a particular period of time and then expire?
I would like to know what are the actual downsides of this approach. Any pointers on this is highly appreciated.
IMO, using close on server shut down seems to be the clean way to do it.
But I would like to get an expert opinion on this.
Update: There is no need to explicitly close the connection pool via API. Mongo driver takes care of it.
Related
Forgive me if there's already been an answer to this somewhere, but I haven't seen anything definitive in the docs.
Is there a limit to the size of the connection pool?
I have a situation where there could be 100s or 1000s of connections open at once - should the connection pool be used for this or would that be an abuse of the feature?
Is there a limit to the size of the connection pool?
Probably not, however a greater concern is that each connection will take up RAM.
I have a situation where there could be 100s or 1000s of connections open at once - should the connection pool be used for this or would that be an abuse of the feature?
I don't see this being an abuse. I think during the times you have 100 or 1000 clients connecting at once, the server would handle the connections much better.
However if there is only 10 clients connecting and you have a connection pool of 1000, 900 connections could be seen as wasted resources.
Source: Deep Dive into Connection Pooling
I do not have any experience with connection pools, so don't take my information as granted. I would like to hear from someone with experience on the topic.
It turns out that using the connection pool to manage traffic to more than one database isn't possible - which is what I was aiming to do. The solution I'll be using involves creating numerous connections using createConnection(), closing any unused ones.
See an issue I opened in the Mongoose git project for a fuller explanation https://github.com/Automattic/mongoose/issues/6206
I sometimes get the following exception
SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
While using play framework and scalikeJDBC to connect to a MariaDB instance
Googling around showed it can either be that connections aren't being properly closed or that I should configure my thread pool to be bigger
Now onto the actual question:
I'd like to investigate further but I need a way to monitor said connection thread pool, ideally in the form a graph of sorts, but how?
I have no idea how to configure JMX and MBeans for netty (Play uses netty, right?) or if it's possible at all and google is not helping. I don't even know if this would be the right approach so I'm giving a bounty sized ammount of points (or even more) to whoever can provide a sweet set of steps on how to proceed
A bit of googling lends me to think that scalikeJDBC use commons dbcp as the underlying connection pool.
More googling links back to Monitoring for Commons DBCP? , which proposes to monitor exactly what you want I believe !
I have been testing spymemcached and xmemcached clients. I have been trying to find answers in the projects documentation but it is very poor.
My questions are regarding opening, closing and reusing the connections. I found this in one document:
A client may just close the connection at any moment it no longer needs it. Note,
however, that clients are encouraged to cache their connections rather
than reopen them every time they need to store or retrieve data. Caching connections will eliminate the overhead associated with establishing a TCP connection".
Spymemcached doesn't provide a connection pool, so every time I create a MemcachedClient instance I am creating a new connection right? Then when should I close the connection? Should I provide the same instance to all the threads in my application or create a new one every time?
xmemcached does have a connection pool. In this case should I close connections I get from the pool?
Spymemcached doesn't provide a connection pool, so every time I create a MemcachedClient instance I am creating a new connection right?
Yes, every time you create a new MemcachedClient object you create a new connection. Each connection appear asynchronous to the application so even having one connection will probably be enough for your application. Some people do however build a connection pool of MemcachedClients though.
Then when should I close the connection?
You shut down connections as soon as you no longer need to communicate with memcached. If you application is short lived you need to shutdown the connection in order to get the jvm to stop since MemcachedClient connections are daemon connections by default.
Should I provide the same instance to all the threads in my application or create a new one every time?
Use the same connection with multiple threads. Creating a new connection for each call will cause a significant performance drop because of the overhead of creating a TCP connection.
xmemcached does have a connection pool. In this case should I close connections I get from the pool?
I'm not familiar with xmemcached, but I would imagine you would only want to create a few (16 maybe) threads and share them with your application threads for the best performance.
i am going to have a website with 20k+ concurrent users.
i am going to use mongodb using one management node and 3 or more nodes for data sharding.
now my problem is maximum connections. if i have that many users accessing the database, how can i make sure they don't reach the maximum limit? also do i have to change anything maybe on the kernel to increase the connections?
basically the database will be used to keep hold of connected users to the site, so there are going to be heavy read/write operations.
thank you in advance.
You don't want to open a new database connection each time a new user connects. I don't know if you'll be able to scale to 20k+ concurrent users easily, since MongoDB uses a new thread for each new connection. You want your web app backend to have just one to a few database connections open and just use those in a pool, particularly since web usage is very asynchronous and event driven.
see: http://www.mongodb.org/display/DOCS/Connections
The server will use one thread per TCP
connection, therefore it is highly recomended that your application
use some sort of connection pooling. Luckily, most drivers handle this
for you behind the scenes. One notable exception is setups where your
app spawns a new process for each request, such as CGI and some
configurations of PHP.
Whatever driver you're using, you'll have to find out how they handle connections and if they pool or not. For instance, Node's Mongoose is non-blocking and so you use one connection per app usually. This is the kind of thing you probably want.
I have this question asked in the Go mailing list, but I think it is more general to get better response from SO.
When work with Java/.Net platform, I never had to manage database connection manually as the drivers handle it. Now, when try to connect to a no sql db with very basic driver support, it is my responsibility to manage the connection. The driver let connect, close, reconnect to a tcp port, but not sure how should i manage it (see the link). Do i have to create a new connection for each db request? can I use other 3rd party connection pooling libraries?
thanks.
I don't know enough about MongoDB to answer this question directly, but do you know how MongoDB handles requests over TCP? For example, one problem with a single TCP connection can be that the db will handle each request serially, potentially causing high latency even though it may be bottlenecking on a single machine and could handle a higher capacity.
Are the machines all running on a local network? If so, the cost of opening a new connection won't be too high, and might even be insignificant from a performance perspective regardless.
My two cents: Do one TCP connection per request and just profile it and see what happens. It is very easy to add pooling later if you're DoSing yourself, but it may never be a problem. That'll work right now, and you won't have to mess around with a third party library that may cause more problems than it solves.
Also, TCP programming is really easy. Don't be intimidated by it, detecting a closed socket, and reconnecting synchronously or asynchronously is simple.
Most mongodb drivers (clients) will create and use a connection pool when connecting to the server. Each socket (connection) can do one operation at a time at the server; because of how data is read off the socket you can issue many requests and server will just get them one after another and return data as each one completes.
There is a Go mongo db driver but it doesn't seem to do connection pooling. http://github.com/mikejs/gomongo
In addition to the answers here: if you find you do need to do some kind of connection pooling redis.go is a decent example of a database driver that pools connections. Specifically, look at the Client.popCon and Client.pushCon methods in the source.