TCP connection management - mongodb

I have this question asked in the Go mailing list, but I think it is more general to get better response from SO.
When work with Java/.Net platform, I never had to manage database connection manually as the drivers handle it. Now, when try to connect to a no sql db with very basic driver support, it is my responsibility to manage the connection. The driver let connect, close, reconnect to a tcp port, but not sure how should i manage it (see the link). Do i have to create a new connection for each db request? can I use other 3rd party connection pooling libraries?
thanks.

I don't know enough about MongoDB to answer this question directly, but do you know how MongoDB handles requests over TCP? For example, one problem with a single TCP connection can be that the db will handle each request serially, potentially causing high latency even though it may be bottlenecking on a single machine and could handle a higher capacity.
Are the machines all running on a local network? If so, the cost of opening a new connection won't be too high, and might even be insignificant from a performance perspective regardless.
My two cents: Do one TCP connection per request and just profile it and see what happens. It is very easy to add pooling later if you're DoSing yourself, but it may never be a problem. That'll work right now, and you won't have to mess around with a third party library that may cause more problems than it solves.
Also, TCP programming is really easy. Don't be intimidated by it, detecting a closed socket, and reconnecting synchronously or asynchronously is simple.

Most mongodb drivers (clients) will create and use a connection pool when connecting to the server. Each socket (connection) can do one operation at a time at the server; because of how data is read off the socket you can issue many requests and server will just get them one after another and return data as each one completes.
There is a Go mongo db driver but it doesn't seem to do connection pooling. http://github.com/mikejs/gomongo

In addition to the answers here: if you find you do need to do some kind of connection pooling redis.go is a decent example of a database driver that pools connections. Specifically, look at the Client.popCon and Client.pushCon methods in the source.

Related

MongoDB Connection Pooling Shutdown

We have mongodb as datastorage, and there is a MongoClient which we are using for connection pooling.
Question is whether to explicitly use the MongoClient.close to shutdown the connection pool or not.
Here's what I have explored on this so far.
The documentation for the close API says
Closes all resources associated with this instance, in particular any open network connections. Once called, this instance and any databases obtained from it can no longer be used.
But when I referred other questions on this topic, it says you can perform your operations and don't need to explicitly manage operations like MongoClient.close, as this object manages connection pooling automatically.
Java MongoDB connection pool
Both of them are contradicting. If I were to follow the second, what will be the downsides of it?
Will the connections in the pool be closed when the mongoclient object is de-referenced from jvm?
or will the connections stay open for a particular period of time and then expire?
I would like to know what are the actual downsides of this approach. Any pointers on this is highly appreciated.
IMO, using close on server shut down seems to be the clean way to do it.
But I would like to get an expert opinion on this.
Update: There is no need to explicitly close the connection pool via API. Mongo driver takes care of it.

golang grpc socket tuning

I have a golang client application talking a server via GRPC. I noticed that while the application is running that the number of sockets accumulated on the client app keeps climbing till around 9000. At which point I pause client. However, after there are no more traffic between the client and the server the number sockets still stayed at that level even after 8 hours.
Is there anyway we can tune GRPC for socket usage? Such as closing sockets after a timeout? Is using streaming another way to limit number of sockets being opened?
Thanks for any help.
I'd start by making sure that your client application cleans up unused connections (grpc.ClientConn) by calling Close() method on it.
Also, since I don't know what exactly your application does so I'm gonna go ahead and suggest reusing connections for multiple RPCs (you're probably already doing this).
And to answer your question about setting timeout deadline on connections:
1. You shouldn't have to do this. Feel free to open up an issue on https://github.com/grpc/grpc-go about whatever gRPC shortcoming is forcing you to take this route.
2. But if you must know, you can use a custom dialer(https://github.com/grpc/grpc-go/blob/13975c070286c7371aa3a8b3c230e90d7bf029fc/clientconn.go#L333) and set a deadline on the net.Conn that you return from it.
Best,
Mak

TCP Server is overwhelmed by clients that only "connect" without sending any data

I have created a TCP server using .NET TcpListener.
I have some concerns on how it could be abused by spamming a lot of bogus connections similar to a DoS-like kind of attack.
I created a small console app to repeatedly initiate a connection to the server (only "connect" without transmitting other kind of data). The "max allowable concurrent connections limit" which is a setting in the server to prevent it from being overwhelmed, was met in an instant. This rendered my server pretty much useless since it could not accept new connections unless the other fake connections disconnect. This proves that my concern is not unfounded.
Is there any way we can do from the application level to prevent this?
I was thinking to require clients to send a kind of token when connecting and the server would refuse connections that don't but I don't think TCP works that way.
Is relying on external solutions the only way? Eg. VPN, firewall, NAT etc?
Set a read timeout on every accepted socket, and close it if it triggers.

how do I write my own production web server?

I am making a unix ssl server/client. So far I have implemented FD_SET with select to handle all connections concurrently in one master server process. However due to __FD_SETSIZE the number of clients can only be 1024. I need to increase the number of clients and efficiency of the server. Changing the __FD_SETSIZE has potential problems (apparently?) so I am stuck.
So far the network includes: errno.h detection, signal detection -> atomic handling, fd_set -> select(), successful stream socket based communication.
I would really appreciate it if someone can tell me what should I do? do I fork() after 1024 (which presents its own problems, if its even doable?) do I implement threads to handle each client request, or just client data or both?
What is the best network architecture in your opinion? keep in mind its a socket stream based connection that is meant to handle as much punishment as possible and allowing as many clients to the server as possible.
Don't write your own production web server.
There are too many open source servers out there all written by people who know more about high connectivity and SSL than you do. They also have the advantage of being tested to a degree that you'd never be able to accomplish with your homebrew server.

mongodb & max connections

i am going to have a website with 20k+ concurrent users.
i am going to use mongodb using one management node and 3 or more nodes for data sharding.
now my problem is maximum connections. if i have that many users accessing the database, how can i make sure they don't reach the maximum limit? also do i have to change anything maybe on the kernel to increase the connections?
basically the database will be used to keep hold of connected users to the site, so there are going to be heavy read/write operations.
thank you in advance.
You don't want to open a new database connection each time a new user connects. I don't know if you'll be able to scale to 20k+ concurrent users easily, since MongoDB uses a new thread for each new connection. You want your web app backend to have just one to a few database connections open and just use those in a pool, particularly since web usage is very asynchronous and event driven.
see: http://www.mongodb.org/display/DOCS/Connections
The server will use one thread per TCP
connection, therefore it is highly recomended that your application
use some sort of connection pooling. Luckily, most drivers handle this
for you behind the scenes. One notable exception is setups where your
app spawns a new process for each request, such as CGI and some
configurations of PHP.
Whatever driver you're using, you'll have to find out how they handle connections and if they pool or not. For instance, Node's Mongoose is non-blocking and so you use one connection per app usually. This is the kind of thing you probably want.