Getting an error "use of closed network connection" - mongodb

My application stack consist of Golang for backend programming & MongoDB Atlas Free Tier for database. I am trying to trigger events from Stripe CLI and my GoLang programs updates a bunch of tables in MongoDB Atlas based on certain validations. This seems to work fine for the most part. But at certain times in the process, I am facing the below error while updating data into Mongo Collections.
connection(xxxxx-shard-00-02.ka3rc.mongodb.net:xxx[-15])
incomplete read of message header: read tcp
xxx.xxx.x.xx:xxxxx->xx.xx.xxx.xxx:xxxxx: use of closed network connection
I am trying the use the same mongo client that i opened when control enters my Go program to execute all queries within the application.
Do anyone know the reason why we would face this error? Could this be due to Mongo Atlas restricting the number of requests per minute for free tier? This issue happens so randomly and i am not able to determine any pattern when this occurs.

From the go driver client options page:
https://pkg.go.dev/go.mongodb.org/mongo-driver#v1.8.0/mongo/options#ClientOptions
Most of timers are 0 by default :
( ConnectTimeout , MaxConnIdleTime , SocketTimeout )
This mean that in some cases server can close the connection , but the application driver to still not be aware , so it is recommended those timeouts from the client side to be set explicitly during connection init phase.

Related

RDS Data API BatchExecute taking significantly longer than standard connection

I have an AWS Lambda function that needs to insert several thousand rows of data into an RDS PostgreSQL database within a Serverless Cluster. Previously I used a normal database connection using psycopg2, but I switched to the RDS Data API in order to improve performance. However, using the Data API, BatchExecute exceeds the 5 minute lambda limit and still fails to commit the transaction in this time. Meanwhile, the psycopg2 solution, which uses a different transfer protocol, inserts all data in under 30 seconds.
Why is this possible? Shouldn't the Data API give superior performance as it doesn't need to establish a connection? Can I change any settings to make the RDS Data API perform suitably?
I don't believe I am reaching any of the data size limits, because the lambda times out rather than explicitly throwing an error. Also, I know that the connection is succeeding, as other small queries are able to execute successfully.

Static lookup data stored in localhost for 1000+ users (connections)

Sometimes you have static data that is used by all customers. I am looking for a solution that fetches this from localhost (127.0.0.1) using a sort of database.
I have done some tests using Golang fetching from a local Postgresql database and it works perfect. But how does this scale to 1000+ users?
I noticed that only 1 session was started at the local server regardless which computer (as I used 127.0.0.1 in Golang to call Postgres). At some point this may or maybe not be a bottleneck for 1000 users to only using one session?
My questions are:
How many concurrent users can Postgresql handle per session before
it become a bottleneck? Or is this handled by the calling language (Golang)?
Is it even possible to handle many queries per session from
different users?
Is there other better ways to manage static lookup data for all customers than a local Postgresql database (Redis?)
I hope this question fits this forum. Otherwise, please point me in right direction.
Every session creates a new postgres process, which gets forked from the "main" postgres process listening to the port (default 5432).
Default is that 100 sessions can be opened in parallel, but this can easily be changed in postgresql.conf.
There are no parallel queries being executed in one session.

How should I manage postgres database handles in a serverless environment?

I have an API running in AWS Lambda and AWS Gateway using Up. My API creates a database connection on startup, and therefore Lambda does this when the function is triggered for the first time. My API is written in node using Express and pg-promise to connect to and query the database.
The problem is that Lambda creates new instances of the function as it sees fit, and sometimes it appears as though there are multiple instances of it at one time.
I keep running out of DB connections as my Lambda function is using up too many database handles. If I log into Postgres and look at the pg_stat_activity table I can see lots of connections to the database.
What is the recommended pattern for solving this issue? Can one limit the number of simultaneous instances of a function in Lambda? Can you share a connection pool across instances of a function (I doubt it).
UPDATE
AWS now provides a product called RDS Proxy which is a managed connection pooling solution to solve this very issue: https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/
There a couple ways that you can run out of database connections:
You have more concurrent Lambda executions than you have available database connections. This is certainly possible.
Your Lambda function is opening database connections but not closing them. This is a likely culprit, since web frameworks tend to keep database connections open across requests (which is more efficient), but on Lambda have no opportunity to close them since AWS will silently terminate the instance.
You can solve 1 by controlling the number of available connections on the database server (the max_connections setting on PostgreSQL) and the maximum number of concurrent Lambda function invocations (as documented here). Of course, that just trades one problem for another, since Lambda will return 429 errors when it hits the limit.
Addressing 2 is more tricky. The traditional and right way of dealing with database connection exhaustion is to use connection pooling. But with Lambda you can't do that on the client, and with RDS you don't have the option to do that on the server. You could set up an intermediary persistent connection pooler, but that makes for a more complicated setup.
In the absence of pooling, one option is to create and destroy a database connection on each function invocation. Unfortunately that will add quite a bit of overhead and latency to your requests.
Another option is to carefully control your client-side and server-side connection parameters. The idea is first to have the database close connections after a relatively short idle time (on PostgreSQL this is controlled by the tcp_keepalives_* settings). Then, to make sure that the client never tries to use a closed connection, you set a connection timeout on the client (how to do so will be framework dependent) that is shorter than that value.
My hope is that AWS will give us a solution for this at some point (such as server-side RDS connection pooling). You can see various proposed solutions in this AWS forum thread.
You have two options to fix this:
You can tweak Postgres to disconnect those idle connections. This is the best way but may require some trial-and-error.
You have to make sure that you connect to the database inside your handler and disconnect before your function returns or exits. In express, you'll have to connect/disconnect while inside your route handlers.

How to add arbitrary log data to MongoDB logs from client?

I have a node.js web application connecting to MongoDB for which I want to profile DB performance. Each request received by the application is assigned a request-id and can cause multiple queries to be sent to MongoDB. I want to see this request-id value in each log line in MongoDB. Is there a way to do this? I would like to avoid adding always-true fields to each query like "req<id>": null because I suspect this may affect performance.
The docs reference a feature similar to this called Client Data at https://docs.mongodb.com/manual/reference/log-messages/ however this appears to be sent once per connection, and I'm looking for client data that changes multiple times even on the same single connection.
Try using cursor.comment() to record the request ID value into the log messages.

MongoDB: Checking number of users using my applications currently which is connected to my MongoDB server.

Is it possible to Check the number of users using my applications currently which is connected to my MongoDB server? Is there any command to find it? Also in the below output
db.serverStatus().connections
{ "current" : 12, "available" : 807, "totalCreated" : NumberLong(96385) }
the current connection will include the number of users connected to my database through my application? Please explain. Thanks in advance!!!!
db.serverStatus().connections shows the number of connections made to the MongoDB. But to find the number of users using the DB, you must absolutely know the number of connections a single client(user) will open. That means if your application opens 3 connections for single user then you cannot say there are 3 users. You can only say 3 connections are used, so there is only one user.
So you must first define a fixed amount of db connections for a single user.
Remember: unlike MySql, MongoDB is not intended to be directly used with a client. MongoDB is best when used along with a REST server. That means the client first connects to REST server and the DB is handled by the REST server and not the client directly.