The issue was that even if i target just one node of my replica set in my connection string, mongo-go-driver always want to discover and connect other nodes.
I found a solution here that basically say i should add the connect option in the connection string.
mongodb://host:27017/authDb?connect=direct
My question is: How good or bad practice is this and why mongo doesn't have documented, are there other available values that this option can have?
That option only exists for the Go driver. For all other drivers it is unrecognized, so it is not documented as a general connection string option.
It is documented for the Go Driver at https://godoc.org/go.mongodb.org/mongo-driver/mongo#example-Connect--Direct
How good or bad practice is this and why mongo doesn't have documented, are there other available values that this option can have?
As pointed out in the accepted answer, that this is documented under the driver documentation. Now for the other part of the question.
Generally speaking in the replica set context, you would want to connect to the topology instead of directly to a specific replica set member, with an exception for administrative purposes. Replication is designed to provide redundancy, and connecting directly to one member i.e. Primary is not recommended in case of fail-over.
All of the official MongoDB drivers follows MongoDB Specifications. In regards to the direct connections, the requirement currently is server-discovery-and-monitoring.rst#general-requirements:
Direct connections: A client MUST be able to connect to a single
server of any type. This includes querying hidden replica set members,
and connecting to uninitialized members (see RSGhost) in order to run
"replSetInitiate". Setting a read preference MUST NOT be necessary to
connect to a secondary. Of course, the secondary will reject all
operations done with the PRIMARY read preference because the slaveOk
bit is not set, but the initial connection itself succeeds. Drivers
MAY allow direct connections to arbiters (for example, to run
administrative commands).
It only specifies that it MUST be able to do so, but not how. MongoDB Go driver is not the only driver that currently supporting the direct option approach, there are also .NET/C# and Ruby as well.
Currently there is an open PR for the specifications to unify the behaviour. In the future, all drivers will have the same way of establishing a direct connection.
Related
I have a request asking for a read only schema replica for a role in postgresql. After reading documentation and better understanding replication in postgresql, I'm trying to identify whether or not I can create the publisher and subscriber within the same database.
Any thoughts on the best approach without having a second server would be greatly appreciated.
You asked two different question. Same database? No. Since Pub/Sub requires tables to have the same name (including schema) on both ends, you would be trying to replicate a table onto itself. Using logical replication plugins other than the built-in one might get around this restriction.
Same server? Yes. You can replicate between two databases of the same instance (but see the note in the docs about some extra hoops you need to jump through) or between two instances on the same host. So whichever of those things you meant by "same server", yes, you can.
But it seems like an odd way to do this. If the access is read only, why does it matter whether it is to a replica of the real data or to the real data itself?
If a system is already running SQL Server, is it possible to use a NoSQL database (i,e MongoDb in particular) as the failover database in a SQL Server failover environment? Such that if the primary SQL node fails the secondary node running/hosting MongoDb takes the primary place.
The short answer to this question is "no". The long answer is anything is possible given enough code and resources.
SQL and MongoDB do not speak the same language, so there would need to be an intermediary that can translate. But this adds another failure mode to the system. It also needs to be complex enough to understand such concepts as "primary". There are connectors out there that will handle either SQL -> MongoDB or MongoDB -> SQL, but I'm not aware of any that are capable of syncing the two in real time. Additionally, it would be up to your application to determine where to query data from and where to write data to. This would be outside something a connector like these will do.
I have an API running in AWS Lambda and AWS Gateway using Up. My API creates a database connection on startup, and therefore Lambda does this when the function is triggered for the first time. My API is written in node using Express and pg-promise to connect to and query the database.
The problem is that Lambda creates new instances of the function as it sees fit, and sometimes it appears as though there are multiple instances of it at one time.
I keep running out of DB connections as my Lambda function is using up too many database handles. If I log into Postgres and look at the pg_stat_activity table I can see lots of connections to the database.
What is the recommended pattern for solving this issue? Can one limit the number of simultaneous instances of a function in Lambda? Can you share a connection pool across instances of a function (I doubt it).
UPDATE
AWS now provides a product called RDS Proxy which is a managed connection pooling solution to solve this very issue: https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/
There a couple ways that you can run out of database connections:
You have more concurrent Lambda executions than you have available database connections. This is certainly possible.
Your Lambda function is opening database connections but not closing them. This is a likely culprit, since web frameworks tend to keep database connections open across requests (which is more efficient), but on Lambda have no opportunity to close them since AWS will silently terminate the instance.
You can solve 1 by controlling the number of available connections on the database server (the max_connections setting on PostgreSQL) and the maximum number of concurrent Lambda function invocations (as documented here). Of course, that just trades one problem for another, since Lambda will return 429 errors when it hits the limit.
Addressing 2 is more tricky. The traditional and right way of dealing with database connection exhaustion is to use connection pooling. But with Lambda you can't do that on the client, and with RDS you don't have the option to do that on the server. You could set up an intermediary persistent connection pooler, but that makes for a more complicated setup.
In the absence of pooling, one option is to create and destroy a database connection on each function invocation. Unfortunately that will add quite a bit of overhead and latency to your requests.
Another option is to carefully control your client-side and server-side connection parameters. The idea is first to have the database close connections after a relatively short idle time (on PostgreSQL this is controlled by the tcp_keepalives_* settings). Then, to make sure that the client never tries to use a closed connection, you set a connection timeout on the client (how to do so will be framework dependent) that is shorter than that value.
My hope is that AWS will give us a solution for this at some point (such as server-side RDS connection pooling). You can see various proposed solutions in this AWS forum thread.
You have two options to fix this:
You can tweak Postgres to disconnect those idle connections. This is the best way but may require some trial-and-error.
You have to make sure that you connect to the database inside your handler and disconnect before your function returns or exits. In express, you'll have to connect/disconnect while inside your route handlers.
This is not a basic question on just asking about mongo cluster. It is not a duplicate in my opinion
I have a mongodb 3 node cluster and my url is something along the following lines in a PlayFramework conf file
mongodb.uri = "mongodb://mongodb1:27017,mongodb2:27017,mongodb3:27017/myproj"
By default when the replica is configured, all reads and writes are happening only on Primary and that is what I want. I however, want the reads to go to secondary when no primary is left, that is when 2 nodes go down, there will not be a primary left and only one secondary.
I do not want to modify my code to achieve this for each read query. I tried the following on the secondary node but it does not help
db.getMongo().setReadPref('primaryPreferred')
What exactly do I need to do to make this work?
I do not want to modify my code to achieve this for each read query. I tried the following on the secondary node but it does not help:
db.getMongo().setReadPref('primaryPreferred')
You are on the right track with read preferences, but need to set this in your connection string or driver. Setting a read preference in the mongo shell only affects the current shell session, and has no effect on remote connections.
mongodb.uri = "mongodb://mongodb1:27017,mongodb2:27017,mongodb3:27017/myproj"
You need to add some additional parameters as per MongoDB's Connecting String URI Format:
(required) The replicaSet=... option indicates that the driver should use "replica set" connection mode as opposed to the default direct connection mode. This parameter enables replica set monitoring, read preferences, and discovery of topology changes. The provided replica set name must match the replica set name configured for your deployment. For full details on the connection behaviour expected for officially supported MongoDB Drivers, see the Server Discovery and Monitoring (SDAM) specification. The rationale section of the spec includes answers to common questions about the chosen approach.
(required) The readPreference=primaryPreferred option indicates the preference to read from a primary but use a secondary if there is no primary available.
(optional) In MongoDB 3.4+ you can specify a maxStalenessSeconds=... option which limits the maximum replication lag (or staleness) when using a secondary read preference. By default there is no max staleness so the driver will not consider replication lag when selecting a secondary based on read preference. If your intent is to use primaryPreferred as a failover option for reads I would set max staleness with caution: you need to ensure that you have at least one secondary which has acceptable staleness.
So, assuming a replica set name of mongocluster and database of myproj the suggested connection string would be:
mongodb://mongodb1:27017,mongodb2:27017,mongodb3:27017/myproj?replicaSet=mongocluster&readPreference=primaryPreferred
I have to setup a database that can handle failover (if one crashes the other takes over). For that, I decided to use mongodb:
I set up a replica set with two instances. Each instance is running on a separate VM. I have several questions:
It is recommended to use at least 3 instances in a replica set. Is it ok to use only two ?
I have two instances, and then two IP addresses. Which IP should I give to my application that will need to read/write in the database ? When a database is down, how the request will be redirected to the instance that is still up ?
Some help to get started would be great !
It is recommended to use at least 3 instances in a replica set. Is it ok to use only two ?
No, the minimum requirement for a replica set is three processes (docs), but the third could be an arbiter even though it is not recommended.
I have two instances, and then two IP addresses. Which IP should I give to my application that will need to read/write in the database ? When a database is down, how the request will be redirected to the instance that is still up ?
There are two alternatives:
#1 (recommended)
You provide the driver with all addresses (for more detailed information how, visit the docs), example with nodejs driver (it is similar with the other). This way, the driver will know all, or at least more then one of, the instances directly, which will prevent problems if all of the specified instances are down (see #2).
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect('mongodb://[server1],[server2],[...]/[database]?replicaSet=[name]', function(err, db) {
});
#2
You provide the driver with one of them (probably the primary) and mongodb will figure out the rest of them. However, if your app starts up when the specified instance(s) is down, the driver would not be able to find the other instances, and therefore cannot connect to mongodb.