I would like to ask you about why my external instance MongoDB is slower than launched by Meteor.js.
I set the MONGO_URL environment variable to connect with my local database so the connection should be as fast as the database created by the Meteor.js.
However, when I tried to test publications with external database and I saw that I have one or two seconds latency, but when Meteor.js runs database all works properly (I saw the new data from database without delay).
Thanks for any help!
Cheers
Meteor has two ways to access changes in MongoDB:
Pull: Meteor checks for updates at regular intervals. You may notice a few seconds delay.
Push, also known as "oplog tailing": MongoDB sends data changes right when they are performed. Meteor registers it instantaneously.
You'll need to set the MONGO_OPLOG_URL environment variable to enable oplog tailing and have instantaneous updates. When Meteor starts up a local Mongo instance, is also sets up oplog tailing automatically.
Here's a detailed article about it: https://meteorhacks.com/mongodb-oplog-and-meteor/
Related
We are facing stale issue stale read issue for some percentage of users for our MongoDB Spring framework based app. It's a very low volume app with hits less than 10K a day as well as a record count of less than 100K or even less. Following is our app tech stack.
Mongo DB version db version v3.2.8.
compile group: 'org.springframework.data', name: 'spring-data-mongodb', version:'1.5.5.RELEASE'
compile group: 'org.mongodb', name: 'mongo-java-driver', version:'2.13.2'.
Users reported that in case of a new record insert or update, that value is not available to read for a certain duration say half an hour. After which the latest values in reading got reflected and available for reading across all users. However, when connecting with the mongo terminal, we are able to see the latest values in DB.
We confirmed that there is no application-level cache involved in reported flows. Also for JSP's we added timestamp on reported pages as well tried private browsing mode to rule out any browser issue.
We also tried changing Write concern in MongoClient and Mongo Template but no change in behavior:
MongoClientOptions.builder().writeConcern(WriteConcern.FSYNCED).build(); //Mongo Client
mongoTemplate.setWriteConcern(WriteConcern.FSYNCED); // Spring Mongo template
mongoTemplate.setWriteResultChecking(WriteResultChecking.LOG);
Also, DB logs look clean, no exceptions or errors seem to be generated on MongoDB logs.
We also didn't introduce any new library or DB changes and this setup was working prefect for the past 2 years. Any pointers would be helpful.
NOTE: It's a single mongo Instance with no slaves or secondary configured.
Write concern does not affect reads.
Most likely you have some cache in your application or on your user's system (like their web browser) that you are overlooking.
Second likely reason is you are reading from secondaries (i.e. using anything other than primary read preference).
I have a node.js app using Mongoose (v5.11.17) drivers. Before I used standalone db, but recently for reliability concerns the app was migrated to replicas set (of 3). There is a code that saves some object a in the db, and then could be any number of calls reading object a. Since migrating to the replica set some times reading can return as not found, that has never happened with the standalone db. My guess is the read preferences are not right somehow and the read goes to a non-primary db (or may be the write went to non-primary). Can I find out from the connection in mongo shell and query its read preference?
I'm trying to set up an app that will act as a front end to an externally updated mongo database. The data will be pushed into the database by another process.
I so far have the app connecting to the external mongo instance and pulling data out with on issues, but its not reactive (not seeing any of the new data going into the mongo database).
I've done some digging and it so far can only find that I might need to set up replica sets and use oplog, is there a way to do this without going to replica sets (or is that the best way anyway)?
The code so far is really simple, a single collection, a single publication (pulling out the last 10 records from the database) and a single template just displaying that data.
No deps that I've written (not sure if that's what I'm missing).
Thanks.
Any reason not to use Oplog? For what I've read it is the recommended approach even if your DB isn't updated by an external process, and a must if it does.
Nevertheless, without Oplog your app should see the changes on the DB made by the external process anyway. It should take longer (up to 10 seconds), but it should update.
Background: I have a very specific use case where I have an existing MongoDB that I need to interact with via reads, but I have to ensure that the data can never be modified. However I also need to trigger some form of event when new data comes in so I can do post processing on it.
The current plan is to use replication to get the data onto a slave for the read processing. However for my purposes I only care about new data in various document stores. Part of the issue is that I can not modify the existing MongoDB and not all the data is timestamped, so there is no incremental way to handle this that I can think of.
Question: Is it possible to fire an event from a slave that would tell me I have new data and what it is? I will only have access to the slave DB, as the master will be locked.
I may have some limited ability to change the master DB, but I can not expect to change the document structure at all.
Instead of using a master/slave configuration you could instead use a replica set with a priority 0 secondary (so that it can never become primary).
You can tail the oplog on that secondary looking for insert operations.
Need some way to push data from clients database to central database.Basically, there are several instances of MongoDB running on remote machines [clients] , and need some method to periodically update central mongo database with newly added and modified documents in clients.it must replicate its records to the single central server
Eg:
If I have 3 mongo instances running on 3 machines each having data of 10GB then after the data migration 4th machine's mongoDB must have 30GB of data. And cenral mongoDB machine must get periodically updated with data of all those 3 machines. But these 3 machines not only get new documents but existing documents in them may get updated. I would like the central mongoDB machine also to get these updations.
Your desired replication strategy is not formally supported by MongoDB.
A MongoDB replica set consists of a single primary with asynchronous replication to one or more secondary servers in the same replica set. You cannot configure a replica set with multiple primaries or replication to a different replica set.
However, there are a few possible approaches for your use case depending on how actively you want to keep your central server up to date and the volume of data/updates you need to manage.
Some general caveats:
Merging data from multiple standalone servers can create unexpected conflicts. For example, unique indexes would not know about documents created on other servers.
Ideally the data you are consolidating will still be separated by a unique database name per origin server so you don't have strange crosstalk between disparate documents that happen to have the same namespace and _id shared by different origin servers.
Approach #1: use mongodump and mongorestore
If you just need to periodically sync content to your central server, one way to do so is using mongodump and mongorestore. You can schedule a periodic mongodump from each of your standalone instances and use mongorestore to import them into the central server.
Caveats:
There is a --db parameter for mongorestore that allows you to restore into a different database from the original name (if needed)
mongorestore only performs inserts into the existing database (i.e. does not perform updates or upserts). If existing data with the same _id already exists on the target database, mongorestore will not replace it.
You can use mongodump options such as --query to be more selective on data to export (for example, only select recent data rather than all)
If you want to limit the amount of data to dump & restore on each run (for example, only exporting "changed" data), you will need to work out how to handle updates and deletions on the central server.
Given the caveats, the simplest use of this approach would be to do a full dump & restore (i.e. using mongorestore --drop) to ensure all changes are copied.
Approach #2: use a tailable cursor with the MongoDB oplog.
If you need more realtime or incremental replication, a possible approach is creating tailable cursors on the MongoDB replication oplog.
This approach is basically "roll your own replication". You would have to write an application which tails the oplog on each of your MongoDB instances and looks for changes of interest to save to your central server. For example, you may only want to replicate changes for selective namespaces (databases or collections).
A related tool that may be of interest is the experimental Mongo Connector from 10gen labs. This is a Python module that provides an interface for tailing the replication oplog.
Caveats:
You have to implement your own code for this, and learn/understand how to work with the oplog documents
There may be an alternative product which better supports your desired replication model "out of the box".
You should be aware that there are only replica set for doing replication there a replicat set always means: one primary, multiple secondary. Write always go to the primary server. Appearently you want multi-master replication which is not supported by MongoDB. So you want to look into a different technology like CouchDB or CouchBase. MongoDB is barrel burst here.
There may be a way since MongoDB 3.6 to achieve your goal: Change Streams.
Change streams allow applications to access real-time data changes without the complexity and risk of tailing the oplog. Applications can use change streams to subscribe to all data changes on a single collection, a database, or an entire deployment, and immediately react to them. Because change streams use the aggregation framework, applications can also filter for specific changes or transform the notifications at will.
There are some configuration options that affect whether you can use Change Streams or not, so please read about them.
Another option is Delayed Replica Set Members.
Because delayed members are a "rolling backup" or a running "historical" snapshot of the data set, they may help you recover from various kinds of human error. For example, a delayed member can make it possible to recover from unsuccessful application upgrades and operator errors including dropped databases and collections.
Hidden Replica Set Members may be another option to consider.
A hidden member maintains a copy of the primary's data set but is invisible to client applications. Hidden members are good for workloads with different usage patterns from the other members in the replica set.
Another option may be to configure a Priority 0 Replica Set Member.
Because delayed members are a "rolling backup" or a running "historical" snapshot of the data set, they may help you recover from various kinds of human error.
I am interested in these options myself, but I haven't decided what approach I will use.