Share a MongoDB instance between Meteor apps without lag in reactivity? - mongodb

This question has been asked multiple times, here and here, and the answer to get this working is fairly straight forward: add an environmental variable to your bash_profile and all Meteor instances on your localhost will share that MONGO_URL.
What I've noticed however is that while this may be the case, there's quite a bit of latency in the "reactivity" of Meteor. I've tested this with two very lean Meteor apps, with empty collections. Inserting a document to a collection from one Meteor app, where my second app is querying that same collection and printing out a field from the documents does work, but there's a noticeable lag before it updates. I've ruled out the possibility of the collection insertion being the source of the lag (simple console.log callback on the client of the first app, logging the id of the newly inserted document).
My purpose for having multiple apps (two to be precise) sharing the same MongoDB is to separate an admin panel from a mobile app without going crazy regarding name-spacing and bloat. This configuration works, but I'm not sure it's the "proper" way of accomplishing the task, and it certainly seems to be causing a performance hit.
Any insight into this matter would be appreciated. Thank you!
EDIT: To clarify, the db URL I'm using is on my localhost, and isn't something hosted online.

When you use an external database, by default meteor will use periodic polling (every few seconds) in order to observe any changes. The delay you are experiencing is a result of this polling process. You can remove the delay and reduce your app's CPU usage by taking advantage of meteor's oplog tailing feature. In order to use it you will:
Get access to a mongodb instance with the oplog turned on.
Set the environment variable MONGO_OPLOG_URL so your app(s) can read the oplog.
Personally, I'd recommend compose.io for this. They provide exactly this as part of their basic elastic deployment. See this post for detailed instructions.
For users who wish to connect to the oplog created locally for you, you can obtain the URL via:
MongoInternals.defaultRemoteCollectionDriver().mongo._oplogHandle._oplogUrl
It should end up looking something like mongodb://127.0.0.1:3001/local

Related

Meteor - Why not just publish all the collection data?

This may be quite an easy question to answer as it may just be my lack of understanding, but if you are having to run the query twice - once of the server and once on the client - why not just publish all the collection data, and then just run one query on the client?
Obviously I don't mean doing this for the users collection, but if you have a blog Posts collection, wouldn't this be beneficial?
Publish all the post data, then subscribe to it and running whatever query is necessary on the client to get the data you need.
Publishing everything is good for 'development' environment as meteor adds autopublish by default but this has some fallacies in 'production' environment. I find this two points to be of importance
Security : The idea is, supply only as much data to the client as required. You can never trust the client and you don't know what the client may use the data for. For your use case, of simple blog posts, this may not be a serious risk but may be a critical risk for e commerce application. The last thing, you want is a hacker to use the data and leverage a bug in your code to do nasty stuff.
Data Overheads: For subscriptions, generally waitOn is used. Thus, till all the data has been made available to the client, the templates are not rendered. IF you have a very large amount of data it will take considerable time to render. So, it is advised to keep the data at 'only what to need' stage to optimize this time too.

How could meteor know MongoDB changes instantly?

I had been wondering about knowing mongoDB changes instantly for a while and best solution that I have ever seen, is tailing mongoDb logs and I just had given wondering up till met Meteor.
When there is a changes on mongoDB somehow(through Meteor or directly), Meteor understands the changes instantly and apply them. We could see it on publishing changes or observing(observe or observeChange) changes.
I guess that, Meteor could understand it by tailing mongoDB logs as the best solution that I have ever seen suggests. But this assumption also make me ask that, Meteor can run with a mongoDB running on different host and if the way that meteor knows MongoDB changes instantly is tailing logs, how could Meteor tails logs of mongoDB running on different machine(different host) and handle changes? Tailing on different host requires painful things and I think Meteor does not follow this way.
I think tailing logs is not my answer because of this question.
I am curious about this knowledge. Cause, I think that, if I have it, I could use it on my other works without Meteor.
So, how could really Meteor knows mongoDB changes instantly? And I wanna ask once again; what would be the best way to know about mongoDB changes instantly?
Thank You!
Yes you are right this is from new new oplog tailing functionality. Meteor pretends that it is mongodb in a replica set and it tails mongodb's operation logs for collections it is monitoring.
Usually databases need multiple servers so that they can stay up in case one fails. MongoDB has this functionality in a replica set.
So this is where meteor comes in. Meteor pretends to be a mongodb replica database so when any data changes on this operations log for other copies to see, Meteor knows that data has changed and can push this new data down to the client instantly.
Usage in production
You can use a different mongo database with this new functionality as you set MONGO_OPLOG_URL. You will need to set mongodb as a replica set first so that the operations log is enabled
This was a very recent introduction put in just last week in version 0.7 you can see the blog post about it below
More links about it:
https://www.meteor.com/blog/2013/12/17/meteor-070-scalable-database-queries-using-mongodb-oplog-instead-of-poll-and-diff
https://www.youtube.com/watch?v=_dzX_LEbZyI
https://github.com/meteor/meteor/wiki/Oplog-Observe-Driver

MongoLab strategies for restoring a database that has people currently using it

My site runs on MongoDB and I'm using MongoLab as a host for the database. I have 10-15 people using the site at any given time and would rather not 'switch them off' if at all possible.
If I have a MongoDB dump that I'd like to restore, it takes a very long time (the file size is around 360mb and this takes a good while to upload on my connection speed). What's more, it appears that MongoDB wants me to delete my collection before doing a restore, so the user would have no data to look through while it's updating.
Is there any way around this, besides, say, having two mongoLab accounts, one 'active' and one for uploading backups and I switch between the two when I need to do a restore?
Is there a general recommended strategy for this sort of thing?

Caching repeating query results in MongoDB

I am going to build a page that is designed to be "viewed" alot, but much fewer users will "write" into the database. For example, only 1 in 100 users may post his news on my site, and the rest will just read the news.
In the above case, 100 SAME QUERIES will be performed when they visit my homepage while the actual database change is little. Actually 99 of those queries are a waste of computer power. Are there any methods that can cache the results of the first query, and when they detect the same query in a short time, can deliver the cached result?
I use MongoDB and Tornado. However, some posts say that the MongoDB does not do caching.
Making a static, cached HTML with something like Nginx is not preferred, because I want to render a personalized page by Tornado each time.
I use MongoDB and Tornado. However, some posts say that the MongoDB does not do caching.
I dunno who said that but MongoDB does have a way to cache queries, in fact it uses the OS' LRU to cache since it does not do memory management itself.
So long as your working set fits into the LRU without the OS having to page it out or swap constantly you should be reading this query from memory at most times. So, yes, MongoDB can cache but technically it doesn't; the OS does.
Actually 99 of those queries are a waste of computer power.
Caching mechanisms to solve these kind of problems is the same across most techs whether they by MongoDB or SQL. Of course, this only matters if it is a problem, you are probably micro-optimising if you ask me; unless you get Facebook or Google or Youtube type traffic.
The caching subject goes onto a huge subject that ranges from caching queries in either pre-aggregated MongoDB/Memcache/Redis etc to caching HTML and other web resources to make as little work as possible on the server end.
Your scenario, personally as I said, sounds as though you are thinking wrong about the wasted computer power. Even if you were to cache this query in another collection/tech you would probably use the same amount of power and resources retrieving the result from that tech than if you just didn't bother. However that assumption comes down to you having the right indexes, schema, set-up etc.
I recommend you read some links on good schema design and index creation:
http://docs.mongodb.org/manual/core/indexes/
https://docs.mongodb.com/manual/core/data-model-operations/#large-number-of-collections
Making a static, cached HTML with something like Nginx is not preferred, because I want to render a personalized page by Tornado each time.
Yea I think by trying to worry about query caching you are pre-maturely optimising, especially if you don't want to take off, what would be 90% of the load on your server each time; loading the page itself.
I would focus on your schema and indexes and then worry about caching if you really need it.
The author of the Motor (MOngo + TORnado) package gives an example of caching his list of categories here: http://emptysquare.net/blog/refactoring-tornado-code-with-gen-engine/
Basically, he defines a global list of categories and queries the database to fill it in; then, whenever he need the categories in his pages, he checks the list: if it exists, he uses it, if not, he queries again and fills it in. He has it set up to invalidate the list whenever he inserts to the database, but depending on your usage you could create a global timeout variable to keep track of when you need to re-query next. If you're doing something complicated, this could get out of hand, but if it's just a list of the most recent posts or something, I think it would be fine.

Cloudant / CouchDB "pull" replication for 600+ documents to iPhone

I'm using Cloudant and I'm struggling to pull/replicate 600 documents from server to my iPhone. First, it's pretty slow because it has to go one-document-at-a-time, and Second Cloudant was giving me "timeouts" after the 100th-or-so REST request. (I have a ticket with Cloudant for this one, as it's unacceptable!)
I was wondering if anyone has found a way / hack to "bulk" replicate when pulling. I was thinking, perhaps it's possible to "zip up" all of the changes, send them in one file, and fast-forward the iPhone database to the last-change seq.
Any helps is great -- thanks!
Can you not hit _all_docs?include_docs=true to get everything in one shot? http://wiki.apache.org/couchdb/HTTP_Document_API#all_docs
I don't know couchcoccoa but it looks like the API supports this: http://couchbaselabs.github.com/CouchCocoa/docs/interfaceCouchDatabase.html#a49d0904f438587b988860891e8049885
Actually, why not make a view. Make a view that gives you your list and make sure your id is there. With your id, you can then go to the document and get all the rest of the required information that you need in order to update it if you need to.
There really is no reason you would ever need to hit every document individually. They have views and search2.0 for that. Keep in mind you are using a cloud based technology. This stuff is not sitting in your basement, you can't just hit it a million times per device in a few seconds and expect anyone to not notice and/or get upset (an exaggeration, yes I know).
What I do not understand is that you are trying to replicate it to an iPhone? Are you running apache and couchdb in your app? Why not just read the JSON data and throw it into a database. or just throw it into a file if it updates that much and keep overwriting it. There is so many options that are a whole lot less messy.