Loading a large collection from MongoDB to Meteor makes pages slow

Loading a large collection from MongoDB to Meteor makes pages slow - mongodb

I am developing an app in Meteor for the first time. As seen in the documentation, I am loading my collections this way:
Items = new Mongo.Collection("items")
The items collection has more than a million documents and makes page loading very slow. How can I avoid this overhead?

First remove the autopublish package from the console:
$ meteor remove autopublish
Otherwise all records will be published to all clients and 1M records will be very slow.
Second, create a publication that filters the collection to only publish those documents that are actually relevant to the current user in the current application context:
Server:
Meteor.publish('myItems',function(){
if ( this.userId ){
return Items.find({ some query relevant to the user },
{fields: { key1: 1, key2: 1, ... only relevant fields }});
}
this.ready();
});
Client:
Meteor.subscribe('myItems');
Your query and list of relevant fields might vary by class of user. You can also have multiple publications on the same collection for different use cases.

If your collection has millions of document, it very bad to have entire data loaded on front-end may it be any circumstances.
you should wisely use PAGINATION, as there are packages out there for pagination in meteor. But if you want to use pagination with simple Session variable handling (the variable keeps the "skip" value which is accessible at both client and server) with aslagle:reactive-table for better presentation, you can watch this video https://www.youtube.com/watch?v=UivnTM1YA-I
After implementing this feature, you yourself will feel that loading entire data on UI is not feasible solution and pagination works without page refresh, asynchronously and reactively.

Related

Query documents in one collection that aren't referenced in another collection with Firestore

I have a firestore DB where I'm storing polls in one collection and responses to polls in another collection. I want to get a document from the poll collection that isn't referenced in the responses collection for a particular user.
The naive approach would be to get all of the poll documents and all of the responses filtered by user ID then filter the polls on the client side. The problem is that there may be quite a few polls and responses so those queries would have to pull down a lot of data.
So my question is, is there a way to structure my data so that I can query for polls that haven't been completed by a user without having to pull down the collections in their entirety? Or more generally, is there some pattern to use when you need to query for documents in one collection that aren't referenced by another?
The documents in each of the collections look something like this:
Polls:
{
question: string;
answers: Answer[];
}
Responses:
{
userId: string;
pollId: string;
answerId: string;
}
Anyhelp would be much appreciated!

Queries in Firestore can only return documents from one collection (or from all collections with the same name) and can only contain conditions on the data that they actually return.
Since there's no way to filter based on a condition in some other documents, you'll need to include the information that you want to filter on in the polls documents.
For example, you could include a completionCount field in each poll document, that you initially set to 0, and then update only every poll completion. With that in place, the query becomes a simple query on the completionCount field of the polls collection.
For a specific user I'd actually add all polls to their profile document, and remove them from there. Duplicating data is usually the easiest (and sometimes only) way to implement use-cases such as this.
If you're worried about having to add each new poll to each new user profile when it is created, you can also query all polls on their creation timestamp when you next load a user profile and perform that sync at that moment.
load user profile,
check when they were last active,
query for new polls,
add them to user profile.

Performance difference between storing the asset as subdocument vs single document in Mongoose

I have an API for synchronizing contacts from the user's phone to our database. The controller essentially iterates the data sent in the request body and if it passes validation a new contact is saved:
const contact = new Contact({ phoneNumber, name, surname, owner });
await contact.save();
Having a DB with 100 IOPS and considering the average user has around 300 contacts, when the server is busy this API takes a lot of time.
Since the frontend client is made in a way that a contact ID is necessary for other operations (edit, delete), I was thinking about changing the data structure to subdocuments, and instead of saving each Contact as a separate document, the idea is to save one document with many contacts inside:
const userContacts = new mongoose.Schema({
owner: //the id of the contacts owner,
contacts: [new mongoose.Schema({
name: { type: String },
phone: { type: String }
})]
});
This way I have to do just one save. But since Mongo has to generate an ID for each subdocument, is this really that much faster than the original approach?

Summary
This really depends on your exact usage scenarios:
are contacts often updated?
what is the max / average quantity of contacts per user
are they ever partially loaded, or are they always fetched all together?
But for a fairly common collection such as contacts, I would not recommend storing them in subdocuments.
Instead you should be able to use insertMany for your initial sync scenario.
Explanation
Storing as subdocuments makes a bulk-write easier will make querying and updating contacts slower and more awkward than as regular documents.
For example, if I have 100 contacts, and I want to view and edit 1 of them, it needs to load the full 100 contacts. I can make the change via a partial update using $set or $update, so the update will be OK. But when I add a new contact, I will have to add a new contact subDocument to you Contacts document. This makes it a growing document, meaning your database will suffer from fragmentation which can slow things down a lot (see this answer)
You will have to use aggregate with $ projection or $unwind to search through contacts in MongoDB. If you want to apply a specific sort order, this too would have to be done via aggregate or in code.
Matching via projection can also lead to problems with duplicate contacts being difficult to find.
And this won't scale. What if you get users with 1000s of contacts later? Then this single document will grow large and querying it will become very slow.
Alternatives
If your contacts for sync are in the 100s, you might get away with a splitting them into groups of ~50-100 and calling insertMany for each batch.
If they grow into the thousands, then I would suggest uploading all contacts, saving them as JSON / CSV files to disk, then slowly processing these in the background in batches.

Add an extra field to collection only on the client side and not affecting server side

Is there a way to add or inject an additional or new field to a collection and then being able to access that newly inserted field to the collection on the client side and displaying it on a template or accessing it without affecting the server side? I didn't want to compromise with the APIs or the database since it's simply a count of something, like when a customer has a total of 8 deliveries.
I was doing this code where I'm subscribing to a collection and then trying to update the collection on the client side but obviously I should have no rights on updating it:
Meteor.subscribe("all_customers", function () {
Customer.find({}).forEach(function(data) {
var delivery_count = Delivery.find({ customerID: data._id }).count();
Customer.update( { _id: data._id } , { $push: { deliveries: delivery_count } } );
});
}),
And then doing this one where I'd try to manipulate the collection by inserting new key-value pair but it won't display or nothing at all when it's sent:
Meteor.subscribe("all_customers", function () {
Customer.find({}).forEach(function(data) {
var delivery_count = Delivery.find({ customerID: data._id }).count();
data.deliveries = delivery_count;
});
}),
My main objective is to basically be able to sort the count of the deliveries in the EasySearch package without compromising with the APIs or database but the package only sorts an existing field. Or is there a way to do this with the package alone?
Forgive me if this question may sound dumb but I hope someone could help.

It's completely possible, for that you can use low level publications described here: https://docs.meteor.com/api/pubsub.html
You have 2 solutions, one standard, the other hacky:
You follow documentation, you transform your document before sending it to the client. Your server data is not affected and you can enhance data (even do joins) from your subscription. It's cool but you can't push it too far as it might collide and conflict with another publication for the same collection on the same page. Solution is well explained here: How to publish a view/transform of a collection in Meteor?
You can do a client-only collection which will be customs objects with only the data you need. You still have a server-side observer on the source collection but you'll fetch new objects that client will treat and display without any calculation. You need more code but this way, you don't mess with classic server-client collections. It's a bit overkill so ping me if you want a detailed example.

Mongo Collections and Meteor Reactivity

I'm trying to decide the best approach for an app I'm working on. In my app each user has a number of custom forms for example X user will have custom forms and Y user will have 5 different forms customized to their needs.
My idea is to create a mongo db collection for each custom form, at the start I wouldn't have to many users I understand the mongo collection limit is set to 24000 (I think not sure). If that's correct I'm ok for now.
But I think this might create issues down the line but also not sure this is the best approach for performance, management and so forth.
The other option is to create one collocation "forms" and add custom data under an object field like so
{
_id: dfdfd34df4efdfdfdf,
data: {}
}
My concern with this is one Meteor reactivity and scale.
First I'm expecting each user to fill out each form at least 30 to 50 times per week, so I'm expecting the collection size to increase very fast. Which makes me question this approach and go with the collection option which breaks down the size.
My second concern or question is well Meteor be able to identify changes in the first level object and second level object. As I need the data to be reactive.
First Level
{
_id: dfdfd34df4efdfdfdf,
data: {}
}
Second Level
{
_id: dfdfd34df4efdfdfdf,
data: {
Object:
{
name:Y, _id: random id
}
}
}

The answer is somewhat here limits of number of collections in databases
It's not a yes or no but it's clear regrading the mongo collection limit. As for Meteor reactivity that's another topic.

Meteor: difference between names for collections, variables, publications, and subscriptions?

In the Discover Meteor examples, what's the diff between "posts" and "Posts"? Why is it that when we do an insert from the server we use "posts" but when querying from the browser we use "Posts"? Wouldn't the system be confused by the case differences?
I see the variable assignment for client Posts to the server posts in posts.js. Is it a conventional notation to capitalize client and use small caps for server?
Posts = new Meteor.Collection('posts')
Why does server/fixtures.js use "Posts"? I was under the assumption that we query "Posts" in the browser (client), and use "posts" in the server, like we did in meteor mongo. So why are we now using Posts in the server?

Let's distinguish between the different names you might have to deal with when programming Meteor:
Variable names, such as Posts = new Meteor.Collection(...). These are used only so your code knows how to access this variable. Meteor doesn't know or care what it is, although the convention is to capitalize.
Collection names, such as new Meteor.Collection("posts"). This maps to the name of a MongoDB collection (on the server) or a minimongo collection (on the client).
Publication and subscription names, used in Meteor.publish("foo", ...) or Meteor.subscribe("foo"). These have to match up for the client to subscribe to some data on the server.
There are two things you need to match up in the Meteor data model:
Names of publications and their corresponding subscriptions
(usually) Names of collections on the client and server, if using the default collection model
A subscription name needs to always match up with the name of a publication. However, the collections that are sent for a given subscription needn't have anything to do with the subscription name. In fact, one can send over multiple cursors in one publication or one collection over different publications or even multiple subscriptions per publication, which appear merged as one in the client. You can also have different collection names in the server and client; read on...
Let's review the different cases:
Simple subscription model. This is the one you usually see in straightforward Meteor demos.
On client and server,
Posts = new Meteor.Collection("posts");
On server only:
Meteor.publish("postsPub", function() {
return Posts.find()
});
On client only:
Meteor.subscribe("postsPub")
This synchronizes the Posts collection (which is named posts in the database) using the publication called postsPub.
Multiple collections in one publication. You can send multiple cursors over for a single publication, using an array.
On client and server:
Posts = new Meteor.Collection("posts");
Comments = new Meteor.Collection("comments");
On server only:
Meteor.publish("postsAndComments", function() {
return [
Posts.find(),
Comments.find()
];
});
On client only:
Meteor.subscribe("postsAndComments");
This synchronizes the Posts collection as well as the Comments collection using a single publication called postsAndComments. This type of publication is well-suited for relational data; for example, where you might want to publish only certain posts and the comments associated only with those posts. See a package that can build these cursors automatically.
Multiple publications for a single collection. You can use multiple publications to send different slices of data for a single collection which are merged by Meteor automatically.
On server and client:
Posts = new Meteor.Collection("posts");
On server only:
Meteor.publish("top10Posts", function() {
return Posts.find({}, {
sort: {comments: -1},
limit: 10
});
});
Meteor.publish("newest10Posts", function() {
return Posts.find({}, {
sort: {timestamp: -1},
limit: 10
});
});
On client only:
Meteor.subscribe("top10Posts");
Meteor.subscribe("newest10Posts");
This pushes both the 10 posts with the most comments as well as the 10 newest posts on the site to the user, which sees both sets of data merged into a single Posts collection. If one of the newest posts is also a post with the most comments or vice versa, the Posts collection will contain less than 20 items. This is an example of how the data model in Meteor allows you to do powerful data merging operations without implementing the details yourself.
Multiple subscriptions per publication. You can get multiple sets of data from the same publication using different arguments.
On server and client:
Posts = new Meteor.Collection("posts");
On server only:
Meteor.publish("postsByUser", function(user) {
return Posts.find({
userId: user
});
});
On client only:
Meteor.subscribe("postsByUser", "fooUser");
Meteor.subscribe("postsByUser", "barUser");
This causes the posts by fooUser and barUser to both show up in the posts collection. This model is convenient when you have several different computations that are looking at different slices of your data and may be updated dynamically. Note that when you subscribe inside a Deps.autorun(...), Meteor calls stop() on any previous subscription handle with the same name automatically, but if you are using these subscriptions outside of an autorun you will need to stop them yourself. As of right now, you can't do two subscriptions with the same name inside an autorun computation, because Meteor can't tell them apart.
Pushing arbitrary data over a publication. You can completely customize publications to not require the same collection names on the server and client. In fact, the server can publish data that isn't backed by a collection at all. To do this, you can use the API for the publish functions.
On server only:
Posts = new Meteor.Collection("posts");
Meteor.publish("newPostsPub", function() {
var sub = this;
var subHandle = null;
subHandle = Posts.find({}, {
sort: {timestamp: -1},
limit: 10
})
.observeChanges({
added: function(id, fields) {
sub.added("newposts", id, fields);
},
changed: function(id, fields) {
sub.changed("newposts", id, fields);
},
removed: function(id) {
sub.removed("newposts", id);
}
});
sub.ready();
sub.onStop(function() {
subHandle.stop();
})
});
On client only:
NewPosts = new Meteor.Collection("newposts");
Meteor.subscribe("newPostsPub");
This synchronizes the newest 10 posts from the Posts collection on the server (called posts in the database) to the NewPosts collection on the client (called newposts in minimongo) using the publication/subscription called newPostsPub. Note that observeChanges differs from observe, which can do a bunch of other things.
The code seems complicated, but when you return a cursor inside a publish function, this is basically the code that Meteor is generating behind the scenes. Writing publications this way gives you a lot more control over what is and isn't sent to the client. Be careful though, as you must manually turn off observe handles and mark when the subscription is ready. For more information, see Matt Debergalis' description of this process (however, that post is out of date). Of course, you can combine this with the other pieces above to potentially get very nuanced and complicated publications.
Sorry for the essay :-) but many people get confused about this and I though it would be useful to describe all the cases.

You decide the naming conventions, and meteor doesn't care.
Posts becomes a collection of documents from the mongo server. You find posts by calling Posts.find({author: 'jim}). In the example you wrote, meteor is being told to internally call that collection 'posts'. Hopefully this is easy to remember if the names are similar...
There needs to be a way to express and track what info is available to clients. Sometimes there may be multiple sets of information, of varying detail. Example: a summary for a title listing, but detail for a particular document. These are often also named 'posts' so it can be initially confusing:
Meteor.publish "posts", -> # on server
Posts.find()
and then
dbs.subscriptions.posts = Meteor.subscribe 'posts' # on client
publication and subscription names must match, but it could all be named like this:
PostsDB = new Meteor.Collection('postdocumentsonserver')
so in mongo you'd need to type
db.postdocumentsonserver.find()
but otherwise you never need to care about 'postdocumentsonserver'. Then
Meteor.publish "post_titles", ->
PostsDB.find({},{fields:{name:1}})
matching
dbs.subscriptions.post_titles = Meteor.subscribe 'post_titles'

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse