I have a collection containing around 100k documents. I want to add an auto incrementing "custom_id" field to my documents, and keep adding my documents by incrementing that field from now on.
What's the best approach for this? I've seen some examples in the official document (http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/) however they're only for adding new documents, not for updating an existing collection.
Example code I created based on the link above to increment my counter:
function incrementAndGetNext(counter, callback) {
counters.findAndModify({
name: counter
}, [["_id", 1]], {
$inc: {
"count": 1
}
}, {
"new": true
}, function (err, doc) {
if (err) return console.log(err);
callback(doc.value);
})
}
On the above code counters is db.counters collection and I have this document there:
{_id:"...",name:"post",count:"0"}
Would love to know.
Thank you.
P.S. I'm using native mongojs driver for js
Well, using the link you mentionned, I'd rather use the counters collection approach.
The counters collections approach has some drawbacks including :
It always generates multiples request (two): one to get the sequence number, another to do the insertion using the id you got via the sequence,
If you are using sharding features of mongodb, a document responsible for storing a counter state may be used a lot, and each time it will reach the same server.
However it should be appropriate for most uses.
The approach you mentionned ("the optimistic loop") should not break IMO, and I don't guess why you have a problem with it. However I'd not recommend it. What happens if you execute the code on multiple mongo clients, if one has a lot of latency and others keep taking IDs? I'd not like to encounter this kind of problem... Furthermore, there are at least two request per successful operation, but no maximum of retries before a success...
Related
Is there a way to add or inject an additional or new field to a collection and then being able to access that newly inserted field to the collection on the client side and displaying it on a template or accessing it without affecting the server side? I didn't want to compromise with the APIs or the database since it's simply a count of something, like when a customer has a total of 8 deliveries.
I was doing this code where I'm subscribing to a collection and then trying to update the collection on the client side but obviously I should have no rights on updating it:
Meteor.subscribe("all_customers", function () {
Customer.find({}).forEach(function(data) {
var delivery_count = Delivery.find({ customerID: data._id }).count();
Customer.update( { _id: data._id } , { $push: { deliveries: delivery_count } } );
});
}),
And then doing this one where I'd try to manipulate the collection by inserting new key-value pair but it won't display or nothing at all when it's sent:
Meteor.subscribe("all_customers", function () {
Customer.find({}).forEach(function(data) {
var delivery_count = Delivery.find({ customerID: data._id }).count();
data.deliveries = delivery_count;
});
}),
My main objective is to basically be able to sort the count of the deliveries in the EasySearch package without compromising with the APIs or database but the package only sorts an existing field. Or is there a way to do this with the package alone?
Forgive me if this question may sound dumb but I hope someone could help.
It's completely possible, for that you can use low level publications described here: https://docs.meteor.com/api/pubsub.html
You have 2 solutions, one standard, the other hacky:
You follow documentation, you transform your document before sending it to the client. Your server data is not affected and you can enhance data (even do joins) from your subscription. It's cool but you can't push it too far as it might collide and conflict with another publication for the same collection on the same page. Solution is well explained here: How to publish a view/transform of a collection in Meteor?
You can do a client-only collection which will be customs objects with only the data you need. You still have a server-side observer on the source collection but you'll fetch new objects that client will treat and display without any calculation. You need more code but this way, you don't mess with classic server-client collections. It's a bit overkill so ping me if you want a detailed example.
I am developing an app in Meteor for the first time. As seen in the documentation, I am loading my collections this way:
Items = new Mongo.Collection("items")
The items collection has more than a million documents and makes page loading very slow. How can I avoid this overhead?
First remove the autopublish package from the console:
$ meteor remove autopublish
Otherwise all records will be published to all clients and 1M records will be very slow.
Second, create a publication that filters the collection to only publish those documents that are actually relevant to the current user in the current application context:
Server:
Meteor.publish('myItems',function(){
if ( this.userId ){
return Items.find({ some query relevant to the user },
{fields: { key1: 1, key2: 1, ... only relevant fields }});
}
this.ready();
});
Client:
Meteor.subscribe('myItems');
Your query and list of relevant fields might vary by class of user. You can also have multiple publications on the same collection for different use cases.
If your collection has millions of document, it very bad to have entire data loaded on front-end may it be any circumstances.
you should wisely use PAGINATION, as there are packages out there for pagination in meteor. But if you want to use pagination with simple Session variable handling (the variable keeps the "skip" value which is accessible at both client and server) with aslagle:reactive-table for better presentation, you can watch this video https://www.youtube.com/watch?v=UivnTM1YA-I
After implementing this feature, you yourself will feel that loading entire data on UI is not feasible solution and pagination works without page refresh, asynchronously and reactively.
What I'm talking about is:
Meteor.users.findOne() =
{
_id: "..."
...
followers: {
users: Array[], // ["someUserId1", "someUserId2"]
pages: Array[] // ["somePageId1", "somePageId2"]
}
}
vs.
Followings.findOne() =
{
_id: "..."
followeeId: "..."
followeeType: "user"
followerId: "..."
}
I found second one totally inefficient because I need to use smartPublish to publish user's followers.
Meteor.smartPublish('userFollowers', function(userId) {
var coursors = [],
followings = Followings.find({followeeId: userId});
followings.forEach(function(following) {
coursors.push(Meteor.users.find({_id: following.followerId}));
});
return coursors;
});
And I can't filter users inside the iron-router. I cache subscriptions so there may be more users than I need.
I want to do something like this:
data: function() {
return {
users: Meteor.users.find({_id: {$in: Meteor.user().followers.users}})
};
},
A bad thing about using nested arrays inside the Document is that if I've added an item to followers.users[], the whole array will be sent back to the client.
So what do you think? Is it better to keep such data inside the user Document so it'll become fat? May be it's a 'Meteor way' of solving such problems.
I think it's a better idea to keep it nested inside the user document. Storing it in a separate collection leads to a lot of unnecessary duplication, and every time the publish function is run you have to scan the entire collection again. If you're worrying about the arrays growing too large, in most cases, don't (generally, a full-text novel only takes a few hundred kb). Plus, if you're publishing your user document already, you don't have to pull any new documents into memory; you already have everything you need.
This MongoDB blog post seems to advocate a similar approach (see one-to-many section). It might be worth checking out.
You seem to be aware of the pros and cons of each option. Unfortunately, your question is mostly opinion based.
Generally, if your follower arrays will be small in size and don't change often, keep them embedded.
Otherwise a dedicated collection is the way to go.
For that case, you might want to take a look at https://atmospherejs.com/cottz/publish which seems very efficient in what it does and very easy to implement syntactically.
I'm trying to decide the best approach for an app I'm working on. In my app each user has a number of custom forms for example X user will have custom forms and Y user will have 5 different forms customized to their needs.
My idea is to create a mongo db collection for each custom form, at the start I wouldn't have to many users I understand the mongo collection limit is set to 24000 (I think not sure). If that's correct I'm ok for now.
But I think this might create issues down the line but also not sure this is the best approach for performance, management and so forth.
The other option is to create one collocation "forms" and add custom data under an object field like so
{
_id: dfdfd34df4efdfdfdf,
data: {}
}
My concern with this is one Meteor reactivity and scale.
First I'm expecting each user to fill out each form at least 30 to 50 times per week, so I'm expecting the collection size to increase very fast. Which makes me question this approach and go with the collection option which breaks down the size.
My second concern or question is well Meteor be able to identify changes in the first level object and second level object. As I need the data to be reactive.
First Level
{
_id: dfdfd34df4efdfdfdf,
data: {}
}
Second Level
{
_id: dfdfd34df4efdfdfdf,
data: {
Object:
{
name:Y, _id: random id
}
}
}
The answer is somewhat here limits of number of collections in databases
It's not a yes or no but it's clear regrading the mongo collection limit. As for Meteor reactivity that's another topic.
I have trouble understanding when exactly the database is hit when using node-mongodb-native. Couldn't find any reference on that. As everything is callback based, it gave me the feeling that every single call hits the database ... For example, are those two snippets any different in terms of how many times the database is hit :
// ---- 1
db.collection('bla', function(err, coll) {
coll.findOne({'blo': 'bli'}, function(err, doc) {
coll.count(function(err, count) {
console.log(doc, count)
})
})
})
// ---- 2
db.collection('bla', function(err, coll) {
coll.findOne({'blo': 'bli'}, function(err, doc) {
db.collection('bla', function(err, coll) {
coll.count(function(err, count) {
console.log(doc, count)
})
})
})
})
I was basically wondering whether I can cache instances of collections and cursors. For example, why not fetch the collections I need only once, at server start, and reuse the same instances indefinitely ?
I'd really like to understand how the whole thing work, so I'd really appreciate a good link explaining stuff in details.
Looking at the source code for the node.js driver for collection it seems it will not ping MongoDB upon creation of the collection unless you have strict mode on: https://github.com/mongodb/node-mongodb-native/blob/master/Readme.md#strict-mode
The source code I looked at ( https://github.com/mongodb/node-mongodb-native/blob/master/lib/mongodb/db.js#L446 ) reinforced the idea that if strict was not on then it would just try and create a new node.js collection object and run the callback.
However findOne and count will break the "lazy" querying of node.js and will force it to query the database in order to get results.
Note: The count being on the collection won't enforce a "true" count of all items in the collection. Instead it will garnish this information from the collection meta.
So for the first snippet you should see two queries run. One for the findOne and one for the count and two for the second snippet as well since creating the collection after the findOne should not enforce a query to MongoDB.
After some googling, I have find this link about best practices for node-mongodb-native. It is answered by Christian Kvalheim who seem to be the maintainer of the library. He says :
"You can safely store the collection objects if you wish and reuse them"
So even if the call to collection might hit the database in case it is made in strict mode, the actual client-side collection instance can be reused.