Mongodb update a collection based on another collection [duplicate] - mongodb

This question already has answers here:
MongoDB and "joins" [duplicate]
(7 answers)
Closed 7 years ago.
I have two collections
The structure of collection one is
{'click_id':"123345",
...
}
The structure of collection two is
{'click_id':"123345",
...
}
What is the optimal way to do the following in collection 1
{'click_id':"123345",
'collection2':true,
...
}
considering the fact there are around 1 billion records in collection 1 and around 30 million records in collection 2.

As far as I know you can't do stuff like JOIN from SQL DBs on MongoDB.
If your job is batch and you can think of the second collection as of a state snapshot, you could just load all the 30 million IDs into memory (should be under 1 GB, but it depends on the length of the IDs) as dictionary/map.
Then go through all the 1 billion records from the 1st collection and save the results in Bulk insert/update (IDK what you want).

Related

mongodb - fast pagination in a sharded cluster [duplicate]

This question already has answers here:
Implementing pagination in mongodb
(2 answers)
How does MongoDB sort records when no sort order is specified?
(2 answers)
Closed 5 years ago.
I'm running mongo 3.4 (w/ wiredtiger). Up to now I have been using the 'fast pagination' strategy specified in the following article (https://scalegrid.io/blog/fast-paging-with-mongodb), namely:
Retrieve the _id of the last document in the current page
Retrieve documents greater than this “_id” in the next page
//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...
//Page 2
users = db.users.find({'_id'> last_id}). limit(10);
//Update the last id with the id of the last document in this page
last_id = ...
I am about to shard my collection in order to allow horizontal scaling. As part of enabling sharding, I am going to use a unique composite key (on fields "user_id" and "post_id") for a shard key. This will guarantee document uniquness across shards, and should allow for relatively good document distribution across shards.
But after I shard my collection, will I be able to use the above fast-pagination strategy? If not, is there a common solution?
Thanks

update() function disregards limit() in mongo [duplicate]

This question already has answers here:
How to limit number of updating documents in mongodb
(8 answers)
Closed 5 years ago.
Lets say I have a 10 documents of Item in the database.
Lets retrieve 3 documents of Item matching some condition using limit().
documents = Item.objects(somefield=somecondition).limit(3)
Now if I do
documents.update(), mongoengine updates all the documents in the database matched by the query not just the 3 documents I have limited my query to.
I also tried setting multi=False in the params, but then only one document gets updated.
Is there anyway to do update while querying itself instead of looping over the documents one by one?
As far as I know there is no available solution to your problem provided by MongoDB. However you could try something like this
documents.forEach(
function (e) {
e.field = 'value';
db.collection.save(e);
}
);

A bad performance of upserting item to a million-document collection

It takes 700~800 ms to upsert an item into a collection, which is containing about 2 million documents. I have tried the functions as following,
Model.findOneAndUpdate()
bulk.find({...}).upsert().updateOne()
But both of them takes about almost 1 second to upsert ONE item.
I have another 1 million items to insert/upsert, so it will takes me several days. How can I improve it?
Adding an Index for the querying item will accelerate the process.

mongodb $in limit [duplicate]

This question already has answers here:
What is the maximum number of parameters passed to $in query in MongoDB?
(4 answers)
Closed 6 years ago.
Was just wondering if there is a limit to Mongodb's $in function?
http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24in
I have a collection of users (BIG) and have a smaller subset of ObjectIds stashed somewhere, and I want to select all users (collections) that are in my ObjectIds.
Thanks
Since there's no limit on the number of items in an array as such, you shouldn't have any problem..
For the case when the array is embedded inside a document, you might want to have a look at these:
http://groups.google.com/group/mongodb-user/browse_thread/thread/4a7caeba972aa998?fwc=1
Filtering content based on words
http://groups.google.com/group/mongodb-user/browse_thread/thread/28ae76e5ad5fcfb5

How can I get last 50 documents in mongoDB? [duplicate]

This question already has answers here:
How to get the last N records in mongodb?
(16 answers)
Closed 4 years ago.
How can I get last 50 documents in mongoDB?
I have a collection which is made by
db.createCollection("collection",{capped:true, size:300000});
from this "collection"
I would like to have last 50 documents instead of get first 50 documents.
I know that I can get first 50 documents by using
db.collection.find().limit(50);
But how can I get last 50 documents?
Is this can be done simply with MongoDB API or should I implement this with programming?
this should do the thing:
db.collection.find().sort({$natural: -1}).limit(50);
The last N added records, from less recent to most recent, can be seen with this query:
db.collection.find().skip(db.collection.count() - N)
In your case 50
db.collection.find().skip(db.collection.count() - 50)
you need the last 50 documents from mongo collection, so this is the query
db.collection('collectionName').find().sort({$natural: -1}).limit(50);
// sort({$natural: -1}) for the last fields (desecnding)
// limit(50) because you need only 50 fields