mongodb - fast pagination in a sharded cluster [duplicate] - mongodb

This question already has answers here:
Implementing pagination in mongodb
(2 answers)
How does MongoDB sort records when no sort order is specified?
(2 answers)
Closed 5 years ago.
I'm running mongo 3.4 (w/ wiredtiger). Up to now I have been using the 'fast pagination' strategy specified in the following article (https://scalegrid.io/blog/fast-paging-with-mongodb), namely:
Retrieve the _id of the last document in the current page
Retrieve documents greater than this “_id” in the next page
//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...
//Page 2
users = db.users.find({'_id'> last_id}). limit(10);
//Update the last id with the id of the last document in this page
last_id = ...
I am about to shard my collection in order to allow horizontal scaling. As part of enabling sharding, I am going to use a unique composite key (on fields "user_id" and "post_id") for a shard key. This will guarantee document uniquness across shards, and should allow for relatively good document distribution across shards.
But after I shard my collection, will I be able to use the above fast-pagination strategy? If not, is there a common solution?
Thanks

Related

update() function disregards limit() in mongo [duplicate]

This question already has answers here:
How to limit number of updating documents in mongodb
(8 answers)
Closed 5 years ago.
Lets say I have a 10 documents of Item in the database.
Lets retrieve 3 documents of Item matching some condition using limit().
documents = Item.objects(somefield=somecondition).limit(3)
Now if I do
documents.update(), mongoengine updates all the documents in the database matched by the query not just the 3 documents I have limited my query to.
I also tried setting multi=False in the params, but then only one document gets updated.
Is there anyway to do update while querying itself instead of looping over the documents one by one?
As far as I know there is no available solution to your problem provided by MongoDB. However you could try something like this
documents.forEach(
function (e) {
e.field = 'value';
db.collection.save(e);
}
);

how to efficient paging in mongodb [duplicate]

This question already has answers here:
Implementing pagination in mongodb
(2 answers)
Closed 5 years ago.
I want to sort all docs by field x (multiple docs can have same value on this field). Then every time I press "next", it loads the 10 more docs.
If multiple docs have the same value, they can be displayed at whatever order among them, it doesn't matter.
Since skip() is inefficient on large dataset, how do this efficiently? No pagination number needed, only infinite scroll.
If you don't require pagination number, then you can just utilise a monotonically increasing unique id values; such as _id with ObjectId().
Using your example:
/* Value of first scroll record, and to be updated every iteration */
var current_id;
var scroll_size = 10;
db.collection.find({_id: {$lt: current_id}}).
limit(scroll_size).
sort({
_id: -1,
x: 1 // Depending on your use case
});
The example above will give you most recent records. Depending on your use case you would have to decide what to do with newly inserted document.
If you are using a different field than _id to scroll through, make sure you add appropriate indexes on the field.

Check if document exists in different collection MongoDB Aggregation [duplicate]

This question already has an answer here:
How to Model a "likes" voting system with MongoDB
(1 answer)
Closed 6 years ago.
I insert a new document in Votes collection when a user votes on a poll.
{
_id: ObjectId(XXX),
card: 11,
user: 22
}
Now when a user requests for all the polls I want to return a Voted: 1 field if the users have already voted on the poll i.e. a document is already present in the Votes collection.
Can anyone tell me if there's a way to access documents from another collection in aggregation command.
With mongoDB it's not possible to access multiple documents within a query. You should change your data model and add an array or use embedded documents.
I don't know much about your use case so please take this just as an example and not as a final solution.
The following model contains an array for all voted polls of an user. Therefore you can check if the array contains the poll and return 1 if its true.
{
_id: ObjectId(XXX),
user: 22,
cards: [1, 3, 5]
}
See https://docs.mongodb.org/manual/core/data-modeling-introduction/ for more details about data modelling in mongoDB.

Mongodb update a collection based on another collection [duplicate]

This question already has answers here:
MongoDB and "joins" [duplicate]
(7 answers)
Closed 7 years ago.
I have two collections
The structure of collection one is
{'click_id':"123345",
...
}
The structure of collection two is
{'click_id':"123345",
...
}
What is the optimal way to do the following in collection 1
{'click_id':"123345",
'collection2':true,
...
}
considering the fact there are around 1 billion records in collection 1 and around 30 million records in collection 2.
As far as I know you can't do stuff like JOIN from SQL DBs on MongoDB.
If your job is batch and you can think of the second collection as of a state snapshot, you could just load all the 30 million IDs into memory (should be under 1 GB, but it depends on the length of the IDs) as dictionary/map.
Then go through all the 1 billion records from the 1st collection and save the results in Bulk insert/update (IDK what you want).

Do unique indexes also increase query efficiency in mongoDB? [duplicate]

This question already has answers here:
Advantage of a unique index in MongoDB
(2 answers)
Closed 7 years ago.
I need to create a unique index on a field in mongoDB in order to prevent duplicates in my collection. I also want to create a single-field index on that same field in order to optimize for queries.
Do I need to create these two different indexes? Or will the unique index be used for queries as well?
Any help is appreciated!
The unique index will be used for queries, so the extra index is unnecessary.
You can test this, by looking at the indexes considered in the output from explain in the shell.