This question already has answers here:
rank leaderboard in mongo with surrounding players
(2 answers)
Closed 3 years ago.
I have a collection of users, each user has a points attribute (see schema below).
My end goal is to return a sorted array of limited documents according to a specific user location, or in other words, a small part of a leaderboard relative for a specific user location.
Currently I query and sort all users, find the user I want in the returned array and slice the array according to that. I was wondering if there is a query that will save me returning all users.
Sample User Schema:
/* 1 */
{
"_id" : ObjectId("..."),
"points" : 5852 ,
"key" : "user1"
}
/* 2 */
{
"_id" : ObjectId("..."),
"points" : 3835,
"key" : "user2"
}
/* 3 */
{
"_id" : ObjectId("..."),
"points" : 1984,
"key" : "user3"
}
/* 4 */
{
"_id" : ObjectId("...."),
"points" : 2437,
"key" : "user4"
}
Lets assume I want the query to return 3 documents sorted by points -
The document of "user4" (my relative user), the document after him in the leaderboard ("user3" in the example above) and 1 document before him ("user2" in the above example)
Thanks!
Edit: Im using mongoose as well if this simplifies things.
Edit2: Please see below the expected output -
/* 1 */
{
"_id" : ObjectId("..."),
"points" : 3835,
"key" : "user2"
}
/* 2 */
{
"_id" : ObjectId("...."),
"points" : 2437,
"key" : "user4"
}
/* 3 */
{
"_id" : ObjectId("..."),
"points" : 1984,
"key" : "user3"
}
Note that "user1" does not appear in the output as the query asks for 1 user ranked above and 1 user ranked below "user4"
Sounds like you're trying to do this: rank leaderboard in mongo with surrounding players
As it says in the answer to that question, the easiest way to do it is with three queries. Even though that's an increase in the number of queries, the data you transfer will be significantly reduced.
Related
I'm trying to add a new field to a document
/* 1 */
{
"_id" : ObjectId("5d3b1d8c708bf66fc760afb6"),
"user" : "myuser",
"password" : "$2y$10$Oh.RKbvf4eT5gozLnD7A0uS5C/k6YluYw0k7uShPD2Elu6FKNBQn2"
}
/* 2 */
{
"_id" : ObjectId("5d3c4e70faa55a342c08c40a"),
"user" : "user2",
"pass" : "$2y$10$Oh.RKbvf4eT5gozLnD7A0uS5C/k6YluYw0k7uShPD2Elu6FKNBQn2"
}
with the comand
db.users.update({"user":"user2"},{"widgets":1})
But then the document that matches previous info is lost
/* 2 */
{
"_id" : ObjectId("5d3c4e70faa55a342c08c40a"),
"widgets" : 1.0
}
How can I update keeping the preious data of the document?
You are missing $set here. If you donot provide $set then all the fields will be replaced with the provided one in your case user and pass is replaced with widgets. As mention in docs
db.users.update({"user":"user2"},{$set: {"widgets":1}})
I have a mongo DB aggregation pipeline that performs the following steps:
Sorts a list of user stats objects by timestamp
Groups the results by user ID
Sorts by a specified stat's name
Pages the results via skip and limit stages
In plain English, this pipeline returns a page from a list of user stats sorted by a specified stat. Each user can have multiple stats object, so I group to return only the most recent stats object for each user.
In Mongo Shell, this looks like:
db.getCollection("stats").aggregate(
[
{ "$sort" : { "Timestamp" : -1.0 } },
{
"$group" : {
"_id" : "$UserId",
"UserId" : { "$last" : "$UserId" },
"StatsOverall" : { "$last" : "$StatsOverall" },
"Timestamp" : { "$last" : "$Timestamp" }
}
},
{ "$sort" : { "StatsOverall.Rank" : -1.0 } },
{ "$skip" : specifiedPageNumber },
{ "$limit" : specifiedNumResultsPerPage }
]
);
This works fine.
I now want to modify this query to be able to search the user by name, and get back the entire page that user is contained on. (This is for a leaderboard). So, if the user is on page 5 of the leaderboard, I want to return the entirety of page 5.
However, I'm having trouble seeing a solution that doesn't require me to either load all of the users in to memory and page them there (awful idea), or go back and forth to the database iterating through pages (almost as awful).
Is there some way I can modify my aggregation pipeline to do all this at the database level?
EDIT: As requested, added some sample data and the expected result.
Sample data looks something like this... I've omitted some fields that aren't relevant. The initial data is a collection of user's stats, where each user can have more than one object. My existing pipeline returns the 1 most recent stats object for each user sorted by a specified stat name.
{
"_id" : "5c611e71ab0ffc430410e0ba",
"UserId" : "5c611e71ab0ffc430410e0ba",
"StatsOverall" : {
"Rank" : NumberInt(1000),
"GamesLost" : NumberInt(30),
"GamesWon" : NumberInt(50)
}
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
// ----------------------------------------------
{
"_id" : "5c6238658966ae5860795879",
"UserId" : "5c6238658966ae5860795879",
"StatsOverall" : {
"Rank" : NumberInt(413),
"GamesLost" : NumberInt(2),
"GamesWon" : NumberInt(141),
},
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
// many objects like this
The expected result looks like this:
{
"_id" : "5c611e71ab0ffc430410e0ba",
"UserId" : "5c611e71ab0ffc430410e0ba",
"StatsOverall" : {
"Rank" : NumberInt(1000),
"GamesLost" : NumberInt(30),
"GamesWon" : NumberInt(50)
}
"Timestamp" : "2019-02-10T21:35:06.599Z"
}
It returns the exact same type of object, sorted the same way as the existing pipeline, however I want to return only the page the the user is on. In the example result, assume the page size is just 1 result per page. So, the result would contain the 1 page that the user with the given UserId is on. In my sample result, that ID would be 5c611e71ab0ffc430410e0ba.
I wonder how I should match a document by last element in an Array in mongodb document.
Say I want to update a specific document with new data if a field in last element of array is not equal some specific value.
I know that I can do this to check if a field in Array does not contain that value already:
myTable.update({ Thing: thisThing,
'myArray.Element': {$ne: parseInt(thisValue)} }, ...)
But how should one check that the last Element (myArray.Element) in myArray is not equal to thisValue?
Note that I want to do this with findand not aggregate.
Best Regards
Let's say we have collection names, looking like this:
/* 1 */
{
"_id" : ObjectId("58de74f8c1bb7f4256adf32c"),
"user" : "John",
"list_friends" : [
"Alice",
"Bob"
]
}
/* 2 */
{
"_id" : ObjectId("58de75d3c1bb7f4256adf32d"),
"user" : "Pop",
"list_friends" : [
"Eve",
"Oscar"
]
}
Now, let's say we want to change "user" field to "Updated" for all users whose last friend name is different than "Oscar" (in this case that is John). This query:
db.getCollection('names').update({$where: "this.list_friends[this.list_friends.length - 1] !== 'Oscar'"}, {"$set": {"user": "Updated"}})
modifies the collection and the final result is:
/* 1 */
{
"_id" : ObjectId("58de74f8c1bb7f4256adf32c"),
"user" : "Updated",
"list_friends" : [
"Alice",
"Bob"
]
}
/* 2 */
{
"_id" : ObjectId("58de75d3c1bb7f4256adf32d"),
"user" : "Pop",
"list_friends" : [
"Eve",
"Oscar"
]
}
I tested the solution using Mongo 3.2, I am not sure if it works for older versions.
Not a question about joins in mongoDB
I have two collections in mongoDB, which do not have a common field and which I would like to apply a zip function to (like in Python, Haskell). Both collections have the same number of documents.
For example:
Let's say one collection (Users) is for users, and the other (Codes) is of unique randomly generated codes.
Collection Users:
{ "_id" : ObjectId(""), "userId" : "123"}
{ "_id" : ObjectId(""), "userId" : "456"}
Collection Codes:
{ "_id" : ObjectId(""), "code" : "randomCode1"}
{ "_id" : ObjectId(""), "code" : "randomCode2"}
The desired output would to assign a user to a unique code. As follows:
Output
{ "_id" : ObjectId(""), "code" : "randomCode1", "userId" : "123"}
{ "_id" : ObjectId(""), "code" : "randomCode2", "userId" : "456"}
Is there any way of doing this with the aggregation pipeline?
Or perhaps with map reduce? Don't think so because it only works on one collection.
I've considered inserting another random id into both collections for each document pair, and then using $lookup with this new id, but this seems like an overkill. Also the alternative would be to export and use Python, since there aren't so many documents, but again I feel like there should be a better way.
I would do something like this to get the records from collection 1 & 2 and merge the required fields into single object.
You have already confirmed that number of records in collection 1 and 2 are same.
The below code will loop through the cursor and map the required fields into one object. Finally, you can print the object to console or insert into another new collection (commented the insert).
var usersCursor = db.users.find( { } );
var codesCursor = db.codes.find( { } );
while (usersCursor.hasNext() && codesCursor.hasNext()) {
var user = usersCursor.next();
var code = codesCursor.next();
var outputObj = {};
outputObj ["_id"] = new ObjectId();
outputObj ["userId"] = user["userId"];
outputObj ["code"] = code["code"];
printjson( outputObj);
//db.collectionName.insertOne(outputObj);
}
Output:-
{
"_id" : ObjectId("58348512ba41f1f22e600c74"),
"userId" : "123",
"code" : "randomCode1"
}
{
"_id" : ObjectId("58348512ba41f1f22e600c75"),
"userId" : "456",
"code" : "randomCode2"
}
Unlike relational database in MongoDB you doing JOIN stuff at the app level (so it will be easy to horizontal scale the database). You need to do that in the app level.
I have an ID of a document and need to return the document plus the 10 documents that come before and the 10 documents after it. 21 docs total.
I do not have a start or end value from any key. Only the limit in either direction.
Best way to do this? Thank you in advance.
Did you know that ObjectID's contain a timestamp? And that therefore they always represent the natural insertion order. So if you are looking for documents before an after a known document _id you can do this:
Our documents:
{ "_id" : ObjectId("5307f2d80f936e03d1a1d1c8"), "a" : 1 }
{ "_id" : ObjectId("5307f2db0f936e03d1a1d1c9"), "b" : 1 }
{ "_id" : ObjectId("5307f2de0f936e03d1a1d1ca"), "c" : 1 }
{ "_id" : ObjectId("5307f2e20f936e03d1a1d1cb"), "d" : 1 }
{ "_id" : ObjectId("5307f2e50f936e03d1a1d1cc"), "e" : 1 }
{ "_id" : ObjectId("5307f2e90f936e03d1a1d1cd"), "f" : 1 }
{ "_id" : ObjectId("5307f2ec0f936e03d1a1d1ce"), "g" : 1 }
{ "_id" : ObjectId("5307f2ee0f936e03d1a1d1cf"), "h" : 1 }
{ "_id" : ObjectId("5307f2f10f936e03d1a1d1d0"), "i" : 1 }
{ "_id" : ObjectId("5307f2f50f936e03d1a1d1d1"), "j" : 1 }
{ "_id" : ObjectId("5307f3020f936e03d1a1d1d2"), "j" : 1 }
So we know the _id of "f", get it and the next 2 documents:
> db.items.find({ _id: {$gte: ObjectId("5307f2e90f936e03d1a1d1cd") } }).limit(3)
{ "_id" : ObjectId("5307f2e90f936e03d1a1d1cd"), "f" : 1 }
{ "_id" : ObjectId("5307f2ec0f936e03d1a1d1ce"), "g" : 1 }
{ "_id" : ObjectId("5307f2ee0f936e03d1a1d1cf"), "h" : 1 }
And do the same in reverse:
> db.items.find({ _id: {$lte: ObjectId("5307f2e90f936e03d1a1d1cd") } })
.sort({ _id: -1 }).limit(3)
{ "_id" : ObjectId("5307f2e90f936e03d1a1d1cd"), "f" : 1 }
{ "_id" : ObjectId("5307f2e50f936e03d1a1d1cc"), "e" : 1 }
{ "_id" : ObjectId("5307f2e20f936e03d1a1d1cb"), "d" : 1 }
And that's a much better approach than scanning a collection.
Neil's answer is a good answer to the question as stated (assuming that you are using automatically generated ObjectIds), but keep in mind that there's some subtlety around the concept of the 10 documents before and after a given document.
The complete format for an ObjectId is documented here. Note that it consists of the following fields:
timestamp to 1-second resolution,
machine identifier
process id
counter
Generally if you don't specify your own _ids they are automatically generated by the driver on the client machine. So as long as the ObjectIds are generated on a single process on a client single machine, their order does indeed reflect the order in which they were generated, which in a typical application will also be the insertion order (but need not be). However if you have multiple processes or multiple client machines, the order of the ObjectIds for objects generated within a given second by those multiple sources has an unpredictable relationship to the insertion order.