Insert if not exists, else remove MongoDB - mongodb

So I have a query in MongoDB (2.6.4) where I am trying to implement a simple upvote/downvote mechanism. When a user clicks upvote, I need to do the following:
If already upvoted by user, then remove upvote.
Else if not upvoted by user, then add upvote AND remove downvote if exists.
So far, my query formed (is incorrect) is:
db.collection.aggregate([
{
$project: {
"_id" : ObjectId("53e4d45c198d7811248cefca"),
"upvote": {
"$cond":
[
{"$in": ["$upvote",1] },
{"$pull": {"upvote" : 1}},
{"$addToSet": {"upvote" : 1}, "$pull": {"downvote": 1}}
]
}
}
}
])
where '1' is the user id who is trying to upvote.
Both upvote and downvote are arrays that contain userIds of those who have upvoted and downvoted, respectively.
For output of query, I just want a bool value: true if $cond evaluated to true, else false.

That's not a good way to implement up-votes and downvotes. Aside from the aggregation framework not being a mechanism for updating documents in any way, you seem to have gravitated towards thinking it may be a solution due to the logic you want to implement. But aggregate does not update.
What you want on your, well lets call it a "question" schema is a structure like this:
{
"_id": ObjectId("53f51a844ffa9b02cf01c074"),
"upvoted": [],
"downvoted": [],
"upvoteCount": 0,
"downvoteCount": 0
}
That is something that can work well with atomic updates and actually give you some stateful information about the object at the same time.
For the "upvoted" and "downvoted" arrays, we are going to consider that the "users" voting have a similar unique ObjectId value. So what we are going to do is $push or $pull from either array and also "increment/decrement" the counter values along with each of those operations.
Here's how this works for an upvote:
db.questions.update(
{
"_id": ObjectId("53f51a844ffa9b02cf01c074"),
"upvoted": { "$ne": ObjectId("53f51c0a4ffa9b02cf01c075") }
"downvoted": ObjectId("53f51c0a4ffa9b02cf01c075")
},
{
"$push": { "upvoted": ObjectId("53f51c0a4ffa9b02cf01c075") },
"$inc": { "upvoteCount": 1, "downvoteCount": -1 },
"$pull": { "downvoted": ObjectId("53f51c0a4ffa9b02cf01c075") },
}
)
db.questions.update(
{
"_id": ObjectId("53f51a844ffa9b02cf01c074"),
"upvoted": { "$ne": ObjectId("53f51c0a4ffa9b02cf01c075") }
},
{
"$push": { "upvoted": ObjectId("53f51c0a4ffa9b02cf01c075") },
"$inc": { "upvoteCount": 1 },
}
)
Actually that's two operations, which you could do with the Bulk operations API as well (probably the best way really) but it has a point to it. The first statement will only match a document where the current user has a "downvote" recorded in the array. As it, we already "pushed" that user id value to the "downvotes" array. If it is not there then no update is made. But you both push and pull from respective arrays and also "increment/decrement" the counter fields at the same time.
With the second statement which will only match something where the first did not, you make a fair assessment that now you don't need to touch "downvotes" and just handle the upvote fields. In both cases the safe thing to do is make sure that the main condition is the current user id value is not present in the "upvoted" array.
For downvotes the fields are just reversed:
db.questions.update(
{
"_id": ObjectId("53f51a844ffa9b02cf01c074"),
"downvoted": { "$ne": ObjectId("53f51c0a4ffa9b02cf01c075") }
"upvoted": ObjectId("53f51c0a4ffa9b02cf01c075")
},
{
"$pull": { "upvoted": ObjectId("53f51c0a4ffa9b02cf01c075") },
"$inc": { "upvoteCount": -1, "downvoteCount": 1 },
"$push": { "downvoted": ObjectId("53f51c0a4ffa9b02cf01c075") },
}
)
db.questions.update(
{
"_id": ObjectId("53f51a844ffa9b02cf01c074"),
"downvoted": { "$ne": ObjectId("53f51c0a4ffa9b02cf01c075") }
},
{
"$push": { "downvoted": ObjectId("53f51c0a4ffa9b02cf01c075") },
"$inc": { "downvoteCount": 1 },
}
)
Naturally you can see the logical progression to simply cancelling any "upvote/downvote" for the user in question. Also you can be smart about it if you want and expose the information in your client to not only show if the current user have already "upvoted/downvoted" but also control click actions and eliminate unnecessary requests.

Related

Maintaining an embedded array with top 3 elements

I'm currently working on a mobile car racing game.
After a user finishes a track, a new document is added to "Plays" collections.
Also, if the user finishes the track 3rd/2nd/1st in time. the user id and time will be added to the "best" array of this track. (and the new 4th place user will be removed from the array).
Since 2+ users can finish a track on the same time, I'll probably need to make this atomic. so I've used findAndModify.
So far I've managed to do it well if I only maintain the 1st position in the array. this is what I did:
db.collection('tracks').findAndModify(
{ $or: [ {_id: track_id, 'best': {$exists: false}}, {_id: track_id,'best.0.time': {$gt: _time}} ] },
[],
{$set : {'best.0' : {'user_id': _userId, 'time': _time} }},
(err, data) => {
if (err) return app_res.send(err);
app_res.send (data.value != null);
}
);
But My goal is to maintain the 3 best.
I've looked in the MongoDB documentation for array operators but I can't understand how (and if) they can't help me achieve my goal.
Is there anyway I can do it?
EDIT: Just to make this more clear, the top 3 indicates the top 3 users and their top times. for example, if "best" array is:
1. user: a, time : 5.
2. user: b, time : 9.
3. user: c, time : 20.
and than user c finish the track in 7 seconds, than "best" changes to:
1. user: a, time : 5.
2. user: c, time : 7.
3. user: b, time : 9.
My Schema:
Users:
{
"_id": {
"$oid": "123"
},
"name": "A name"
}
Tracks:
{
"_id": {
"$oid": "765"
},
"name": "A track name",
"length": 34.65,
"best": [{"user_id": 467,"time": 24},{"user_id": 532,"time": 47},{"user_id": 953,"time": 89}]
}
Plays:
{
"_id": {
"$oid": "1"
},
"time": 300000,
"date": {
"$date": "2018-08-15T14:05:47.872Z"
},
"user_id": {
"$oid": "123"
},
"track_id": {
"$oid": "765"
}
}
Here is how you'd do that - using some special modifiers that can be used with $push:
db.tracks.update({}, {
$push: {
"best": {
$each: [ {"user_id": 123,"time": 1} ], // add a new item to the "best" array
$slice: 3, // keep only top three
$sort: { "time": 1 } // rank/sort based on "time" field
}
}
})

Get sum of Nested Array in Aggregate

Ok, I have an issue I cannot seem to solve.
I have a document like this:
{
"playerId": "43345jhiuy3498jh4358yu345j",
"leaderboardId": "5b165ca15399c020e3f17a75",
"data": {
"type": "EclecticData",
"holeScores": [
{
"type": "RoundHoleData",
"xtraStrokes": 0,
"strokes": 3,
},
{
"type": "RoundHoleData",
"xtraStrokes": 1,
"strokes": 5,
},
{
"type": "RoundHoleData",
"xtraStrokes": 0,
"strokes": 4
}
]
}
}
Now, what I am trying to accomplish is using aggregate sum the strokes and then order it afterwards. I am trying this:
var sortedBoard = db.collection.aggregate(
{$match: {"leaderboardId": boardId}},
{$group: {
_id: "$playerId",
played: { $sum: 1 },
strokes: {$sum: '$data.holeScores.strokes'}
}
},
{$project:{
type: "$SortBoard",
avgPoints: '$played',
sumPoints: "$strokes",
played : '$played'
}}
);
The issue here is that I do net get the strokes sum correct, since this is inside another array.
Hope someone can help me with this and thanks in advance :-)
You need to say $sum twice:
var sortedBoard = db.collection.aggregate([
{ "$match": { "leaderboardId": boardId}},
{ "$group": {
"_id": "$playerId",
"SortBoard": { "$first": "$SortBoard" },
"played": { "$sum": 1 },
"strokes": { "$sum": { "$sum": "$data.holeScores.strokes"} }
}},
{ "$project": {
"type": "$SortBoard",
"avgPoints": "$playeyed",
"sumPoints": "$strokes",
"played": "$played"
}}
])
The reason is because you are using it both as a way to "sum array values" and also as an "accumulator" for $group.
The other thing you appear to be missing is that $group only outputs the fields you tell it to, therefore if you want to access other fields in other stages or output, you need to keep them with something like $first or another accumulator. We also appear to be missing a pipeline stage in the question anyway, but it's worth noting just to be sure.
Also note you really should wrap aggregation pipelines as an official array [], because the legacy usage is deprecated and can cause problems in some language implementations.
Returns the correct details of course:
{
"_id" : "43345jhiuy3498jh4358yu345j",
"avgPoints" : 1,
"sumPoints" : 12,
"played" : 1
}

Mongodb aggregate to find if a user is in any other user's follower list

I collected followers list and friends list for n number of users from twitter and stored them in mongodb.
Here is a sample document:
{
"_id": ObjectId("561d6f8986a0ea57e51ec95c"),
"status": "True",
"UserId": "1489245878",
"followers": [
"1566382441",
"1155774331"
],
"followersCount": 2,
"friendsCount": 5,
"friends": [
"1135511478",
"998082481",
"565321118",
"848123988",
"343334562"
]
}
I wanted to know within my collection, are there any userids that are also in the followers list of some other documents. Lets say we have user "a", now i would like to know if user "a" is in the followers list of any other document within the same collection. I'm not sure how to do this. In case if we have, i would like to project the userid and the _id of the document that has the userid within the followers list.
I guess you can use aggregate function like below to get this result.
db.getCollection('your_collection").aggregate([
{
"$match": {
"followers": "1566382441"
}
},
{
"$project": {
"followers": 1
}
},
{
"$unwind": "$followers"
},
{
"$match": {
"followers": "1566382441"
}
},
{
"$group": {
"_id": "$followers",
"ids": {
"$addToSet": "$_id"
}
}
},
{
"$project": {
"userId": "$_id",
"ids": 1,
"_id": 0
}
}
])
I am using only a sample of your data. You can add your list of users for whom you are trying to filter in both stages of "$match". Just see if this helps.
P.S: I know its been a long time since you asked this question! But you know, its never late!

MongoDB Sum Array With Objects

Say I have an aggregation that returns the following:
[
{driverId: 21312asd12, cars: 2, totalMiles: 30000, family: 4},
{driverId: 55512a23a2, cars: 3, totalMiles: 55000, family: 2},
...
]
How would I go about running a summation of each data set on a groupId basis to return the following? Do I use an $unwind? Do another grouping?
For example I would like to return:
{
totalDrivers: 2,
totalCars: 5,
totalMiles: 85000,
totalFamily: 6
}
You seem to just be referring to the documents in the output as an "array", therefore just add another $group to the end of your pipeline:
{ "$group": {
"_id": null,
"totalDrivers": { "$sum": 1 },
"totalCars": { "$sum": "$cars" },
"totalMiles": { "$sum": "$totalMiles" },
"totalFamily": { "$sum": "$family" }
}}
Where null is essentially just a blank grouping key that is not a field present in the document to group on. The result should be a single document (albeit in an array, depending on the API method call used or server version).
Or if you actually mean that each document has a field with an array like this, then $unwind and process the group either per document or with a null as above:
{ "$unwind": "$someArray" },
{ "$group": {
"_id": "$_id",
"totalDrivers": { "$sum": 1 },
"totalCars": { "$sum": "$someArray.cars" },
"totalMiles": { "$sum": "$someArray.totalMiles" },
"totalFamily": { "$sum": "$someArray.family" }
}}
At any rate, you should really post the code you are using when asking questions like this. It is very likely that your pipeline may not be as efficient to get to your end goal as you think, and if you posted that it both gives a clear picture of what you are doing as well as leaves it open for suggested improvement.

Sub-query in MongoDB

I have two collections in MongoDB, one with users and one with actions. Users look roughly like:
{_id: ObjectId("xxxxx"), country: "UK",...}
and actions like
{_id: ObjectId("yyyyy"), createdAt: ISODate(), user: ObjectId("xxxxx"),...}
I am trying to count events and distinct users split by country. The first half of which is working fine, however when I try to add in a sub-query to pull the country I only get nulls out for country
db.events.aggregate({
$match: {
createdAt: { $gte: ISODate("2013-01-01T00:00:00Z") },
user: { $exists: true }
}
},
{
$group: {
_id: {
year: { $year: "$createdAt" },
user_obj: "$user"
},
count: { $sum: 1 }
}
},
{
$group: {
_id: {
year: "$_id.year",
country: db.users.findOne({
_id: { $eq: "$_id.user_obj" },
country: { $exists: true }
}).country
},
total: { $sum: "$count" },
distinct: { $sum: 1 }
}
})
No Joins in here, just us bears
So MongoDB "does not do joins". You might have tried something like this in the shell for example:
db.events.find().forEach(function(event) {
event.user = db.user.findOne({ "_id": eventUser });
printjson(event)
})
But this does not do what you seem to think it does. It actually does exactly what it looks like and, runs a query on the "user" collection for every item that is returned from the "events" collection, both "to and from" the "client" and is not run on the server.
For the same reasons your 'embedded' statement within an aggregation pipeline does not work like that. Unlike the above the "whole pipeline" logic is sent to the server before execution. So if you did something like this to 'select "UK" users:
db.events.aggregate([
{ "$match": {
"user": {
"$in": db.users.distinct("_id",{ "country": "UK" })
}
}}
])
Then that .distinct() query is actually evaluated on the "client" and not the server and therefore not having availability to any document values in the aggregation pipeline. So the .distinct() runs first, returns it's array as an argument and then the whole pipeline is sent to the server. That is the order of execution.
Correcting
You need at least some level of de-normalization for the sort of query you want to run to work. So you generally have two choices:
Embed your whole user object data within the event data.
At least embed "some" of the user object data within the event data. In this case "country" becasue you are going to use it.
So then if you follow the "second" case there and at least "extend" your existing data a little to include the "country" like this:
{
"_id": ObjectId("yyyyy"),
"createdAt": ISODate(),
"user": {
"_id": ObjectId("xxxxx"),
"country": "UK"
}
}
Then the "aggregation" process becomes simple:
db.events.aggregate([
{ "$match": {
"createdAt": { "$gte": ISODate("2013-01-01T00:00:00Z") },
"user": { "$exists": true }
}},
{ "$group": {
"_id": {
"year": { "$year": "$createdAt" },
"user_id": "$user._id"
"country": "$user.country"
},
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.country",
"total": { "$sum": "$count" },
"distinct": { "$sum": 1 }
}}
])
We're not normal
Fixing your data to include the information it needs on a single collection where we "do not do joins" is a relatively simple process. Just really a variant on the original query sample above:
var bulk = db.events.intitializeUnorderedBulkOp(),
count = 0;
db.users.find().forEach(function(user) {
// update multiple events for user
bulk.find({ "user": user._id }).update({
"$set": { "user": { "_id": user._id, "country": user.country } }
});
count++;
// Send batch every 1000
if ( count % 1000 == 0 ) {
bulk.execute();
bulk = db.events.intitializeUnorderedBulkOp();
}
});
// Clear any queued
if ( count % 1000 != 0 )
bulk.execute();
So that's what it's all about. Individual queries to a MongoDB server get "one collection" and "one collection only" to work with. Even the fantastic "Bulk Operations" as shown above can still only be "batched" on a single collection.
If you want to do things like "aggregate on related properties", then you "must" contain those properties in the collection you are aggregating data for. It is perfectly okay to live with having data sitting in separate collections, as for instance "users" would generally have more information attached to them than just and "_id" and a "country".
But the point here is if you need "country" for analysis of "event" data by "user", then include it in the data as well. The most efficient server join is a "pre-join", which is the theory in practice here in general.