Mongo Aggregation select and push last element in array - mongodb

I have documents with the following structure:
{
...,
trials:[ {...,
ref:[{a:1,b:2},{a:2,b:2},...]
},
{...,
ref:[{a:1,b:2}]
},
...,
]
}
Where ref is an array guaranteed to be of length of at least 1.
If I want to count the individual occurrences of each of elements in each of the ref arrays I would use the following aggregation. (This works fine)
db.cl.aggregate([
{$unwind:"$trials"},
{$unwind:"$trials.ref"},
{$group:{_id:"$trials.ref", count:{$sum:1}}}
])
Now I want to do the same thing, but only with the last element in each ref array. I need a way to only select the last element of each array in the aggregation pipeline.
I first thought I could add a intermediate step to just get all the elements that I want to group by doing something like this:
db.cl.aggregate([
{$unwind:"$trials"},
{$group:{_id:null,arr:{$push:"$trials.ref.-1"}}},...
])
I've also tried using a position operator with $match.
db.cl.aggregate([
{$unwind:"$trials"},
{$match:{"trials.ref.$":-1}},...
])
Or trying to project the last element.
db.cl.aggregate([
{$unwind:"$trials"},
{$project:{ref:"$trials.ref.1"}}
])
Neither of these get me anywhere. The $pop operator is not valid in the aggregation pipeline. $last operator isn't really useful here.
Any ideas on how to only use the last element of the ref array? I'd rather keep with the aggregation framework and NOT use Map Reduce.

The aggregation framework really has no way of dealing with this. Aside from lacking any "slice" type operator, the real problem here is the lack of any marker to tell where your inner array ends, and there really isn't any way to do that with any other form of document re-shaping.
For now at least, the mapReduce approach is very simple, and does not even require a reducer:
db.cl.mapReduce(
function() {
this.trials.forEach(function(trial) {
trial.ref = trial.ref.slice(-1);
});
var id = this._id;
delete this._id;
emit( id, this );
},
function(){},
{ "out": { "inline": 1 } }
)
In the future there might be some hope. Some form of $slice has sought after for some time. But I did notice this interesting snippet inside the $map operator code. Just to list here as well:
output.reserve(input.size());
for (size_t i=0; i < input.size(); i++) {
vars->setValue(_varId, input[i]);
Value toInsert = _each->evaluateInternal(vars);
if (toInsert.missing())
toInsert = Value(BSONNULL); // can't insert missing values into array
output.push_back(toInsert);
}
Note the for loop and the index value. I for one would be voting to have this exposed as a variable within the $map operator, as where you know the current position and the length of the array you can effectively do "slicing".
But for now, there is not a way to tell where you are in the array using $map and if you $unwind both of your arrays, you loose the end-points of the inner arrays. So the aggregation framework is lacking in the solutions to the right now.

Related

Mongoose find query - specific items or empty array

I'm trying to build a filter where it should be possible to query all items selected in an array, but also to show documents where the property has not been set.
returning all specific from an array works fine with below
searchCriteria.filter1 = {
$in: array.category,
};
searchCriteria.filter2 = {
$in: array.carbrand,
};
//Then the data if fetched
const fetchedData = await Activity.find(searchCriteria)
.sort({ date: -1 })
.limit(limit)
.skip(startIndex)
.exec();
However, sometimes users have not added a category, and it's just a empty array. My goal is to get all of these empty arrays as well as the specific arrays.
so something like:
searchCriteria.filter2 = {
$in: array.carbrand OR is []
};
Any suggestions?
One way you could approach this is indeed to use the $or operator. But since $in is logically an OR for a single value, you should be able to just append [] to the list being used for comparison by the $in operator.
The approach is demonstrated in this playground example.
I believe you could adjust the code in a manner similar to this:
searchCriteria.filter2 = {
$in: [...array.carbrand, []]
};

MongoDB $pop element which is in 3 time nested array

Here is the data structure for each document in the collection. The datastructure is fixed.
{
'_id': 'some-timestamp',
'RESULT': [
{
'NUMERATION': [ // numeration of divisions
{
// numeration of producttypes
'DIVISIONX': [{'PRODUCTTYPE': 'product xy', COUNT: 100}]
}
]
}
]
}
The query result should be in the same structure but only contain producttypes matching a regular expression.
I tried using an nested $elemMatchoperator but this doesn't get me any closer. I don't know how I can iterate each value in the producttypes array for each division.
How can I do that? Then I could apply $pop, $in and $each.
I looked at:
Querying an array of arrays in MongoDB
https://docs.mongodb.com/manual/reference/operator/update/each/
https://docs.mongodb.com/manual/reference/operator/update/pop/
... and more
The solution I want to avoid is writing something like this:
collection.find().forEach(function(x) { /* more for eaches */ })
Edit:
Here is an example document to copy:
{"_id":"5ab550d7e85d5930b0879cbe","RESULT":[{"NUMERATION":[{"DIVISION":[{"PRODUCTTYPE":"Book","COUNT":10},{"PRODUCTTYPE":"Giftcard","COUNT":"300"}]}]}]}
E.g. the query result should only return the entry with the giftcard:
{"_id":"5ab550d7e85d5930b0879cbe","RESULT":[{"NUMERATION":[{"DIVISION":[{"PRODUCTTYPE":"Giftcard","COUNT":"300"}]}]}]}
Using the forEach approach the result is in the correct format. I'm still looking for a better way which does not involve the use of that function - therefore I will not mark this as an answer.
But for now this works fine:
db.collection.find().forEach(
function(wholeDocument) {
wholeDocument['RESULT'].forEach(function (resultEntry) {
resultEntry['NUMERATION'].forEach(function (numerationEntry) {
numerationEntry['DIVISION'].forEach(function(divisionEntry, index) {
// example condition (will be replaced by regular expression evaluation)
if(divisionEntry['PRODUCTTYPE'] != 'Giftcard'){
numerationEntry['DIVISION'].splice(index, 1);
}
})
})
})
print(wholeDocument);
}
)
UPDATE
Thanks to Rahul Raj's comments I have read up the aggregation with the $redact operator. A prototype of the solution to the issue is this query:
db.getCollection('DeepStructure').aggregate( [
{ $redact: {
$cond: {
if: { $ne: [ "$PRODUCTTYPE", "Giftcard" ] },
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}
]
)
I hope you're trying to update nested array.
You need to use positional operators $[] or $ for that.
If you use $[], you will be able to remove all matching nested array elements.
And if you use $, only the first matching array element will get removed.
Use $regex operator to pass on your regular expression.
Also, you need to use $pull to remove array elements based on matching condition. In your case, its regular expression. Note that $elemMatch is not the correct one to use with $pull as arguments to $pull are direct queries to the array.
db.collection.update(
{/*additional matching conditions*/},
{$pull: {"RESULT.$[].NUMERATION.$[].DIVISIONX":{PRODUCTTYPE: {$regex: "xy"}}}},
{multi: true}
)
Just replace xy with your regular expression and add your own matching conditions as required. I'm not quite sure about your data set, but I came up with the above answer based on my assumptions from the given info. Feel free to change according to your requirements.

Exact match when searching in arrays of array in MongoDB

I have two questions. I found similar things but I couldn't adapt to my problem.
query = {'$and': [{'cpc.class': u'24'},
{'cpc.section': u'A'},
{'cpc.subclass': u'C'}]}
collection:
{"_id":1,
"cpc":
[{u'class': u'24',
u'section': u'A',
u'subclass': u'B'},
{u'class': u'07',
u'section': u'C',
u'subclass': u'C'},]}
{"_id":2,
"cpc":
[{u'class': u'24',
u'section': u'A',
u'subclass': u'C'},
{u'class': u'07',
u'section': u'K',
u'subclass': u'L'},]}
In this query, two documents will be fetched.
1) But I want to fetch only the second document ("_id": 2) because it matches the query exactly. That is, the second document contains a cpc element which its class equals to 24, its section equals to A, and its subclass equals to C.
2) And I want to fetch only the matching element of cpc if possible? Otherwise I have to traverse all elements of each retrieved documents; if I traverse and try to find out which element matches exactly then my first question would be meaningless.
Thanks!
1) you're looking for the $elemMatch operator which compares subdocuments as a whole and is more concise then separate subelement queries (you don't need the $and in your query by the way):
query = { 'cpc' : {
'$elemMatch': { 'class': u'24',
'section': u'A',
'subclass': u'C' } } };
2) That can be done using a projection:
db.find(query, { "cpc.$" : 1 })
The $ projection operator documentation contains pretty much this use case as an example.

How to force MongoDB pullAll to disregard document order

I have a mongoDB document that has the following structure:
{
user:user_name,
streams:[
{user:user_a, name:name_a},
{user:user_b, name:name_b},
{user:user_c, name:name_c}
]
}
I want to use $pullAll to remove from the streams array, passing it an array of streams (the size of the array varies from 1 to N):
var streamsA = [{user:"user_a", name:"name_a"},{user:"user_b", name:"name_b"}]
var streamsB = [{name:"name_a", user:"user_a"},{name:"name_b", user:"user_b"}]
I use the following mongoDB command to perform the update operation:
db.streams.update({name:"user_name", {"$pullAll:{streams:streamsA}})
db.streams.update({name:"user_name", {"$pullAll:{streams:streamsB}})
Removing streamsA succeeds, whereas removing streamsB fails. After digging through the mongoDB manuals, I saw that the order of fields in streamsA and streamsB records has to match the order of fields in the database. For streamsB the order does not match, that's why it was not removed.
I can reorder the streams to the database document order prior to performing an update operation, but is there an easier and cleaner way to do this? Is there some flag that can be set to update and/or pullAll to ignore the order?
Thank You,
Gary
The $pullAll operator is really a "special case" that was mostly intended for single "scalar" array elements and not for sub-documents in the way you are using it.
Instead use $pull which will inspect each element and use an $or condition for the document lists:
db.streams.update(
{ "user": "user_name" },
{ "$pull": { "streams": { "$or": streamsB } }}
)
That way it does not matter which order the fields are in or indeed look for an "exact match" as the current $pullAll operation is actually doing.

Mongodb alternative to Dot notation to edit nested fields with $inc

conceptually what I am trying to figure out is if there is an alternative to accessing nested docs with mongo other than dot notation.
What I am trying to accomplish:
I have a user collection, and each user has a nested songVotes collection where the keys for this nested songVotes collection are the songIds and the value is their vote form the user -1,0, or 1.
I have a "room collection" where many users go and their collective votes for each song influence the room. A single room also has a nested songVotes collection with keys as songIds, however the value is the total number of accumulated votes for all the users in the room added up. For purposes of Meteor.js, its more efficient as users enter the room to add their votes to this nested cumulative vote collection.
Again because reactive joins in Meteor.js arent supported in any kind of efficient way, it also doesnt make sense to break out these nested collections to solve my problem.
So what I am having trouble with is this update operation when a user first enters the room where I take a single users nested songVotes collection and use the mongo $inc operator to apply it to the nested cumulative songVotes collection of the entire room.
The problem is that if you want to use the $inc operator with nested fields, you must use dot notation to access them. So what I am asking on a broad sense is if there is a nice way to apply updates like this to a nested object. Or perhaps specify a global dot notation prefix for $inc something like:
var userVotes = db.collection.users.findOne('user_id').songVotes
// userVotes --> { 'song1': 1, 'song2': -1 ... }
db.rooms.update({ _id: 'blah' }, { $set: { roomSongVotes: { $inc: userVotes } } })
You do need to use dot notation, but you can still do that in your case by programmatically building up the update object:
var userVotes = {'song1': 1, 'song2': -1};
var update = {$inc: {}};
for (var songId in userVotes) {
update.$inc['roomSongVotes.' + songId] = userVotes[songId];
}
db.rooms.update({ _id: 'blah' }, update);
This way, update gets built up as:
{ '$inc': { 'roomSongVotes.song1': 1, 'roomSongVotes.song2': -1 } }