$limit get ignored after $unwind with mongo aggregate - mongodb

I have the following document structure in my mongo documents:
{
"_id":"EAovkxq63eXshdEb5",
"gender":"man",
"firstName":"Test",
"lastName":"Test",
"otherFirstName":"",
...
"cooperatives":[
{
"cooperativeId":"xaPFLy2XbJFsDA2st",
...
}
]
}
I try to only get cooperator with cooperatives.cooperativeId with a certain value and I also try to filter out the sub array cooperatives to keep only one. Finaly, I need pagination to only keep a certain amount of data back to the front end :
cooperators.aggregate(
{ $match: query },
{ $sort: {'displayName':1} },
{ $unwind: '$cooperatives' },
{ $match: { 'cooperatives.cooperativeId': 'myid'} },
{ $skip: 10 },
{ $limit: 50 }
);
The result returns all the document matching my criteria but $skip and $limit are completely ignored.
Thanks for your help !

Related

How to copy a value from an array to another field and then pop it?

Assume we have inserted the following object.
db.getCollection('things').insert(
{ plonks: [ "plonk1", "plonk2", "plonk3" ], _id: 1001 }
)
I can update the document and insert a new field like so.
db.getCollection('things').update(
{ _id: 1001 },
{ $set: { plonk: "beep" } }
)
I can also pop the latest addition like this.
db.getCollection('things').update(
{ _id: 1001 },
{ $pop: { plonks: 1 } }
)
Now, I'd like to connect those two operations somehow so that the popped element will be set as a value in the separate field. I want to end up in a document looking as follows.
{
_id: 1001,
plonk: "plonk3",
plonks: [ "plonk1", "plonk2" ]
}
There's nothing on fetching/storageing the popped value in the docs for $pop. I've seen an answer to a similar question, which didn't work out as they're copying the element onto itself, reducing the original array. I tried reading up on $aggregate as I sensed that let me provide a list of operations but I'm fumbling in darkness without any method of approach.
I also tried with a combination of $match and $set getting a bit success when copying over the whole array like this.
db.getCollection('things').aggregate([
{ $match: { _id: 1001 } },
{ $set: { plonk: "$plonks" } }
])
However, I haven't got the $sliceing to work.
db.getCollection('things').aggregate([
{ $match: { _id: 1001 } },
{ $set: { plonk: { plonks: { $slice: 1 } } } }
])
You can make use of the MongoDB Aggregation Pipeline support for update command feature.
Note: This will work only on MongoDB version >= 4.2
For plonk key, make use of $arrayElemAt to fetch the last element of the array
For plonks key, use $slice accumulator, where the position option is the size of plonks array - 1
db.getCollection('things').update({
"plonks": {"$exists": true}, // Find Conditions
},
[
{
"$set": {
"plonk": { // Fetch last element from `plonks` array
"$arrayElemAt": [
"$plonks",
-1
]
},
"plonks": {
"$slice": [ // Get elements from array till n-1
"$plonks",
{
"$subtract": [
{
"$size": "$plonks" // Fetches size of `plonks` array
},
1
]
},
],
},
}
},
])
Mongo Playground Sample Execution

MongoDB - count by field, and sort by count

I am new to MongoDB, and new to making more than super basic queries and i didn't succeed to create a query that does as follows:
I have such collection, each document represents one "use" of a benefit (e.g first row states the benefit "123" was used once):
[
{
"id" : "1111",
"benefit_id":"123"
},
{
"id":"2222",
"benefit_id":"456"
},
{
"id":"3333",
"benefit_id":"456"
},
{
"id":"4444",
"benefit_id":"789"
}
]
I need to create q query that output an array. at the top is the most top used benefit and how many times is was used.
for the above example the query should output:
[
{
"benefit_id":"456",
"cnt":2
},
{
"benefit_id":"123",
"cnt": 1
},
{
"benefit_id":"789",
"cnt":1
}
]
I have tried to work with the documentation and with $sortByCount but with no success.
$group
$group by benefit_id and get count using $sum
$sort by count descending order
db.collection.aggregate([
{
$group: {
_id: "$benefit_id",
count: { $sum: 1 }
}
},
{ $sort: { count: -1 } }
])
Playground
$sortByCount
Same operation using $sortByCount operator
db.collection.aggregate([
{ $sortByCount: "$benefit_id" }
])
Playground

How to update and return all array elements which matches a condition?

I tried the below and understood $ returns first matching element.
vardate=newDate();
date.setDate(date.getDate()-30);
db.getCollection('status').find({
'data.end_ts': {
'$lte': date
},
$or: [
{
"data.risk_status": 'inactive'
},
{
"data.risk_status": 'expired'
}
]
},
{
"data.$": 1
})
Then I planned to remove projection and do the removal job at java.
Here, the problem is that I need to remove and insert into another collection. Hence, I can't just use delete.
I came up with another way so that I can avoid conditions at java.
db.getCollection('status').aggregate([
{
"$match": {
$or: [
{
"data.risk_status": 'inactive'
},
{
"data.risk_status": 'expired'
}
]
}
},
{
$unwind: "$data"
},
{
$match: {
'datas.end_ts': {
'$lte': date
}
}
},
{
$group:{
"_id":"$_id",
"a":{$push:"$$ROOT"}
}
},
{
$project:{
"_id":1,
"a.data":1
}
}
])
])
Is there any other way which deletes and returns the docs. So that I just can save the returned doc to other collection.
Can I use $out here to do that? I am not sure. Any help which reduces the network round trip time is desirable.
Yes, of course you can use $out to push a new collections. Since you need to add some data from one collection to another collection, $out helps efficiently and reduce programmatical time and codes.
Aggregation aggregation = Aggregation.newAggregation(
match(Criteria.where("data.risk_status").in("inactive","expired")),
unwind("$data"),
// all other stages
o->new Document("$out","NEW_COLLECTION_NAME")
).withOptions(AggregationOptions.builder().allowDiskUse(Boolean.TRUE).build());
Note : $out must be the last stage of aggregation. Ref $out.

Best usage for MongoDB Aggregate request

I would like to highlight a list of _id documents (with a limit) ranked in descending order (via their timestamp) based on a list of ObjectId.
Corresponding to this:
db.collection.aggregate( [ { $match: { _id: { $in: [ObjectId("X"), ObjectId("Y") ] } } }, { $sort: { timestamp: -1 } }, { $group: { _id: "$_id" } }, { $skip: 0 }, { $limit: 100 } ] )
Knowing that the list from the loop may contain way more than 1000 ObjectId (in $in array), do you think my solution is viable? Is not there a faster and less resource intensive way?
Best Regards.

mongodb aggregation framework group + project

I have the following issue:
this query return 1 result which is what I want:
> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } } }])
{
"result" : [
{
"_id" : "b91e51e9-6317-4030-a9a6-e7f71d0f2161",
"version" : 1.2000000000000002
}
],
"ok" : 1
}
this query ( I just added projection so I can later query for the entire document) return multiple results. What am I doing wrong?
> db.items.aggregate([ {$group: { "_id": "$id", version: { $max: "$version" } }, $project: { _id : 1 } }])
{
"result" : [
{
"_id" : ObjectId("5139310a3899d457ee000003")
},
{
"_id" : ObjectId("513931053899d457ee000002")
},
{
"_id" : ObjectId("513930fd3899d457ee000001")
}
],
"ok" : 1
}
found the answer
1. first I need to get all the _ids
db.items.aggregate( [
{ '$match': { 'owner.id': '9e748c81-0f71-4eda-a710-576314ef3fa' } },
{ '$group': { _id: '$item.id', dbid: { $max: "$_id" } } }
]);
2. then i need to query the documents
db.items.find({ _id: { '$in': "IDs returned from aggregate" } });
which will look like this:
db.items.find({ _id: { '$in': [ '1', '2', '3' ] } });
( I know its late but still answering it so that other people don't have to go search for the right answer somewhere else )
See to the answer of Deka, this will do your job.
Not all accumulators are available in $project stage. We need to consider what we can do in project with respect to accumulators and what we can do in group. Let's take a look at this:
db.companies.aggregate([{
$match: {
funding_rounds: {
$ne: []
}
}
}, {
$unwind: "$funding_rounds"
}, {
$sort: {
"funding_rounds.funded_year": 1,
"funding_rounds.funded_month": 1,
"funding_rounds.funded_day": 1
}
}, {
$group: {
_id: {
company: "$name"
},
funding: {
$push: {
amount: "$funding_rounds.raised_amount",
year: "$funding_rounds.funded_year"
}
}
}
}, ]).pretty()
Where we're checking if any of the funding_rounds is not empty. Then it's unwind-ed to $sort and to later stages. We'll see one document for each element of the funding_rounds array for every company. So, the first thing we're going to do here is to $sort based on:
funding_rounds.funded_year
funding_rounds.funded_month
funding_rounds.funded_day
In the group stage by company name, the array is getting built using $push. $push is supposed to be part of a document specified as the value for a field we name in a group stage. We can push on any valid expression. In this case, we're pushing on documents to this array and for every document that we push it's being added to the end of the array that we're accumulating. In this case, we're pushing on documents that are built from the raised_amount and funded_year. So, the $group stage is a stream of documents that have an _id where we're specifying the company name.
Notice that $push is available in $group stages but not in $project stage. This is because $group stages are designed to take a sequence of documents and accumulate values based on that stream of documents.
$project on the other hand, works with one document at a time. So, we can calculate an average on an array within an individual document inside a project stage. But doing something like this where one at a time, we're seeing documents and for every document, it passes through the group stage pushing on a new value, well that's something that the $project stage is just not designed to do. For that type of operation we want to use $group.
Let's take a look at another example:
db.companies.aggregate([{
$match: {
funding_rounds: {
$exists: true,
$ne: []
}
}
}, {
$unwind: "$funding_rounds"
}, {
$sort: {
"funding_rounds.funded_year": 1,
"funding_rounds.funded_month": 1,
"funding_rounds.funded_day": 1
}
}, {
$group: {
_id: {
company: "$name"
},
first_round: {
$first: "$funding_rounds"
},
last_round: {
$last: "$funding_rounds"
},
num_rounds: {
$sum: 1
},
total_raised: {
$sum: "$funding_rounds.raised_amount"
}
}
}, {
$project: {
_id: 0,
company: "$_id.company",
first_round: {
amount: "$first_round.raised_amount",
article: "$first_round.source_url",
year: "$first_round.funded_year"
},
last_round: {
amount: "$last_round.raised_amount",
article: "$last_round.source_url",
year: "$last_round.funded_year"
},
num_rounds: 1,
total_raised: 1,
}
}, {
$sort: {
total_raised: -1
}
}]).pretty()
In the $group stage, we're using $first and $last accumulators. Right, again we can see that as with $push - we can't use $first and $last in project stages. Because again, project stages are not designed to accumulate values based on multiple documents. Rather they're designed to reshape documents one at a time. Total number of rounds is calculated using the $sum operator. The value 1 simply counts the number of documents passed through that group together with each document that matches or is grouped under a given _id value. The project may seem complex, but it's just making the output pretty. It's just that it's including num_rounds and total_raised from the previous document.