Possible to use the slice operator with the aggregation framework?

Possible to use the slice operator with the aggregation framework? - mongodb

I have a list of employees, each who belong to a department and a company.
An employee also has a salary history. The last value is their current salary.
Example:
{
name: "Programmer 1"
employee_id: 1,
dept_id: 1,
company_id: 1,
salary: [50000,50100,50200]
},
{
name: "Programmer 2"
employee_id: 2,
dept_id: 1,
company_id: 1,
salary: [50000,50200,50300]
},
{
name: "Manager"
employee_id: 3,
dept_id: 2,
company_id: 1,
salary: [60000,60500,61000]
},
{
name: "Contractor (different company)"
employee_id: 4,
dept_id: 1,
company_id: 2,
salary: [60000,60500,75000]
}
I want to find the current average salary for employees, grouped by dept_id and company_id.
Something like:
db.employees.aggregate(
{ $project : { employee_id: 1, dept_id: 1, company_id: 1, salaries: 1}},
{ $unwind : "$salaries" },
{
"$group" : {
"_id" : {
"dept_id" : "$dept_id",
"company_id" : "$company_id",
},
current_salary_avg : { $avg : "$salaries.last()" }
}
}
);
In this case it would be
Company 1, Group 1: 50250
Company 1, Group 2: 61000
Company 2, Group 1: 75000
I've seen examples doing something similar with $unwind, but I'm struggling with getting the last value of salary. Is $slice the correct operator in this case, and if so how do I use it with project?

In this case you need to set up your pipeline as follows :
unwind the salary list to get all the salaries for each employee
group by employee, dept and company and get the last salary
group by dept and company and get the average salary
The code for this aggregation pipeline is :
use test;
db.employees.aggregate( [
{$unwind : "$salary"},
{
"$group" : {
"_id" : {
"dept_id" : "$dept_id",
"company_id" : "$company_id",
"employee_id" : "$employee_id",
},
"salary" : {$last: "$salary"}
}
},
{
"$group" : {
"_id" : {
"company_id" : "$_id.company_id",
"dept_id" : "$_id.dept_id",
},
"current_salary_avg" : {$avg: "$salary"}
}
},
{$sort :
{
"_id.company_id" : 1,
"_id.dept_id" : 1,
}
},
]);
Assuming that you have imported the data with:
mongoimport --drop -d test -c employees <<EOF
{ name: "Programmer 1", employee_id: 1, dept_id: 1, company_id: 1, salary: [50000,50100,50200]}
{ name: "Programmer 2", employee_id: 2, dept_id: 1, company_id: 1, salary: [50000,50200,50300]}
{ name: "Manager", employee_id: 3, dept_id: 2, company_id: 1, salary: [60000,60500,61000]}
{ name: "Contractor (different company)", employee_id: 4, dept_id: 1, company_id: 2, salary: [60000,60500,75000]}
EOF

Now you can use $slice in aggregation. To return elements from either the start or end of the array: { $slice: [ <array>, <n> ] }
To return elements from the specified position in the array: { $slice: [ <array>, <position>, <n> ] }.
And a couple of examples from the mongo page:
{ $slice: [ [ 1, 2, 3 ], 1, 1 ] } // [ 2 ]
{ $slice: [ [ 1, 2, 3 ], -2 ] } // [ 2, 3 ]
{ $slice: [ [ 1, 2, 3 ], 15, 2 ] } // [ ]
{ $slice: [ [ 1, 2, 3 ], -15, 2 ] } // [ 1, 2 ]

Related

Java MongoDB Projection

I am referring mongodb official page for projection where I came across following example where elements of array in subdocument is filtered:
https://docs.mongodb.com/manual/reference/operator/aggregation/filter/#exp._S_filter
db.sales.aggregate([
{
$project: {
items: {
$filter: {
input: "$items",
as: "item",
cond: { $gte: [ "$$item.price", 100 ] }
}
}
}
}
])
I am trying to implement this in Java but I am not doing it correctly and elements in subdocument array are not filtered.
Input Collection:
{
_id: 0,
items: [
{ item_id: 43, quantity: 2, price: 10 },
{ item_id: 2, quantity: 1, price: 240 }
]
}
{
_id: 1,
items: [
{ item_id: 23, quantity: 3, price: 110 },
{ item_id: 103, quantity: 4, price: 5 },
{ item_id: 38, quantity: 1, price: 300 }
]
}
{
_id: 2,
items: [
{ item_id: 4, quantity: 1, price: 23 }
]
}
Expected Output Collection:
{
"_id" : 0,
"items" : [
{ "item_id" : 2, "quantity" : 1, "price" : 240 }
]
}
{
"_id" : 1,
"items" : [
{ "item_id" : 23, "quantity" : 3, "price" : 110 },
{ "item_id" : 38, "quantity" : 1, "price" : 300 }
]
}
{ "_id" : 2, "items" : [ ] }
In Java(mongo Driver 3.9.1), this is what I am doing:
Bson priceFilter = Filters.gte("items.price", 100);
mongoCollection.aggregate(
Aggregates.project(Projections.fields(priceFilter))
);
How do I project with aggregate function for the subdocument arrays where I need to filter out elements from subdocument array based on some condition?

In MongoDB Java Driver 3.9.1, collection.aggregate() takes a java.util.List as parameter. So you need to replace your Java code with the below.
mongoCollection.aggregate(
Arrays.asList(
Aggregates.project(Projections.computed("items",
new Document().append("$filter",
new Document().append("input", "$items").append("as", "item").append("cond",
new Document().append("$gte", Arrays.asList("$$item.price",100))))))
)
);

Atomically move object by ID from one array to another in same document [duplicate]

This question already has an answer here:
Move an element from one array to another within same document MongoDB
(1 answer)
Closed 3 years ago.
I have data that looks like this:
{
"_id": ObjectId("4d525ab2924f0000000022ad"),
"arrayField": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 0, other: 235 },
{ id: 3, other: 765 }
],
"someOtherArrayField": []
}
Given a nested object's ID (0), I'd like to $pull the element from one array (arrayField) and $push it to another array (someOtherArrayField) within the same document. The result should look like this:
{
"_id": ObjectId("id"),
"arrayField": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 3, other: 765 }
],
"someOtherArrayField": [
{ id: 0, other: 235 }
]
}
I realize that I can accomplish this with a find followed by an update, i.e.
db.foo.findOne({"_id": param._id})
.then((doc)=>{
db.foo.update(
{
"_id": param._id
},
{
"$pull": {"arrayField": {id: 0}},
"$push": {"someOtherArrayField": {doc.array[2]} }
}
)
})
But I'm looking for an atomic operation like, in pseudocode, this:
db.foo.update({"_id": param._id}, {"$move": [{"arrayField": {id: 0}}, {"someOtherArrayField": 1}]}
Is there an atomic way to do this, perhaps using MongoDB 4.2's ability to specify a pipeline to an update command? How would that look?
I found this post that generously provided the data I used, but the provided solution isn't an atomic operation. Has an atomic solution become possible with MongoDB 4.2?

Here's an example:
> db.baz.find()
> db.baz.insert({
... "_id": ObjectId("4d525ab2924f0000000022ad"),
... "arrayField": [
... { id: 1, other: 23 },
... { id: 2, other: 21 },
... { id: 0, other: 235 },
... { id: 3, other: 765 }
... ],
... "someOtherArrayField": []
... })
WriteResult({ "nInserted" : 1 })
function extractIdZero(arrayFieldName) {
return {$arrayElemAt: [
{$filter: {input: arrayFieldName, cond: {$eq: ["$$this.id", 0]}}},
0
]};
}
extractIdZero("$arrayField")
{
"$arrayElemAt" : [
{
"$filter" : {
"input" : "$arrayField",
"cond" : {
"$eq" : [
"$$this.id",
0
]
}
}
},
0
]
}
db.baz.updateOne(
{_id: ObjectId("4d525ab2924f0000000022ad")},
[{$set: {
arrayField: {$filter: {
input: "$arrayField",
cond: {$ne: ["$$this.id", 0]}
}},
someOtherArrayField: {$concatArrays: [
"$someOtherArrayField",
[extractIdZero("$arrayField")]
]}
}}
])
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
> db.baz.findOne()
{
"_id" : ObjectId("4d525ab2924f0000000022ad"),
"arrayField" : [
{
"id" : 1,
"other" : 23
},
{
"id" : 2,
"other" : 21
},
{
"id" : 3,
"other" : 765
}
],
"someOtherArrayField" : [
{
"id" : 0,
"other" : 235
}
]
}

Match with only one record after using $unwind on the record

I have records as below
[
{
_id : 1, nm : 1,
usr:[
{id : 1, ty : 'team'},
{id : 2, em : 'a'},
{id : 3, em : 'b'}
]
},
{
_id : 2, nm : 2,
usr:[
{id : 2, em : 'a'},
{id : 3, em : 'b'}
]
},
{
_id : 3, nm : 3,
usr:[
{id : 1, ty : 'team'},
{id : 3, em : 'b'}
]
},
{
_id : 4, nm : 4,
usr:[
{id : 3, em : 'b'}
]
}
]
I want the count to be 3 when querying with userAndTeam = [1, 2], i.e. if a record has usr.id as 1 or 2 get those records.
I want the count to be 2 when querying with userAndTeam4 = [1, 4], i.e. if a record has usr.id as 1 or 4 get those records.
I have tried using $unwind, which made the count to 4 for the first case, as $unwind created 3 records for the first record and below query matches with 2 of those records.
The query I have tried:
var userAndTeam = [1, 2] //User id and team id. Record should match one of the ids.
var userAndTeam4 = [1, 4]
[{
$unwind: '$usr'
},
{
$project:{
'count-2' : {$cond: {
if: {$in : ['$usr.id',userAndTeam]},
then: 1,
else: 0
}},
'count-4' : {$cond: {
if: {$in : ['$usr.id',userAndTeam4]},
then: 1,
else: 0
}}
}
}
]
Output:
Expected for user with id 2 - 'count-2' : 3
Got for user with id 2 - 'count-2' : 4
Expected for user with id 4 - 'count-4' : 2
Got for user with id 4 - 'count-4' : 2
Can someone guide me to solve this problem and to get the expected count?

You can try below aggregation:
db.col.aggregate([
{
$unwind: "$usr"
},
{
$project:{
"userId": "$usr.id",
"count-2": {
$cond: {
if: { $in: [ "$usr.id", userAndTeam ] },
then: 1,
else: 0
}
},
"count-4": {
$cond: {
if: { $in: [ "$usr.id", userAndTeam4] },
then: 1,
else: 0
}
}
}
},
{
$group: {
_id: "$_id",
"count-2": { $max: "$count-2" },
"count-4": { $max: "$count-4" }
}
},
{
$group: {
_id: null,
"count-2": { $sum: "$count-2" },
"count-4": { $sum: "$count-4" }
}
}
])
What you're missing in your code is that you want to either have 0 or 1 per each _id so you need to group by _id taking $max value of each group which is 0 if none match of 1 if any element matches your array.

Get the set of all unique values in array field

Given the following documents:
{ "_id" : ObjectId("585901b7875bab86885cf54f"), "foo" : 24, "bar" : [ 1, 2, 5, 6 ] }
{ "_id" : ObjectId("585901be875bab86885cf550"), "foo" : 42, "bar" : [ 3, 4 ] }
I want to get all the unique values in the bar field, something like:
{"_id": "something", "bar": [1, 2, 3, 4, 5, 6]}
This is what I tried:
db.stuff.aggregate([{
$group: {
_id: null,
bar: {
$addToSet: {$each: "$bar"}
}
}
}])
But complains that $each is not a recognized operator.
This does work:
db.stuff.aggregate([{
$group: {
_id: null,
bar: {
$addToSet: "$bar"
}
}
}])
But obviously produces a wrong result:
{ "_id" : null, "bar" : [ [ 3, 4 ], [ 1, 2, 5, 6 ] ] }
EDIT
I managed to have the result I want by adding a first $unwind stage:
db.stuff.aggregate([{
$unwind: { "$bar" },
$group: {
_id: null,
bar: {
$addToSet: "$bar"
}
}
}])
=> { "_id" : null, "bar" : [ 4, 3, 5, 2, 6, 1 ] }
Is it possible at all to make it in one single pipeline stage?

The distinct() works with array field as well so will beautifully do this.
db.stuff.distinct('bar')
The aggregation framework is overkill for this and will not perform well

find documents having a specific count of matches array

I've searched high and low but not been able to find what i'm looking for so apologies if this has already been asked.
Consider the following documents
{
_id: 1,
items: [
{
category: "A"
},
{
category: "A"
},
{
category: "B"
},
{
category: "C"
}]
},
{
_id: 2,
items: [
{
category: "A"
},
{
category: "B"
}]
},
{
_id: 3,
items: [
{
category: "A"
},
{
category: "A"
},
{
category: "A"
}]
}
I'd like to be able to find those documents which have more than 1 category "A" item in the items array. So this should find documents 1 and 3.
Is this possible?

Using aggregation
> db.spam.aggregate([
{$unwind: "$items"},
{$match: {"items.category" :"A"}},
{$group: {
_id: "$_id",
item: {$push: "$items.category"}, count: {$sum: 1}}
},
{$match: {count: {$gt: 1}}}
])
Output
{ "_id" : 3, "item" : [ "A", "A", "A" ], "count" : 3 }
{ "_id" : 1, "item" : [ "A", "A" ], "count" : 2 }