I am trying to write a mongoDB aggregation query in Scala.
How do I write Scala code to use "$let" in '$project' stage?
I am wondering if Variable should be used. Not sure how?
'$project': {
'myprojitem' :{
'$let': {
'vars' : { 'myVariable1': { '$or': [...] } }
'in' : {
'$cond': [
'$$myVariable1',
{ ... },
{ ... },
]
}
}
I figured out the answer. Hopefully it helps someone.
val doc : Document = Document("{
'$let': {
'vars' : { 'myVariable1': { '$or': [...] } },
'in' : { '$cond': ['$$myVariable1',{ ... },{ ... } ]
}
}")
var pipeline = mutable.Buffer[Bson]()
pipeline += Aggregates.project(Projections.fields(
Projections.computed("myprojitem",doc)
))
Basically, every { name : expression } can be written as :
Document("name" -> expression)
Or
Document( "{name : expression}")
$let is used to bind variables together to a results obj. The syntax follows the rule:
{
$let:
{
vars: { <var1>: <expression>},
in: <expression>
}
}
for mere details you should take a look at $let (aggregation) definition from mongodb manual
Here is a text book example just to make more sense:
Consider the following data:
{ _id: 1, price: 10, tax: 0.50, applyDiscount: true }
{ _id: 2, price: 10, tax: 0.25, applyDiscount: false }
And imagine that we want to generate a result for the finalTotal in a way that:
Where Disc = 10% if applyDiscount: true and 0 otherwise.
So we need now to create the aggregation on the data to construct this equation. So we can get a results like:
{ _id: 1, finalTotal: 9.45 }
{ _id: 2, finalTotal: 10.25 }
We can do this by doing:
$project: {
finalTotal: {
$let: {
vars: {
total: { $add: [ '$price', '$tax' ] },
discounted: { $cond: { if: '$applyDiscount', then: (0.9, else: 1 } }
},
in: { $multiply: [ "$$total", "$$discounted" ] }
}
}
}
We can break this down:
Step 1. adding price to tax together to a variable called total
total: { $add: [ '$price', '$tax' ] },
Step 2. transforming the condition in numbers (variable discounted)
discounted: { $cond: { if: '$applyDiscount', then: 0.9, else: 1 } }
Step 3. performing the operation $multiply operation between the constructed $$total and $$discounted
in: { $multiply: [ "$$total", "$$discounted" ] }
Related
Is it possible do same filtering as in js
const list = [
{
a: 1,
"mostImportant": "qwer",
"lessImportant": "rty"
},
{
a: 2,
"lessImportant": "weRt",
"notImportant": "asd",
},
{
a: 3,
"mostImportant": "qwe2",
"notImportant": "asd",
}
];
list.filter((data) => {
data.attrToSearch = data.mostImportant || data.lessImportant || data.notImportant;
return data.attrToSearch.match(/wer/i);
});
in MongoDB?
Loot at example:
https://mongoplayground.net/p/VQdfoQ-HQV4
So I want to attrToSearch contain value of first not blank attr with next order mostImportant, lessImportant, notImportant
and then match by regex.
Expected result is receive first two documents
Appreciate your help
Approach 1: With $ifNull
Updated
$ifNull only checks whether the value is null but does not cover checking for the empty string.
Hence, according to the attached JS function which skips for null, undefined, empty string value and takes the following value, you need to set the field value as null if it is found out with an empty string via $cond.
db.collection.aggregate([
{
$addFields: {
mostImportant: {
$cond: {
if: {
$eq: [
"$mostImportant",
""
]
},
then: null,
else: "$mostImportant"
}
},
lessImportant: {
$cond: {
if: {
$eq: [
"$lessImportant",
""
]
},
then: null,
else: "$lessImportant"
}
},
notImportant: {
$cond: {
if: {
$eq: [
"$notImportant",
""
]
},
then: null,
else: "$notImportant"
}
}
}
},
{
"$addFields": {
"attrToSearch": {
$ifNull: [
"$mostImportant",
"$lessImportant",
"$notImportant"
]
}
}
},
{
"$match": {
attrToSearch: {
$regex: "wer",
$options: "i"
}
}
}
])
Demo Approach 1 # Mongo Playground
Approach 2: With $function
Via $function, it allows you to write a user-defined function (UDF) with JavaScript support.
db.collection.aggregate([
{
"$addFields": {
"attrToSearch": {
$function: {
body: "function(mostImportant, lessImportant, notImportant) { return mostImportant || lessImportant || notImportant; }",
args: [
"$mostImportant",
"$lessImportant",
"$notImportant"
],
lang: "js"
}
}
}
},
{
"$match": {
attrToSearch: {
$regex: "wer",
$options: "i"
}
}
}
])
Demo Approach 2 # Mongo Playground
I am learning MongoDb query and my requirement is to calculate the average time between two dates. I wrote a mongoDB query with project and group stages.
{
project: {
OrderObject:1,
date:1
}
},
{
group: {
objectId: '$OrderObject.pharmacy.companyName',
count: {
$sum: 1
},
duration: {
$avg: {
$abs: {
$divide: [
{
$subtract: [
{
$arrayElemAt: [
'$date',
0
]
},
{
$arrayElemAt: [
'$date',
1
]
}
]
},
60000
]
}
}
},
OrderIDs: {
$addToSet: '$OrderObject.orderID'
},
pharmacyName: {
$addToSet: '$OrderObject.pharmacy.companyName'
},
}
}
The output I get is
{
count: 3,
duration: 54.53004444444445,
OrderIDs: [ 'ABCDE', 'EWQSE', 'ERTRE' ],
pharmacyName: [ 'pharmacy business Name' ],
objectId: null
},
Can someone please tell me why objectId is null in this case but the value is printed in pharmacyName field. I am using this pipeline in parse server as query.aggregate(pipeline, {useMasterKey:true})
The my expectation is pharmacyName === objectId
Most probably your nested element here is with different name:
OrderObject.name.companyName
but this is not an issue for the $group stage since it make the aggregation for all elements( in total 3) in the collection when the _id is null and do not give you any errors ...
It is also interesing why in your output the "pharmacyName" appear simply as "name" ? ;)
I'm trying to analyse some data and I thought my queries would be faster ultimately by storing a relationship between my collections instead. So I wrote something to do the data normalisation, which is as follows:
var count = 0;
db.Interest.find({'PersonID':{$exists: false}, 'Data.DateOfBirth': {$ne: null}})
.toArray()
.forEach(function (x) {
if (null != x.Data.DateOfBirth) {
var peep = { 'Name': x.Data.Name, 'BirthMonth' :x.Data.DateOfBirth.Month, 'BirthYear' :x.Data.DateOfBirth.Year};
var person = db.People.findOne(peep);
if (null == person) {
peep._id = db.People.insertOne(peep).insertedId;
//print(peep._id);
}
db.Interest.updateOne({ '_id': x._id }, {$set: { 'PersonID':peep._id }})
++count;
if ((count % 1000) == 0) {
print(count + ' updated');
}
}
})
This script is just passed to mongo.exe.
Basically, I attempt to find an existing person, if they don't exist create them. In either case, link the originating record with the individual person.
However this is very slow! There's about 10 million documents and at the current rate it will take about 5 days to complete.
Can I speed this up simply? I know I can multithread it to cut it down, but have I missed something?
In order to insert new persons into People collection, use this one:
db.Interest.aggregate([
{
$project: {
Name: "$Data.Name",
BirthMonth: "$Data.DateOfBirth.Month",
BirthYear: "$Data.DateOfBirth.Year",
_id: 0
}
},
{
$merge: {
into: "People",
// requires an unique index on {Name: 1, BirthMonth: 1, BirthYear: 1}
on: ["Name", "BirthMonth", "BirthYear"]
}
}
])
For updating PersonID in Interest collection use this pipeline:
db.Interest.aggregate([
{
$lookup: {
from: "People",
let: {
name: "$Data.Name",
month: "$Data.DateOfBirth.Month",
year: "$Data.DateOfBirth.Year"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ["$Name", "$$name"] },
{ $eq: ["$BirthMonth", "$$month"] },
{ $eq: ["$BirthYear", "$$year"] }
]
}
}
},
{ $project: { _id: 1 } }
],
as: "interests"
}
},
{
$set: {
PersonID: { $first: "$interests._id" },
interests: "$$REMOVE"
}
},
{ $merge: { into: "Interest" } }
])
Mongo Playground
this is my schema:
new Schema({
code: { type: String },
toy_array: [
{
date:{
type:Date(),
default: new Date()
}
toy:{ type:String }
]
}
this is my db:
{
"code": "Toystore A",
"toy_array": [
{
_id:"xxxxx", // automatic
"toy": "buzz"
},
{
_id:"xxxxx", // automatic
"toy": "pope"
}
]
},
{
"code": "Toystore B",
"toy_array": [
{
_id:"xxxxx", // automatic
"toy": "jessie"
}
]
}
I am trying to update an object. In this case I want to update the document with code: 'ToystoreA' and add an array of subdocuments to the array named toy_array if the toys does not exists in the array.
for example if I try to do this:
db.mydb.findOneAndUpdate({
code: 'ToystoreA,
/*toy_array: {
$not: {
$elemMatch: {
toy: [{"toy":'woddy'},{"toy":"buzz"}],
},
},
},*/
},
{
$addToSet: {
toy_array: {
$each: [{"toy":'woddy'},{"toy":"buzz"}],
},
},
},
{
new: false,
}
})
they are added and is what I want to avoid.
how can I do it?
[
{
"code": "Toystore A",
"toy_array": [
{
"toy": "buzz"
},
{
"toy": "pope"
}
]
},
{
"code": "Toystore B",
"toy_array": [
{
"toy": "jessie"
}
]
}
]
In this example [{"toy":'woddy'},{"toy":"buzz"}] it should only be added 'woddy' because 'buzz' is already in the array.
Note:when I insert a new toy an insertion date is also inserted, in addition to an _id (it is normal for me).
As you're using $addToSet on an object it's failing for your use case for a reason :
Let's say if your document look like this :
{
_id: 123, // automatically generated
"toy": "buzz"
},
{
_id: 456, // automatically generated
"toy": "pope"
}
and input is :
[{_id: 789, "toy":'woddy'},{_id: 098, "toy":"buzz"}]
Here while comparing two objects {_id: 098, "toy":"buzz"} & {_id: 123, "toy":"buzz"} - $addToSet consider these are different and you can't use $addToSet on a field (toy) in an object. So try below query on MongoDB version >= 4.2.
Query :
db.collection.updateOne({"_id" : "Toystore A"},[{
$addFields: {
toy_array: {
$reduce: {
input: inputArrayOfObjects,
initialValue: "$toy_array", // taking existing `toy_array` as initial value
in: {
$cond: [
{ $in: [ "$$this.toy", "$toy_array.toy" ] }, // check if each new toy exists in existing arrays of toys
"$$value", // If yes, just return accumulator array
{ $concatArrays: [ [ "$$this" ], "$$value" ] } // If No, push new toy object into accumulator
]
}
}
}
}
}])
Test : aggregation pipeline test url : mongoplayground
Ref : $reduce
Note :
You don't need to mention { new: false } as .findOneAndUpdate() return old doc by default, if you need new one then you've to do { new: true }. Also if anyone can get rid of _id's from schema of array objects then you can just use $addToSet as OP was doing earlier (Assume if _id is only unique field), check this stop-mongoose-from-creating-id-property-for-sub-document-array-items.
I have written a find query, which works, the find query returns records where name and level exist
db.docs.find( { $and: [{name:{$exists:true}},{level:{ $exists:true}} ] },{_id:0, name:1}).sort({"name":1})
and now want to combine it with something like the code below which also works, but needs to be merged with the above to pull the correct data
db.docs.aggregate(
[
{
$project:
{
_id:0,
name: 1,
Honours:
{
$cond: { if: { $gte: [ "$level", 8 ] }, then: "True", else: "False" }
}
}
}
]
)
The find query returns records where name and level exist, but I need to enhance the result with new column called Honours, showing True of False depending on whether the level is gte (greater than or equal to 8)
So I am basically trying to combine the above find filter with the $cond function (which I found and modified example here : $cond)
I tried the below and a few other permutations to try and make find and sort with the $project and$cond aggregate, but it returned errors. I am just very new to how to construct mongodb syntax to make it all fit together. Can anyone please help?
db.docs.aggregate(
[{{ $and: [{name:{$exists:true}},{level:{ $exists:true}} ] },{_id:0, name:1}).sort({"name":1}
{
$project:
{
_id:0,
name: 1,
Honours:
{
$cond: { if: { $gte: [ "$level", 8 ] }, then: "True", else: "False" }
}
}
}
}
]
)
Try below aggregation pipeline :
db.docs.aggregate([
/** $match is used to filter docs kind of .find(), lessen the dataset size for further stages */
{
$match: {
$and: [{ name: { $exists: true } }, { level: { $exists: true } }]
}
},
/** $project works as projection - w.r.t. this projection it will lessen the each document size for further stages */
{
$project: {
_id: 0,
name: 1,
Honours: {
$cond: { if: { $gte: ["$level", 8] }, then: "True", else: "False" }
}
}
},
/** $sort should work as .sort() */
{ $sort: { name: 1 } }
]);