MongoDB Aggregation Pipeline: $match with expression not possible? - mongodb

I'm doing a rather complicated aggregation pipeline and have a rather strange phenomenon - I extracted a short example to visualize my problem here.
It seemed related to MongoDb $addFields and $match - but it doesn't contain any information for me to fix the problem at hand.
Note: Please note that my problem is not with the specific example of using date fields and or dealing with values, the problem is that I'm not able to $match using an expression - using a field that was added before with $addFields or not.
Given MongoDB: 3.6.3 (currently latest)
Let's insert some testdata:
db.testexample.insert({
"dateField": new ISODate("2016-05-18T16:00:00Z")
});
db.testexample.insert({
"dateField": new ISODate("2018-05-18T16:00:00Z")
});
Now let's make simple pipeline that computes only the year of the date and $matches on that:
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": {"$eq": "$dateFieldYear"}}
}
}
])
--> No matches
It should match as it's the same field? Maybe with more trickery (using an $add)?
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": {"$eq": {"$add": ["$dateFieldYear", 0]}}
}
}
])
--> No matches
Still no dice.. Next i thought that variables altogether are a problem. So let's fix the values:
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": {"$eq": {"$add": [2016, 0]}}
}
}
])
--> No matches
Wait.. something is really wrong here.. Let's see with a static value:
db.testexample.aggregate([
{
"$addFields": {
"dateFieldYear": {"$year": "$dateField"}
}
},
{
"$match": {
"dateFieldYear": 2016
}
}
])
--> 1 record found!
So my conclusion seems to be that $match cannot take an expression on a field in an aggregate pipeline. But this doesn't seem possible - as the documentation states that $match follows the query syntax as described here.
Anybody can help how it can be done to $match using the simple example "dateFieldYear": {"$eq": "$dateFieldYear"}} - why doesn't this work as expected?
Thanks so much for any help

You can use $expr ( 3.6 mongo version operator ) to use aggregation functions in regular query.
Compare query operators vs aggregation comparison operators.
In your case
db.testexample.find({$expr:{$eq:["$dateFieldYear", "$dateFieldYear"]}})
Regular Query:
db.testexample.find({$expr:{$eq:["$dateFieldYear", {"$year": "$dateField"}]}})
Aggregation Query:
db.testexample.aggregate({$match:{$expr:{$eq:["$dateFieldYear", {"$year": "$dateField"}]}})

Related

Mongodb - aggregate match subarray

I trying to match the data in Subarray for some reason it is grouped like this.
Data :
{
"_id": 1,
"addresDetails": [
[
{
"Name":"John",
"Place":"Berlin",
"Pincode":"10001"
},
{
"Name":"Sarah",
"Place":"Newyork",
"Pincode":"10002"
}
],
[
{
"Name":"Mark",
"Place":"Tokyo",
"Pincode":"10003"
},
{
"Name":"Michael",
"Place":"Newyork",
"Pincode":"10002"
}
]
]
}
I tried with this Match query:
{
"$match":{
"attributes":{
"$elemMatch":{
"$in":["Mark"]
}
}
}
}
I am getting No data found , How do i match the elements in this subarrays.
Query
aggregation way, in general if you are stuck and query operators or update operators seems not enough, aggregation provides so much more operators, and its alternative.
2 nested filter in the 2 level arrays to find a Name in array [Mark]
*maybe there is a shorter more declarative way with $elemMatch, and possible a way to use index, also think about schema change, maybe you dont really need array with array members (the bellow doesnt use index)
*i used addressDetails remove the one s else you will get empty results
Playmongo
aggregate(
[{"$match":
{"$expr":
{"$ne":
[{"$filter":
{"input": "$addressDetails",
"as": "a",
"cond":
{"$ne":
[{"$filter":
{"input": "$$a",
"as": "d",
"cond": {"$in": ["$$d.Name", ["Mark"]]}}},
[]]}}},
[]]}}}])
You can apparently nest elemMatch as well, e.g.:
db.collection.find({
"addresDetails": {
$elemMatch: {
$elemMatch: {
"Name": "Mark"
}
}
}
})
This matches your document, as shown by this mongo playground link, but is probably not very efficient.
Alternatively you can use aggregations. For example unwind may help to flatten out your nested arrays, and allow for easier match afterwards.
db.collection.aggregate([
{
"$unwind": "$addresDetails"
},
{
"$match": {
"addresDetails.Name": "Mark"
}
}
])
You can find the mongo playground link for this here. But unwind is usually not preferred as the first stage of the aggregation pipeline either, again because of performance reasons.
Also please note that the results for these 2 options are different!

Combine $in with $split in mongo

I want to combine $in with $split like in the following example, but it fails saying "$in needs an array". I understand the output of $split is an array so I don't know why it fails. Do you know how to solve it or another way to do it?
Thanks
db.mydoc.aggregate([
{
'$match': {
'myid': {
'$in': {
'$split': [
'136618,136620,136622',
',',
],
},
},
},
},
{
'$project': { ... },
},
]);
This "fail" is the expected behavior, let's understand why.
We must first take a look at the $match behavior as specified in the docs:
$match takes a document that specifies the query conditions. The query syntax is identical to the read operation query syntax; i.e. $match does not accept raw aggregation expressions. Instead, use a $expr query expression to include aggregation expression in $match.
This means when you use $match it uses the query language by default, now the "issue" comes from the difference between the two $in operators the query $in operator (which is being used) and the aggregation $in operator ( which you assume is being used ).
It is true that $split resolves to an array. but $split is also an aggregation operator, now I think this case should throw an error but for some reason as you mentioned this behavior just resolves with no results. the aggregation $in operator however , does accept raw aggregation expressions.
This means all you have to do is convert your $match query to use $expr so you can use the aggregation version of $in within the match, like so:
db.collection.aggregate([
{
"$match": {
$expr: {
$in: [
"$myid",
{
"$split": [
"136618,136620,136622",
","
]
}
]
}
}
}
])
Mongo Playground
#Tom Slabbaert gave a very comprehensive and good answer. Just for sake of completeness, an alternative solution (if you work with Javascript/Mongo shell) is this one:
db.mydoc.aggregate([
{
'$match': {
'myid': { '$in': '136618,136620,136622'.split(',') }
}
},
{
'$project': { ... },
},
]);
Be aware either solutions create an array of strings, i.e. [ "136618", "136620", "136622" ]. It does not match if your collection has numeric values, e.g. { myid: 136618 }
You may use
'136618,136620,136622'.split(',').map(x => NumberInt(x))
or
{ $map: { input: { "$split": ["136618,136620,136622", ","] }, in: { $toInt: "$$this" } } }

Select all where the months of "date" is December in Mongodb?

Do you know if I can do a findAll where the month of Date is December?
I try this request but it's not good:
db.myCollection.aggregate({}, { "Date": { $month: 12 } });
it's similar to a SELECT * FROM table WHERE Months(date)=december?
Consider running an aggregation pipeline that uses the $redact operator as it allows you to incorporate with a single pipeline, a functionality with $project to create a field that represents the month of a date field and $match to filter the documents
which match the given condition of the month being December.
In the above, $redact uses $cond tenary operator as means to provide the conditional expression that will create the system variable which does the redaction. The logical expression in $cond will check
for an equality of a date operator field with a given value, if that matches then $redact will return the documents using the $$KEEP system variable and discards otherwise using $$PRUNE.
Running the following pipeline should give you the desired result:
db.myCollection.aggregate([
{
"$redact": {
"$cond": [
{ "$eq": [{ "$month": "$Date" }, 12] },
"$$KEEP",
"$$PRUNE"
]
}
}
])
This is similar to a $project +$match combo but you'd need to then select all the rest of the fields that go into the pipeline:
db.myCollection.aggregate([
{
"$project": {
"month": { "$month": "$Date" },
"field1": 1,
"field2": 1,
.....
}
},
{ "$match": { "month": 12 } }
])
With another alternative, albeit slow query, using the find() method and $where as:
db.myCollection.find({ "$where": "this.Date.getMonth() === 11" })
db.collection('mycollection').find({"Date": {$month: 12}})
https://docs.mongodb.com/manual/reference/method/db.collection.find/
This should work for finding records in December for a given year, if that is enough to suit your purposes.
db.myCollection.find({
"Date":{
$gte:new Date("2016-12-01T00:00:00Z"),
$lt:new Date("2017-01-01T00:00:00Z")
}})

$divide aggregation framework questions

I have this query for the MongoDB aggregation framework. I cannot figure out why I can't get this query to run. I checked the documentation and am still perplexed. Can anyone let me know what is wrong.
db.acquisitions.aggregate([
{ $match: {"acquired_year":{$gte:1999} } },
{ $group: {_id:"$acquired_year", "total_acquisition_amount(BBn)": { $divide :[ {$sum:"$acquistion_price"}, 1000000000 ] } }},
{ $sort : {"acquired_year" : -1} }
])
Read the $group manual page, which also lists all valid "accumulators", which means the operators that must be the first argument to any field property referenced after the _id.
This should then lead you to work out that if you want to $divide on a summed total, you need to place that operation in a separate aggregation pipeline stage with $project:
db.acquisitions.aggregate([
{ "$match": { "acquired_year":{ "$gte": 1999 } }},
{ "$group": {
"_id":"$acquired_year",
"total_acquisition_amount(BBn)": { "$sum": "$acquistion_price" }
}},
{ "$project": {
"total_acquisition_amount(BBn)": {
"$divide": [ "$totatotal_acquisition_amount(BBn)", 1000000000 ]
}
}},
{ "$sort": { "_id": -1 }}
])
The only way you can otherwise use math and other operators is "within" an accumulator like $sum, which does not apply in this case since the division must occur "after" the total has been determined.
Also, as a result of $group, the "acquired_year" field is no longer part of the document emitted, but instead this is the _id value, so you apply the sort on that instead.

Mongodb query specific month|year not date

How can I query a specific month in mongodb, not date range, I need month to make a list of customer birthday for current month.
In SQL will be something like that:
SELECT * FROM customer WHERE MONTH(bday)='09'
Now I need to translate that in mongodb.
Note: My dates are already saved in MongoDate type, I used this thinking that will be easy to work before but now I can't find easily how to do this simple thing.
With MongoDB 3.6 and newer, you can use the $expr operator in your find() query. This allows you to build query expressions that compare fields from the same document in a $match stage.
db.customer.find({ "$expr": { "$eq": [{ "$month": "$bday" }, 9] } })
For other MongoDB versions, consider running an aggregation pipeline that uses the $redact operator as it allows you to incorporate with a single pipeline, a functionality with $project to create a field that represents the month of a date field and $match to filter the documents
which match the given condition of the month being September.
In the above, $redact uses $cond tenary operator as means to provide the conditional expression that will create the system variable which does the redaction. The logical expression in $cond will check
for an equality of a date operator field with a given value, if that matches then $redact will return the documents using the $$KEEP system variable and discards otherwise using $$PRUNE.
Running the following pipeline should give you the desired result:
db.customer.aggregate([
{ "$match": { "bday": { "$exists": true } } },
{
"$redact": {
"$cond": [
{ "$eq": [{ "$month": "$bday" }, 9] },
"$$KEEP",
"$$PRUNE"
]
}
}
])
This is similar to a $project +$match combo but you'd need to then select all the rest of the fields that go into the pipeline:
db.customer.aggregate([
{ "$match": { "bday": { "$exists": true } } },
{
"$project": {
"month": { "$month": "$bday" },
"bday": 1,
"field1": 1,
"field2": 1,
.....
}
},
{ "$match": { "month": 9 } }
])
With another alternative, albeit slow query, using the find() method with $where as:
db.customer.find({ "$where": "this.bday.getMonth() === 8" })
You can do that using aggregate with the $month projection operator:
db.customer.aggregate([
{$project: {name: 1, month: {$month: '$bday'}}},
{$match: {month: 9}}
]);
First, you need to check whether the data type is in ISODate.
IF not you can change the data type as the following example.
db.collectionName.find().forEach(function(each_object_from_collection){each_object_from_collection.your_date_field=new ISODate(each_object_from_collection.your_date_field);db.collectionName.save(each_object_from_collection);})
Now you can find it in two ways
db.collectionName.find({ $expr: {
$eq: [{ $year: "$your_date_field" }, 2017]
}});
Or by aggregation
db.collectionName.aggregate([{$project: {field1_you_need_in_result: 1,field12_you_need_in_result: 1,your_year_variable: {$year: '$your_date_field'}, your_month_variable: {$month: '$your_date_field'}}},{$match: {your_year_variable:2017, your_month_variable: 3}}]);
Yes you can fetch this result within date like this ,
db.collection.find({
$expr: {
$and: [
{
"$eq": [
{
"$month": "$date"
},
3
]
},
{
"$eq": [
{
"$year": "$date"
},
2020
]
}
]
}
})
If you're concerned about efficiency, you may want to store the month data in a separate field within each document.