MongoDB aggregation field with underscore - mongodb

I have a field _type_ in my documents like this:
{
"name" : "0",
"_type_" : "product"
}
I need to do aggregation on that field:
db.readImport.aggregate([
{
$match: {
"$_type_": "product"
}
},
...
]);
When the field would not have underscores it would work but this way I get
unknown top level operator: $_type_
how can I access the field _type_ with $ ?

You don't need $ for $match stage:
db.readImport.aggregate([
{
$match: { "_type_": "product" }
},
...
]);
because $match stage accepts simple query as parameter. Other aggregation stages, such as $group, accept expressions. Expressions use filed path to access fields of input documents.

Related

Using cond to specify _id fields for group in mongodb aggregation

new to Mongo. Trying to group across different sub fields of a document based on a condition. The condition is a regex on a field value. Looks like -
db.collection.aggregate([{
{
"$group": {
"$cond": [{
"upper.leaf": {
$not: {
$regex: /flower/
}
}
},
{
"_id": {
"leaf": "$upper.leaf",
"stem": "$upper.stem"
}
},
{
"_id": {
"stem": "$upper.stem",
"petal": "$upper.petal"
}
}
]
}
}])
Using api v4.0: cond in the docs shows - { $cond: [ <boolean-expression>, <true-case>, <false-case> ] }
The error I get with the above code is - "Syntax error: dotted field name 'upper.leaf' can not used in a sub object."
Reading up on that I tried $let to re-assign the dotted field name. But started to hit various syntax errors with no obvious issue in the query.
Also tried using $project to rename the fields, but got - Field names may not start with '$'
Thoughts on the best approach here? I can always address this at the application level and split my query into two but it's attractive potentially to solve it natively in mongo.
$group syntax is wrong
{
$group:
{
_id: <expression>, // Group By Expression
<field1>: { <accumulator1> : <expression1> },
...
}
}
You tried to do
{
$group:
<expression>
}
And even if your expression resulted in the same code, its invalid syntax for $group (check from the documentation where you are allowed to use expressions)
One other problem is that you use the query operator for regex, and not the aggregate regex operators (you can't do that, if you aggregate you can use only aggregate operators, only $match is the exception that you can use both if you add $expr)
You need this i think
[{
"$group" : {
"_id" : {
"$cond" : [ {
"$not" : [ {
"$regexMatch" : {
"input" : "$upper.leaf",
"regex" : "/flower/"}}]},
{"leaf" : "$upper.leaf","stem" : "$upper.stem"},
{"stem" : "$upper.stem","petal" : "$upper.petal"}]
}
}}]
Its similar code, but expression gets as value of the "_id" and $regexMatch
is used that is aggregate operator.
I didnt tested the code.

Converting some fields in Mongo from String to Array

I have a collection of documents where a "tags" field was switched over from being a space separated list of tags to an array of individual tags. I want to update the previous space-separated fields to all be arrays like the new incoming data.
I'm also having problems with the $type selector because it is applying the type operation to individual array elements, which are strings. So filtering by type just returns everything.
How can I get every document that looks like the first example into the format for the second example?
{
"_id" : ObjectId("12345"),
"tags" : "red blue green white"
}
{
"_id" : ObjectId("54321"),
"tags" : [
"red",
"orange",
"black"
]
}
We can't use the $type operator to filter our documents here because the type of the elements in our array is "string" and as mentioned in the documentation:
When applied to arrays, $type matches any inner element that is of the specified BSON type. For example, when matching for $type : 'array', the document will match if the field has a nested array. It will not return results where the field itself is an array.
But fortunately MongoDB also provides the $exists operator which can be used here with a numeric array index.
Now how can we update those documents?
Well, from MongoDB version <= 3.2, the only option we have is mapReduce() but first let look at the other alternative in the upcoming release of MongoDB.
Starting from MongoDB 3.4, we can $project our documents and use the $split operator to split our string into an array of substrings.
Note that to split only those "tags" which are string, we need a logical $condition processing to split only the values that are string. The condition here is $eq which evaluate to true when the $type of the field is equal to "string". By the way $type here is new in 3.4.
Finally we can overwrite the old collection using the $out pipeline stage operator. But we need to explicitly specify the inclusion of other field in the $project stage.
db.collection.aggregate(
[
{ "$project": {
"tags": {
"$cond": [
{ "$eq": [
{ "$type": "$tags" },
"string"
]},
{ "$split": [ "$tags", " " ] },
"$tags"
]
}
}},
{ "$out": "collection" }
]
)
With mapReduce, we need to use the Array.prototype.split() to emit the array of substrings in our map function. We also need to filter our documents using the "query" option. From there we will need to iterate the "results" array and $set the new value for "tags" using bulk operations using the bulkWrite() method new in 3.2 or the now deprecated Bulk() if we are on 2.6 or 3.0 as shown here.
db.collection.mapReduce(
function() { emit(this._id, this.tags.split(" ")); },
function(key, value) {},
{
"out": { "inline": 1 },
"query": {
"tags.0": { "$exists": false },
"tags": { "$type": 2 }
}
}
)['results']

MongoDB's Aggregation Framework: project only matching element of an array

I have a "class" document as:
{
className: "AAA",
students: [
{name:"An", age:"13"},
{name:"Hao", age:"13"},
{name:"John", age:"14"},
{name:"Hung", age:"12"}
]
}
And i want to get the student who has name is "An", get only matching element in array "students". I can do that with function find() as:
>db.class.find({"students.name":"An"}, {"students.$":true})
{
"_id" : ObjectId("548b01815a06570735b946c1"),
"students" : [
{
"name" : "An",
"age" : "13"
}
]}
It's fine, but when i do the same with Aggregation as following, it get error:
db.class.aggregate([
{$match:{"students.name":'An'}},
{$project:{"students.$":true}}
])
Error is:
uncaught exception: aggregate failed: {
"errmsg" : "exception: FieldPath field names may not start with '$'.",
"code" : 16410,
"ok" : 0
}
Why? I can't use "$" for array in $project operator of aggregate() while can use this one in project operator of find().
From the docs:
Use $ in the projection document of the find() method or the findOne()
method when you only need one particular array element in selected
documents.
The positional operator $ cannot be used in an aggregation pipeline projection stage. It is not recognized there.
This makes sense, because, when you execute a projection along with a find query, the input to the projection part of the query is a single document that has matched the query.The context of the match is known even during projection. So for each document that matches the query, the projection operator is applied then and there before the next match is found.
db.class.find({"students.name":"An"}, {"students.$":true})
In case of:
db.class.aggregate([
{$match:{"students.name":'An'}},
{$project:{"students.$":true}}
])
The aggregation pipeline is a set of stages. Each stage is completely unaware and independent of its previous or next stages. A set of documents pass a stage completely before being passed on to the next stage in the pipeline. The first stage in this case being the $match stage, all the documents are filtered based on the match condition. The input to the projection stage is now a set of documents that have been filtered as part of the match stage.
So a positional operator in the projection stage makes no sense, since in the current stage it doesn't know on what basis the fields had been filtered. Therefore, $ operators are not allowed as part of the field paths.
Why does the below work?
db.class.aggregate([
{ $match: { "students.name": "An" },
{ $unwind: "$students" },
{ $project: { "students": 1 } }
])
As you see, the projection stage gets a set of documents as input, and projects the required fields. It is independent of its previous and next stages.
Try using the unwind operator in the pipeline: http://docs.mongodb.org/manual/reference/operator/aggregation/unwind/#pipe._S_unwind
Your aggregation would look like
db.class.aggregate([
{ $match: { "students.name": "An" },
{ $unwind: "$students" },
{ $project: { "students": 1 } }
])
You can use $filter to selects a subset of an array to return based on the specified condition.
db.class.aggregate([
{
$match:{
"className": "AAA"
}
},
{
$project: {
$filter: {
input: "$students",
as: "stu",
cond: { $eq: [ "$$stu.name", "An" ] }
}
}
])
The following example filters the Students array to only include documents that have a name equal to "An".

Include all existing fields and add new fields to document

I would like to define a $project aggregation stage where I can instruct it to add a new field and include all existing fields, without having to list all the existing fields.
My document looks like this, with many fields:
{
obj: {
obj_field1: "hi",
obj_field2: "hi2"
},
field1: "a",
field2: "b",
...
field26: "z"
}
I want to make an aggregation operation like this:
[
{
$project: {
custom_field: "$obj.obj_field1",
//the next part is that I don't want to do
field1: 1,
field2: 1,
...
field26: 1
}
},
... //group, match, and whatever...
]
Is there something like an "include all fields" keyword that I can use in this case, or some other way to avoid having to list every field separately?
In 4.2+, you can use the $set aggregation pipeline operator which is nothing other than an alias to $addFieldsadded in 3.4
The $addFields stage is equivalent to a $project stage that explicitly specifies all existing fields in the input documents and adds the new fields.
db.collection.aggregate([
{ "$addFields": { "custom_field": "$obj.obj_field1" } }
])
You can use $$ROOT to references the root document. Keep all fields of this document in a field and try to get it after that (depending on your client system: Java, C++, ...)
[
{
$project: {
custom_field: "$obj.obj_field1",
document: "$$ROOT"
}
},
... //group, match, and whatever...
]
>>> There's something like "include all fields" keyword that I can use in this case or some another solution?
Unfortunaly, there is no operator to "include all fields" in aggregation operation. The only reason, why, because aggregation is mostly created to group/calculate data from collection fields (sum, avg, etc.) and return all the collection's fields is not direct purpose.
To add new fields to your document you can use $addFields
from docs
and to all the fields in your document, you can use $$ROOT
db.collection.aggregate([
{ "$addFields": { "custom_field": "$obj.obj_field1" } },
{ "$group": {
_id : "$field1",
data: { $push : "$$ROOT" }
}}
])
As of version 2.6.4, Mongo DB does not have such a feature for the $project aggregation pipeline. From the docs for $project:
Passes along the documents with only the specified fields to the next stage in the pipeline. The specified fields can be existing fields from the input documents or newly computed fields.
and
The _id field is, by default, included in the output documents. To include the other fields from the input documents in the output documents, you must explicitly specify the inclusion in $project.
according to #Deka reply, for c# mongodb driver 2.5 you can get the grouped document with all keys like below;
var group = new BsonDocument
{
{ "_id", "$groupField" },
{ "_document", new BsonDocument { { "$first", "$$ROOT" } } }
};
ProjectionDefinition<BsonDocument> projection = new BsonDocument{{ "document", "$_document"}};
var result = await col.Aggregate().Group(group).Project(projection).ToListAsync();
// For demo first record
var fistItemAsT = BsonSerializer.Deserialize<T>(result.ToArray()[0]["document"].AsBsonDocument);

How do I update Array Elements matching criteria in a MongoDB document?

I have a document with an array field, similar to this:
{
"_id" : "....",
"Statuses" : [
{ "Type" : 1, "Timestamp" : ISODate(...) },
{ "Type" : 2, "Timestamp" : ISODate(...) },
//Etc. etc.
]
}
How can I update a specific Status item's Timestamp, by specifying its Type value?
From mongodb shell you can do this by
db.your_collection.update(
{ _id: ObjectId("your_objectid"), "Statuses.Type": 1 },
{ $set: { "Statuses.$.Timestamp": "new timestamp" } }
)
so the c# equivalent
var query = Query.And(
Query.EQ("_id", "your_doc_id"),
Query.EQ("Statuses.Type", 1)
);
var result = your_collection.Update(
query,
Update.Set("Statuses.$.Timestamp", "new timestamp", UpdateFlags.Multi,SafeMode.True)
);
This will update the specific document, you can remove _id filter if you wanted to update the whole collection
Starting with MongoDB 3.6, the $[<identifier>] positional operator may be used. Unlike the $ positional operator — which updates at most one array element per document — the $[<identifier>] operator will update every matching array element. This is useful for scenarios where a given document may have multiple matching array elements that need to be updated.
db.yourCollection.update(
{ _id: "...." },
{ $set: {"Statuses.$[element].Timestamp": ISODate("2021-06-23T03:47:18.548Z")} },
{ arrayFilters: [{"element.Type": 1}] }
);
The arrayFilters option matches the array elements to update, and the $[element] is used within the $set update operator to indicate that only array elements that matched the arrayFilter should be updated.