I have to make a consult on a data base in Mongodb, but usin colab (google colab), i found that the existing documentation (oficial and every other site) have a similar way of doing the consult but not de same (in colab with "less tan" operator yo have to do "$lt" and $lt doesn't work) thas why i am here asking how can i translate a group by and sum consult.
For more detail i want to group by name of publisher and sum another field (weeks on best seller)
Query = collection.aggregate(
[
{'$group':{'_id': 'publisher', 'Cantidad_total': { "$sum": 'weeks_on_list' }}}])
for elemento in Query:
pprint.pprint(elemento)
this is what i came up with (that doesn´t fail) but give this
{'Cantidad_total': 0, '_id': 'publisher'}
You just need a couple of $.
From the docs:
Expressions:
Expressions can include field paths , literals , system variables ,
expression objects , and expression operators . Expressions can be
nested.
Field Paths:
Aggregation expressions use field path to access fields in the input
documents. To specify a field path, prefix the field name or the
dotted field name (if the field is in the embedded document) with a
dollar sign $. For example, "$user" to specify the field path for the
user field or "$user.name" to specify the field path to "user.name"
field.
So for your particular example, you want your aggregation pipeline to be:
[
{
"$group": {
"_id": "$publisher",
"Cantidad_total": {
"$sum": "$weeks_on_list"
}
}
}
]
Related
I have a collection of documents in which a field name appears to have a dot:
{
"prod_id": "123",
"prod_cost (whole)": 49
"prod_cost (dec.)": 49
}
How can I effectively run an aggregation pipeline using that field?
As of now, it reports null values since it considers ")" as an additional nested field for prod_cost (dec.).
From MongoDB version 5,
MongoDB 5.0 adds improved support for the use of ($) and (.) in field names. There are some restrictions. See Field Name Considerations for more details.
Field Names with Periods (.) and Dollar Signs ($)
In most cases data that has been stored using field names like these is not directly accessible. You need to use helper methods like $getField, $setField, and $literal in queries that access those fields.
{ "$getField": "prod_cost (dec.)" }
Sample MongoPlayground
To access field in object, you can refer to Query a Field in a Sub-document demo.
{
"$getField": {
field: {
$literal: "prod_cost (dec.)"
},
input: "$productInfo"
}
}
Sample Mongo Playground (Nested object)
how to print field and length of this field
e.g. I have {name:"aaa"} document is collection "names"
then the expected output is
{name:"aaa", name_legth:3}
Please help.
MongoDB versions <3.2 don't have a text aggregation operator to compute length of a string value stored in a field. If you are using version 3.2 or older, you will need to implement the length computation outside the DB (such as in the controller layer of an MVC architecture).
Version 3.4, though, includes several new and useful aggregation operators including the $strLenCP operator which should serve your purpose. The usage for your case would be as follows:
db.names.aggregate(
[
{
$project: {
"name": 1,
"name_length": { $strLenCP: "$name" }
}
}
]
)
The documentation for the aggregation operator can be found here.
Can anyone tell me how to add a $match stage to an aggregation pipeline to filter for where a field MATCHES a query, (and may have other data in it too), rather than limiting results to entries where the field EQUALS the query?
The query specification...
var query = {hello:"world"};
...can be used to retrieve the following documents using the find() operation of MongoDb's native node driver, where the query 'map' is interpreted as a match...
{hello:"world"}
{hello:"world", extra:"data"}
...like...
collection.find(query);
The same query map can also be interpreted as a match when used with $elemMatch to retrieve documents with matching entries contained in arrays like these documents...
{
greetings:[
{hello:"world"},
]
}
{
greetings:[
{hello:"world", extra:"data"},
]
}
{
greetings:[
{hello:"world"},
{aloha:"mars"},
]
}
...using an invocation like [PIPELINE1] ...
collection.aggregate([
{$match:{greetings:{$elemMatch:query}}},
]).toArray()
However, trying to get a list of the matching greetings with unwind [PIPELINE2] ...
collection.aggregate([
{$match:{greetings:{$elemMatch:query}}},
{$unwind:"$greetings"},
]).toArray()
...produces all the array entries inside the documents with any matching entries, including the entries which don't match (simplified result)...
[
{greetings:{hello:"world"}},
{greetings:{hello:"world", extra:"data"}},
{greetings:{hello:"world"}},
{greetings:{aloha:"mars"}},
]
I have been trying to add a second match stage, but I was surprised to find that it limited results only to those where the greetings field EQUALS the query, rather than where it MATCHES the query [PIPELINE3].
collection.aggregate([
{$match:{greetings:{$elemMatch:query}}},
{$unwind:"$greetings"},
{$match:{greetings:query}},
]).toArray()
Unfortunately PIPELINE3 produces only the following entries, excluding the matching hello world entry with the extra:"data", since that entry is not strictly 'equal' to the query (simplified result)...
[
{greetings:{hello:"world"}},
{greetings:{hello:"world"}},
]
...where what I need as the result is rather...
[
{greetings:{hello:"world"}},
{greetings:{hello:"world"}},
{greetings:{"hello":"world","extra":"data"}
]
How can I add a second $match stage to PIPELINE2, to filter for where the greetings field MATCHES the query, (and may have other data in it too), rather than limiting results to entries where the greetings field EQUALS the query?
What you're seeing in the results is correct. Your approach is a bit wrong. If you want the results you're expecting, then you should use this approach:
collection.aggregate([
{$match:{greetings:{$elemMatch:query}}},
{$unwind:"$greetings"},
{$match:{"greetings.hello":"world"}},
]).toArray()
With this, you should get the following output:
[
{greetings:{hello:"world"}},
{greetings:{hello:"world"}},
{greetings:{"hello":"world","extra":"data"}
]
Whenever you're using aggregation in MongoDB and want to create an aggregation pipeline that yields documents you expect, you should always start your query with the first stage. And then eventually add stages to monitor the outputs from subsequent stages.
The output of your $unwind stage would be:
[{
greetings:{hello:"world"}
},
{
greetings:{hello:"world", extra:"data"}
},
{
greetings:{hello:"world"}
},
{
greetings:{aloha:"mars"}
}]
Now if we include the third stage that you used, then it would match for greetings key that have a value {hello:"world"} and with that exact value, it would find only two documents in the pipeline. So you would only be getting:
{ "greetings" : { "hello" : "world" } }
{ "greetings" : { "hello" : "world" } }
I am using mongodb database and I need to run less than and equal filter based on custom comparator. Following is more details.
"profile" collection is having "level" field as string
{"name":"Test1", "level":"intermediate"}
Following are value of level and its corresponding weight
novice
intermediate
experienced
advance
I want to write query like as below so that it should return all the profile collection which level less than and equal to "experienced" (i.e. includes result for "novice", "intermediate" and "experienced"
db.profile.find( { level: { $lte: "experienced" } } )
I understand, I need to provide custom comparator. But how can i do?
You can't use custom comparators in a MongoDB Query. The ones available are: $eq, $gt, $gte, $lt, $lte, $ne, $in, $nin.
You can, however, use $in to get what you want:
db.profile.find( { level: { $in: [ "experienced", "intermediate ", "novice" ] } } );
In MongoDB, I want to group my documents based on whether a certain field has a certain substring. I was trying to project each document to a boolean that says whether the field has that substring/matches that pattern, and then group by that field.
How can I do this?
Edit:
1) I have tried the aggregation pipeline as
db.aggregate([
{
'$match': {
'$regex': '.*world'
}
}
])
But I get an error:
pymongo.errors.OperationFailure: command SON([('aggregate', u'test'), ('pipeline', [{'$match': {'$regex': '.*world'}}])]) failed: exception: bad query: BadValue unknown top level operator: $regex
2) I am NOT trying to select only the words that match the pattern. I would like to select every document. I would like to make a field called 'matched' which says whether the pattern was matched for each element.
The $regex operator needs to appear after the field that you are matching on:
db.aggregate([
{
"$match": {
"fieldName": {"$regex": ".*world"}
}
}
])
The "unknown top level operator" message means that the operator has to appear after the field to which it applies.