MongoDB unwind output, grabbing one string - mongodb

"_id": "Long_ID_Stuff"
"GFV" : "user001"
"hf": "NA"
"h" : {
"totalSamples" : 16,
"hist" : [
["US",16]]
"newEvent" :[
["US", NumberLong("654654654654")]
]
}
I am trying to pull out just the "US" portion of this document in a query and so far it has been giving me nothing.
My query thus far is:
db.x_collection.aggregate([{$unwind :"$h.hist"},{$match : { m:"TOP_COUNTRIES"}},{$match: {"h.lastUpdate":{$gt:1446336000000}}},{$match: {"h.hist":"US"}}]).pretty()
Do I need to do a $unwind: $h, then $unwind: $h.hist?

Without a little more information this is my best guess at what you are looking for. Given that the collection you are aggregating is the "h" collection.
db.x_collection.aggregate([
{ $unwind : "$h.hist"},
{ $match :
{ h.hist : "US" },
{ lastUpdate: { $gt:1446336000000 }},
{ m: "TOP_COUNTRIES" }
}
]).pretty();
if "h" is an array you will need to add this:
{ $unwind : $h },

Related

MongoDB - average of a feature after slicing the max of another feature in a group of documents

I am very new in mongodb and trying to work around a couple of queries, which I am not even sure if they 're feasible.
The structure of each document is:
{
"_id" : {
"$oid": Text
},
"grade": Text,
"type" : Text,
"score": Integer,
"info" : {
"range" : NumericText,
"genre" : Text,
"special": {keys:values}
}
};
The first query would give me:
per grade (thinking I have to group by "grade")
the highest range (thinking I have to call $max:$range, it should work with a string)
the score average (thinking I have to call $avg:$score)
I tried something like the following, which apparently is wrong:
collection.aggregate([{
'$group': {'_id':'$grade',
'highest_range': {'$max':'$info',
'average_score': {'$avg':'$score'}}}
}])
The second query would give the distinct genre records.
Any help is valuable!
ADDITION - providing an example of the document and the output:
{
"_id" : {
"$oid": '60491ea71f8'
},
"grade": D,
"type" : Shop,
"score": 4,
"info" : {
"range" : "2",
"genre" : 'Pet shop',
"special": {'ClientsParking':True,
'AcceptsCreditCard':True,
'BikeParking':False}
}
};
And the output I am looking into is something within lines:
[{grade: A, "highest_range":"4", "average_score":3.5},
{grade: B, "highest_range":"7", "average_score":8.3},
{grade: C, "highest_range":"3", "average_score":2.4}]
I think you are looking for this:
db.collection.aggregate([
{
'$group': {
'_id': '$grade',
'highest_range': { '$max': '$info.range' },
'average_score': { '$avg': '$score' }
}
}
])
However, $min, $max, $avg works only on numbers, not strings.
You could try { '$first': '$info.range' } or { '$last': '$info.range' }. But it requires $sort for proper result. Not clear what you mean by "highest range".

How to return Mongodb Aggregate pipeline docs to ONE document?

I know this has got to be simple, but for the life of me I can't seem to generate the correct final stage in my pipeline to get this working. Here are the documents output from a stage that I have in a mongo query:
{ "_id" : ObjectId("61435ceb233ce0118c1d93ec") }
{ "_id" : ObjectId("61435cf29598d31c17f0d839") }
{ "_id" : ObjectId("611e5cf953396d78985d222f") }
{ "_id" : ObjectId("61435cf773b8b06c848af83e") }
{ "_id" : ObjectId("61435cfd7ac204efa857e7ce") }
{ "_id" : ObjectId("611e5cf953396d78985d2237") }
I would like to get these documents into ONE single document with an array as such:
{
"_id" : [
ObjectId("61435ceb233ce0118c1d93ec"),
ObjectId("61435cf29598d31c17f0d839"),
ObjectId("611e5cf953396d78985d222f"),
ObjectId("61435cf773b8b06c848af83e"),
ObjectId("61435cfd7ac204efa857e7ce"),
ObjectId("611e5cf953396d78985d2237")
]
}
My last stage in the pipeline is simply:
{
$group:{_id:"$uniqueIds"}
}
I've tried everything from $push to $mergeObjects, but no matter what I do, it keeps returning the original 6 documents in some shape or form instead of ONE document. Any advice would be greatly appreciated! Thanks in advance.
Test code here
Query
group by null, sees all collection as 1 group
db.collection.aggregate([
{
"$group": {
"_id": null,
"ids": {
"$push": "$_id"
}
}
},
{
"$unset": "_id"
}
])

Project values of different columns into one field

{
"_id" : ObjectId("5ae84dd87f5b72618ba7a669"),
"main_sub" : "MATHS",
"reporting" : [
{
"teacher" : "ABC"
}
],
"subs" : [
{
"sub" : "GEOMETRIC",
"teacher" : "XYZ",
}
]
}
{
"_id" : ObjectId("5ae84dd87f5b72618ba7a669"),
"main_sub" : "SOCIAL SCIENCE",
"reporting" : [
{
"teacher" : "XYZ"
}
],
"subs" : [
{
"sub" : "CIVIL",
"teacher" : "ABC",
}
]
}
I have simplified the structure of the documents that i have.
The basic structure is that I have a parent subject with an array of reporting teachers and an array of sub-subjects(each having a teacher)
I now want to extract all the subject(parent/sub-subjects) along with the condition if they are sub-subjects or not which are taught by a particular teacher.
For eg:
for teacher ABC i want the following structure:
[{'subject':'MATHS', 'is_parent':'True'}, {'subject':'CIVIL', 'is_parent':'FALSE'}]
-- What is the most efficient query possible ..? I have tried $project with $cond and $switch but in both the cases I have had to repeat the conditional statement for 'subject' and 'is_parent'
-- Is it advised to do the computation in a query or should I get the data dump and then modify the structure in the server code? AS in, I could $unwind and get a mapping of the parent subjects with each sub-subject and then do a for loop.
I have tried
db.collection.aggregate(
{$unwind:'$reporting'},
{$project:{
'result':{$cond:[
{$eq:['ABC', '$reporting.teacher']},
"$main_sub",
"$subs.sub"]}
}}
)
then I realised that even if i transform the else part into another query for the sub-subjects I will have to write the exact same thing for the property of is_parent
You have 2 arrays, so you need to unwind both - the reporting and the subs.
After that stage each document will have at most 1 parent teacher-subj and at most 1 sub teacher-subj pairs.
You need to unwind them again to have a single teacher-subj per document, and it's where you define whether it is parent or not.
Then you can group by teacher. No need for $conds, $filters, or $facets. E.g.:
db.collection.aggregate([
{ $unwind: "$reporting" },
{ $unwind: "$subs" },
{ $project: {
teachers: [
{ teacher: "$reporting.teacher", sub: "$main_sub", is_parent: true },
{ teacher: "$subs.teacher", sub: "$subs.sub", is_parent: false }
]
} },
{ $unwind: "$teachers" },
{ $group: {
_id: "$teachers.teacher",
subs: { $push: {
subject: "$teachers.sub",
is_parent: "$teachers.is_parent"
} }
} }
])

Mongo sort by string value that is actually number

I have collection that contains objects such as this:
{
"_id" : ObjectId("57f00cf47958af95dca29c0c"),
"id" : "...",
"threadId" : "...",
"ownerEmail" : "...#...",
"labelIds" : [
...
],
"snippet" : "...",
"historyId" : "35699995",
"internalDate" : "1422773000000",
"headers" : {
"from" : "...#...",
"subject" : "....",
"to" : "...#..."
},
"contents" : {
"html" : "...."
}
}
When accessing objects, I want to sort them by iternalDate value, which was supposed to be integer, however it is a string. Is there a way to sort them when fetching even if these are strings? By alphabetic order? Or is there a way to convert them to integer painlessly?
Collation is what you need...
db.collection.find()
.sort({internalDate: 1})
.collation({locale: "en_US", numericOrdering: true})
It seems to me that the best solution here would be to parse it first as an integer. You could do it using a simple script in javascript like this, using the mongodb client for node:
db.collection.find({}, {internalDate: 1}).forEach(function(doc) {
db.collection.update(
{ _id: doc._id },
{ $set: { internalDate: parseInt(doc.internalDate) } }
)
})
you also can use the aggregate method to sort number which is actually a string.
db.collection.aggregate([{
$sort : {
internalDate : 1
}
}], {
collation: {
locale: "en_US",
numericOrdering: true
}
});
if you are using mongoose-paginate package for serverside pagination .. so don't use this package, use only mongoose-paginate-v2 for serverside pagination. this package for nodejs side
I was having this issue. I use string length to sort first and then apply the sort of my numeric value stored like a string. e.g. "1", "100", "20", "3" that should be sorted like 1, 3, 29, 100.
db.AllTours.aggregate([
{
$addFields : {
"MyStringValueSize" : { $strLenCP: "$MyValue" }
}
},
{
$sort : {
"MyStringValueSize" : 1,
"MyValue" : 1
}
}
]);
There is a new feature in version 4.0 called $toInt that can be used to parse your string and then sort. In my case I can't upgrade from 3.6.
With aggregate, this works for me.
db.collection.aggregate([<pipeline>]).collation({locale:"en_US", numericOrdering:true})
This is my solution and it worked for me
db.getCollection('myCollection').aggregate([
{
$project: {
newMonth: {
$cond: { if: {
$and: [
{$ne: ['$month', '10']},
{$ne: ['$month', '11']},
{$ne: ['$month', '12']},
]
}, then: {$concat: ['0', '$month']}, else: '$month' }
}
}
},
{
$sort: {newMonth: -1}
}
])

MongoDB query using aggregation not returning expected results

I have a few documents that look like this example:
{
"_id": ObjectId("540f4b6496f35c16af001dc4"),
"groups": [
1,
46105,
46106,
53241,
55397,
55406,
62840
],
"vehicleid": 123,
"vehiclename": "123 - CAN BC",
"totaldistancetraveled": 472.0,
"date_num": 20140901
}
I need to find the total distance driven by all vehicles that belong to group 46105 and where theie date_num matches with 20140901.
I tried the following aggregation query:
db.vehicle_performance_monthly.aggregate(
{ $unwind : "$groups"},
{$group:
{_id: "$groups",
totalMiles: { $sum: "$totaldistancetraveled"}}},
{$match:{_id: {$in:[46106]}},{"$date_num":{$in:20140901}}}
)
But multiple matches are not being returned. Any help is appreciated.
This should work.
db.vehicle_performance_monthly.aggregate([ {
$match : {
groups : 46106,
date_num : 20140901
}
}, {
$unwind : "$groups"
}, {
$match : {
groups : 46106
}
}, {
$group : {
_id : "$groups",
totalMiles : {
$sum : "$totaldistancetraveled"
}
}
} ]);
Analysis for your original answer:
db.vehicle_performance_monthly.aggregate(
{ $unwind : "$groups"},
{$group:
{_id: "$groups",
totalMiles: { $sum: "$totaldistancetraveled"}}}, // $group doesn't map "date_name" then it will lost.
{$match:{_id: {$in:[46106]}},{"$date_num":{$in:20140901}}} // syntax error: {$match:{_id: {$in:[46106]}},{"$date_num":{$in:20140901}}} should be {$match:{_id: {$in:[46106]},"$date_num":{$in:[20140901]}}}
)
$match first to improve performance