I am trying to write a query to get all of the results of some survey data stored in a mongo. The tricky part is some questions are radio questions with a single answer, and some questions are multi-select type questions, some are values that need to be averaged, so I want to perform different aggregations depending on the type of question.
The results are stored in a schema like this, with each item in the array being a survey response.
[
{
metaData: {
survey: new ObjectId("62206ea0b31be3535abac547")
},
answers: {
'question1': 'a',
'question2': 'a',
'question3': ['a','c'],
'question4': 3
},
createdAt: 2022-03-03T07:30:40.517Z,
},
{
metaData: {
survey: new ObjectId("62206ea0b31be3535abac547"),
},
answers: {
'question1': 'a',
'question2': 'b',
'question3': ['a','c'],
'question4': 2
},
createdAt: 2022-03-03T07:30:40.518Z,
},
{
metaData: {
survey: new ObjectId("62206ea0b31be3535abac547"),
},
answers: {
'question1': 'b',
'question2': 'c',
'question3': ['b']
'question4': 1
},
createdAt: 2022-03-03T07:30:40.518Z,
}
]
question1 and question2 are radio questions, so there can be only 1 answer, whereas question 3 is a multi-select, so the user can have multiple answers. Question 4 is a value that needs to be averaged.
I think there is some way to accomplish this in a single aggregation pipeline with some combination of facets, grouping, filters, projections, etc, but I am stuck.
I'd like to get a final result that looks like this
{
'question1' : {
'a' : 2,
'b' : 1
},
'question2' : {
'a' : 1,
'b' : 1,
'c' : 1,
},
'question3' : {
'a' : 2,
'b' : 1,
'c' : 2,
},
'question4' : 2 //avg (3+2+1)/3
}
OR even better:
{
'radio': {
'question1' : {
'a' : 2,
'b' : 1
},
'question2' : {
'a' : 1,
'b' : 1,
'c' : 1,
},
},
'multi': {
'question3' : {
'a' : 2,
'b' : 1,
'c' : 2,
}
},
'avg' : {
'question4' : 2
}
}
My pipeline would look something like this:
Response.aggregate([
{ $match: { 'metaData.survey': surveyId} }, // filter only for the specific survey
{ $project: { // I assume I have to turn the answers into an array
"answers": { $objectToArray: "$answers" },
"createdAt": "$createdAt"
}
},
// maybe facet here?
// conceptually, In the next stage I'd want to bucket the questions
// by type with something like below, then perform the right type of
// aggregation depending on the question type
// if $in [$$answers.k ['question1, 'question2']] group by k, v and count
// if $in [$$answers.k ['question3']] unwind and count each unique value?
// { $facet : { radio: [], multi:[]}}
])
Basically, I know which question Id is a radio or a multi-select, I'm just trying to figure out how to format the pipeline to achieve the desired output based on the questionId being in a known array.
Bonus points if I can figure out how to also group the by day/month based on the createdAt time
db.collection.aggregate([
{
$match: {}
},
{
$project: { answers: { $objectToArray: "$answers" } }
},
{
$unwind: "$answers"
},
{
$unwind: "$answers.v"
},
{
$group: {
_id: "$answers",
c: { "$sum": 1 }
}
},
{
$group: {
_id: "$_id.k",
v: { "$push": { k: "$_id.v", v: "$c" } }
}
},
{
$group: {
_id: null,
v: { "$push": { k: "$_id", v: { "$arrayToObject": "$v" } } }
}
},
{
$set: { v: { $arrayToObject: "$v" } }
},
{
$replaceWith: "$v"
}
])
mongoplayground
db.collection.aggregate([
{
$match: {}
},
{
$project: { answers: { $objectToArray: "$answers" } }
},
{
$unwind: "$answers"
},
{
$set: {
"answers.type": {
$switch: {
branches: [
{
case: { $isArray: "$answers.v" },
then: "multi"
},
{
case: { $eq: [ { $type: "$answers.v" }, "string" ] },
then: "radio"
},
{
case: { $isNumber: "$answers.v" },
then: "avg"
}
],
default: "other"
}
}
}
},
{
$unwind: "$answers.v"
},
{
$group: {
_id: "$answers",
c: { $sum: 1 }
}
},
{
$group: {
_id: "$_id.k",
type: { $first: "$_id.type" },
v: {
$push: {
k: { $toString: "$_id.v" },
v: "$c"
}
}
}
},
{
$group: {
_id: "$type",
v: {
$push: {
k: "$_id",
v: { $arrayToObject: "$v" }
}
}
}
},
{
$group: {
_id: null,
v: {
$push: {
k: "$_id",
v: { $arrayToObject: "$v" }
}
}
}
},
{
$set: { v: { $arrayToObject: "$v" } }
},
{
$replaceWith: "$v"
},
{
$set: {
avg: {
$arrayToObject: {
$map: {
input: { $objectToArray: "$avg" },
as: "s",
in: {
k: "$$s.k",
v: {
$avg: {
$map: {
input: { $objectToArray: "$$s.v" },
as: "x",
in: { $multiply: [ { $toInt: "$$x.k" }, "$$x.v" ] }
}
}
}
}
}
}
}
}
}
])
mongoplayground
Related
I've created an aggregate query but for some reason it doesn't seem to work for custom fields created in the aggregation pipeline.
return this.repository.mongo().aggregate([
{
$match: { q1_avg: { $regex: baseQuery['value'], $options: 'i' } }, // NOT WORKING
},
{
$group: {
_id: '$product_sku',
id: { $first: "$_id" },
product_name: { $first: '$product_name' },
product_category: { $first: '$product_category' },
product_sku: { $first: '$product_sku' },
q1_cnt: { $sum: 1 },
q1_votes: { $push: "$final_rating" }
},
},
{
$facet: {
pagination: [ { $count: 'total' } ],
data: [
{
$project: {
_id: 1,
id: 1,
product_name: 1,
product_category: 1,
product_sku: 1,
q1_cnt: 1,
q1_votes: {
$filter: {
input: '$q1_votes',
as: 'item',
cond: { $ne: ['$$item', null] }
}
},
},
},
{
$set: {
q1_avg: { $round: [ { $avg: '$q1_votes' }, 2 ] },
}
},
{ $unset: ['q1_votes'] },
{ $skip: skip },
{ $limit: limit },
{ $sort: sortList }
]
}
},
{ $unwind : "$pagination" },
]).next();
q1_avg value is an integer and as far as I know, regex only works with strings. Could that be the reason
Let's say I have the input docs below:
[
{
"_id": "6225ca4052e7c226e2dd836d",
"data": [
"07",
"07",
"12",
"19",
"07",
"32"
]
},
{
"_id": "6225ca4052e7c226e2dd888f",
"data": [
"99",
"97",
"52",
"99",
"58",
"92"
]
}
]
I want to count the occurrences of every element in data string array per document. In JS, I can use countBy. How can I achieve the same using MongoDB Aggregation Framework?
I have tried to $reduce but MongoDB seems to not support assigning dynamic field to object.
{
$reduce: {
input: '$data',
initialValue: {},
in: { // assign `$$this` with count to `$$value`, but failed! }
}
}
Below is the desired output.
[
{
"_id": "6225ca4052e7c226e2dd836d",
"freqs": {
"12": 1,
"19": 1,
"32": 1,
"07": 3
}
},
{
"_id": "6225ca4052e7c226e2dd888f",
"freqs": {
"52": 1,
"58": 1,
"92": 1,
"97": 1,
"99": 2
}
}
]
db.collection.aggregate([
{
$match: {}
},
{
$unwind: "$data"
},
{
$group: {
_id: "$data",
c: { $sum: 1 },
id: { $first: "$_id" }
}
},
{
$group: {
_id: "$id",
data: { $push: { k: "$_id", v: "$c" } }
}
},
{
$set: {
data: { $arrayToObject: "$data" }
}
}
])
mongoplayground
db.collection.aggregate([
{
$set: {
data: {
$function: {
body: "function(d) {let obj = {}; d.forEach(e => {if(obj[e]==null) { obj[e]=1; }else{ obj[e]++; }}); return obj;}",
args: [
"$data"
],
lang: "js"
}
}
}
}
])
mongoplayground
My objective is to write an efficient query, that with the given input, gives me the expected output. I have some working solution, but all "types" are "manually" written, so I guess I'm looking for help to get the same output but in a different way.
input
reportId
type
weight
A
"fish"
4
A
"fish"
2
A
"cow"
0
B
"fish"
2
B
"tuna"
1
B
"bird"
Expected output
[
{
reportId: "A",
totalCount: 3,
totalWeight: 6,
fishCount: 2,
tunaCount: 0,
cowCount: 1,
birdCount: 0
},
{
reportId: "A",
totalCount: 3,
totalWeight: 2,
fishCount: 1,
tunaCount: 1,
cowCount: 0,
birdCount: 1
},
]
Partial "hard-coded" solution
What I have been doing so far is to create 2 group-by steps: It kind of get's the job done, but in my real use-case there are a lot of types, and therefore the group-stages are very long.
[
{
$group: {
_id: { reportId: "$reportId", type: $type },
count: { $sum: 1 },
totalWeight: { $sum: "$weight" }
}
},
{
$group: {
_id: "$_id.reportId",
totalCount: { $sum: "$totalCount" },
totalWeight: { $sum: "$totalWeight" },
fishCount: {
$sum: {
$cond: {
"if": { $eq: ["$_id.type", "fish"] },
then: "$count",
else: 0
}
}
},
tunaCount: {
$sum: {
$cond: {
"if": { $eq: ["$_id.type", "tuna"] },
then: "$count",
else: 0
}
}
},
// <== And here I have a count blog for each type. Can I get the same result in a better way?
}
}
]
I will focus to the second part, which is the difficult one. I don't know whether there is a shorter and better solution, but this one should work:
db.collection.aggregate([
{
$unset: "_id"
},
{
$set: {
data: {
"$objectToArray": "$$ROOT"
}
}
},
{
$group: {
_id: "$reportId",
data: {
$push: "$data"
}
}
},
{
$set: {
data: {
$reduce: {
input: "$data",
initialValue: [],
in: {
$concatArrays: [
"$$value",
"$$this"
]
}
}
}
}
},
{
$set: {
data: {
$filter: {
input: "$data",
cond: {
$not: {
$in: [
"$$this.k",
[
"totalCount",
"totalWeight"
]
]
}
}
}
}
}
},
{
$unwind: "$data"
},
{
$group: {
_id: "$_id",
data: {
$push: "$data"
}
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: "$data"
}
}
}
])
See Mongo playground
I have the following film collection structure:
{
"_id" : ObjectId,
"title" : "movie-1",
"actors" : [
"actor-1",
"actor-2",
"actor-3",
],
"categories" : [
"category-1",
"category-2"
]
}
I want to display result of all actors with associate movies and categories as like as given below:
{
"actor": "actor-1",
"result": {
"category-1": [ "movie-1", "movie-2" ],
"category-2": [ "movie-1", "movie-4" ]
}
}
I have tried aggregation as like as given below:
db.film.aggregate([
{ $unwind: "$actors" },
{ $group: {
_id: "$actors",
data: { $push: { movie: "$title", categories: "$categories" } }
}
},
{
$project: {
_id: 0,
actor: "$_id",
result: {
$reduce: {
input: "$data",
initialValue: {},
in: {
$let: {
vars: { movie: "$$this.movie", categories: "$$this.categories" },
in: {
$arrayToObject: {
$map: {
input: "$$categories",
in: { k: "$$this", v: "$$movie" }
}
}
}
}
}
}
}
}
}
])
But I get all actors list with only one movie with category as like as given below:
{
"actor" : "actor-1",
"result" : {
"category-1" : "movie-1",
"category-2" : "movie-2",
"category-3" : "movie-3"
}
}
How can I solve this problem? Thanks in advance.
You may need to do another $unwind on the categories array after flattening the actors array then group all the flattened docs by the two fields i.e. actor and category fields to create the movie titles list.
Another group to shape the result field is required.
The following pipeline should give you the desired result:
db.film.aggregate([
{ "$unwind": "$actors" },
{ "$unwind": "$categories" },
{ "$group": {
"_id": { "actor": "$actors", "category": "$categories" },
"movies": { "$push": "$title" }
} },
{ "$group": {
"_id": "$_id.actor",
"result": {
"$push": {
"k": "$_id.category",
"v": "$movies"
}
}
} },
{ "$addFields": {
"result": { "$arrayToObject": "$result" }
} }
])
I've used a sledgehammer to crack a nut (c)
Some stages could be replaced by $reduce, done inside $project stage (criticism and suggestions will be welcome)
db.film.aggregate([
{
$unwind: "$actors"
},
{
$group: {
_id: "$actors",
data: {
$push: {
movie: "$title",
categories: "$categories"
}
}
}
},
{
$unwind: "$data"
},
{
$unwind: "$data.categories"
},
{
$group: {
_id: {
actors: "$_id",
categories: "$data.categories"
},
movies: {
$push: "$data.movie"
}
}
},
{
$project: {
_id: 0,
actor: "$_id.actors",
result: {
k: "$_id.categories",
v: "$movies"
}
}
},
{
$group: {
_id: "$actor",
result: {
$push: "$result"
}
}
},
{
$project: {
_id: 0,
actor: "$_id",
result: {
$arrayToObject: "$result"
}
}
},
{
$sort: {
actor: 1
}
}
])
MongoPlayground
I have a collection with documents similar to the following format:
{
departure:{name: "abe"},
arrival:{name: "tom"}
},
{
departure:{name: "bob"},
arrival:{name: "abe"}
}
And to get output like so:
{
name: "abe",
departureCount: 1,
arrivalCount: 1
},
{
name: "bob",
departureCount: 1,
arrivalCount: 0
},
{
name: "tom",
departureCount: 0,
arrivalCount: 1
}
I'm able to get the counts individually by doing a query for the specific data like so:
db.sched.aggregate([
{
"$group":{
_id: "$departure.name",
departureCount: {$sum: 1}
}
}
])
But I haven't figured out how to merge the arrival and departure name into one document along with counts for both. Any suggestions on how to accomplish this?
You should use a $map to split your doc into 2, then $unwind and $group..
[
{
$project: {
dep: '$departure.name',
arr: '$arrival.name'
}
},
{
$project: {
f: {
$map: {
input: {
$literal: ['dep', 'arr']
},
as: 'el',
in : {
type: '$$el',
name: {
$cond: [{
$eq: ['$$el', 'dep']
}, '$dep', '$arr']
}
}
}
}
}
},
{
$unwind: '$f'
}, {
$group: {
_id: {
'name': '$f.name'
},
departureCount: {
$sum: {
$cond: [{
$eq: ['$f.type', 'dep']
}, 1, 0]
}
},
arrivalCount: {
$sum: {
$cond: [{
$eq: ['$f.type', 'arr']
}, 1, 0]
}
}
}
}, {
$project: {
_id: 0,
name: '$_id.name',
departureCount: 1,
arrivalCount: 1
}
}
]