Find entry's count of array elements in each array in documents - mongodb

I have collection with such documents:
[
{p: [1, 2, 3, 4]},
{p: [1, 2, 7, 9, 10]},
{p: [3, 5]}
]
I want to know how many times each element of p in all documents appear in other document's p. The right result should be collection with such elements:
[
{pElement: 1, count: 2},
{pElement: 2, count: 2},
{pElement: 3, count: 2},
{pElement: 4, count: 1},
{pElement: 7, count: 1},
{pElement: 9, count: 1},
{pElement: 10, count: 1},
{pElement: 5, count: 1}
]
How can I achieve that?

You should use an Aggregation Pipeline with the following stages:
Decompose the p arrays and generate one document for each element. You can use $unwind operator in order to do that.
Group the generated documents based on the p value and count the occurrence of each one using the $group operator and $sum accumulator operator.
Reshape the previous stage result to look like {pElement: p, count: c} using the $project operator.
And sort them based on the count value using $sort operator.
The final aggregation code would look like:
db.collectionName.aggregate([
{ $unwind: "$p" },
{ $group: { _id: "$p", count: { $sum: 1 } } },
{ $project: { _id: 0, pElement: "$_id", count: 1 } },
{ $sort: { count: -1 } }
])
The result would be:
{ "count" : 2, "pElement" : 3 }
{ "count" : 2, "pElement" : 2 }
{ "count" : 2, "pElement" : 1 }
{ "count" : 1, "pElement" : 5 }
{ "count" : 1, "pElement" : 10 }
{ "count" : 1, "pElement" : 9 }
{ "count" : 1, "pElement" : 7 }
{ "count" : 1, "pElement" : 4 }

Related

Sum all named fields in aggregation

I am trying to calculate the sum of all the values below. I have tried googling the question in different ways but cannot find an answer. The data looks like this.
I don't care about the keys, I am just looking for a total of the values for monday
"monday" : {
"a" : 5,
"b" : 2,
"c" : 1,
"d" : 2,
"e" : 3,
"f" : 9,
"g" : 2,
"h" : 16,
"h2" : 8,
"g" : 2
}
You can use $objectToArray to convert monday into an array of k and v fields and then use $reduce to sum them:
db.collection.aggregate([
{
$project: {
sum: {
$reduce: {
input: { $objectToArray: "$monday" },
initialValue: 0,
in: { $add: [ "$$value", "$$this.v" ] }
}
}
}
}
])
Mongo playground

Mongodb update array of objects

I have a document to Mongodb update array of objects like this:
{
"_id" : 5,
"quizzes" : [
{ "wk" : 1, "score" : 10 },
{ "wk" : 2, "score" : 8 },
{ "wk" : 5, "score" : 8 }
]
}
And I want to add new field in each object of quizzes array.
The expected result should be
{
"_id" : 5,
"quizzes" : [
{ "wk" : 1, "score" : 10, "day": 1 },
{ "wk" : 2, "score" : 8, "day": 2 },
{ "wk" : 5, "score" : 8, "day": 3 }
]
}
Any solution for this.
You can use Aggregation Framework:
db.col.aggregate([
{
$unwind: {
path: "$quizzes",
includeArrayIndex: "quizzes.day"
}
},
{
$group: {
_id: "$_id",
quizzes: {
$push: {
"score" : "$quizzes.score",
"wk" : "$quizzes.wk",
"day" : { $add: [ "$quizzes.day", 1 ] }
}
}
}
},
{
$out: "col"
}
])
To assign indexes to each element you can use $unwind with includeArrayIndex option. Those indexes start with 0 so we have to use $add to start with 1. Then you can group back by your _id property and use $out to save aggregation results to your collection.

Find documents whose array field contains at least n elements of a given array

It is basically what the title says.
Input: myArray = an array of words
I have an model that have field
wordsCollection , which is an array field.
How can I find all documents of that model whose wordsCollections has at least n elements of myArray
Let say we have the following documents in our collection:
{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d03"), "a" : [ 0, 8, 432, 9, 34, -3 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }
and the following input array and n = 2
var inputArray = [1, 3, 0];
We can return those documents where the array field contains at least n elements of a given array using the aggregation framework.
The $match selects only those documents with the array's length greater or equals to n. This reduce the amount of data to be processed in down in the pipeline.
The $redact pipeline operator use a logical condition processing using the $cond operator and the special operations $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "discard" the document where the condition is false.
In our case, the condition is $gte which returns true if the $size of the intersection of the two arrays, which we compute using the $setIntersection operator is greater than or equal 2.
db.collection.aggregate(
[
{ "$match": { "a.1": { "$exists": true } } },
{ "$redact": {
"$cond": [
{ "$gte": [
{ "$size": { "$setIntersection": [ "$a", inputArray ] } },
2
]},
"$$KEEP",
"$$PRUNE"
]
}}
]
)
which produces:
{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }
Use aggregation.
In $match aggregation pipeline, you can use $size and $gte

How to group documents on index of array elements?

I'm looking for a way to take data such as this
{ "_id" : 5, "count" : 1, "arr" : [ "aga", "dd", "a" ] },
{ "_id" : 6, "count" : 4, "arr" : [ "aga", "ysdf" ] },
{ "_id" : 7, "count" : 4, "arr" : [ "sad", "aga" ] }
I would like to sum the count based on the 1st item(index) of arr. In another aggregation I would like to do the same with the 1st and the 2nd item in the arr array.
I've tried using unwind, but that breaks up the data and the hierarchy is then lost.
I've also tried using
$group: {
_id: {
arr_0:'$arr.0'
},
total:{
$sum: '$count'
}
}
but the result is blank arrays
Actually you can't use the dot notation to group your documents by element at a specified index. To two that you have two options:
First the optimal way using the $arrayElemAt operator new in MongoDB 3.2. which return the element at a specified index in the array.
db.collection.aggregate([
{ "$group": {
"_id": { "$arrayElemAt": [ "$arr", 0 ] },
"count": { "$sum": 1 }
}}
])
From MongoDB version 3.0 backward you will need to de-normalise your array then in the first time $group by _id and use the $first operator to return the first item in the array. From there you will need to regroup your document using that value and use the $sum to get the sum. But this will only work for the first and last index because MongoDB also provides the $last operator.
db.collection.aggregate([
{ "$unwind": "$arr" },
{ "$group": {
"_id": "$_id",
"arr": { "$first": "$arr" }
}},
{ "$group": {
"_id": "$arr",
"count": { "$sum": 1 }
}}
])
which yields something like this:
{ "_id" : "sad", "count" : 1 }
{ "_id" : "aga", "count" : 2 }
To group using element at position p in your array you will get a better chance using the mapReduce function.
var mapFunction = function(){ emit(this.arr[0], 1); };
var reduceFunction = function(key, value) { return Array.sum(value); };
db.collection.mapReduce(mapFunction, reduceFunction, { "out": { "inline": 1 } } )
Which returns:
{
"results" : [
{
"_id" : "aga",
"value" : 2
},
{
"_id" : "sad",
"value" : 1
}
],
"timeMillis" : 27,
"counts" : {
"input" : 3,
"emit" : 3,
"reduce" : 1,
"output" : 2
},
"ok" : 1
}

mongodb return fields optionally

I have a document like this:
{
"_id" : "abcdefg",
"status" : 1,
"myData": {
"a": "a",
"b": "b"
}
},
{
"_id" : "zxcvbnm",
"status" : 3,
"myData": {
"a": "a",
"b": "b"
}
}
if "status" === 3, field "myData" will return, like
myCollection.find({}, {"status": 1, "myData": 1})
if "status" !== 3, field "myData" will not return, like:
myCollection.find({}, {"status": 1, "myData": 0})
How could I do that in mongodb?
You could set them to null with aggregation.
db.collection.aggregate({
$project: {
status: 1,
myData: {
$cond: [{
$eq: ['$status', 1]
}, '$myData', null]
}
}
})
It does not remove the field but saves you some traffic.