Find in tripple nested array mongodb [duplicate] - mongodb

I have this Collection in mongodb
{
"_id" : "777",
"someKey" : "someValue",
"someArray" : [
{
"name" : "name1",
"someNestedArray" : [
{
"name" : "value"
},
{
"name" : "delete me"
}
]
}
]
}
I want to find document based on someArray.someNestedArray.name
but i can't find any useful link all search result about update nested array
i am trying this but return nothing
db.mycollection.find({"someArray.$.someNestedArray":{"$elemMatch":{"name":"1"}}})
db.mycollection.find({"someArray.$.someNestedArray.$.name":"1"})
and Some thing else
how can i find by element in double nested array mongodb?

In the simplest sense this just follows the basic form of "dot notation" as used by MongoDB. That will work regardless of which array member the inner array member is in, as long as it matches a value:
db.mycollection.find({
"someArray.someNestedArray.name": "value"
})
That is fine for a "single field" value, for matching multiple-fields you would use $elemMatch:
db.mycollection.find({
"someArray": {
"$elemMatch": {
"name": "name1",
"someNestedArray": {
"$elemMatch": {
"name": "value",
"otherField": 1
}
}
}
}
})
That matches the document which would contain something with a a field at that "path" matching the value. If you intended to "match and filter" the result so only the matched element was returned, this is not possible with the positional operator projection, as quoted:
Nested Arrays
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
Modern MongoDB
We can do this by applying $filter and $map here. The $map is really needed because the "inner" array can change as a result of the "filtering", and the "outer" array of course does not match the conditions when the "inner" was stripped of all elements.
Again following the example of actually having multiple properties to match within each array:
db.mycollection.aggregate([
{ "$match": {
"someArray": {
"$elemMatch": {
"name": "name1",
"someNestedArray": {
"$elemMatch": {
"name": "value",
"otherField": 1
}
}
}
}
}},
{ "$addFields": {
"someArray": {
"$filter": {
"input": {
"$map": {
"input": "$someArray",
"as": "sa",
"in": {
"name": "$$sa.name",
"someNestedArray": {
"$filter": {
"input": "$$sa.someNestedArray",
"as": "sn",
"cond": {
"$and": [
{ "$eq": [ "$$sn.name", "value" ] },
{ "$eq": [ "$$sn.otherField", 1 ] }
]
}
}
}
}
},
},
"as": "sa",
"cond": {
"$and": [
{ "$eq": [ "$$sa.name", "name1" ] },
{ "$gt": [ { "$size": "$$sa.someNestedArray" }, 0 ] }
]
}
}
}
}}
])
Therefore on the "outer" array the $filter actually looks at the $size of the "inner" array after it was "filtered" itself, so you can reject those results when the whole inner array does in fact match noting.
Older MongoDB
In order to "project" only the matched element, you need the .aggregate() method:
db.mycollection.aggregate([
// Match possible documents
{ "$match": {
"someArray.someNestedArray.name": "value"
}},
// Unwind each array
{ "$unwind": "$someArray" },
{ "$unwind": "$someArray.someNestedArray" },
// Filter just the matching elements
{ "$match": {
"someArray.someNestedArray.name": "value"
}},
// Group to inner array
{ "$group": {
"_id": {
"_id": "$_id",
"name": "$someArray.name"
},
"someKey": { "$first": "$someKey" },
"someNestedArray": { "$push": "$someArray.someNestedArray" }
}},
// Group to outer array
{ "$group": {
"_id": "$_id._id",
"someKey": { "$first": "$someKey" },
"someArray": { "$push": {
"name": "$_id.name",
"someNestedArray": "$someNestedArray"
}}
}}
])
That allows you to "filter" the matches in nested arrays for one or more results within the document.

You can also try something like below:
db.collection.aggregate(
{ $unwind: '$someArray' },
{
$project: {
'filteredValue': {
$filter: {
input: "$someArray.someNestedArray",
as: "someObj",
cond: { $eq: [ '$$someObj.name', 'delete me' ] }
}
}
}
}
)

Related

Querying MongoDB array of similar objects

I've to work with old MongoDB where objects in one collection are structured like this.
{
"_id": ObjectId("57fdfcc7a7c81fde38b79a3d"),
"parameters": [
{
"key": "key1",
"value": "value1"
},
{
"key": "key2",
"value": "value2"
}
]
}
The problem is that parameters is an array of objects, which makes efficient querying difficult. There can be about 50 different objects, which all have "key" and "value" properties. Is it possible to make a query, where the query targets "key" and "value" inside one object? I've tried
db.collection.find({$and:[{"parameters.key":"value"}, {"parameters.value":"another value"}]})
but this query hits all the objects in parameters array.
EDIT. Nikhil Jagtiani found solution to my original question, but actually I should be able query to target multiple objects inside parameters array. E.g. check keys and values in two different objects in parameters array.
Please refer below mongo shell aggregate query :
db.collection.aggregate([
{
$unwind:"$parameters"
},
{
$match:
{
"parameters.key":"key1",
"parameters.value":"value1"
}
}
])
1) Stage 1 - Unwind : Deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.
2) Stage 2 - Match : Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage.
Without aggregation, queries will return the entire document even if one subdocument matches. This pipeline will only return the required subdocuments.
Edit: If you need to specify multiple key value pairs, what we need is $in for parameters field.
db.collection.aggregate([{$unwind:"$parameters"},{$match:{"parameters":{$in:[{ "key" : "key1", "value" : "value1"},{ "key" : "key2", "value" : "value2" }]}}}])
will match the following two pairs of key-values as subdocuments:
1) { "key" : "key1", "value" : "value1" }
2) { "key" : "key2", "value" : "value2" }
There is a $filter operator in the aggregation framework which is perfect for such queries. A bit verbose but very efficient, you can use it as follows:
db.surveys.aggregate([
{ "$match": {
"$and": [
{
"parameters.key": "key1",
"parameters.value": "val1"
},
{
"parameters.key": "key2",
"parameters.value": "val2"
}
]
}},
{
"$project": {
"parameters": {
"$filter": {
"input": "$parameters",
"as": "item",
"cond": {
"$or": [
{
"$and" : [
{ "$eq": ["$$item.key", "key1"] },
{ "$eq": ["$$item.value", "val1"] }
]
},
{
"$and" : [
{ "$eq": ["$$item.key", "key2"] },
{ "$eq": ["$$item.value", "val2"] }
]
}
]
}
}
}
}
}
])
You can also do this with more set operators in MongoDB 2.6 without using $unwind:
db.surveys.aggregate([
{ "$match": {
"$and": [
{
"parameters.key": "key1",
"parameters.value": "val1"
},
{
"parameters.key": "key2",
"parameters.value": "val2"
}
]
}},
{
"$project": {
"parameters": {
"$setDifference": [
{ "$map": {
"input": "$parameters",
"as": "item",
"in": {
"$cond": [
{ "$or": [
{
"$and" : [
{ "$eq": ["$$item.key", "key1"] },
{ "$eq": ["$$item.value", "val1"] }
]
},
{
"$and" : [
{ "$eq": ["$$item.key", "key2"] },
{ "$eq": ["$$item.value", "val2"] }
]
}
]},
"$$item",
false
]
}
}},
[false]
]
}
}
}
])
For a solution with MongoDB 2.4, you would need to use the $unwind operator unfortunately:
db.surveys.aggregate([
{ "$match": {
"$and": [
{
"parameters.key": "key1",
"parameters.value": "val1"
},
{
"parameters.key": "key2",
"parameters.value": "val2"
}
]
}},
{ "$unwind": "$parameters" },
{ "$match": {
"$and": [
{
"parameters.key": "key1",
"parameters.value": "val1"
},
{
"parameters.key": "key2",
"parameters.value": "val2"
}
]
}},
{
"$group": {
"_id": "$_id",
"parameters": { "$push": "$parameters" }
}
}
]);
Is it possible to make a query, where the query targets "key" and
"value" inside one object?
This is possible if you know which object(id) you are going to query upfront(to be given as input parameter in the find query). If that is not possible then we can try on the below approach for efficient querying.
Build an index on the parameters.key and if needed also on parameters.value. This would considerably improve the query performance.
Please see
https://docs.mongodb.com/manual/indexes/
https://docs.mongodb.com/manual/core/index-multikey/

Mongo Group and sum with two fields

I have documents like:
{
"from":"abc#sss.ddd",
"to" :"ssd#dff.dff",
"email": "Hi hello"
}
How can we calculate count of sum "from and to" or "to and from"?
Like communication counts between two people?
I am able to calculate one way sum. I want to have sum both ways.
db.test.aggregate([
{ $group: {
"_id":{ "from": "$from", "to":"$to"},
"count":{$sum:1}
}
},
{
"$sort" :{"count":-1}
}
])
Since you need to calculate number of emails exchanged between 2 addresses, it would be fair to project a unified between field as following:
db.a.aggregate([
{ $match: {
to: { $exists: true },
from: { $exists: true },
email: { $exists: true }
}},
{ $project: {
between: { $cond: {
if: { $lte: [ { $strcasecmp: [ "$to", "$from" ] }, 0 ] },
then: [ { $toLower: "$to" }, { $toLower: "$from" } ],
else: [ { $toLower: "$from" }, { $toLower: "$to" } ] }
}
}},
{ $group: {
"_id": "$between",
"count": { $sum: 1 }
}},
{ $sort :{ count: -1 } }
])
Unification logic should be quite clear from the example: it is an alphabetically sorted array of both emails. The $match and $toLower parts are optional if you trust your data.
Documentation for operators used in the example:
$match
$exists
$project
$cond
$lte
$strcasecmp
$toLower
$group
$sum
$sort
You basically need to consider the _id for grouping as an "array" of the possible "to" and "from" values, and then of course "sort" them, so that in every document the combination is always in the same order.
Just as a side note, I want to add that "typically" when I am dealing with messaging systems like this, the "to" and "from" sender/recipients are usually both arrays to begin with anyway, so it usally forms the base of where different variations on this statement come from.
First, the most optimal MongoDB 3.2 statement, for single addresses
db.collection.aggregate([
// Join in array
{ "$project": {
"people": [ "$to", "$from" ],
}},
// Unwind array
{ "$unwind": "$people" },
// Sort array
{ "$sort": { "_id": 1, "people": 1 } },
// Group document
{ "$group": {
"_id": "$_id",
"people": { "$push": "$people" }
}},
// Group people and count
{ "$group": {
"_id": "$people",
"count": { "$sum": 1 }
}}
]);
Thats the basics, and now the only variations are in construction of the "people" array ( stage 1 only above ).
MongoDB 3.x and 2.6.x - Arrays
{ "$project": {
"people": { "$setUnion": [ "$to", "$from" ] }
}}
MongoDB 3.x and 2.6.x - Fields to array
{ "$project": {
"people": {
"$map": {
"input": ["A","B"],
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "A", "$$el" ] },
"$to",
"$from"
]
}
}
}
}}
MongoDB 2.4.x and 2.2.x - from fields
{ "$project": {
"to": 1,
"from": 1,
"type": { "$const": [ "A", "B" ] }
}},
{ "$unwind": "$type" },
{ "$group": {
"_id": "$_id",
"people": {
"$addToSet": {
"$cond": [
{ "$eq": [ "$type", "A" ] },
"$to",
"$from"
]
}
}
}}
But in all cases:
Get all recipients into a distinct array.
Order the array to a consistent order
Group on the "always in the same order" list of recipients.
Follow that and you cannot go wrong.

Get Distinct list of two properties using MongoDB 2.4

I have an article collection:
{
_id: 9999,
authorId: 12345,
coAuthors: [23456,34567],
title: 'My Article'
},
{
_id: 10000,
authorId: 78910,
title: 'My Second Article'
}
I'm trying to figure out how to get a list of distinct author and co-author ids out of the database. I have tried push, concat, and addToSet, but can't seem to find the right combination. I'm on 2.4.6 so I don't have access to setUnion.
Whilst $setUnion would be the "ideal" way to do this, there is another way that basically involved "switching" between a "type" to alternate which field is picked:
db.collection.aggregate([
{ "$project": {
"authorId": 1,
"coAuthors": { "$ifNull": [ "$coAuthors", [null] ] },
"type": { "$const": [ true,false ] }
}},
{ "$unwind": "$coAuthors" },
{ "$unwind": "$type" },
{ "$group": {
"_id": {
"$cond": [
"$type",
"$authorId",
"$coAuthors"
]
}
}},
{ "$match": { "_id": { "$ne": null } } }
])
And that is it. You may know the $const operation as the $literal operator from MongoDB 2.6. It has always been there, but was only documented and given an "alias" at the 2.6 release.
Of course the $unwind operations in both cases produce more "copies" of the data, but this is grouping for "distinct" values so it does not matter. Just depending on the true/false alternating value for the projected "type" field ( once unwound ) you just pick the field alternately.
Also this little mapReduce does much the same thing:
db.collection.mapReduce(
function() {
emit(this.authorId,null);
if ( this.hasOwnProperty("coAuthors"))
this.coAuthors.forEach(function(id) {
emit(id,null);
});
},
function(key,values) {
return null;
},
{ "out": { "inline": 1 } }
)
For the record, $setUnion is of course a lot cleaner and more performant:
db.collection.aggregate([
{ "$project": {
"combined": {
"$setUnion": [
{ "$map": {
"input": ["A"],
"as": "el",
"in": "$authorId"
}},
{ "$ifNull": [ "$coAuthors", [] ] }
]
}
}},
{ "$unwind": "$combined" },
{ "$group": {
"_id": "$combined"
}}
])
So there the only real concerns are converting the singular "authorId" to an array via $map and feeding an empty array where the "coAuthors" field is not present in the document.
Both output the same distinct values from the sample documents:
{ "_id" : 78910 }
{ "_id" : 23456 }
{ "_id" : 34567 }
{ "_id" : 12345 }

How to find document and single subdocument matching given criterias in MongoDB collection

I have collection of products. Each product contains array of items.
> db.products.find().pretty()
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "01.02.2100",
"purchasePrice" : 1,
"sellingPrice" : 10,
"count" : 15
},
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
So, can you please give me an advice, how I can query MongoDB to retrieve all products with only single item which date is equals to the date I pass to query as parameter.
The result for "31.08.2014" must be:
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
What you are looking for is the positional $ operator and "projection". For a single field you need to match the required array element using "dot notation", for more than one field use $elemMatch:
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
)
Or the $elemMatch for more than one matching field:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "shop": 1, "name":1, "items.$": 1 }
)
These work for a single array element only though and only one will be returned. If you want more than one array element to be returned from your conditions then you need more advanced handling with the aggregation framework.
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$unwind": "$items" },
{ "$match": { "items.date": "31.08.2014" } },
{ "$group": {
"_id": "$_id",
"shop": { "$first": "$shop" },
"name": { "$first": "$name" },
"items": { "$push": "$items" }
}}
])
Or possibly in shorter/faster form since MongoDB 2.6 where your array of items contains unique entries:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$project": {
"shop": 1,
"name": 1,
"items": {
"$setDifference": [
{ "$map": {
"input": "$items",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.date", "31.08.2014" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
Or possibly with $redact, but a little contrived:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$redact": {
"$cond": [
{ "$eq": [ { "$ifNull": [ "$date", "31.08.2014" ] }, "31.08.2014" ] },
"$$DESCEND",
"$$PRUNE"
]
}}
])
More modern, you would use $filter:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$addFields": {
"items": {
"input": "$items",
"cond": { "$eq": [ "$$this.date", "31.08.2014" ] }
}
}}
])
And with multiple conditions, the $elemMatch and $and within the $filter:
db.products.aggregate([
{ "$match": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "$addFields": {
"items": {
"input": "$items",
"cond": {
"$and": [
{ "$eq": [ "$$this.date", "31.08.2014" ] },
{ "$eq": [ "$$this.purchasePrice", 1 ] }
]
}
}
}}
])
So it just depends on whether you always expect a single element to match or multiple elements, and then which approach is better. But where possible the .find() method will generally be faster since it lacks the overhead of the other operations, which in those last to forms does not lag that far behind at all.
As a side note, your "dates" are represented as strings which is not a very good idea going forward. Consider changing these to proper Date object types, which will greatly help you in the future.
Based on Neil Lunn's code I work with this solution, it includes automatically all first level keys (but you could also exclude keys if you want):
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
{ items: { $elemMatch: { date: "31.08.2014" } } },
)
With multiple requirements:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ items: { $elemMatch: { "date": "31.08.2014", "purchasePrice": 1 } } },
)
Mongo supports dot notation for sub-queries.
See: http://docs.mongodb.org/manual/reference/glossary/#term-dot-notation
Depending on your driver, you want something like:
db.products.find({"items.date":"31.08.2014"});
Note that the attribute is in quotes for dot notation, even if usually your driver doesn't require this.

Mongodb aggregation, finding within an array of values

I have a schemea that creates documents using the following structure:
{
"_id" : "2014-07-16:52TEST",
"date" : ISODate("2014-07-16T23:52:59.811Z"),
"name" : "TEST"
"values" : [
[
1405471921000,
0.737121
],
[
1405471922000,
0.737142
],
[
1405471923000,
0.737142
],
[
1405471924000,
0.737142
]
]
}
In the values, the first index is a timestamp. What I'm trying to do is query a specific timestamp to find the closest value ($gte).
I've tried the following aggregate query:
[
{ "$match": {
"values": {
"$elemMatch": { "0": {"$gte": 1405471923000} }
},
"name" : 'TEST'
}},
{ "$project" : {
"name" : 1,
"values" : 1
}},
{ "$unwind": "$values" },
{ "$match": { "values.0": { "$gte": 1405471923000 } } },
{ "$limit" : 1 },
{ "$sort": { "values.0": -1 } },
{ "$group": {
"_id": "$name",
"values": { "$push": "$values" },
}}
]
This seems to work, but it doesn't pull the closest value. It seems to pull anything greater or equal to and the sort doesn't seem to get applied, so it will pull a timestamp that is far in the future.
Any suggestions would be great!
Thank you
There are a couple of things wrong with the approach here even though it is a fair effort. You are right that you need to $sort here, but the problem is that you cannot "sort" on an inner element with an array. In order to get a value that can be sorted you must $unwind the array first as it otherwise will not sort on an array position.
You also certainly do not want $limit in the pipeline. You might be testing this against a single document, but "limit" will actually act on the entire set of documents in the pipeline. So if more than one document was matching your condition then they would be thrown away.
The key thing you want to do here is use $first in your $group stage, which is applied once you have sorted to get the "closest" element that you want.
db.collection.aggregate([
// Documents that have an array element matching the condition
{ "$match": {
"values": { "$elemMatch": { "0": {"$gte": 1405471923000 } } }
}},
// Unwind the top level array
{ "$unwind": "$values" },
// Filter just the elements that match the condition
{ "$match": { "values.0": { "$gte": 1405471923000 } } },
// Take a copy of the inner array
{ "$project": {
"date": 1,
"name": 1,
"values": 1,
"valCopy": "$values"
}},
// Unwind the inner array copy
{ "$unwind": "$valCopy" },
// Filter the inner elements
{ "$match": { "valCopy": { "$gte": 1405471923000 } }},
// Sort on the now "timestamp" values ascending for nearest
{ "$sort": { "valCopy": 1 } },
// Take the "first" values
{ "$group": {
"_id": "$_id",
"date": { "$first": "$date" },
"name": { "$first": "$name" },
"values": { "$first": "$values" },
}},
// Optionally push back to array to match the original structure
{ "$group": {
"_id": "$_id",
"date": { "$first": "$date" },
"name": { "$first": "$name" },
"values": { "$push": "$values" },
}}
])
And this produces your document with just the "nearest" timestamp value matching the original document form:
{
"_id" : "2014-07-16:52TEST",
"date" : ISODate("2014-07-16T23:52:59.811Z"),
"name" : "TEST",
"values" : [
[
1405471923000,
0.737142
]
]
}