I have a large set of documents that may have two arrays or one of the two. I want to merge them in a $project.
I am currently using $concatArrays but as the documentation says it returns null when one of the arrays is null. I can figure out how to add a condition statement in there that will either return the $concatArrays or what ever array is in there.
Example
I have:
{_id: 1, array1: ['a', 'b', 'b'], array2: ['e', 'e']}
{_id: 2, array1: ['a', 'b', 'b']}
{_id: 3, array2: ['e', 'e']}
I want:
{_id: 1, combinedArray: ['a','b', 'b', 'e', 'e']}
{_id: 2, combinedArray: ['a','b', 'b']}
{_id: 3, combinedArray: ['e', 'e']}
I tried:
$project: {
combinedArray: { '$concatArrays': [ '$array1', '$array2' ] }
}
//output (unexpected result):
{_id: 1, combinedArray: ['a','b', 'b', 'e', 'e']}
{_id: 2, combinedArray: null}
{_id: 3, combinedArray: null}
I also tried:
$project: {
combinedArray: { '$setUnion': [ '$array1', '$array2' ] }
}
//output (unexpected result):
{_id: 1, combinedArray: ['a','b', 'e']}
{_id: 2, combinedArray: ['a','b']}
{_id: 3, combinedArray: ['e']}
As documentation for $concatArrays says
If any argument resolves to a value of null or refers to a field that
is missing, $concatArrays returns null.
So we need to be sure that we are not passing arguments which refer to a missing field or null. You can do that with $ifNull operator:
Evaluates an expression and returns the value of the expression if the
expression evaluates to a non-null value. If the expression evaluates
to a null value, including instances of undefined values or missing
fields, returns the value of the replacement expression.
So just return empty array if filed expression will not evaluate to non-null value:
db.collection.aggregate([
{$project: {
combinedArray: { '$concatArrays': [
{$ifNull: ['$array1', []]},
{$ifNull: ['$array2', []]}
] }
}
}
])
You can easily achieve this with the $ifNull operator:
db.arr.aggregate([
{
$project:{
combinedArray:{
$concatArrays:[
{
$ifNull:[
"$array1",
[]
]
},
{
$ifNull:[
"$array2",
[]
]
}
]
}
}
}
])
output:
{ "_id" : 1, "combinedArray" : [ "a", "b", "b", "e", "e" ] }
{ "_id" : 2, "combinedArray" : [ "a", "b", "b" ] }
{ "_id" : 3, "combinedArray" : [ "e", "e" ] }
I tried to do this with nested $cond, answer with $ifNull is better, but still posting my answer.
db.getCollection('test').aggregate( [{
$project: {
combinedArray: { $cond: [
{ $and: [ { $isArray: ["$array1"] }, { $isArray: ["$array2"] } ] },
{ '$concatArrays': [ '$array1', '$array2' ] },
{ $cond: [
{ $isArray: ["$array1"] },
"$array1",
"$array2"
] }
] }
}
}] )
Related
I need to identify which documents have the wrong date string ( $gt:10 characters) from all my collection :
{
"_id": ObjectId("5c05984246a0201286d4b57a"),
f: "x",
"_a": [
{
"_onlineStore": {}
},
{
"_p": {
"s": {
"a": {
"t": [
{
"dateP": "20200-09-20",
"l": "English",
"size": "XXL"
}
]
},
"c": {
"t": [
{
"dateP": "20300-09-20",
"l": "English",
"size": "XXL"
}
]
}
}
}
}
]
}
and output need to be as follow:
{f:"x",dateP:"20200-09-20", t:"c"}
{f:"x",dateP:"20300-09-20", t:"a"}
The last field in the output "t" not compulsory but desirable ...
Please, help ...
We can use $objectToArray for this:
db.collection.aggregate([
{$unwind: "$_a"},
{$project: {_id: 0, f: 1, data: {$objectToArray: "$_a._p.s"}}},
{$unwind: "$data"},
{$unwind: "$data.v.t"},
{$match: {$expr: {$gt: [{$strLenCP: "$data.v.t.dateP"}, 10]}}},
{$project: {f: 1, dateP: "$data.v.t.dateP", t: "$data.k"}}
])
See how it works on the playground example
Let's say I have those documents below:
[
{
array : ['a', 'b' , 'c'],
},
{
array : ['b', 'd' , 'e'],
},
{
array : ['d', 'e' , 'f'],
},
]
and input array for query:
["b","d","e","f"]
Expected output:
['b', 'd' , 'e'],['d', 'e' , 'f']
Which query can I use to do that?
And how to filter which element is not in the document?
Expected result:
[
{
array : ['b', 'd' , 'e'],
missingElement : ['f']
},
{
array : ['d', 'e' , 'f'],
missingElement : ['b']
},
]
$expr - Allow to use aggregation operator.
1.1. $eq - Compare the result from 1.1.1 and 1.1.2 are equal.
1.1.1. $size - Get the size of array field.
1.1.2. $size - Get the size of array from the result 1.1.2.1.
1.1.2.1. $setIntersection - Intersect array field and input array, return the intersected value(s) in array.
db.collection.find({
$expr: {
$eq: [
{
$size: "$array"
},
{
$size: {
$setIntersection: [
"$array",
[
"b",
"d",
"e",
"f"
]
]
}
}
]
}
})
Sample Mongo Playground
Updated
For Aggregation query to find missing element(s):
$match - Filter the documents (as explained in the first answer for $expr).
$project - Decorate the output documents. For missingElement field, you need $filter operator to find each value in the input array does not exist ($not and $in) in the array.
db.collection.aggregate([
{
$match: {
$expr: {
$eq: [
{
$size: "$array"
},
{
$size: {
$setIntersection: [
"$array",
[
"b",
"d",
"e",
"f"
]
]
}
}
]
}
}
},
{
$project: {
array: 1,
missingElement: {
$filter: {
input: [
"b",
"d",
"e",
"f"
],
cond: {
$not: {
$in: [
"$$this",
"$array"
]
}
}
}
}
}
}
])
Sample Mongo Playground (Aggregation query)
Here is the document
{
'_id': ObjectId('61a4262a53ddaa8b93374613'),
'userid': 'renyi',
'data1': [{'a': 1}, 1, 2, 3],
'data2': [{'c': 1}, {'b': 2}, {'c': 3}, {'c': 2}],
'data': {'0': {'a': 1}}
}
With this
coll.find_one({'userid':'renyi','data2.c':1},{'_id':0,"data2.$":1})
I can get
{'data2': [{'c': 1}]}
But how to get
{'data2': [{'c': 1},{'c': 2}]}
You can use $filter to filter the element within the array in projection.
db.collection.find({
"userid": "renyi",
"data2.c": {
$in: [
1,
2
]
}
},
{
"_id": 0,
"data2": {
$filter: {
input: "$data2",
cond: {
$in: [
"$$this.c",
[
1,
2
]
]
}
}
}
})
Sample Mongo Playground
This question already has answers here:
Retrieve only the queried element in an object array in MongoDB collection
(18 answers)
Closed 4 years ago.
Could you please help me to write some sort of aggregation query using mongodb.
I have next data structure.
[
{
id: 1,
shouldPay: true,
users: [
{
id: 100,
items: [{...}],
tags: ['a', 'b', 'c', 'd']
},
{
id: 100,
items: [{...}],
tags: ['b', 'c', 'd']
},
{
id: 100,
items: [{...}],
tags: ['c', 'd']
}
],
}
]
In result I want to get something like that:
[
{
id: 1,
shouldPay: true,
user: {
id: 100,
items: [{...}],
tags: ['a', 'b', 'c', 'd']
}
}
]
The main idea is to select a specific user that has "a" letter or list of letters ['a', 'b'] in tags.
You can use below aggregation
Use $match at the starting of the pipeline to filter out the documents which don't contain "a" and "b" in tags array. And then use $filter with $setIsSubset to filter out the nested array.
$arrayELemAt to return the specified element from the array.
db.collection.aggregate([
{ "$match": { "users.tags": { "$in": ["a", "b"] }}},
{ "$project": {
"users": {
"$arrayElemAt": [
{ "$filter": {
"input": "$users",
"cond": { "$setIsSubset": [["a", "b"], "$$this.tags"] }
}},
0
]
}
}}
])
You need to use $unwind along with $filter:
db.collection.aggregate([
{
$unwind: "$users"
},
{
$match: {
"users.tags": "a"
}
}
])
Result:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"id": 1,
"shouldPay": true,
"users": {
"id": 100,
"tags": [
"a",
"b",
"c",
"d"
]
}
}
]
I have a query but I don't know how to write
I have tour collection with embedded points.
{
_id: "my tour",
points: [
{_id: '1', point: 'A'},
{_id: '2', point: 'B'},
{_id: '3', point: 'C'},
{_id: '4', point: 'D'}
]
}
When user search tour start at B & end at C I want to list this document
But when user search: start at C & end at B the tour will not be listed.
How I can write the this query?
You use $elemMatch here since there are "two" conditions on the array entry selection:
var start = "B",
end = "C";
db.getCollection('nest').find({
"points": {
"$elemMatch": {
"point": { "$gte": start, "$lte": end }
}
}
})
Would select the document since the conditions are true for both the elements with "B" and "C", but reversing the order:
var start = "C",
end = "B";
db.getCollection('nest').find({
"points": {
"$elemMatch": {
"point": { "$gte": start, "$lte": end }
}
}
})
The condition is not true for any array element since no element has a value both greater or equal to "C" and also satisfies less or equal to "B".
So that's how you enforce the order, and it's fine as long as the data itself is always ordered that way.
If you have data that is itself presented in reverse:
{
_id: "my tour",
points: [
{_id: '4', point: 'D'},
{_id: '3', point: 'C'},
{_id: '2', point: 'B'},
{_id: '1', point: 'A'}
]
}
And you do want to "start" at "C" and end at "B" indicating forward progression by "array index", then you need a calculated statement:
var start = "B",
end = "C";
db.getCollection('nest').aggregate([
{ "$match": {
"points.point": { "$all": [start,end] }
}},
{ "$redact": {
"$cond": {
"if": {
"$and": [
{ "$gt": [
{ "$indexOfArray": [ "$points.point", start ] },
{ "$indexOfArray": [ "$points.point", end ] },
]}
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
So the best advise here is to keep the array ordered that is searchable for a lexical range, since you can do with calculation but the raw query is far more performant.