How to query Start and End Elements in Order from an Array? - mongodb

I have a query but I don't know how to write
I have tour collection with embedded points.
{
_id: "my tour",
points: [
{_id: '1', point: 'A'},
{_id: '2', point: 'B'},
{_id: '3', point: 'C'},
{_id: '4', point: 'D'}
]
}
When user search tour start at B & end at C I want to list this document
But when user search: start at C & end at B the tour will not be listed.
How I can write the this query?

You use $elemMatch here since there are "two" conditions on the array entry selection:
var start = "B",
end = "C";
db.getCollection('nest').find({
"points": {
"$elemMatch": {
"point": { "$gte": start, "$lte": end }
}
}
})
Would select the document since the conditions are true for both the elements with "B" and "C", but reversing the order:
var start = "C",
end = "B";
db.getCollection('nest').find({
"points": {
"$elemMatch": {
"point": { "$gte": start, "$lte": end }
}
}
})
The condition is not true for any array element since no element has a value both greater or equal to "C" and also satisfies less or equal to "B".
So that's how you enforce the order, and it's fine as long as the data itself is always ordered that way.
If you have data that is itself presented in reverse:
{
_id: "my tour",
points: [
{_id: '4', point: 'D'},
{_id: '3', point: 'C'},
{_id: '2', point: 'B'},
{_id: '1', point: 'A'}
]
}
And you do want to "start" at "C" and end at "B" indicating forward progression by "array index", then you need a calculated statement:
var start = "B",
end = "C";
db.getCollection('nest').aggregate([
{ "$match": {
"points.point": { "$all": [start,end] }
}},
{ "$redact": {
"$cond": {
"if": {
"$and": [
{ "$gt": [
{ "$indexOfArray": [ "$points.point", start ] },
{ "$indexOfArray": [ "$points.point", end ] },
]}
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
So the best advise here is to keep the array ordered that is searchable for a lexical range, since you can do with calculation but the raw query is far more performant.

Related

How to add a sub-document field with a condition in MongoDB?

I am trying to add a new subField with a condition.
In the case the field already exists, I don't overwrite it.
In the case the condition is not fulfilled, I don't want to add the parent object.
Here is my collection :
{type: "A", object: {a: "", b: "foo"}},
{type: "A", object: {a: ""}},
{type: "A"},
{type: "B"}
Here is my aggregate :
{
$addFields: {
"object.b": {
$cond: {
if: {$eq: ["$type","A"]},
then: {$ifNull: ["$object.b", "bar"]},
else: "$DROP"
}
}
}
}
$DROP is not an aggregate command, but in the else case I don't want to add the new field.
It will not create the b field, but the parent object remains.
Here is my current result :
{type: "A", "object": {a: "", b: "foo"}},
{type: "A", "object": {a: "", b: "bar"}},
{type: "A", "object": {b: "bar"}},
{type: "B", "object": {}},
Here is what I want :
{type: "A", object: {a: "", b: "foo"}},
{type: "A", object: {a: "", b: "bar"}},
{type: "A", object: {b: "bar"}},
{type: "B"}
Your help will be highly appreciated.
This aggregate query will give you the desired result:
db.collection.aggregate([
{
$addFields: {
object: {
$cond: {
if: { $eq: [ "$type", "A" ] },
then: {
$mergeObjects: [
"$object",
{ b: { $ifNull: [ "$object.b", "bar" ] } }
]
},
else: "$$REMOVE"
}
}
}
}
])
Note the $$REMOVE is a aggregation system variable.
When a $set adds a path, all path is added, even if you end up to $$REMOVE this will affect only the last of the path, the rest would be already added see example
Query
$set with switch case starting from object
if object doesn't exist
if type A add {object :{"b" : "bar"}}
else $$REMOVE
$type = "A" AND (not-value "$object.b")
add {"b" : "bar"} (this case covers also the case object:null)
else
keep old value (another type, or b had value)
*Maybe it could be smaller but we check many things(see the example for all cases of data)
object exists/null/not exists
type A/not
b exists/null/value
Test code here
db.collection.aggregate([
{
"$set": {
"object": {
"$switch": {
"branches": [
{
"case": {
"$eq": [
{
"$type": "$object"
},
"missing"
]
},
"then": {
"$cond": [
{
"$eq": [
"$type",
"A"
]
},
{
"b": "bar"
},
"$$REMOVE"
]
}
},
{
"case": {
"$and": [
{
"$eq": [
"$type",
"A"
]
},
{
"$or": [
{
"$eq": [
"$object.b",
null
]
},
{
"$eq": [
{
"$type": "$object.b"
},
"missing"
]
}
]
}
]
},
"then": {
"$mergeObjects": [
"$object",
{
"b": "bar"
}
]
}
}
],
"default": "$object"
}
}
}
}
])

Aggregate nested objects mongodb [duplicate]

This question already has answers here:
Retrieve only the queried element in an object array in MongoDB collection
(18 answers)
Closed 4 years ago.
Could you please help me to write some sort of aggregation query using mongodb.
I have next data structure.
[
{
id: 1,
shouldPay: true,
users: [
{
id: 100,
items: [{...}],
tags: ['a', 'b', 'c', 'd']
},
{
id: 100,
items: [{...}],
tags: ['b', 'c', 'd']
},
{
id: 100,
items: [{...}],
tags: ['c', 'd']
}
],
}
]
In result I want to get something like that:
[
{
id: 1,
shouldPay: true,
user: {
id: 100,
items: [{...}],
tags: ['a', 'b', 'c', 'd']
}
}
]
The main idea is to select a specific user that has "a" letter or list of letters ['a', 'b'] in tags.
You can use below aggregation
Use $match at the starting of the pipeline to filter out the documents which don't contain "a" and "b" in tags array. And then use $filter with $setIsSubset to filter out the nested array.
$arrayELemAt to return the specified element from the array.
db.collection.aggregate([
{ "$match": { "users.tags": { "$in": ["a", "b"] }}},
{ "$project": {
"users": {
"$arrayElemAt": [
{ "$filter": {
"input": "$users",
"cond": { "$setIsSubset": [["a", "b"], "$$this.tags"] }
}},
0
]
}
}}
])
You need to use $unwind along with $filter:
db.collection.aggregate([
{
$unwind: "$users"
},
{
$match: {
"users.tags": "a"
}
}
])
Result:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"id": 1,
"shouldPay": true,
"users": {
"id": 100,
"tags": [
"a",
"b",
"c",
"d"
]
}
}
]

Mongo $concatArrays even when null

I have a large set of documents that may have two arrays or one of the two. I want to merge them in a $project.
I am currently using $concatArrays but as the documentation says it returns null when one of the arrays is null. I can figure out how to add a condition statement in there that will either return the $concatArrays or what ever array is in there.
Example
I have:
{_id: 1, array1: ['a', 'b', 'b'], array2: ['e', 'e']}
{_id: 2, array1: ['a', 'b', 'b']}
{_id: 3, array2: ['e', 'e']}
I want:
{_id: 1, combinedArray: ['a','b', 'b', 'e', 'e']}
{_id: 2, combinedArray: ['a','b', 'b']}
{_id: 3, combinedArray: ['e', 'e']}
I tried:
$project: {
combinedArray: { '$concatArrays': [ '$array1', '$array2' ] }
}
//output (unexpected result):
{_id: 1, combinedArray: ['a','b', 'b', 'e', 'e']}
{_id: 2, combinedArray: null}
{_id: 3, combinedArray: null}
I also tried:
$project: {
combinedArray: { '$setUnion': [ '$array1', '$array2' ] }
}
//output (unexpected result):
{_id: 1, combinedArray: ['a','b', 'e']}
{_id: 2, combinedArray: ['a','b']}
{_id: 3, combinedArray: ['e']}
As documentation for $concatArrays says
If any argument resolves to a value of null or refers to a field that
is missing, $concatArrays returns null.
So we need to be sure that we are not passing arguments which refer to a missing field or null. You can do that with $ifNull operator:
Evaluates an expression and returns the value of the expression if the
expression evaluates to a non-null value. If the expression evaluates
to a null value, including instances of undefined values or missing
fields, returns the value of the replacement expression.
So just return empty array if filed expression will not evaluate to non-null value:
db.collection.aggregate([
{$project: {
combinedArray: { '$concatArrays': [
{$ifNull: ['$array1', []]},
{$ifNull: ['$array2', []]}
] }
}
}
])
You can easily achieve this with the $ifNull operator:
db.arr.aggregate([
{
$project:{
combinedArray:{
$concatArrays:[
{
$ifNull:[
"$array1",
[]
]
},
{
$ifNull:[
"$array2",
[]
]
}
]
}
}
}
])
output:
{ "_id" : 1, "combinedArray" : [ "a", "b", "b", "e", "e" ] }
{ "_id" : 2, "combinedArray" : [ "a", "b", "b" ] }
{ "_id" : 3, "combinedArray" : [ "e", "e" ] }
I tried to do this with nested $cond, answer with $ifNull is better, but still posting my answer.
db.getCollection('test').aggregate( [{
$project: {
combinedArray: { $cond: [
{ $and: [ { $isArray: ["$array1"] }, { $isArray: ["$array2"] } ] },
{ '$concatArrays': [ '$array1', '$array2' ] },
{ $cond: [
{ $isArray: ["$array1"] },
"$array1",
"$array2"
] }
] }
}
}] )

How to select only not null values when aggregating with first or last in mongodb?

My data represents a dictionary that receives a bunch of updates and potentially new fields (metadata being added to a post). So something like:
> db.collection.find()
{ _id: ..., 'A': 'apple', 'B': 'banana' },
{ _id: ..., 'A': 'artichoke' },
{ _id: ..., 'B': 'blueberry' },
{ _id: ..., 'C': 'cranberry' }
The challenge - I want to find the first (or last) value for each key ignoring blank values (i.e. I want some kind of conditional group by that works at a field not document level). (Equivalent to the starting or ending version of the metadata after updates).
The problem is that:
db.collection.aggregate([
{ $group: {
_id: null,
A: { $last: '$A' },
B: { $last: '$B' },
C: { $last: '$C' }
}}
])
fills in the blanks with nulls (rather than skipping them in the result), so I get:
{ '_id': ..., 'A': null, 'B': null, 'C': 'cranberry' }
when I want:
{ '_id': ..., 'A': 'artichoke', 'B': 'blueberry', 'C': cranberry' }
I don't think this is what you really want, but it does solve the problem you are asking. The aggregation framework cannot really do this, as you are asking for "last results" of different columns from different documents. There is really only one way to do this and it is pretty insane:
db.collection.aggregate([
{ "$group": {
"_id": null,
"A": { "$push": "$A" },
"B": { "$push": "$B" },
"C": { "$push": "$C" }
}},
{ "$unwind": "$A" },
{ "$group": {
"_id": null,
"A": { "$last": "$A" },
"B": { "$last": "$B" },
"C": { "$last": "$C" }
}},
{ "$unwind": "$B" },
{ "$group": {
"_id": null,
"A": { "$last": "$A" },
"B": { "$last": "$B" },
"C": { "$last": "$C" }
}},
{ "$unwind": "$C" },
{ "$group": {
"_id": null,
"A": { "$last": "$A" },
"B": { "$last": "$B" },
"C": { "$last": "$C" }
}},
])
Essentially you compact down the documents pushing all of the found elements into arrays. Then each array is unwound and the $last element is taken from there. You need to do this for each field in order to get the last element of each array, which was the last match for that field.
Not real good and certain to explode the BSON 16MB limit on any meaningful collection.
So what you are really after is looking for a "last seen" value for each field. You could brute force this by iterating the collection and keeping values that are not null. You can even do this on the server like this with mapReduce:
db.collection.mapReduce(
function () {
if (start == 0)
emit( 1, "A" );
start++;
current = this;
Object.keys(store).forEach(function(key) {
if ( current.hasOwnProperty(key) )
store[key] = current[key];
});
},
function(){},
{
"scope": { "start": 0, "store": { "A": null, "B": null, "C": null } },
"finalize": function(){ return store },
"out": { "inline": 1 }
}
)
That will work as well, but iterating the whole collection is nearly as bad as mashing everything together with aggregate.
What you really want in this case is three queries, ideally in parallel to just get the discreet value last seen for each property:
> db.collection.find({ "A": { "$exists": true } }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e7"), "A" : "artichoke" }
> db.collection.find({ "B": { "$exists": true } }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e8"), "B" : "blueberry" }
> db.collection.find({ "C": { "$exists": true } }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e9"), "C" : "cranberry" }
Acutally even better is to create a sparse index on each property and query via $gt and a blank string. This makes sure an index is used and as a sparse index it will only contain documents where the property is present. You'll need to .hint() this, but you still want $natural ordering for the sort:
db.collection.ensureIndex({ "A": -1 },{ "sparse": 1 })
db.collection.ensureIndex({ "B": -1 },{ "sparse": 1 })
db.collection.ensureIndex({ "C": -1 },{ "sparse": 1 })
> db.collection.find({ "A": { "$gt": "" } }).hint({ "A": -1 }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e7"), "A" : "artichoke" }
> db.collection.find({ "B": { "$gt": "" } }).hint({ "B": -1 }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e8"), "B" : "blueberry" }
> db.collection.find({ "C": { "$gt": "" } }).hint({ "C": -1 }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e9"), "C" : "cranberry" }
That's the best way to solve what you are saying here. But as I said, this is how you think you need to solve it. Your real problem likely has another way to approach both storing and querying.
Starting Mongo 3.6, for those using $first or $last as a way to get one value from grouped records (not necessarily the actual first or last), $group's $mergeObjects can be used as a way to find a non-null value from grouped items:
// { "A" : "apple", "B" : "banana" }
// { "A" : "artichoke" }
// { "B" : "blueberry" }
// { "C" : "cranberry" }
db.collection.aggregate([
{ $group: {
_id: null,
A: { $mergeObjects: { a: "$A" } },
B: { $mergeObjects: { b: "$B" } },
C: { $mergeObjects: { c: "$C" } }
}}
])
// { _id: null, A: { a: "artichoke" }, B: { b: "blueberry" }, C: { c: "cranberry" } }
$mergeObjects accumulates an object based on each grouped record. And the thing to note is that $mergeObjects will merge in priority values that aren't null. But that requires to modify the accumulated field to an object, thus the "awkward" { a: "$A" }.
If the output format isn't exactly what you expect, one can always use an additional $project stage.
So I've just thought about how to answer this, but would be interested to hear people's opinions on how right/wrong this is. Based on the reply from #NeilLunn I guess I'll hit the BSON limit, making his version better for pulling the data, but it's important to my app that I can run this query in one go. (Perhaps my real problem is the data design).
The problem we have is that in the "group by" we pull in a version of A, B, C for every document. So my solution is to tell the aggregation what fields it should pull in by changing (slightly) the original data structure to tell the engine which keys are in each document:
> db.collection.find()
{ _id: ..., 'A': 'apple', 'B': 'banana', 'Keys': ['A', 'B']},
{ _id: ..., 'A': 'artichoke', 'Keys': ['A']},
{ _id: ..., 'B': 'blueberry', 'Keys': ['B']},
{ _id: ..., 'C': 'cranberry', 'Keys': ['C']}
Now we can can $unwind on 'Keys' and then group with 'Keys' as '_id'. Thus:
db.collection.aggregate([
{'$unwind': 'Keys'},
{'$group':
{'_id': 'Keys',
'A': {'$last': '$A'},
'B': {'$last': '$B'},
'C': {'$last': '$C'}
}
}
])
I get back a series of documents with _id equal to the key:
{_id: 'A', 'A': 'artichoke', 'B': null, 'C': null},
{_id: 'B', 'A': null, 'B': 'blueberry', 'C': null},
{_id: 'C', 'A': null, 'B': null, 'C': 'cranberry'}
You can then pull the results you want, knowing that the value for key X is only valid for the result where _id is X.
(Of course the next question is how to reduce this series of documents to one, taking the appropriate field each time)

Dynamic Sticky Sorting in Mongo for a simple value or list

I'm trying to dynamically sticky sort a collection of records with the value that is sticky being different with each query. Let me give an example. Here are some example docs:
{first_name: 'Joe', last_name: 'Blow', offices: ['GA', 'FL']}
{first_name: 'Joe', last_name: 'Johnson', offices: ['FL']}
{first_name: 'Daniel', last_name: 'Aiken', offices: ['TN', 'SC']}
{first_name: 'Daniel', last_name: 'Madison', offices: ['SC', 'GA']}
... a bunch more names ...
Now suppose I want to display the names in alphabetical order by last name but I want to peg all the records with the first name "Joe" at the top.
In SQL this is fairly straight forward:
SELECT * FROM people ORDER first_name == 'Joe' DESC, last_name
The ability to put expressions in the sort criteria makes this trivial. Using the aggregation framework I can do this:
[
{$project: {
first_name: 1,
last_name: 1
offices: 1,
sticky: {$cond: [{$eq: ['$first_name', 'Joe']}, 1, 0]}
}},
{$sort: [
'sticky': -1,
'last_name': 1
]}
]
Basically I create a dynamic field with the aggregation framework that is 1 if the name if Joe and 0 if the name is not Joe then sort in reverse order. Of course when building my aggregation pipeline I can easily change 'Joe' to be 'Daniel' and now 'Daniel' will be pegged to the top. This is partially what I mean by dynamic sticky sorting. The value I am sticky sorting by will change query-by-query
Now this works great for a basic value like a string. The problem comes when I try to the same thing for a value that hold an array. Say I want to peg all users in 'FL' offices. With Mongo's native understanding of arrays I would think I can do the same thing. So:
[
{$project: {
first_name: 1,
last_name: 1
offices: 1,
sticky: {$cond: [{$eq: ['$offices', 'FL']}, 1, 0]}
}},
{$sort: [
'sticky': -1,
'last_name': 1
]}
]
But this doesn't work at all. I did figure out that if I changed it to the following it would put Joe Johnson (who is only in the FL office) at the top:
[
{$project: {
first_name: 1,
last_name: 1
offices: 1,
sticky: {$cond: [{$eq: ['$offices', ['FL']]}, 1, 0]}
}},
{$sort: [
'sticky': -1,
'last_name': 1
]}
]
But it didn't put Joe Blow at the top (who is in FL and GA). I believe it is doing simple match. So my first attempt doesn't work at all since $eq returns false since we are comparing an array to a string. The second attempt works for Joe Johnson because we are comparing the exact same arrays. But Joe Blow doesn't work since ['GA', 'FL'] != ['FL']. Also if I want to peg both FL and SC at the top I can't give it the value ['FL', 'SC'] to compare against.
Next I try using a combination of $setUnion and $size.
[
{$project: {
first_name: 1,
last_name: 1
offices: 1,
sticky: {$size: {$setUnion: ['$offices', ['FL', 'SC']]}}
}},
{$sort: [
'sticky': -1,
'last_name': 1
]}
]
I've tried using various combinations of $let and $literal but it always complains about me trying to pass a literal array into $setUnion's arguments. Specifically it says:
disallowed field type Array in object expression
Is there any way to do this?
Cannot reproduce your error but you have a few "typos" in your question so I cannot be sure what you actually have.
But presuming you actually are working with MongoDB 2.6 or above then you probably want the $setIntersection or $setIsSubset operators rather than $setUnion. Those operators imply "matching" contents of the array they are compared to, where $setUnion just combines the supplied array with the existing one:
db.people.aggregate([
{ "$project": {
"first_name": 1,
"last_name": 1,
"sticky": {
"$size": {
"$setIntersection": [ "$offices", [ "FL", "SC" ]]
}
},
"offices": 1
}},
{ "$sort": {
"sticky": -1,
"last_name": 1
}}
])
In prior versions where you do not have those set operators you are just using $unwind to work with the array, and the same sort of $cond operation as before within a $group to bring it all back together:
db.people.aggregate([
{ "$unwind": "$offices" },
{ "$group": {
"_id": "$_id",
"first_name": { "$first": "$first_name" },
"last_name": { "$first": "$last_name",
"sticky": { "$sum": { "$cond": [
{ "$or": [
{ "$eq": [ "$offices": "FL" ] },
{ "$eq": [ "$offices": "SC" ] },
]},
1,
0
]}},
"offices": { "$push": "$offices" }
}},
{ "$sort": {
"sticky": -1,
"last_name": 1
}}
])
But you were certainly on the right track. Just choose the right set operation or other method in order to get your precise need.
Or since you have posted your way of getting what you want, a better way to write that kind of "ordered matching" is this:
db.people.aggregate([
{ "$project": {
"first_name": 1,
"last_name": 1,
"sticky": { "$cond": [
{ "$anyElementTrue": {
"$map": {
"input": "$offices",
"as": "o",
"in": { "$eq": [ "$$o", "FL" ] }
}
}},
2,
{ "$cond": [
{ "$anyElementTrue": {
"$map": {
"input": "$offices",
"as": "o",
"in": { "$eq": [ "$$o", "SC" ] }
}
}},
1,
0
]}
]},
"offices": 1
}},
{ "$sort": {
"sticky": -1,
"last_name": 1
}}
])
And that would give priority it documents with "offices" containing "FL" over "SC" and hence then over all others, and doing the operation within a single field. That should also be very easy for people to see how to abstract that into the form using $unwind in earlier versions without the set operators. Where you simply provide the higher "weight" value to the items you want at the top by nesting the $cond statements.
I think I figured out the best way to do it.
[
{$project: {
first_name: 1,
last_name: 1
offices: 1,
sticky_0: {
$cond: [{
$anyElementTrue: {
$map: {
input: "$offices",
as: "j",
in: {$eq: ['$$j', 'FL']}
}
}
}, 0, 1]
},
sticky_1": {
$cond: [{
$anyElementTrue: {
$map: {
input: '$offices',
as: 'j',
in: {$eq: ['$$j', 'SC']}
}
}
}, 0, 1]
}
}},
{$sort: [
'sticky_0': 1,
'sticky_1': 1,
'last_name': 1
]}
]
Basically when building my pipeline I iterate through each item I want to make sticky and from that item create it's own virtual field that checks just the one value. To check just the one value I use a combination of $cond, $anyElementTrue and $map. It's a little convoluted but it works. Would love to hear if there is something simpler.