MongoDB projections and fields subset - mongodb

I would like to use mongo projections in order to return less data to my application. I would like to know if it's possible.
Example:
user: {
id: 123,
some_list: [{x:1, y:2}, {x:3, y:4}],
other_list: [{x:5, y:2}, {x:3, y:4}]
}
Given a query for user_id = 123 and some 'projection filter' like user.some_list.x = 1 and user.other_list.x = 1 is it possible to achieve the given result?
user: {
id: 123,
some_list: [{x:1, y:2}],
other_list: []
}
The ideia is to make mongo work a little more and retrieve less data to the application. In some cases, we are discarding 80% of the elements of the collections at the application's side. So, it would be better not returning then at all.
Questions:
Is it possible?
How can I achieve this. $elemMatch doesn't seem to help me. I'm trying something with unwind, but not getting there
If it's possible, can this projection filtering benefit from a index on user.some_list.x for example? Or not at all once the user was already found by its id?
Thank you.

What you can do in MongoDB v3.0 is this:
db.collection.aggregate({
$match: {
"user.id": 123
}
}, {
$redact: {
$cond: {
if: {
$or: [ // those are the conditions for when to include a (sub-)document
"$user", // if it contains a "user" field (as is the case when we're on the top level
"$some_list", // if it contains a "some_list" field (would be the case for the "user" sub-document)
"$other_list", // the same here for the "other_list" field
{ $eq: [ "$x", 1 ] } // and lastly, when we're looking at the innermost sub-documents, we only want to include items where "x" is equal to 1
]
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
})
Depending on your data setup what you could also do to simplify this query a little is to say: Include everything that does not have a "x" field or if it is present that it needs to be equal to 1 like so:
$redact: {
$cond: {
if: {
$eq: [ { "$ifNull": [ "$x", 1 ] }, 1 ] // we only want to include items where "x" is equal to 1 or where "x" does not exist
},
then: "$$DESCEND", // descend into sub-document
else: "$$PRUNE" // drop sub-document
}
}
The index you suggested won't do anything for the $redact stage. You can benefit from it, however, if you change the $match stage at the start to get rid of all documents which don't match anyway like so:
$match: {
"user.id": 123,
"user.some_list.x": 1 // this will use your index
}

Very possible.
With findOne, the query is the first argument and the projection is the second. In Node/Javascript (similar to bash):
db.collections('users').findOne( {
id = 123
}, {
other_list: 0
} )
Will return the who'll object without the other_list field. OR you could specify { some_list: 1 } as the projection and returned will be ONLY the _id and some_list

$filter is your friend here. Below produces the output you seek. Experiment with changing the $eq fields and target values to see more or less items in the array get picked up. Note how we $project the new fields (some_list and other_list) "on top of" the old ones, essentially replacing them with the filtered versions.
db.foo.aggregate([
{$match: {"user.id": 123}}
,{$project: { "user.some_list": { $filter: {
input: "$user.some_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}},
"user.other_list": { $filter: {
input: "$user.other_list",
as: "z",
cond: {$eq: [ "$$z.x", 1 ]}
}}
}}
]);

Related

Find in nested array with compare on last field

I have a collection with documents like this one:
{
f1: {
firstArray: [
{
secondArray: [{status: "foo1"}, {status: "foo2"}, {status: "foo3"}]
}
]
}
}
My expected result includes documents that have at least one item in firstArray, which is last object status on the secondArray is included in an input array of values (eg. ["foo3"]).
I don't must use aggregate.
I tried:
{
"f1.firstArray": {
$elemMatch: {
"secondArray.status": {
$in: ["foo3"],
},
otherField: "bar",
},
},
}
You can use an aggregation pipeline with $match and $filter, to keep only documents that their size of matching last items are greater than zero:
db.collection.aggregate([
{$match: {
$expr: {
$gt: [
{$size: {
$filter: {
input: "$f1.firstArray",
cond: {$in: [{$last: "$$this.secondArray.status"}, ["foo3"]]}
}
}
},
0
]
}
}
}
])
See how it works on the playground example
If you know that the secondArray have always 3 items you can do:
db.collection.find({
"f1.firstArray": {
$elemMatch: {
"secondArray.2.status": {
$in: ["foo3"]
}
}
}
})
But otherwise I don't think you can check only the last item without an aggregaation. The idea is that a regular find allows you to write a query that do not use values that are specific for each document. If the size of the array can be different on each document or even on different items on the same document, you need to use an aggregation pipeline

Query on all nested docs inside a nested doc in MongoDB

I have a collection with documents that look like this:
{ keyA1: "stringVal",
keyA2: "stringVal",
keyA3: { keyB1: { feild1: intVal,
feild2: intVal}
keyB2: { feild1: intVal,
feild2: intVal}
}
}
Currently the [keyB1, keyB2, ...] set is 7 keys, same for all documents in the collection. I want to query the intVals on specific fields for all keyB's. So, for example, I might want to find all documents where field2 has value greater than 100 regardless of whcih keyB it falls in.
For any one specific keyB, I simply use the dot notation: {"keyA3.keyB2.field2": {$gte: 100}}. Right now, I have the option of looping over all keyB's, but this may not be the case in the future where more keyB values can be added. I don't want to have to modify the code then, and would like to avoid harcoding those values in anyway. I also need the solution to be fairly fast, as the final deployment is expected to have over 20M documents.
How can I write a query that can "skip" the keyB field in the dot notation and just go through all the embedded docs?
FWIW, I'm implementing this in python using pymongo. Thanks.
first convert keyA3 object to array and add new field with $addFields
then filter the new array to match field2 value is greater than 100
then query the doc that size of matched array is greater than 0 , then remove extra field we add
db.collection.aggregate([
{
"$addFields": {
"arr": {
"$objectToArray": "$keyA3"
}
}
},
{
"$addFields": {
"matchArrSize": {
$size: {
"$filter": {
"input": "$arr",
"as": "z",
"cond": {
$gt: [
"$$z.v.feild2",
100
]
}
}
}
}
}
},
{
$match: {
matchArrSize: {
$gt: 0
}
}
},
{
$unset: [
"arr",
"matchArrSize"
]
}
])
https://mongoplayground.net/p/VumwL9y7Km1

Looking for proper way to prioritize certain documents in Mongodb query

I was looking all over the place and I couldn't find a proper source for the problem I need to solve.
given record data, I need to prioritize some documents over others when I query all.
for example: lets say i'm doing this search
db.users.find().limit(10)
and my document has data with id = 1,2,3,....50;
how can I prioritize the query of id=12, or id=49 first?
what I would want to get back:
array({id=12}, {id=49} ... fill the rest until pager limit)
I tried using $or like this:
{
"$or": [
{'_id': {'$in': [id=12,id=49]}},
{}
]
}
But I don't think this is the proper way of doing this and it's not working
Any help would be greatly appreciated
You can use aggregate() method,
$addFields to add new fields for sorting purpose hasId, check condition if your field _id in your input ids then return 1 otherwise removes field
$sort by hasId in descending order
$limit documents
db.collection.aggregate([
{
$addFields: {
hasId: {
$cond: [
{ $in: ["$_id", [8, 5]] },
1,
"$$REMOVE"
]
}
}
},
{ $sort: { hasId: -1 } },
{ $limit: 5 }
])
Playground

MongoDB map filtered array inside another array with aggregate $project

I am using Azure Cosmos DB's API for MongoDB with Pymongo. My goal is to filter array inside array and return only filtered results. Aggregation query works for the first array, but returns full inside array after using map, filter operations. Please find Reproducible Example in Mongo Playground: https://mongoplayground.net/p/zS8A7zDMrmK
Current query use $project to filter and return result by selected Options but still returns every object in Discount_Price although query has additional filter to check if it has specific Sales_Week value.
Let me know in comments if my question is clear, many thanks for all possible help and suggestions.
It seemed you troubled in filtering nested array.
options:{
$filter: {
input: {
$map: {
input: "$Sales_Options",
as: 's',
in: {
City: "$$s.City",
Country: "$$s.Country",
Discount_Price: {
$filter: {
input: "$$s.Discount_Price",
as: "d",
cond: {
$in: ["$$d.Sales_Week", [2, 7]]
}
}
}
}
}
},
as: 'pair',
cond: {
$and: [{
$in: [
'$$pair.Country',
[
'UK'
]
]
},
{
$in: [
'$$pair.City',
[
'London'
]
]
}
]
}
}
}
Working Mongo plaground. If you need price1, you can use $project in next stage.
Note : If you follow the projection form upper stage use 1 or 0 which is good practice.
I'd steer you towards the $unwind operator and everything becomes a lot simpler:
db.collection.aggregate([
{$match: {"Store": "AB"}},
{$unwind: "$Sales_Options"},
{$unwind: "$Sales_Options.Discount_Price"},
{$match: {"Sales_Options.Country": {$in: [ "UK" ]},
"Sales_Options.City": {$in: [ "London" ]},
"Sales_Options.Discount_Price.Sales_Week": {$in: [ 2, 7 ]}
}
}
])
Now just $project the fields as appropriate for your output.

MongoDB select from array based on multiple conditions [duplicate]

This question already has answers here:
Retrieve only the queried element in an object array in MongoDB collection
(18 answers)
Closed 3 years ago.
I have the following structure (this can't be changed, that is I have to work with):
{
"_id" : ObjectId("abc123"),
"notreallyusedfields" : "dontcare",
"data" : [
{
"value" : "value1",
"otherSomtimesInterestingFields": 1
"type" : ObjectId("asd123=type1"),
},
{
"value" : "value2",
"otherSometimesInterestingFields": 1
"type" : ObjectId("asd1234=type2"),
},
many others
]
}
So basically the fields for a schema are inside an array and they can be identified based on the type field inside 1 array element (1 schema field and it's value is 1 element in the array). For me this is pretty strange, but I'm new to NoSQL so this may be ok. (also for different data some fields may be missing and the element order in the data array is not guaranteed)
Maybe it's easier to understand like this:
Table a: type1 column | type2 column | and so on (and these are stored in the array like the above)
My question is: how can you select multiple fields with conditions? What I mean is (in SQL): SELECT * FROM table WHERE type1=value1 AND type2=value2
I can select 1 like this:
db.a.find( {"data.type":ObjectId("asd1234=type2"), "data.value":value2}).pretty()
But I don't know how could I include that type1=value1 or even more conditions. And I think this is not even good because it can match any data.value field inside the array so it doesn't mean that where the type is type2 the value has to be value2.
How would you solve this?
I was thinking of doing 1 query for 1 condition and then do another based on the result. So something like the pipeline for aggregation but as I see $match can't be used more times in an aggregation. I guess there is a way to pipeline these commands but this is pretty ugly.
What am I missing? Or because of the structure of the data I have to do these strange multiple queries?
I've also looked at $filter but the condition there also applies to any element of the array. (Or I'm doing it wrong)
Thanks!
Sorry if I'm not clear enough! Please ask and I can elaborate.
(Basically what I'm trying to do based on this structure ( Retrieve only the queried element in an object array in MongoDB collection ) is this: if shape is square then filter for blue colour, if shape is round then filter for red colour === if type is type1 value has to be value1, if type is type2 value has to be value2)
This can be done like:
db.document.find( { $and: [
{ type:ObjectId('abc') },
{ data: { $elemMatch: { type: a, value: DBRef(...)}}},
{ data: { $elemMatch: { type: b, value: "string"}}}
] } ).pretty()
So you can add any number of "conditions" using $and so you can specify that an element has to have type a and a value b, and another element type b and value c...
If you want to project only the matching elements then use aggregate with filter:
db.document.aggregate([
{$match: { $and: [
{ type:ObjectId('a') },
{ data: { $elemMatch: { Type: ObjectId("b"), value: DBRef(...)}}},
{ data: { $elemMatch: { Type: ObjectId("c"), value: "abc"}}}
] }
},
{$project: {
metadata: {
$filter: {
input: "$data",
as: "data",
cond: { $or: [
{$eq: [ "$$data.Type", ObjectId("a") ] },
{$eq: [ "$$data.Type", ObjectId("b") ] }]}
}
}
}
}
]).pretty()
This is pretty ugly so if there is a better way please post it! Thanks!
If you need to retrieve documents that have array elements matching
multiple conditions, you have to use $elemMatch query operator.
db.collection.find({
data: {
$elemMatch: {
type: "type1",
value: "value1"
}
}
})
This will output whole documents where an element matches.
To output only first matching element in array, you can combine it with $elemMatch projection operator.
db.collection.find({
data: {
$elemMatch: {
type: "type1",
value: "value1"
}
}
},
{
data: {
$elemMatch: {
type: "type1",
value: "value1"
}
}
})
Warning, don't forget to project all other fields you need outside data array.
And if you need to output all matching elements in array, then you have to use $filter in an aggregation $project stage, like this :
db.collection.aggregate([
{
$project: {
data: {
$filter: {
input: "$data",
as: "data",
cond: {
$and: [
{
$eq: [
"$$data.type",
"type1"
]
},
{
$eq: [
"$$data.value",
"value1"
]
}
]
}
}
}
}
}
])