Find a value in multiple nested fields with the same name in MongoDB - mongodb

In some documents I have a property with a complex structure like:
{
content: {
foo: {
id: 1,
name: 'First',
active: true
},
bar: {
id: 2,
name: 'Second',
active: false
},
baz: {
id: 3,
name: 'Third',
active: true
},
}
I'm trying to make a query that can find all documents with a given value in the field name across the different second level objects foo, bar, baz
I guess that a solution could be:
db.getCollection('mycollection').find({ $or: [
{'content.foo.name': 'First'},
{'content.bar.name': 'First'},
{'content.baz.name': 'First'}
]})
But a I want to do it dynamic, with no need to specify key names of nested fields, nether repeat the value to find in every line.
If some Regexp on field name were available , a solution could be:
db.getCollection('mycollection').find({'content.*.name': 'First'}) // Match
db.getCollection('mycollection').find({'content.*.name': 'Third'}) // Match
db.getCollection('mycollection').find({'content.*.name': 'Fourth'}) // Doesn't match
Is there any way to do it?

I would say this is a bad schema if you don't know your keys in advance. Personally I'd recommend to change this to an array structure.
Regardless what you can do is use the aggregation $objectToArray operator, then query that newly created object. Mind you this approach requires a collection scan each time you execute a query.
db.collection.aggregate([
{
$addFields: {
arr: {
"$objectToArray": "$content"
}
}
},
{
$match: {
"arr.v.name": "First"
}
},
{
$project: {
arr: 0
}
}
])
Mongo Playground
Another hacky approach you can take is potentially creating a wildcard text index and then you could execute a $text query to search for the name, obviously this comes with the text index/query limitations and might not be right for your usecase.

Related

MongoDB paginate 2 collections together on common field

I've two mongo collections - File and Folder.
Both have some common fields like name, createdAt etc. I've a resources API that returns a response having items from both collections, with a type property added. type can be file or folder
I want to support pagination and sorting in this list, for example sort by createdAt. Is it possible with aggregation, and how?
Moving them to a container collection is not a preferred option, as then I have to maintain the container collection on each create/update/delete on either of the collection.
I'm using mongoose too, if it has got any utility function for this, or a plugin.
In this case, you can use $unionWith. Something like:
Folder.aggregate([
{ $project: { name: 1, createdAt: 1 } },
{
$unionWith: {
coll: "files", pipeline: [ { $project: { name: 1, createdAt: 1 } } ]
}
},
... // your sorting go here
])

MongoDB - Query nested objects in nested array with array of strings filter

So basically I need to filter my data with my own filter, which is array of strings, but problem is, that that exact field is inside nested object in array in DB. so, part of my Schema looks like this:
members: [
{
_id: { type: Schema.Types.ObjectId, ref: "Users" },
profilePicture: { type: String, required: true },
profile: {
firstName: { type: String },
lastName: { type: String },
occupation: { type: String },
gender: { type: String }
}
}
]
and my filter looks like this
gender: ["male","female"]
expected result with this filter is to get a team which has both male users and female users, if it has only male, or only female, it should not give me that team. but everything i've tried was giving me everything what included males and females even tho there were only male members.
what i've tried:
db.teams.find(members: { $elemMatch: { "profile.gender": { $in: gender } } })
This works only when there is one gender specified in the filter, and well, i know it must not work on what i am trying to achieve, but i dont know how to achieve it. any help will be appreciated
Edit: I've tried to do it in this way
db.teams.find({
$and: [
{ members: { $elemMatch: { "profile.gender": gender[0] } } },
{ members: { $elemMatch: { "profile.gender": gender[1] } } }
]
})
and this gives me result only when both filters are specified, however, if there is only one filter(either "male", or "female") it is giving me nothing.
Use $and operator instead of $in.
db.teams.find(members: {$elemMatch: {$and: [{'profile.gender': 'male'}, {'profile.gender': 'female'}]}})
This query works no matter how many elements you want to compare
db.teams.find({$and: [{'members.profile.gender': 'male'}, {'members.profile.gender': 'female'}]})
You need to dynamically generate the query before passing it to find, if you want to cover more than one case.
You can do this with the $all operator that finds docs where an array field contains contains all specified elements:
var gender = ['male', 'female'];
db.teams.find({'members.profile.gender': {$all: gender}});

Create MongoDB fields with names based on sub-document values using aggregation pipeline?

Given a MongoDB collection with the following document structure:
{
array_of_subdocs:
[
{
animal: "cat",
count: 10
},
{
animal: "dog",
count: 20
},
...
]
}
where each document contains an array of sub-documents, I want transform the collection into documents of the structure:
{
cat: { count: 10 },
dog: { count: 20 },
...
}
where each sub-document is now the value of a new field in the main document named after one of the values within the sub-document (in the example, the values of the animal field is used to create the name of the new fields, i.e. cat and dog).
I know how to do this with eval with a Javascript snippet. It's slow. My question is: how can this be done using the aggregation pipeline?
According to this feature request and its solution, the new feature will be added for this functionality - Function called arrayToObject.
db.foo.aggregate([
{$project: {
array_of_subdocs: {
$arrayToObject: {
$map: {
input: "$array_of_subdocs",
as: "pair",
in: ["$$pair.animal", "$$pair.count"]
}
}
}
}}
])
But at the moment, no solution. I suggest you to change your data structure. As far as I see there are many feature requests labeled as Major but not done for years.

Pull and addtoset at the same time with mongo

I have a collection which elements can be simplified to this:
{tags : [1, 5, 8]}
where there would be at least one element in array and all of them should be different. I want to substitute one tag for another and I thought that there would not be a problem. So I came up with the following query:
db.colll.update({
tags : 1
},{
$pull: { tags: 1 },
$addToSet: { tags: 2 }
}, {
multi: true
})
Cool, so it will find all elements which has a tag that I do not need (1), remove it and add another (2) if it is not there already. The problem is that I get an error:
"Cannot update 'tags' and 'tags' at the same time"
Which basically means that I can not do pull and addtoset at the same time. Is there any other way I can do this?
Of course I can memorize all the IDs of the elements and then remove tag and add in separate queries, but this does not sound nice.
The error is pretty much what it means as you cannot act on two things of the same "path" in the same update operation. The two operators you are using do not process sequentially as you might think they do.
You can do this with as "sequential" as you can possibly get with the "bulk" operations API or other form of "bulk" update though. Within reason of course, and also in reverse:
var bulk = db.coll.initializeOrderedBulkOp();
bulk.find({ "tags": 1 }).updateOne({ "$addToSet": { "tags": 2 } });
bulk.find({ "tags": 1 }).updateOne({ "$pull": { "tags": 1 } });
bulk.execute();
Not a guarantee that nothing else will try to modify,but it is as close as you will currently get.
Also see the raw "update" command with multiple documents.
If you're removing and adding at the same time, you may be modeling a 'map', instead of a 'set'. If so, an object may be less work than an array.
Instead of data as an array:
{ _id: 'myobjectwithdata',
data: [{ id: 'data1', important: 'stuff'},
{ id: 'data2', important: 'more'}]
}
Use data as an object:
{ _id: 'myobjectwithdata',
data: { data1: { important: 'stuff'},
data2: { important: 'more'} }
}
The one-command update is then:
db.coll.update(
'myobjectwithdata',
{ $set: { 'data.data1': { important: 'treasure' } }
);
Hard brain working for this answer done here and here.
Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.
And coupled with improvements made to db.collection.update() in Mongo 4.2 that can accept an aggregation pipeline, allowing the update of a field based on its own value,
We can manipulate and update an array in ways the language doesn't easily permit:
// { "tags" : [ 1, 5, 8 ] }
db.collection.updateMany(
{ tags: 1 },
[{ $set:
{ "tags":
{ $function: {
body: function(tags) { tags.push(2); return tags.filter(x => x != 1); },
args: ["$tags"],
lang: "js"
}}
}
}]
)
// { "tags" : [ 5, 8, 2 ] }
$function takes 3 parameters:
body, which is the function to apply, whose parameter is the array to modify. The function here simply consists in pushing 2 to the array and filtering out 1.
args, which contains the fields from the record that the body function takes as parameter. In our case, "$tag".
lang, which is the language in which the body function is written. Only js is currently available.
In case you need replace one value in an array to another check this answer:
Replace array value using arrayFilters

Storing null vs not storing the key at all in MongoDB

It seems to me that when you are creating a Mongo document and have a field {key: value} which is sometimes not going to have a value, you have two options:
Write {key: null} i.e. write null value in the field
Don't store the key in that document at all
Both options are easily queryable, in one you query for {key : null} and the other you query for {key : {$exists : false}}.
I can't really think of any differences between the two options that would have any impact in an application scenario (except that option 2 has slightly less storage).
Can anyone tell me if there are any reasons one would prefer either of the two approaches over the other, and why?
EDIT
After asking the question it also occurred to me that indexes may behave differently in the two cases i.e. a sparse index can be created for option 2.
Indeed you have also a third possibility :
key: "" (empty value)
And you forget a specificity about null value.
Query on
key: null will retrieve you all document where key is null or where key doesn't exist.
When a query on $exists:false will retrieve only doc where field key doesn't exist.
To go back to your exact question it depends of you queries and what data represent.
If you need to keep that, by example, a user set a value then unset it, you should keep the field as null or empty. If you dont need, you may remove this field.
Note that, since MongoDB doesnt use field name dictionary compression, field:null consumes disk space and RAM, while storing no key at all doesnt consume resources.
It really comes down to:
Your scenario
Your querying manner
Your index needs
Your language
I personally have chosen to store null keys. It makes it much easier to integrate into my app. I use PHP with Active Record and uisng null values makes my life a lot easier since I am not having to put the stress of field depedancy upon the app. Also I do not need to make any complex code to deal with magics to set non-existant variables.
I personally would not store an empty value like "" since if your not careful you could have two empty values null and "" and then you'll have a hap-hazard time of querying specifically. So I personally prefer null for empty values.
As for space and index: it depends on how many rows might not have this colum but I doubt you will really notice the index size increase due to a few extra docs with null in. I mean the difference in storage is mineute especially if the corresponding key name is small as well. That goes for large setups too.
I am quite frankly unsure of the index usage between $exists and null however null could be a more standardised method by which to query the existance since remember that MongoDB is schemaless which means you have no requirement to have that field in the doc which again produces two empty values: non-existant and null. So better to choose one or the other.
I choose null.
Another point you might want to consider is when you use OGM tools like Hibernate OGM.
If you are using Java, Hibernate OGM supports the JPA standard. So if you can write a JPQL query, you would be theoretically easy if you want to switch to an alternate NoSQL datastore which is supported by the OGM tool.
JPA does not define a equivalent for $exists in Mongo. So if you have optional attributes in your collection then you cannot write a proper JPQL for the same. In such a case, if the attribute's value is stored as NULL, then it is still possible to write a valid JPQL query like below.
SELECT p FROM pppoe p where p.logout IS null;
I think in terms of disk space the difference is negligible. If you need to create an index on this field then consider Partial Index.
In index with { partialFilterExpression: { key: { $exists: true } } } can be much smaller than a normal index.
Also should be noted, that queries look different, see values like this:
db.collection.insertMany([
{ _id: 1, a: 1 },
{ _id: 2, a: '' },
{ _id: 3, a: undefined },
{ _id: 4, a: null },
{ _id: 5 }
])
db.collection.aggregate([
{
$set: {
type: { $type: "$a" },
ifNull: { $ifNull: ["$a", true] },
defined: { $ne: ["$a", undefined] },
existing: { $ne: [{ $type: "$a" }, "missing"] }
}
}
])
{ _id: 1, a: 1, type: double, ifNull: 1, defined: true, existing: true }
{ _id: 2, a: "", type: string, ifNull: "", defined: true, existing: true }
{ _id: 3, a: undefined, type: undefined, ifNull: true, defined: false, existing: true }
{ _id: 4, a: null, type: null, ifNull: true, defined: true, existing: true }
{ _id: 5, type: missing, ifNull: true, defined: false, existing: false }
Or with db.collection.find():
db.collection.find({ a: { $exists: false } })
{ _id: 5 }
db.collection.find({ a: { $exists: true} })
{ _id: 1, a: 1 },
{ _id: 2, a: '' },
{ _id: 3, a: undefined },
{ _id: 4, a: null }
db.collection.find({ a: null })
{ _id: 3, a: undefined },
{ _id: 4, a: null },
{ _id: 5 }
db.collection.find({ a: {$ne: null} })
{ _id: 1, a: 1 },
{ _id: 2, a: '' },
db.collection.find({ a: {$type: "null"} })
{ _id: 4, a: null }