Matching documents with two values in nested array - mongodb

So, I have the following structure in some documents:
{
"Group": [
{
"data": {
"field1": "VALUE1",
"otherfield": "XXXX"
}
},
{
"data": {
"field1": "VALUE2",
"otherfield": "YYYYY"
}
}
]
}
The size of the Group array can be either 0, 1 or 2 in size. What I need to to is match the documents which contains both of VALUE1 and VALUE2 for field1. Couldn't find a suitable answer in here for this specific case.
I tried using $elemMatch but it will not work to bring only documents with both values. That is, it will work like an or not and.

You can use the $all array query operator with dot notation for this:
db.test.find({'Group.data.field1': {$all: ['VALUE1', 'VALUE2']}})
The $all operator selects the documents where the value of a field is an array that contains all the specified elements.

Related

Trying to fetch data from Nested MongoDB Database?

I am beginner in MongoDB and struck at a place I am trying to fetch data from nested array but is it taking so long time as data is around 50K data, also it is not much accurate data, below is schema structure please see once -
{
"_id": {
"$oid": "6001df3312ac8b33c9d26b86"
},
"City": "Los Angeles",
"State":"California",
"Details": [
{
"Name": "Shawn",
"age": "55",
"Gender": "Male",
"profession": " A science teacher with STEM",
"inDate": "2021-01-15 23:12:17",
"Cars": [
"BMW","Ford","Opel"
],
"language": "English"
},
{
"Name": "Nicole",
"age": "21",
"Gender": "Female",
"profession": "Law student",
"inDate": "2021-01-16 13:45:00",
"Cars": [
"Opel"
],
"language": "English"
}
],
"date": "2021-01-16"
}
Here I am trying to filter date with date and Details.Cars like
db.getCollection('news').find({"Details.Cars":"BMW","date":"2021-01-16"}
it is returning details of other persons too which do not have cars- BMW , Only trying to display details of person like - Shawn which have BMW or special array value and date too not - Nicole, rest should not appear but is it not happening.
Any help is appreciated. :)
A combination of $match on the top-level fields and $filter on the array elements will do what you seek.
db.foo.aggregate([
{$match: {"date":"2021-01-16"}}
,{$addFields: {"Details": {$filter: {
input: "$Details",
as: "zz",
cond: { $in: ['BMW','$$zz.Cars'] }
}}
}}
,{$match: {$expr: { $gt:[{$size:"$Details"},0] } }}
]);
Notes:
$unwind is overly expensive for what is needed here and it likely means "reassembling" the data shape later.
We use $addFields where the new field to add (Details) already exists. This effectively means "overwrite in place" and is a common idiom when filtering an array.
The second $match will eliminate docs where the date matches but not a single entry in Details.Cars is a BMW i.e. the array has been filtered down to zero length. Sometimes you want to know this info so if this is the case, do not add the final $match.
I recommend you look into using real dates i.e. ISODate instead of strings so that you can easily take advantage of MongoDB date math and date formatting functions.
Is a common mistake think that find({nested.array:value}) will return only the nested object but actually, this query return the whole object which has a nested object with desired value.
The query is returning the whole document where value BMW exists in the array Details.Cars. So, Nicole is returned too.
To solve this problem:
To get multiple elements that match the criteria you can do an aggregation stage using $unwind to separate the different objects into array and match by the criteria you want.
db.collection.aggregate([
{
"$match": { "Details.Cars": "BMW", "date": "2021-01-26" }
},
{
"$unwind": "$Details"
},
{
"$match": { "Details.Cars": "BMW" }
}
])
This query first match by the criteria to avoid $unwind over all collection.
Then $unwind to get every document and $match again to get only the documents you want.
Example here
To get only one element (for example, if you match by _id and its unique) you can use $elemMatch in this way:
db.collection.find({
"Details.Cars": "BMW",
"date": "2021-01-16"
},
{
"Details": {
"$elemMatch": {
"Cars": "BMW"
}
}
})
Example here
You can use $elemenMatch into query or projection stage. Docs here and here
Using $elemMatch into query the way is this:
db.collection.find({
"Details": {
"$elemMatch": {
"Cars": "BMW"
}
},
"date": "2021-01-16"
},
{
"Details.$": 1
})
Example here
The result is the same. In the second case you are using positional operator to return, as docs says:
The first element that matches the query condition on the array.
That is, the first element where "Cars": "BMW".
You can choose the way you want.

Converting some fields in Mongo from String to Array

I have a collection of documents where a "tags" field was switched over from being a space separated list of tags to an array of individual tags. I want to update the previous space-separated fields to all be arrays like the new incoming data.
I'm also having problems with the $type selector because it is applying the type operation to individual array elements, which are strings. So filtering by type just returns everything.
How can I get every document that looks like the first example into the format for the second example?
{
"_id" : ObjectId("12345"),
"tags" : "red blue green white"
}
{
"_id" : ObjectId("54321"),
"tags" : [
"red",
"orange",
"black"
]
}
We can't use the $type operator to filter our documents here because the type of the elements in our array is "string" and as mentioned in the documentation:
When applied to arrays, $type matches any inner element that is of the specified BSON type. For example, when matching for $type : 'array', the document will match if the field has a nested array. It will not return results where the field itself is an array.
But fortunately MongoDB also provides the $exists operator which can be used here with a numeric array index.
Now how can we update those documents?
Well, from MongoDB version <= 3.2, the only option we have is mapReduce() but first let look at the other alternative in the upcoming release of MongoDB.
Starting from MongoDB 3.4, we can $project our documents and use the $split operator to split our string into an array of substrings.
Note that to split only those "tags" which are string, we need a logical $condition processing to split only the values that are string. The condition here is $eq which evaluate to true when the $type of the field is equal to "string". By the way $type here is new in 3.4.
Finally we can overwrite the old collection using the $out pipeline stage operator. But we need to explicitly specify the inclusion of other field in the $project stage.
db.collection.aggregate(
[
{ "$project": {
"tags": {
"$cond": [
{ "$eq": [
{ "$type": "$tags" },
"string"
]},
{ "$split": [ "$tags", " " ] },
"$tags"
]
}
}},
{ "$out": "collection" }
]
)
With mapReduce, we need to use the Array.prototype.split() to emit the array of substrings in our map function. We also need to filter our documents using the "query" option. From there we will need to iterate the "results" array and $set the new value for "tags" using bulk operations using the bulkWrite() method new in 3.2 or the now deprecated Bulk() if we are on 2.6 or 3.0 as shown here.
db.collection.mapReduce(
function() { emit(this._id, this.tags.split(" ")); },
function(key, value) {},
{
"out": { "inline": 1 },
"query": {
"tags.0": { "$exists": false },
"tags": { "$type": 2 }
}
}
)['results']

MongoDB: Sort by field existing and then alphabetically

In my database I have a field of name. In some records it is an empty string, in others it has a name in it.
In my query, I'm currently doing:
db.users.find({}).sort({'name': 1})
However, this returns results with an empty name field first, then alphabetically returns results. As expected, doing .sort({'name': -1}) returns results with a name and then results with an empty string, but it's in reverse-alphabetical order.
Is there an elegant way to achieve this type of sorting?
How about:
db.users.find({ "name": { "$exists": true } }).sort({'name': 1})
Because after all when a field you want to sort on is not actually present then the returned value is null and therefor "lower" in the order than any positive result. So it makes sense to exclude those results if you really are only looking for something with a matching value.
If you really want all the results in there and regarless of a null content, then I suggest you "weight" them via .aggregate():
db.users.aggregate([
{ "$project": {
"name": 1,
"score": {
"$cond": [
{ "$ifNull": [ "$name", false ] },
1,
10
]
}
}},
{ "$sort": { "score": 1, "name": 1 } }
])
And that moves all null results to the "end of the chain" by assigning a value as such.
If you want to filter out documents with an empty "name" field, change your query: db.users.find({"name": {"$ne": ""}}).sort({"name": 1})

MongoDB distinct- returns only matched array elements

I was trying to fetch distinct tags from array for auto-complete module. The collection format is:
{
tags:["apple","mango","apple-pie"]
},
{
tags: ["man","lemon","lemon-lite"]
}
Now, I am interested in getting distinct tags, with prefix q.
The query that I triggered is:
db.portfolio.distinct("tags",{"tags":/app/});
However, this query returned entire array:
["apple","mango","apple-pie"].
My requirement is: ["apple", "apple-pie"].
How can I modify my query to get desired result?
You can do this with aggregation.
You $unwind the tags array.
You $match those tags you are looking for according to the regular expression given.
You $group the tags into a set using $addToSet.
The code looks something like this:
> db.portfolio.aggregate([
{ "$unwind": "$tags" },
{ "$match": { "tags": /app/ }},
{ "$group":
{
"_id": null,
"tags": { "$addToSet": "$tags" }
}
}
]);
{ "_id" : null, "tags" : [ "apple-pie", "apple" ] }
Not possible with distinct because query will return documents containing matching tags(/app/) which mean there will be non matching tags as well. distinct gets a distinct set from all these tags.
So you will have to filter the returning array again( using regex /app/)

mongodb: return an array of document ids

Is it possible to query mongodb to return array of matching document id values, without the related keys?
Please consider following 'parent' data structur:
{
"_id": ObjectId("52448e4697fb2b775cb5c3a7"),
"name": "Peter",
"children": [
{
"name": "joe"
}
]
},
{
"_id": ObjectId("52448e4697fb2b775cb5c3b6"),
"name": "Marry",
"children": [
{
"name": "joe"
}
]
}
I would to query for an array of parent _ids whose children have the name "joe"
For provided sample data, I would like the following output returned from mongo:
[ObjectId("52448e4697fb2b775cb5c3a7"), ObjectId("52448e4697fb2b775cb5c3b6")]
I know that I can query for an output like this, which also contains the keys
[{"_id": ObjectId("52448e4697fb2b775cb5c3a7")}, {"_id": ObjectId("52448e4697fb2b775cb5c3b6")}]
However I need to push above array to another document with an update operation like this:
db.statistic.update({"date": today}, {$push: {"children": [ObjectId("52448e4697fb2b775cb5c3a7"), ObjectId("52448e4697fb2b775cb5c3b6")]}}, true, false)
I would like to avoid sorting out the document structure, in case it is possible to just return an array containing the appropriate values using mongo
It should be possible by
db.coll.distinct("_id", {"children.name": "joe"})