Find count of number of matched values in MongoDB document - mongodb

In MongoDB, I want to find the document that have few or all values in them as specified in condition and find count of how many values matched.
My sudo target document:
{ "name":"room1",
"colors":["red","blue","green"],
"objects":["chair","bed"]
}
Now i want the documents that have any of the colors and any of the objects present in "room1". It should give result if even 1 or all of the color,objects are found.
{ "name":"room2",
"colors":["blue","green"],
"objects":["chair","bed","sofa","fridge"]
}
{ "name":"room3",
"colors":["yellow","pink"],
"objects":["chair"]
}
So the result should be as:
for room2: matchcount=4 as it shares 4 common values with room1
for room3: matchcount=1 as it shares 1 common value with room1
So far I have tried using $in and aggregate function to find count, it finds the documents with similar values in them but counting what is the match count is still a issue.

Lets suppose your collection name is "room", then query will be this :
db.room.aggregate([{
$match: { colors: { $in: ["red","blue","green"] } }
}, {$project:{matchedColorCount:{
$size: {
"$setIntersection": [["red","blue","green"], '$colors' ]
}
}}}])
Here, $match: { colors: { $in: ["red","blue","green"] } will tell which colors need to be find in the document.
and "$setIntersection": [["red","blue","green"], '$colors' ] will extract the matched colors from the particular document.
$size will provide the matched count of colors.

Related

MongoDB $elemMatch comparison to field in same document

I'm wanting to create an aggregation step to match documents where the value of a field in a document exists within an array in the same document.
In a very worked example (note this is very simplified; this will be fitting into a larger existing pipeline), given documents:
{
"_id":{"$oid":"61a9085af9733d0274c41990"},
"myArray":[
{"$oid":"61a9085af9733d0274c41991"},
{"$oid":"61a9085af9733d0274c41992"},
{"$oid":"61a9085af9733d0274c41993"}
],
"myField":{"$oid":"61a9085af9733d0274c41991"} // < In 'myArray' collection
}
and
{
"_id":{"$oid":"61a9085af9733d0274c41990"},
"myArray":[
{"$oid":"61a9085af9733d0274c41991"},
{"$oid":"61a9085af9733d0274c41992"},
{"$oid":"61a9085af9733d0274c41993"}
],
"myField":{"$oid":"61a9085af9733d0274c41994"} // < Not in 'myArray' collection
}
I want to match the first one because the value of myField exists in the collection, but not the second document.
It feels like this should be a really simple $elemMatch operation with an $eq operator, but I can't make it work and every example I've found uses literals. What I've got currently is below, and I've tried with various combinations of quotes and dollar signs round myField.
[{
$match: {
myArray: {
$elemMatch: {
$eq: '$this.myField'
}
}
}
}]
Am I doing something very obviously wrong? Is it not possible to use the value of a field in the same document with an $eq?
Hoping that someone can come along and point out where I'm being stupid :)
Thanks
You can simply do a $in in an aggregation pipeline.
db.collection.aggregate([
{
"$match": {
$expr: {
"$in": [
"$myField",
"$myArray"
]
}
}
}
])
Here is the Mongo playground for your reference.

Filter a find by the last element on an embedded array

In MongoDB how can I do a search that is filtered by a predicate applied on the last element of an embedded array?
I know if I wanted to do it on the first one I could do this:
db.inventory.find( { 'instock.0.qty': { $lte: 20 } } )
As specified on the documentation.
How do I write an analog query that looks at the last element, when I don't know the exact size of the embedded array?
we can use $arrayElemAt and pass -1 to it as a second argument to get the last element in the array
something like this
db.collection.find({
$expr: {
$gt: [
{
$arrayElemAt: ["$instock.qty", -1]
},
10
]
}
})
you can test it here Mongo Playground
hope it helps

Select only documents with id value not in the collection

I have a collection ft with a number of records in it. However,i want to exclude certain ids that i have in a comma delimited list.
I am trying to do the following
var ids = [ "RQcWthREHBTfkybMy", "jiPrzQQWxbN5a8pEC", "5oFxC68WEggYzY7ah" ]
db.collection.find( { _id: { $ne: ids } } )
but i can only manage to exclude RQcWthREHBTfkybMy which is the first id in my list.
The $ne query operator behavior when the specified value is array is quite different from the the $eq operator. To check if a field value is not in the specified array you need to use the $nin query operator.
var ids = [ "RQcWthREHBTfkybMy", "jiPrzQQWxbN5a8pEC", "5oFxC68WEggYzY7‌​ah" ]
db.collection.find( { "_id": { "$nin": ids } } )

mongodb: document with the maximum number of matched targets

I need help to solve the following issue. My collection has a "targets" field.
Each user can have 0 or more targets.
When I run my query I'd like to retrieve the document with the maximum number of matched targets.
Ex:
documents=[{
targets:{
"cluster":"01",
}
},{
targets:{
"cluster":"01",
"env":"DC",
"core":"PO"
}
},{
targets:{
"cluster":"01",
"env":"DC",
"core":"PO",
"platform":"IG"
}
}];
userTarget={
"cluster":"01",
"env":"DC",
"core":"PO"
}
You seem to be asking to return the document where the most conditions were met, and possibly not all conditions. The basic process is an $or query to return the documents that can match either of the conditions. Then you basically need a statement to calculate "how many terms" were met in the document, and return the one that matched the most.
So the combination here is an .aggregate() statement using the intitial results from $or to calculate and then sort the results:
// initial targets object
var userTarget = {
"cluster":"01",
"env":"DC",
"core":"PO"
};
// Convert to $or condition
// and the calcuation condition to match
var orCondition = [],
scoreCondition = []
Object.keys(userTarget).forEach(function(key) {
var query = {},
cond = { "$cond": [{ "$eq": ["$target." + key, userTarget[key]] },1,0] };
query["target." + key] = userTarget[key];
orCondition.push(query);
scoreCondition.push(cond);
});
// Run aggregation
Model.aggregate(
[
// Match with condition
{ "$match": { "$or": orCondition } },
// Calculate a "score" based on matched fields
{ "$project": {
"target": 1,
"score": {
"$add": scoreCondition
}
}},
// Sort on the greatest "score" (descending)
{ "$sort": { "score": -1 } },
// Return the first document
{ "$limit": 1 }
],
function(err,result) {
// check errors
// Remember that result is an array, even if limitted to one document
console.log(result[0]);
}
)
So before processing the aggregate statement, we are going to generate the dynamic parts of the pipeline operations based on the input in the userTarget object. This would produce an orCondition like this:
{ "$match": {
"$or": [
{ "target.cluster" : "01" },
{ "target.env" : "DC" },
{ "target.core" : "PO" }
]
}}
And the scoreCondition would expand to a coding like this:
"score": {
"$add": [
{ "$cond": [{ "$eq": [ "$target.cluster", "01" ] },1,0] },
{ "$cond": [{ "$eq": [ "$target.env", "DC" ] },1,0] },
{ "$cond": [{ "$eq": [ "$target.core", "PO" ] },1,0] },
]
}
Those are going to be used in the selection of possible documents and then for counting the terms that could match. In particular the "score" is made by evaluating each condition within the $cond ternary operator, and then either attributing a score of 1 where there was a match, or 0 where there was not a match on that field.
If desired, it would be simple to alter the logic to assign a higher "weight" to each field with a different value going towards the score depending on the deemed importance of the match. At any rate, you simply $add these score results together for each field for the overall "score".
Then it is just a simple matter of applying the $sort to the returned "score", and then using $limit to just return the top document.
It's not super efficient, since even though there is a match for all three conditions the basic question you are asking of the data cannot presume that there is, hence it needs to look at all data where "at least one" condition was a match, and then just work out the "best match" from those possible results.
Ideally, I would personally run an additional query "first" to see if all three conditions were met, and if not then look for the other cases. That still is two separate queries, and would be different from simply just pushing the "and" conditions for all fields as the first statement in $or.
So the preferred implementation I think should be:
Look for a document that matches all given field values; if not then
Run the either/or on every field and count the condition matches.
That way, if all fields match then the first query is fastest and only needs to fall back to the slower but required implementaion shown in the listing if there was no actual result.

How can I create an index in on an array field in MongoDB?

I have a MongoDB collection with data in the format of:
[
{
"data1":1,
"data2":2,
"data3":3,
"data4":4,
"horses":[
{
"opponent":{
"jockey":"MyFirstName MyLastName",
"name":"MyHorseName",
"age":4,
"sex":"g",
"scratched":"false",
"id":"1"
},
"id":"1"
},
{
"opponent":{
"jockey":"YourFirstName YourLastName",
"name":"YourHorseName",
"age":4,
"sex":"m",
"scratched":"false",
"id":"2"
},
"id":"2"
}
]
},
...
]
Executing the following query returns exactly what I need:
db.race_results.find({ "$and": [ { "horses":
{ "$elemMatch": { "$and": [
{ "opponent.name": "MyFirstName MyLastName" },
{ "opponent.jockey": "MyHorseName"}
] } }
}
]})
However, this query takes 0.5 seconds to execute with my collection (there are a lot of records).
I am trying to find out how to create an index on the horses.opponent.name field of the data. I have read the docs about multikey indexes (here), but I'm not sure if this is exactly what I need or not. What I need (I think) is an index on the array element of horses, but only the name and jockey fields. Is this possible?
Is there a way to create an index to make my specific query (the one above) any faster?
Any pointers would be greatly appreciated. I am fairly new to MongoDB, but learning fast!
The index to create is:
db.race_results.ensureIndex({"horses.opponent.name":1, "horses.opponent.jockey":1})
After creating this index, the query in your case should return number of scanned objects that is equal to the number of matched objects:
db.race_results.find( { horses: { $elemMatch: { "opponent.name": "MyHorseName", "opponent.jockey": "MyFirstName MyLastName" } } }
).explain()