I'm trying to create a simple search algorithm that will try to match against a first name, last name, and/or set of tags, as an example:
[
{
"key": 1,
"fname": "Bob",
"lname": "Smith",
"tags": [
"a",
"b",
"c"
]
},
{
"key": 2,
"fname": "John",
"lname": "Jacob",
"tags": [
"c",
"d",
"e"
]
},
{
"key": 3,
"fname": "Will",
"lname": "Smith",
"tags": [
"a",
"b",
"c"
]
}
]
This works with the following, but I can only get the tags count. Basically what I'm going for here is to match first-name, last-name, or tags and for each match store a "point":
db.collection.aggregate([
{
$match: {
$or: [
{
"fname": "Will"
},
{
"lname": "Smith"
},
{
tags: {
$in: [
"b",
"c"
]
}
}
]
}
},
{
$project: {
tagsMatchCount: {
$size: {
"$setIntersection": [
[
"b",
"c"
],
"$tags"
]
}
}
}
},
{
"$sort": {
tagsMatchCount: -1
}
}
])
Here's the sandbox I'm playing with: https://mongoplayground.net/p/DFJQZY-dfb5
Query
create a document to hold the matches each in separate field
add one extra field total
keep only those with at least 1 match
you can sort also after by any of the 3 types of matches or by total, like
{"$sort":{"points.total":-1}}
if you have index that can be used, remove my $match and add your match as first stage like in your example
Test code here
aggregate(
[{"$set":
{"points":
{"fname":{"$cond":[{"$eq":["$fname", "Will"]}, 1, 0]},
"lname":{"$cond":[{"$eq":["$lname", "Smith"]}, 1, 0]},
"tags":{"$size":{"$setIntersection":["$tags", ["b", "c"]]}}}}},
{"$set":
{"points.total":
{"$add":["$points.fname", "$points.lname", "$points.tags"]}}},
{"$match":{"$expr":{"$gt":["$points.total", 0]}}}])
Related
I have set with strings like this: a,b
and a collection like this:
{
"owner": "anna",
"letters": ["c"]
},
{
"owner": "bob",
"letters": ["b", "c"]
},
{
"owner": "cai",
"letters": ["a", "b"]
},
{
"owner": "dora",
"letters": ["a", "d"]
},
{
"owner": "emil",
"letters": ["a"]
},
{
"owner": "fry",
"letters": ["b"]
},
I want to get a random collection of documents where the set a,b matches a subset of accumulated distinct sets of the field "letters" of all documents in the collection.
A valid solution would be:
{
"owner": "cai",
"letters": ["a", "b"]
}
Also valid:
{
"owner": "emil",
"letters": ["a"]
},
{
"owner": "fry",
"letters": ["b"]
},
Also valid, since the sets of "letters" are distinct:
{
"owner": "emil",
"letters": ["a"]
},
{
"owner": "bob",
"letters": ["b", "c"]
},
Also valid:
{
"owner": "bob",
"letters": ["b", "c"]
},
{
"owner": "dora",
"letters": ["a", "d"]
},
NOT valid would be the following, since "b" is in both documents (should only be in one):
{
"owner": "bob",
"letters": ["b", "c"]
},
{
"owner": "cai",
"letters": ["a", "b"]
}
Also NOT valid, since "a" is in both documents:
{
"owner": "emil",
"letters": ["a"]
},
{
"owner": "cai",
"letters": ["a", "b"]
},
With the aggregation pipeline, in tried to group the documents by the field "letters" and randomize the order of the documents. I am stuck at how to apply the set rules to only return distinct documents without intersections between the arrays of field "letters" to match the search set.
Thanks for your help!
Use a sub-pipeline inside $lookup to search by a combination where the set union contains your criteria array (i.e. ["a", "b"]). Use $sample at the end to pick 1 combination. The combination would be stored in the $lookup result and you can just pick the combination you want.
db.collection.aggregate([
{
"$lookup": {
"from": "collection",
"let": {
l: "$letters"
},
"pipeline": [
{
"$match": {
$expr: {
$eq: [
{
"$setIntersection": [
"$$l",
"$letters"
]
},
[]
]
}
}
},
{
"$match": {
$expr: {
"$setIsSubset": [
[
"a",
"b"
],
{
"$setUnion": [
"$$l",
"$letters"
]
}
]
}
}
},
{
// optional for performance boost
$limit: 10
}
],
"as": "candidates"
}
},
{
$unwind: "$candidates"
},
{
$sample: {
size: 1
}
}
])
Mongo Playground
Consider this dataset. For each name, we wish to find the average of x and the distinct set and count of game. For Steve, this is avg(x)=19, game A is 2, and game B is 1. For Bob, this is avg(x) = 58, game B is 4:
{"name":"Steve", "game": "A", x:7},
{"name":"Steve", "game": "A", x:21},
{"name":"Steve", "game": "B", x:31},
{"name":"Bob", "game": "B", x:41},
{"name":"Bob", "game": "B", x:51},
{"name":"Bob", "game": "B", x:71},
{"name":"Bob", "game": "B", x:79},
{"name":"Jill", "game": "A", x:61},
{"name":"Jill", "game": "B", x:71},
{"name":"Jill", "game": "C", x:81},
{"name":"Jill", "game": "D", x:91}
EDIT: Answer is below but leaving this incomplete solution as a stepping stone.
I am really close with this. Note we cannot use $addToSet because it is "lossy". So instead, we group by player and game to get the full list, then in a second group, capture list size:
db.foo2.aggregate([
{$group: {_id:{n:"$name",g:"$game"}, z:{$push: "$x"} }}
,{$group: {_id:"$_id.n",
avgx: {$avg: "$z"},
games: {$push: {name: "$_id.g", num: {$size:"$z"}}}
}}
]);
which yields:
{
"_id" : "Steve",
"avgx" : null,
"games" : [ {"name":"A", "num":2 },
{"name":"B", "num":1 }
]
}
{
"_id" : "Bob",
"avgx" : null,
"games" : [ {"name":"B", "num":4 } ]
}
but I just cannot seem to get the avgx working properly. If I needed the average within the game type that would be easy but I need it across the player. $avg in the $group context does not work with array inputs.
Try this:
db.collection.aggregate([
{
$group: {
_id: "$name",
avg: {
$avg: "$x"
},
gamesUnFiltered: {
$push: {
name: "$game",
num: "$x"
}
}
}
},
{
$addFields: {
games: {
$reduce: {
input: "$gamesUnFiltered",
initialValue: [],
in: {
$cond: [
{
$not: [
{
$in: [
"$$this.name",
"$$value.name"
]
}
]
},
{
$concatArrays: [
[
"$$this"
],
"$$value"
]
},
"$$value"
]
}
}
}
}
},
{
$project: {
gamesUnFiltered: 0
}
}
])
Output:
[
{
"_id": "Bob",
"avg": 60.5,
"games": [
{
"name": "B",
"num": 41
}
]
},
{
"_id": "Steve",
"avg": 19.666666666666668,
"games": [
{
"name": "B",
"num": 31
},
{
"name": "A",
"num": 7
}
]
},
{
"_id": "Jill",
"avg": 76,
"games": [
{
"name": "D",
"num": 91
},
{
"name": "C",
"num": 81
},
{
"name": "B",
"num": 71
},
{
"name": "A",
"num": 61
}
]
}
]
Got it! You need an extra $unwind and use $first to "carry" the a field from stage to stage. Threw in total_games for extra info. In general, the "group-unwind-first" pattern is a way to aggregate one or more things then "reset" to unaggregated state to perform additional operations with the aggregate values traveling along with each doc.
db.foo2.aggregate([
{$group: {_id:"$name", a:{$avg:"$x"}, g:{$push: "$game"} }}
,{$unwind: "$g"}
,{$group: {_id:{name:"$_id",game:"$g"}, a:{$first:"$a"}, n:{$sum:1}}}
,{$group: {_id:"$_id.name",
a:{$first:"$a"},
total_games: {$sum:"$n"},
games: {$push: {name:"$_id.game",n:"$n"}}
}}
]);
I have a document like this:
[
{
"id": 1,
"active": true,
"key": []
},
{
"id": 2,
"active": true,
"key": [
{
"code": "fake_code",
"ids": [
""
],
"labels": [
"d"
]
}
]
},
{
"id": 3,
"active": true,
"key": [
{
"code": "fake_code",
"ids": [
""
],
"labels": [
"a",
"b",
"c"
]
}
]
}
]
I only want to get the id of the documents in which any of the values of the given array(let's say ["a", "b", "c", "d"]) present in labels field in the documents.
That means, since the given array = ["a", "b", "c", "d"], and if you will see the documents, then you can find the document having id = 2 is having ["d"] in the labels field, and the document having id = 3 is having ["a", "b", "c"] in it's labels.
So, the expected output is like,
[
{
"id": 2
},
{
"id": 3
}
]
Currently, I've been using
db.collection.find({
"key": {
"$all": [
{
"$elemMatch": {
"ids": {
"$in": [
""
]
},
"code": "fake_code",
"labels": {
"$in": [
[
"a",
"b",
"c"
]
]
}
}
}
]
}
},
{
_id: 0,
id: 1
})
This query is able to return me only one document having id = 3, because in this case I am using the given array = ["a", "b", "c"]. But is it possible to get all documents according to the given array(like ["a", "b", "c", "d"]), that means if any document is having at least one matching values of the given array then the query should return the id of those documents?
Thanks
You can use $in. I fyou dont have any condition inside $elemMatch, you can directly access "key.labels":{$in:[....]}
db.collection.find({
key: {
$elemMatch: {
labels: {
$in: [
"a",
"b",
"c",
"d"
]
}
}
}
},
{
_id: 0,
id: 1
})
Working Mongo playground
Apologies for the confusing title, I am not sure how to summarize this.
Suppose I have the following list of documents in a collection:
{ "name": "Lorem", "source": "A" }
{ "name": "Lorem", "source": "B" }
{ "name": "Ipsum", "source": "A" }
{ "name": "Ipsum", "source": "B" }
{ "name": "Ipsum", "source": "C" }
{ "name": "Foo", "source": "B" }
as well an ordered list of accepted sources, where lower indexes signify higher priority
sources = ["A", "B"]
My query should:
Take a list of available sources and a list of wanted names
Return a maximum of one document per name.
In case of multiple matches, the document with the most prioritized source should be chosen.
Example:
wanted_names = ['Lorem', 'Ipsum', 'Foo', 'NotThere']
Result:
{ "name": "Lorem", "source": "A" }
{ "name": "Ipsum", "source": "A" }
{ "name": "Foo", "source": "B" }
The results don't necessarily have to be ordered.
Is it possible to do this with a Mongo query alone? If so could someone point me towards a resource detailing how to accomplish it?
My current solution doesn't support a list of names, and instead relies on a Python script to execute multiple queries:
db.collection.aggregate([
{$match: {
"name": "Lorem",
"source": {
$in: sources
}}},
{$addFields: {
"order": {
$indexOfArray: [sources, "$source"]
}}},
{$sort: {
"order": 1
}},
{$limit: 1}
]);
Note: _id fields are omitted in this question for the sake of brevity
How about this: With $group we have $min operator which takes lower source
Note: If you prioritize as ['B', 'A'], use $max then
db.collection.aggregate([
{
$match: {
"name": {
$in: [
"Lorem",
"Ipsum",
"Foo",
"NotThere"
]
},
"source": {
$in: [
"A",
"B"
]
}
}
},
{
$group: {
_id: "$name",
source: {
$min: "$source"
}
}
},
{
$project: {
_id: 0,
name: "$_id",
source: 1
}
}
])
MongoPlayground
I have an array as below:
const test = [{
"_id": 1,
"name": "apple",
"car": "ford"
},{
"_id": 2,
"name": "melon",
"car": "ferrari"
},{
"_id": 3,
"name": "perl",
"car": "Renaut"
}]
And there is are documents of Mongodb as below:
[{
"name": "perl", "company": "A"
},{
"name": "melon", "company": "B"
},{
"name": "apple", "company": "C"
},{
"name": "apple", "company": "D"
},{
"name": "perl", "company": "E"
},{
"name": "apple", "company": "F"
}]
And I want to get this result using mongodb aggregate:
[{
"name": "perl", "company": "A", testInform: { "_id": 3, "name": "perl", "car": "Renaut"}
},{
"name": "melon", "company": "B", testInform: { "_id": 2, "name": "melon", "car": "ferrari"}
},{
"name": "apple", "company": "C", testInform: { "_id": 1, "name": "apple", "car": "ford"}
},{
"name": "apple", "company": "D", testInform: { "_id": 1, "name": "apple", "car": "ford"}
},{
"name": "perl", "company": "E", testInform: { "_id": 3, "name": "perl", "car": "Renaut"}
},{
"name": "apple", "company": "F", testInform: { "_id": 1, "name": "apple", "car": "ford"}
}]
I think to use aggregate with $match and $facet, etc., but I don't know exactly how to do this. Could you recommend a solution for this?
Thank you so much for reading this.
$lookup with pipeline keyword
db.demo2.aggregate(
{
$lookup:
{
from: "demo1",
let: { recordName: "$name"},
pipeline: [
{ $match:
{ $expr:
{ $and:
[
{ $eq: [ "$$recordName", "$name" ] },
]
}
}
},
],
as: "testInform"
}
}
)
If the test array data is stored in a collection then acheiving O/P is pretty straightforward $lookup with $project aggregation
$arrayElemAt Why? because the lookup would fetch the joined documents in an array as testInform
db.maindocs.aggregate([
{
$lookup: {
from: "testdocs",
localField: "name",
foreignField: "name",
as: "testInform"
}
},
{
$project: {
_id: 0,
name: 1,
company: 1,
testInform: { $arrayElemAt: ["$testInform", 0] }
}
}
]);
Update based on comments:
The idea is to iterate the cursor from the documents stored in mongodb Array.prototype.find() the object from test which matches the name field, add it to result.
const test = [
{
_id: 1,
name: "apple",
car: "ford"
},
{
_id: 2,
name: "melon",
car: "ferrari"
},
{
_id: 3,
name: "perl",
car: "Renaut"
}
];
const cursor = db.collection("maindocs").find();
const result = [];
while (await cursor.hasNext()) {
const doc = await cursor.next();
const found = test.find(e => e.name === doc.name);
if (found) {
doc["testInform"] = found;
}
result.push(doc);
}
console.info("RESULT::", result);
The aggregation has one stage: Iterate over the test array and get the array element as an object which matches the name field in both the document and the array (using the $reduce operator).
const test = [ { ... }, ... ]
db.test_coll.aggregate( [
{
$addFields: {
testInform: {
$reduce: {
input: test,
initialValue: { },
in: {
$cond: [ { $eq: [ "$$this.name", "$name" ] },
{ $mergeObjects: [ "$$this", "$$value" ] },
"$$value"
]
}
}
}
}
}
] )