MongoDB. Select fields existing in other cillection - mongodb

I have two collections. I need to select documents from the first one by the criteria that field value exists in the second collection. For example:
User_Item collection has documents like below
{
'_id' ...,
'uid' : 123,
'iid' : 'a123',
'quantity' : 10
}
The second collection contains some item's like below
{
'_id' ...,
'iid' : 'a456',
'name' : 'someItem'
}
I need to obtain the sample by the item id's ('iid') which coincide in both collections. The expected result is presented.
{
'_id' : ...,
'uid' : 123,
'iid': 'a123',
'name' : 'item123'
}
I've used $lookup in the user_items from the items but it returns EVERY documents in the first collection and there're a lot of empty arrays. I want to avoid it.
In case of $lookup in items from user_items it will return an array of users. It's not the result desired too.
Is there some options in the lookup, or may be another solution of this issue?

You can use below aggregation
db.User_Item.aggregate([
{ "$lookup": {
"from": "second",
"localField": "iid",
"foreignField": "iid",
"as": "second"
}},
{ "$match": { "second": { "$ne": [] }}},
{ "$addFields": {
"name": { "$arrayElemAt": ["$second.name", 0] }
}},
{ "$project": { "second": 0 }}
])

Related

mongodb aggregation pipeline solution for getting record from 2 collections based on the value from an array in one collection

I have two mongoDb collections, one contains data about cards and the other contains data about a field of cards and called list.
structure of firstCollection :
{
"cardType":"card",
"xyz":"XYZ",
"fields":[
{"abc":"abc", "xyz":"XYZ", "inputMethod" : "Entry", "xyz":"xyz"},
{"abc":"abc", "xyz":"XYZ", "inputMethod" : "List", "xyz":"xyz", "ListId":"1234"}
// ListId will only be present incase of inputMethod=List
]
}
Structure of secondCollection:
{ "abc":"abc", "xyz":"xyz, "itemId": "1234" }
Now what I want is
all the firstCollection where cardType = "card", complete card object
and
all the items from secondCollection where itemId in (select ListId from firstCollection where fields.inputmethod = "List").
Need to write MongoDB pipeline for this situation. I am quite new to mongo, it can be done using an aggregation pipeline with $loopup but I can write the pipeline.
the result I want :
{
firstCollection:{complete collection },
secondCollection:[
array of matching records from second collection where
secondelement.itemId in(records from array of firstcollection
where fields.inputmethod = "List" )
]
}
db.first.aggregate([
{
$match: {}
},
{
$project: {
firstCollection: "$$ROOT"
}
},
{
$lookup: {
"from": "second",
"localField": "firstCollection.fields.ListId",
"foreignField": "itemId",
"as": "secondCollection"
}
}
])
mongoplayground

How to find documents according to a common field value from another collection in mongodb

Assume I have 2 collections:
student:
{name: Joe, school: A}
{name: Kelly, school: B}
{name: Mike, school: C}
{name: Tom, school: D}
schoolRank: (all the school rank is stored in one document)
{rank: [{school: A, value: 1},{school: B, value: 2},{school: C, value: 3},{school: D, value: 4}]}
Now, my question is how could I find the student whoes school rank is higher than 3. (I am a newbie to mongodb. It seems like I need to use lookup but I am not sure how to do it exactly.) Thank you in advance!
You need to use $lookup. Is like a "join" in SQL.
But, first of all. Your document could be much better. schoolRank collection could have every school in a document instead of a unique array wit all values.
Check here the difference between the query with your schema and the schema with schoolRank splited into diffretend documents.
The second query return only the document where field school match. The other will return the entire array for each document, because in each document exist a field school that also exists into rank array.
So, with your schema you need extra stages. Maybe there is another way more efficent, but I'm not used to do $lookup with a bad schema (sorry).
I've try this query:
First $lookup to join both collections (as I've said before, the join is basically add the entire array into each document).
Then an extra stage to get the value returned from $lookup using $set with the element at first position.
After that, using $project te query can filter the field rank_school and overwrite it to get only the element which field school is the same as student.school.
Note that the above steps could be omitted using another schema.
Then, after the $project there is a $match stage to get the documents whose rank_school.value is greater or equal than 3.
And the last stage is another $project to remove the field rank_school.
This is the query:
db.student.aggregate([
{
"$lookup": {
"from": "schoolRank",
"localField": "school",
"foreignField": "rank.school",
"as": "rank_school"
}
},
{
"$set": { "rank_school": { "$arrayElemAt": [ "$rank_school", 0 ] } }
},
{
"$project": {
"_id": "$_id",
"name": "$name",
"school": "$school",
"rank_school": {
"$filter": {
"input": "$rank_school.rank",
"as": "rank_school_filter",
"cond": { "$eq": [ "$$rank_school_filter.school", "$school" ] }
}
}
}
},
{
"$match": { "rank_school.value": { "$gte": 3 } }
},
{
"$project": { "rank_school": 0 }
}
])
Example here.
And the output is:
[
{
"_id": ObjectId("5a934e000102030405000003"),
"name": "Mike",
"school": "C"
},
{
"_id": ObjectId("5a934e000102030405000004"),
"name": "Tom",
"school": "D"
}
]

Mongo DB $lookup with no common value to match

I am a novice to MongoDb and am trying to join two collections where there is no common value.
I have two collections.
collection 1 : Role
Fields : Role, UserName
collection 2 :mysite
Fields : userName ,userEmail
In collection 1:
eg :
{
'Role' :"admin"
'UserName' : "abc.efg"
}
In collection 2:
eg:
{
'userName' : "abc Mr, efg"
'userEmail' : "abc.efg#company.com"
}
The value of username is different in format so I am looking for a way to join these two collections.
Is there any way to merge these two collections please.
Kindly help on this.
To perform uncorrelated subqueries between two collections as well as allow other join conditions besides a single equality match, the $lookup stage has the following syntax:
{
$lookup:
{
from: <collection to join>,
let: { <var_1>: <expression>, …, <var_n>: <expression> },
pipeline: [ <pipeline to execute on the collection to join> ],
as: <output array field>
}
}
You can use the following aggregation query using the above $lookup syntax:
db.Role.aggregate([
{
"$lookup": {
"from": "mysite",
let: {
"userName": "$UserName"
},
"pipeline": [
{
$match: {
"$expr": {
"$ne": [
{
"$indexOfCP": [
"$userEmail",
"$$userName"
]
},
-1
]
}
}
}
],
"as": "mysite"
}
},
{
"$unwind": "$mysite"
},
])
MongoDB Playground
$indexOfCP searches a string for an occurrence of a substring and returns the index of the first occurrence. If the substring is not found, it returns -1.
So, in the following stage, it checks if UserName substring is present in userEmail, if not it returns -1, if present it returns the index at which the substring is located.
Hence using the expr $ne -1 , it matches all documents that have UserNamesubstring present in userEmail, ignoring the documents where given substring is not present.
{
$match: {
"$expr": {
"$ne": [
{
"$indexOfCP": [
"$userEmail",
"$$userName"
]
},
-1
]
}
}
}

Link each element of array in a document to the corresponding element in an array of another document with MongoDB

Using MongoDB 4.2 and MongoDB Atlas to test aggregation pipelines.
I've got this products collection, containing documents with this schema:
{
"name": "TestProduct",
"relatedList": [
{id:ObjectId("someId")},
{id:ObjectId("anotherId")}
]
}
Then there's this cities collection, containing documents with this schema :
{
"name": "TestCity",
"instructionList": [
{ related_id: ObjectId("anotherId"), foo: bar},
{ related_id: ObjectId("someId"), foo: bar}
{ related_id: ObjectId("notUsefulId"), foo: bar}
...
]
}
My objective is to join both collections to output something like this (the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document) :
{
"name": "TestProduct",
"relatedList": [
{ related_id: ObjectId("someId"), foo: bar},
{ related_id: ObjectId("anotherId"), foo: bar},
]
}
I tried using the $lookup operator for aggregation like this :
$lookup:{
from: 'cities',
let: {rId:'$relatedList._id'},
pipeline: [
{
$match: {
$expr: {
$eq: ["$instructionList.related_id", "$$rId"]
}
}
},
]
}
But it's not working, I'm a bit lost with this complex pipeline syntax.
Edit
By using unwind on both arrays :
{
{$unwind: "$relatedList"},
{$lookup:{
from: "cities",
let: { "rId": "$relatedList.id" },
pipeline: [
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id","$$rId"]}}},
],
as:"instructionList",
}},
{$group: {
_id: "$_id",
instructionList: {$addToSet:"$instructionList"}
}}
}
I am able to achieve what I want, however,
I'm not getting a clean result at all :
{
"name": "TestProduct",
instructionList: [
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("someId")
}
}
],
[
{
"name": "TestCity",
"instructionList": {
"related_id":ObjectId("anotherId")
}
}
]
]
}
How can I group everything to be as clean as stated for my original question ?
Again, I'm completely lost with the Aggregation framework.
the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document)
Given an example document on cities collection:
{"_id": ObjectId("5e4a22a08c54c8e2380b853b"),
"name": "TestCity",
"instructionList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"},
{"related_id": "c", "foo": "z"}
]}
and an example document on products collection:
{"_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"id": "a"},
{"id": "b"}
]}
You can achieve try using the following aggregation pipeline:
db.products.aggregate([
{"$lookup":{
"from": "cities",
"let": { "rId": "$relatedList.id" },
"pipeline": [
{"$unwind":"$instructionList"},
{"$match":{
"$expr":{
"$in":["$instructionList.related_id", "$$rId"]
}
}
}],
"as":"relatedList",
}},
{"$project":{
"name":"$name",
"relatedList":{
"$map":{
"input":"$relatedList",
"as":"x",
"in":{
"related_id":"$$x.instructionList.related_id",
"foo":"$$x.instructionList.foo"
}
}
}
}}
]);
To get a result as the following:
{ "_id": ObjectId("5e45cdd8e8d44a31a432a981"),
"name": "TestProduct",
"relatedList": [
{"related_id": "a", "foo": "x"},
{"related_id": "b", "foo": "y"}
]}
The above is tested in MongoDB v4.2.x.
But it's not working, I'm a bit lost with this complex pipeline syntax.
The reason why it's slightly complex here is because you have an array relatedList and also an array of subdocuments instructionList. When you refer to instructionList.related_id (which could mean multiple values) with $eq operator, the pipeline doesn't know which one to match.
In the pipeline above, I've added $unwind stage to turn instructionList into multiple single documents. Afterward, using $in to express a match of single value of instructionList.related_id in array relatedList.
I believe you just need to $unwind the arrays in order to lookup the relation, then $group to recollect them. Perhaps something like:
.aggregeate([
{$unwind:"relatedList"},
{$lookup:{
from:"cities",
let:{rId:"$relatedList.id"}
pipeline:[
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$unwind:"$instructionList"},
{$match:{$expr:{$eq:["$instructionList.related_id", "$$rId"]}}},
{$project:{_id:0, instruction:"$instructionList"}}
],
as: "lookedup"
}},
{$addFields: {"relatedList.foo":"$lookedup.0.instruction.foo"}},
{$group: {
_id:"$_id",
root: {$first:"$$ROOT"},
relatedList:{$push:"$relatedList"}
}},
{$addFields:{"root.relatedList":"$relatedList"}},
{$replaceRoot:{newRoot:"$root"}}
])
A little about each stage:
$unwind duplicates the entire document for each element of the array,
replace the array with the single element
$lookup can then consider each element separately. The stages in $lookup.pipeline:
a. $match so we only unwind the document with matching ID
b. $unwind the array so we can consider individual elements
c. repeat the $match so we are only left with matching elements (hopefully just 1)
$addFields assigns the foo field retrieved from the lookup to the object from relatedList
$group collects together all of the documents with the same _id (i.e. that were unwound from a single original document), stores the first as 'root', and pushes all of the relatedList elements back into an array
$addFields moves the relatedList in to root
$replaceRoot returns the root, which should now be the original document with the matching foo added to each relatedList element

How to sort by 'value' of a specific key within a property stored as an array with k-v pairs in mongodb

I have a mongodb collection, let's call it rows containing documents with the following general structure:
{
"setid" : 154421,
"date" : ISODate("2014-02-22T14:06:48.229Z"),
"version" : 2,
"data" : [
{
"k" : "name",
"v" : "ryan"
},
{
"k" : "points",
"v" : "375"
},
{
"k" : "email",
"v" : "ryan#123.com"
}
],
}
There is no guarantee what values of k and v might populate the "data" property for any particular document (eg. other documents might have 5 k-v pairs with different key names in it). The only rule is that documents with the same setid have the same k-v pairs. (i.e. the rows collection might hold 100 other documents with setid = 154421, that have the same set of 3 keys in the data property: "name", "points", "email", with their own respective values.
How would one, with this setup, construct a query to retrieve all rows with a particular setid sorted by points? I need, in effect, some way of saying 'sort by the the field data.v where the value of k==points or something like that...?
Something like this:
db.rows.find({setid:154421},{$sort:{'data.v',-1}, {$where: k:'points'}}})
I know this is the incorrect syntax, but I'm just taking a stab at it to illustrate my point.
Is it possible?
Assuming that what you want would be all the documents that have the "points" value as a "key" in the array, and then sort on the "value" for that "key", then this is a little out of scope for the .find() method.
Reason being if you did something like this
db.collection.find({
"setid": 154421, "data.k": "point" }
).sort({ "data.v" : -1 })
The problem is that even though the matched elements do have the matching key of "point", there is no way of telling which data.v you are referring to for the sort. Also, a sort within .find() results will not do something like this:
db.collection.find({
"setid": 154421, "data.k": "point" }
).sort({ "data.$.v" : -1 })
Which would be trying to use a positional operator within a sort, essentially telling which element to use the value of v on. But this is not supported and not likely to be, and for the most likely explaination, that "index" value would be likely different in every document.
But this kind of selective sorting can be done with the use of .aggregate().
db.collection.aggregate([
// Actually shouldn't need the setid
{ "$match": { "data": {"$elemMatch": { "k": "points" } } } },
// Saving the original document before you filter
{ "$project": {
"doc": {
"_id": "$_id",
"setid": "$setid",
"date": "$date",
"version": "$version",
"data": "$data"
},
"data": "$data"
}}
// Unwind the array
{ "$unwind": "$data" },
// Match the "points" entries, so filtering to only these
{ "$match": { "data.k": "points" } },
// Sort on the value, presuming you want the highest
{ "$sort": { "data.v": -1 } },
// Restore the document
{ "$project": {
"setid": "$doc.setid",
"date": "$doc.date",
"version": "$doc.version",
"data": "$doc.data"
}}
])
Of course that presumes the data array only has the one element that has the key points. If there were more than one, you would need to $group before the sort like this:
// Group to remove the duplicates and get highest
{ "$group": {
"_id": "$doc",
"value": { "$max": "$data.v" }
}},
// Sort on the value
{ "$sort": { "value": -1 } },
// Restore the document
{ "$project": {
"_id": "$_id._id",
"setid": "$_id.setid",
"date": "$_id.date",
"version": "$_id.version",
"data": "$_id.data"
}}
So there is one usage of .aggregate() in order to do some complex sorting on documents and still return the original document result in full.
Do some more reading on aggregation operators and the general framework. It's a useful tool to learn that takes you beyond .find().