MongoDB - combining query across mulitple collections - mongodb

I'm trying to figure out how to essentially do a join in MongoDB. I've read about doing aggregates: https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/, but that doesn't seem to be what I'm looking for.
I'm also very new to NoSQL, so I'm not exactly sure what I should be using here.
I have two collections in MongoDB, structured as follows:
db collection - employees:
{
_id: 1,
name: 'John Doe',
filesAccess: {
web: true
},
fileIds: [
'fileId1',
'fileId2'
]
},
{
_id: 2,
name: 'Bob Jones',
filesAccess: {
web: false
},
fileIds: [
'fileId3',
'fileId4'
]
}
db collection - files:
{
_id: fileId1,
fileMetaData: {
location: 'NE'
}
},
{
_id: fileId2,
fileMetaData: {
location: 'NE'
}
},
{
_id: fileId3,
fileMetaData: {
location: 'SW'
}
},
{
_id: fileId4,
fileMetaData: {
location: 'SW'
}
}
I want to be able to query for all employees who have fileAccess.web = true and get their employee ID, name, and fileMetaData.location. The location for all of an employee's files will be the same. So the query only needs to use the first fileId from an employee to get the location from the files collection
So I'd like my result should look like:
{
_id: 1,
name: 'John Doe',
location: 'NE'
}
How would you structure a query to accomplish this in MongoDB? I'm using Studio3T as my interface to the db. Any help is greatly appreciated

You can use this aggregation query:
First $match to get only documents where filesAccess.web is true.
The join based on values on fileIds and _id. This give an array calling result.
Then get the first position
And $project to output the fields you want.
db.employess.aggregate([
{
"$match": {
"filesAccess.web": true
}
},
{
"$lookup": {
"from": "files",
"localField": "fileIds",
"foreignField": "_id",
"as": "result"
}
},
{
"$set": {
"result": {
"$arrayElemAt": [
"$result",
0
]
}
}
},
{
"$project": {
"_id": 1,
"name": 1,
"location": "$result.fileMetaData.location"
}
}
])
Example here

Related

Add number field in $project mongodb

I have an issue that need to insert index number when get data. First i have this data for example:
[
{
_id : 616efd7e56c9530018e318ac
student : {
name: "Alpha"
email: null
nisn: "0408210001"
gender : "female"
}
},
{
_id : 616efd7e56c9530018e318af
student : {
name: "Beta"
email: null
nisn: "0408210001"
gender : "male"
}
}
]
and then i need the output like this one:
[
{
no:1,
id:616efd7e56c9530018e318ac,
name: "Alpha",
nisn: "0408210001"
},
{
no:2,
id:616efd7e56c9530018e318ac,
name: "Beta",
nisn: "0408210002"
}
]
i have tried this code but almost get what i expected.
{
'$project': {
'_id': 0,
'id': '$_id',
'name': '$student.name',
'nisn': '$student.nisn'
}
}
but still confuse how to add the number of index. Is it available to do it in $project or i have to do it other way? Thank you for the effort to answer.
You can use $unwind which can return an index, like this:
db.collection.aggregate([
{
$group: {
_id: 0,
data: {
$push: {
_id: "$_id",
student: "$student"
}
}
}
},
{
$unwind: {path: "$data", includeArrayIndex: "no"}
},
{
"$project": {
"_id": 0,
"id": "$data._id",
"name": "$data.student.name",
"nisn": "$data.student.nisn",
"no": {"$add": ["$no", 1] }
}
}
])
You can see it works here .
I strongly suggest to use a $match step before these steps, otherwise you will group your entire collection into one document.
You need to run a pipeline with a $setWindowFields stage that allows you to add a new field which returns the position of a document (known as the document number) within a partition. The position number creation is made possible by the $documentNumber operator only available in the $setWindowFields stage.
The partition could be an extra field (which is constant) that can act as the window partition.
The final stage in the pipeline is the $replaceWith step which will promote the student embedded document to the top-level as well as replacing all input documents with the specified document.
Running the following aggregation will yield the desired results:
db.collection.aggregate([
{ $addFields: { _partition: 'students' }},
{ $setWindowFields: {
partitionBy: '$_partition',
sortBy: { _id: -1 },
output: { no: { $documentNumber: {} } }
} },
{ $replaceWith: {
$mergeObjects: [
{ id: '$_id', no: '$no' },
'$student'
]
} }
])

MongoDB - Unable to add timestamp fields to subdocuments in an array

I recently updated my subschemas (called Courses) to have timestamps and am trying to backfill existing documents to include createdAt/updatedAt fields.
Courses are stored in an array called courses in the user document.
// User document example
{
name: "Joe John",
age: 20,
courses: [
{
_id: <id here>,
name: "Intro to Geography",
units: 4
} // Trying to add timestamps to each course
]
}
I would also like to derive the createdAt field from the Course's Mongo ID.
This is the code I'm using to attempt adding the timestamps to the subdocuments:
db.collection('user').updateMany(
{
'courses.0': { $exists: true },
},
{
$set: {
'courses.$[elem].createdAt': { $toDate: 'courses.$[elem]._id' },
},
},
{ arrayFilters: [{ 'elem.createdAt': { $exists: false } }] }
);
However, after running the code, no fields are added to the Course subdocuments.
I'm using mongo ^4.1.1 and mongoose ^6.0.6.
Any help would be appreciated!
Using aggregation operators and referencing the value of another field in an update statement requires using the pipeline form of update, which is not available until MongoDB 4.2.
Once you upgrade, you could use an update like this:
db.collection.updateMany({
"courses": {$elemMatch: {
_id:{$exists:true},
createdAt: {$exists: false}
}}
},
[{$set: {
"courses": {
$map: {
input: "$courses",
in: {
$mergeObjects: [
{createdAt: {
$convert: {
input: "$$this._id",
to: "date",
onError: {"error": "$$this._id"}
}
}},
"$$this"
]
}
}
}
}
}
])

Mongodb lookup like search: local field as array of objects

I have two collections userProfile and skills,
Eg:userProfile
{
"_id": "5f72c6d4e23732390c96b031",
"name":"name"
"other_skills": [
"1","2"
],
"primary_skills": [
{
"_id": "607ffd1549e13876fef7f2c5",
"years": 4.5,
"skill_id": "1"
},
{
"_id": "607ffd1549e13876fef7f2c6",
"years": 2,
"skill_id": "2"
},
{
"_id": "607ffd1549e13876fef7f2c7",
"years": 1,
"skill_id": "3"
}
]
}
Eg:Skills
{
"_id":1,
"name": "Ruby on Rails",
}
{
"_id":2,
"name": "PHP",
}
{
"_id":3,
"name": "php",
}
I want to retrieve the userprofile based on the skills
eg: input of skill php i want to retrieve the userprofiles that matches either in primary_skills or other_skills
But I got confused about the implementation, I think it can do with pipeline in lookup and the elemMatch. This is the query I tried so far
const skills = ['php','PHP']
userProfile.aggrigate([{
$lookup:{
from:'skills',
let:{'primary_skills':'$primary_skills'},
pipeline:[
{
$match:{
primary_skills:{
$elemMatch:{
name:'' //not sure how to write match
}
}
}
}
]
}
}])
Can somebody help me with this, Thanks in advance
I'll first show you how to correct your pipeline to work, however this approach is very inefficient as you will have to $lookup on every single user in your db which is obviously a lot of overhead.
Here is how to properly match your condition:
const skills = ['php','PHP']
db.userProfile.aggregate([
{
$lookup: {
from: "skills",
let: {
"primary_skills": {
$map: {
input: "$primary_skills",
as: "skill",
in: "$$skill.skill_id"
}
},
"other_skills": "$other_skills"
},
pipeline: [
{
$match: {
$expr: {
"$in": [
"$_id",
{
"$concatArrays": [
"$$other_skills",
"$$primary_skills"
]
}
]
}
}
}
],
as: "skills"
}
},
{
$match: {
'skills.name': {$in: skills}
}
}
])
Mongo Playground
As I've said I recommend you do not do this. what I suggest you do is split it into 2 calls, first fetch the relevant skill ids. and then query on users.
By doing this you can also utilize indexes for much faster queries, like so:
const skills = ['php', 'PHP'];
const matchedSkillIds = await skills.distinct('_id', {name: {$in: skills}});
const users = await userProfile.find({
$or: [
{
'primary_skills.skill_id': {$in: matchedSkillIds}
},
{
'other_skills': {$in: matchedSkillIds}
}
]
})
Finally if you do insist on doing it in one query at the very least start the pipeline from the skill collection.

grouping result from multiple queries and aggregating result mongo

I'm new to mongo and I have a document that has an array with the ids of all it's related documents. I need to fetch the document with all it's relateds in a single query. For the moment I fetch the document and I query separatly each of it's related document with there ids.
all my documents are on the same collection documents_nodes and look like so:
{
"id": "document_1",
"name": "flask",
"relateds": [
"document_2",
"document_3",
"document_4"
],
"parents": [
"document1_1"
]
}
The first query is to get the initial document
db.documents_nodes.find({id: document_1})
And then I query it's relateds with a second query
db.documents_nodes.aggregate([{
$match: {
$and: [{
id: {
$in: ["document_2", "document_3", "document_2"]
}
}]
}
}])
is there a way to combine the two queries, I tried this but it doesn't work
db.documents_nodes.aggregate([
{
$match: {
uri: "https://opus.adeo.com/LMFR_prod/3206"
}
},
{
$addFields: {
newRelateds:
{
$match: {
id: {
$in: [ "$relateds" ]
}
}
}
}
}
])
"errmsg" : "Unrecognized expression '$match'",
"code" : 168,
"codeName" : "InvalidPipelineOperator"
I have found a way to do it, in case someone has the same need.
I used the $unwind to flatten the array of documents and then used the $lookup to fetch the documents by their ids and finally I group the result on a new key.
[{
$match: {
id: "document_1"
}
}, {
$unwind: {
path: '$relateds',
}
}, {
$lookup: {
from: 'documents_nodes',
localField: 'relateds',
foreignField: 'id',
as: 'newRelateds'
}
}, {
$group: {
_id: '$id',
relateds: {
'$push': '$newRelateds'
}
}
}]

MongoDB aggregate - filter by subdocument

I have a mongodb collection with structure like that:
[
{
name: "name1",
instances: [{value:1, score:2}, {value:2, score:5}, {value:2.5, score:9}]
},
{
name: "name2",
instances: [{value:6, score:3}, {value:1, score:6}, {value:3.7, score:5.2}]
}
]
When I want to get all the data from a document, I use aggregate because I want each instance returned as a separate document:
db.myCollection.aggregate([{$match:{name:"name1"}}, {$unwind:"$instances"}, {$project:{name:1, value:"$instances.value", score:"$instances.score"}}])
And everything works like I want it to.
Now for my question: I want to filter the returned data by score or by value. For example, I want an array of all the subdocuments of name1 which have a value greater or equal to 2.
I tried to add to the $match object 'instances.value':{$gte:2}, but it didn't filter anything, and I still get all 3 documents for this query.
Any ideas?
After unwinding instances then again used $match as below
db.collectionName.aggregate({
"$match": {
"name": "name1"
}
}, {
"$unwind": "$instances"
}, {
"$match": {
"instances.value": {
"$gte": 2
}
}
}, {
$project: {
name: 1,
value: "$instances.value",
score: "$instances.score"
}
})
Or if you tried $match after project then used as below
db.collectionName.aggregate([{
$match: {
name: "name1"
}
}, {
$unwind: "$instances"
}, {
$project: {
name: 1,
value: "$instances.value",
score: "$instances.score"
}
}, {
"$match": {
"value": {
"$gte": 2
}
}
}])