Mongodb find document in collection from field in another collection - mongodb

I have two collections: Sharing and Material.
Sharing:
{
from_id: 2
to_id: 1
material_id: material1
}
Material:
{
_id: material1
organization_id: 2
},
{
_id: material2
organization_id: 1
},
{
_id: material3
organization_id: 1
},
--Edit:
There are three materials, 2 belong to organization_id(1) and 1 belongs to organization_id(2). The organization_id does not match 1 in material1 (and instead belongs to material2), but in the Sharing collection, the to_id does match 1. If the match exists, I'd like to find the Material document _id which is equal to the material_id of Sharing AND find the Material documents where the organization_id is equal to 1.
I'd like to check if a field in Sharing (to_id) has a value that is equal to a field in Material (organization_id) AND check if organization_id is equal to 1. If there is a document that exists from this, do another check to find whether the _id of Material is equal to the material_id of Sharing and return all documents & the total count.
If there is no equal value, I'd like to omit that result and send the object with only organization_id equal to 1 and get the total count of this result.
Right now, I do it in a very inefficient way using .map() to find this. Below is my code:
export const getMaterials = async (req, res) => {
const sharing = await Sharing.find({to_id: 1});
let doneLoad;
try {
if (sharing && sharing.length>0) {
const sharingTotal = await Material.find( {$or: [ {organization_id: 1}, {_id: sharing.map((item) => item.material_id)} ] } ).countDocuments();
const sharingMats = await Material.find( {$or: [ {organization_id: 1}, {_id: sharing.map((item) => item.material_id)} ] } );
res.status(200).json({data: sharingMats});
doneLoad= true;
}
else if (!doneLoad) {
const materialTotal = await Material.find({organization_id: 1}).countDocuments();
const materials = await Material.find({organization_id: 1});
res.status(200).json({data: materials});
}
} catch (error) {
res.status(404).json({ message: error.message });
}
}
I have tried using aggregation to get my desired result but I cannot find any solution that fits my requirements. Any help would be great as I am quite new to using mongodb. Thanks.
Edit (desired result):
Materials: [
{
_id: material1,
organization_id: 1
},
{
_id: material2,
organization_id: 1
},
{
_id: material3,
organization_id: 1
}
]

You can use sub-pipeline in a $lookup to perform the filtering. $addFields the count using $size later.
db.Sharing.aggregate([
{
"$match": {
to_id: 1
}
},
{
"$lookup": {
"from": "Material",
"let": {
to_id: "$to_id",
material_id: "$material_id"
},
"pipeline": [
{
"$match": {
$expr: {
$or: [
{
$eq: [
"$$to_id",
"$organization_id"
]
},
{
$eq: [
"$$material_id",
"$_id"
]
}
]
}
}
},
{
"$addFields": {
"organization_id": 1
}
}
],
"as": "materialLookup"
}
},
{
"$addFields": {
"materialCount": {
$size: "$materialLookup"
}
}
}
])
Here is the Mongo playground for your reference.

Related

get document with same 3 fields in a collection

i have a collection with more then 1000 documents and there are some documents with same value in some fields, i need to get those
the collection is:
[{_id,fields1,fields2,fields3,etc...}]
what query can i use to get all the elements that have the same 3 fields for example:
[
{_id:1,fields1:'a',fields2:1,fields3:'z'},
{_id:2,fields1:'a',fields2:1,fields3:'z'},
{_id:3,fields1:'f',fields2:2,fields3:'g'},
{_id:4,fields1:'f',fields2:2,fields3:'g'},
{_id:5,fields1:'j',fields2:3,fields3:'g'},
]
i need to get
[
{_id:2,fields1:'a',fields2:1,fields3:'z'},
{_id:4,fields1:'f',fields2:2,fields3:'g'},
]
in this way i can easly get a list of "duplicate" that i can delete if needed, it's not really important get id 2 and 4 or 1 and 3
but 5 would never be included as it's not 'duplicated'
EDIT:
sorry but i forgot to mention that there are some document with null value i need to exclude those
This is the perfect use case of window field. You can use $setWindowFields to compute $rank in the grouping/partition you want. Then, get those rank not equal to 1 to get the duplicates.
db.collection.aggregate([
{
$match: {
fields1: {
$ne: null
},
fields2: {
$ne: null
},
fields3: {
$ne: null
}
}
},
{
"$setWindowFields": {
"partitionBy": {
fields1: "$fields1",
fields2: "$fields2",
fields3: "$fields3"
},
"sortBy": {
"_id": 1
},
"output": {
"duplicateRank": {
"$rank": {}
}
}
}
},
{
$match: {
duplicateRank: {
$ne: 1
}
}
},
{
$unset: "duplicateRank"
}
])
Mongo Playground
I think you can try this aggregation query:
First group by the feilds you want to know if there are multiple values.
It creates an array with the _ids that are repeated.
Then get only where there is more than one ($match).
And last project to get the desired output. I've used the first _id found.
db.collection.aggregate([
{
"$group": {
"_id": {
"fields1": "$fields1",
"fields2": "$fields2",
"fields3": "$fields3"
},
"duplicatesIds": {
"$push": "$_id"
}
}
},
{
"$match": {
"$expr": {
"$gt": [
{
"$size": "$duplicatesIds"
},
1
]
}
}
},
{
"$project": {
"_id": {
"$arrayElemAt": [
"$duplicatesIds",
0
]
},
"fields1": "$_id.fields1",
"fields2": "$_id.fields3",
"fields3": "$_id.fields2"
}
}
])
Example here

How to update a mongodb document depending on values of document referenced by its objectId

How can I update a MongoDB document depending on values of document referenced by it's objectId? (I am using MongoDB via mongoose)
Let's assume I have two collections. One is called competitions and the other one is called games. A competition can have several games in it. See code example below
// competition documents
[
{
compeititionName:"myCompetition",
games:["617...b16", "617...b19", "617...b1c",
competitionStatus:"notStarted",
},
{
compeititionName:"yourCompetition",
games:["617...b18", "617...b19", "617...b1c",
competitionStatus:"playing",
},
{
compeititionName:"ourCompetition",
games:["617...b14", "617...b19", "617...b2b",
competitionStatus:"ended",
}
]
The competitionStatus above is dependent on the status of the games in that competition.
If all the games have not started then the competition should have notStarted as its competitionStatus. However if any of the games is being played or there are games which have not started and others which are complete then the competition status should be playing. Finally if all the games have ended the the competition status should be ended. An example of how the games collection would look is:
// game documents
[
{
_id:"617...b16",
gameStatus:"notStarted"
},
{
_id:"617...b18",
gameStatus:"playing"
},
{
_id:"617...b14",
gameStatus:"ended"
},
]
How can I update the competitionStatus given the _id of the game whose status has just changed?
Since it is mongoose, you select the model you want to update first:
const completion = await CompletionModel.FindOne({games: _id_of_the_game});
Then aggregate statuses of all games:
const statuses = await GameModel.aggregate([
{$match: {_id: {$in: completion.games}}},
{$group: {_id: gameStatus}}
]).toArray();
Then apply your business logic to set the status:
if(statuses.leength === 1) { // all games have same status
if(statuses[0]._id === "notStarted") {
completion.competitionStatus = "notStarted";
} elseif (statuses[0]._id === "ended") {
completion.competitionStatus = "ended";
} else {
completion.competitionStatus = "playing";
} else {
completion.competitionStatus = "playing";
}
Then save it to the db:
await completion.save();
Please bear in mind, this pseudo-code is prone to race conditions - if games change status between aggregate() and save() you may end up with stale status in completion documents. You may want to add extra queries to ensure data consistency if required.
UPDATE
If a game can be in more than 1 completion then using Mongoose will be quite inefficient. Starting from v4.2 you can use $merge aggregation stage to do all calculations on the database side, and update matched documents:
db.competition.aggregate([
{
$match: {
games: "id_of_the_game"
}
},
{
"$lookup": {
from: "games",
let: {
g: "$games"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$_id",
"$$g"
]
}
}
},
{
$group: {
_id: "$gameStatus"
}
}
],
"as": "statuses"
}
},
{
$set: {
competitionStatus: {
"$cond": {
"if": {
"$gt": [
{
"$size": "$statuses"
},
1
]
},
"then": {
_id: "playing"
},
"else": {
"$arrayElemAt": [
"$statuses",
0
]
}
}
}
}
},
{
"$project": {
competitionStatus: "$competitionStatus._id"
}
},
{
"$merge": {
"into": "competition"
}
}
])

MongoDB: How to speed up my data reorganisation query/operation?

I'm trying to analyse some data and I thought my queries would be faster ultimately by storing a relationship between my collections instead. So I wrote something to do the data normalisation, which is as follows:
var count = 0;
db.Interest.find({'PersonID':{$exists: false}, 'Data.DateOfBirth': {$ne: null}})
.toArray()
.forEach(function (x) {
if (null != x.Data.DateOfBirth) {
var peep = { 'Name': x.Data.Name, 'BirthMonth' :x.Data.DateOfBirth.Month, 'BirthYear' :x.Data.DateOfBirth.Year};
var person = db.People.findOne(peep);
if (null == person) {
peep._id = db.People.insertOne(peep).insertedId;
//print(peep._id);
}
db.Interest.updateOne({ '_id': x._id }, {$set: { 'PersonID':peep._id }})
++count;
if ((count % 1000) == 0) {
print(count + ' updated');
}
}
})
This script is just passed to mongo.exe.
Basically, I attempt to find an existing person, if they don't exist create them. In either case, link the originating record with the individual person.
However this is very slow! There's about 10 million documents and at the current rate it will take about 5 days to complete.
Can I speed this up simply? I know I can multithread it to cut it down, but have I missed something?
In order to insert new persons into People collection, use this one:
db.Interest.aggregate([
{
$project: {
Name: "$Data.Name",
BirthMonth: "$Data.DateOfBirth.Month",
BirthYear: "$Data.DateOfBirth.Year",
_id: 0
}
},
{
$merge: {
into: "People",
// requires an unique index on {Name: 1, BirthMonth: 1, BirthYear: 1}
on: ["Name", "BirthMonth", "BirthYear"]
}
}
])
For updating PersonID in Interest collection use this pipeline:
db.Interest.aggregate([
{
$lookup: {
from: "People",
let: {
name: "$Data.Name",
month: "$Data.DateOfBirth.Month",
year: "$Data.DateOfBirth.Year"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{ $eq: ["$Name", "$$name"] },
{ $eq: ["$BirthMonth", "$$month"] },
{ $eq: ["$BirthYear", "$$year"] }
]
}
}
},
{ $project: { _id: 1 } }
],
as: "interests"
}
},
{
$set: {
PersonID: { $first: "$interests._id" },
interests: "$$REMOVE"
}
},
{ $merge: { into: "Interest" } }
])
Mongo Playground

mongoose updateOne function: don't update if $pull didn't work

I'm using updateOne method like this:
Photo.updateOne(
{
"_id": photoId
},
{
"$pull": {
comments: {
_id: ObjectID(commentId),
"user.id": user.id
}
},
"$inc": { "commentCount": -1 },
},
)
Photo model which contains comments as a array and commentCount as a number. When I run the code it's working but if the photo doesn't have the comment (which I'm trying to pull) it's still incrementing commentCount by -1. What I want is, if the code does not pull any comment in photo comments, don't update the commentCount too. How can I add this rule to my code?
Thanks for help.
You can also add both fields comments._id and comments.use.id conditions in query part, if comment is not available then it will skip update and pull part.
Photo.updateOne(
{
_id: photoId,
comments: {
$elemMatch: {
_id: ObjectID(commentId),
"user.id": user.id
}
}
},
{
"$pull": {
comments: {
_id: ObjectID(commentId),
"user.id": user.id
}
},
"$inc": { "commentCount": -1 }
}
)
There is no such feature existing in Mongo, What you can do if you're using Mongo v4.2+ is use pipelined update, as the name suggests this gives you the power to use a pipeline within an update, hence allowing us to have conditions based on previous results.
Photo.updateOne(
{ "_id": photoId },
[
{
$set: {
comments: {
$filter: {
input: "$comments",
as: "comment",
cond: {
$and: [
{$ne: ["$$comment._id", ObjectID(commentId)]},
{$ne: ["$$comment.user.id", user.id]} //really necessary?
]
}
}
}
}
},
{
$set: {
commentCount: {$size: "$comments"}
}
}
]
)
For lesser versions you'll have to split it into 2 calls. no way around it.
-------------- EDIT ---------------
You can update the query to find the document using $elemMatch, if it's not found then it means the comment belonged to someone else and you can throw an error in that case.
Photo.updateOne(
{
_id: photoId,
comments: {
$elemMatch: {
_id: objectID(commentId),
"user.id": user.id
}
}
},
{
"$pull": {
comments: {
_id: ObjectID(commentId),
"user.id": user.id
}
},
"$inc": { "commentCount": -1 }
}
)

Aggregate and reduce a nested array based upon an ObjectId

I have an Event document structured like so and I'm trying to query against the employeeResponses array to gather all responses (which may or may not exist) for a single employee:
[
{
...
eventDate: 2019-10-08T03:30:15.000+00:00,
employeeResponses: [
{
_id:"5d978d372f263f41cc624727",
response: "Available to work.",
notes: ""
},
...etc
];
}
];
My current mongoose aggregation is:
const eventResponses = await Event.aggregate([
{
// find all events for a selected month
$match: {
eventDate: {
$gte: startOfMonth,
$lte: endOfMonth,
},
},
},
{
// unwind the employeeResponses array
$unwind: {
path: "$employeeResponses",
preserveNullAndEmptyArrays: true,
},
},
{
$group: {
_id: null,
responses: {
$push: {
// if a response id matches the employee's id, then
// include their response; otherwise, it's a "No response."
$cond: [
{ $eq: ["$employeeResponses._id", existingMember._id] },
"$employeeResponses.response",
"No response.",
],
},
},
},
},
{ $project: { _id: 0, responses: 1 } },
]);
As you'll no doubt notice, the query above won't work after more than 1 employee records a response because it treats each individual response as a T/F condition, instead of all of the responses within the employeeResponses array as a single T/F condition.
As a result, I had remove all subsequent queries after the initial $match and do a manual reduce:
const responses = eventResponses.reduce((acc, { employeeResponses }) => {
const foundResponse = employeeResponses.find(response => response._id.equals(existingMember._id));
return [...acc, foundResponse ? foundResponse.response : "No response."];
}, []);
I was wondering if it's possible to achieve the same reduce result above, but perhaps using mongo's $reduce function? Or refactor the aggregation query above to treat all responses within the employeeResponses as a single T/F condition?
The ultimate goal of this aggregation is extract any previously recorded employee's responses and/or lack of a response from each found Event within a current month and place their responses into a single array:
["I want to work.", "Available to work.", "Not available to work.", "No response.", "No response." ...etc]
You can use $filter with $map to reshape your data and filter by _id. Then you can keep using $push with $ifNull to provide default value if an array is empty:
db.collection.aggregate([
{
$addFields: {
employeeResponses: {
$map: {
input: {
$filter: {
input: "$employeeResponses",
cond: {
$eq: [ "$$this._id", "5d978d372f263f41cc624727"]
}
}
},
in: "$$this.response"
}
}
}
},
{
$group: {
_id: null,
responses: { $push: { $ifNull: [ { $arrayElemAt: [ "$employeeResponses", 0 ] }, "No response" ] } }
}
}
])
Mongo Playground