How to migrate value of nested subdocument in array in Mongoose

How to migrate value of nested subdocument in array in Mongoose - mongodb

I have a Mongoose collection called Track that has an array of fitnessPlan subdocuments, each of which currently has a month field that needs to be changed to week in production. I am using mongoose-migrate to migrate these values from the old month field to a new week field. Here's what I have got at the moment:
async function up () {
await Track.updateMany({},
{
$set: {
'fitnessPlans.$[elem].month': '$fitnessPlans.$[elem].week',
},
},
{ arrayFilters: [{ "elem.week": { $gte: 0 } }], strict: false, });
await Track.updateMany({},
{
$unset: {
'fitnessPlans.$[elem].week': '',
},
},
{ arrayFilters: [{ "elem.week": { $gte: 0 } }], strict: false, });
}
However, mongoose-migrate is throwing the following error:
Cast to number failed for value "$fitnessPlans.$[elem].week" at path "month"
I'm guessing this is because the string isn't evaluating correctly, but I'm not sure how else to reference that field's value in this setting.

Try update with aggregation pipeline starting from MongoDB 4.2,
$map to iterate loop of fitnessPlans array merge objects with current and new created week field using $mergeObjects
$unset month field
async function up () {
await Track.updateMany({},
[
{
$set: {
fitnessPlans: {
$map: {
input: "$fitnessPlans",
in: {
$mergeObjects: ["$$this", { week: "$$this.month" }]
}
}
}
}
},
{ $unset: "fitnessPlans.month" }
],
{ strict: false });
}
Playground

Related

MongoDB: Get mactched document from findOneAndUpdate before update

I'm performing a MongoDB query to update a document like below -
await this.activity.findOneAndUpdate(
{ _id: activityId },
{
$set: { isFlagged: //boolean_value },
},
);
In the update part of this query, is there a way to get the matched document from the previous step?
Basically, to do something like this -
const data = await this.activity.findOne({_id: activityId})
await this.activity.findOneAndUpdate(
{ _id: activityId },
{
$set: { isFlagged: !data.isFlagged }, //toggle between the previous boolean value
},
);
Is there a way to achieve this in a single query?

Use $not in an update with aggregation pipeline.
db.collection.update({
"activityId": 1
},
[
{
$set: {
isFlagged: {
$not: "$isFlagged"
}
}
}
])
Mongo Playground

MongoDB set of values with a limit size

I am updating a list of transactions by saving the transaction into the database list, I do not want to have duplicate entries in the list so I use $addtoset
this is because the request can be fired multiple times and we want to make sure that any changes are idempotent to the database. the only catch now is that we want to only store the latest 20 transactions
this could be done with a $push $sort $slice but I need to make sure duplicate entries are not available. there was a feature request to mongo back in 2015 for this to be added to the $addtoset feature, but they declined this due to 'sets' not being in an order...
which is what the $sort function would have been
I thought I could simply append an empty push update to the update object, but from what I understand, each update is potentially threaded and can lead to undesirable edits if the push/slice fires before the $addtoset
right now, the values are an aggregated string with the following formula
timestamp:value but I can easily change the structure to an object
{ts:timestamp, value:value}
Update:
current code, not sure if it will work as intended as each operation maybe independent
await historyDB
.updateOne(
{ trxnId: txid },
{
$addToSet: {
history: {
ts: time,
bid: bid.value,
txid: trxn.txid,
}
},
$push: {
history: {
$each: [{ts:-1}],
$sort: { ts: 1 },
$slice: -10,
},
},
},
{ upsert: true },
).exec();

Your query doesn't work, as you are trying to update history multiple times, which is not allowed in simple update document and raises error Updating the path 'history' would create a conflict at 'history'.
You can however subsequently update history field multiple times with aggregation pipeline.
await historyDB.updateOne(
{ trxnId: txid},
[{
$set: {
history: {
$let: {
vars: {
historyObj: {
ts: time,
bid: bid.value,
txid: trxn.txid,
},
historySafe: { $ifNull: ["$history", []] }
},
in: {
$cond: {
if: { $in: ["$$historyObj", "$$historySafe"] },
then: "$history",
else: { $concatArrays: [ "$$historySafe", ["$$historyObj"] ] }
}
}
}
}
},
},
{
$set: {
history: {
$function: {
body: function(entries) {
entries.sort((a, b) => a.ts - b.ts);
return entries;
},
args: [{ $ifNull: ["$history", []] }],
lang: "js"
}
}
},
},
{
$set: {
history: {
$slice: [ "$history", -10 ]
}
}
}],
{ upsert: true },
).exec()
As of MongoDB 6.0, the second $set stage, which provides sorting, can be replaced with $sortArray operator (see here).

How do you consistently migrate a large MongoDB collection?

I was trying to migrate a large MongoDB of ~600k documents, like so:
for await (const doc of db.collection('collection').find({
legacyProp: { $exists: true },
})) {
// additional data fetching from separate collections here
const newPropValue = await fetchNewPropValue(doc._id)
await db.collection('collection').findOneAndUpdate({ _id: doc._id }, [{ $set: { newProp: newPropValue } }, { $unset: ['legacyProp'] }])
}
}
When the migration script finished, data was still being updated for about 30 minutes or so. I've concluded this by computing document count of documents containing legacyProp property:
db.collection.countDocuments({ legacyProp: { $exists: true } })
which was decreasing on subsequent calls. After a while, the updates stopped and the final document count of documents containing legacy prop was around 300k, so the update failed silently resulting in a data loss. I'm curious what exactly happened, and most importantly, how do you update large MongoDB collections without any data loss? Keep in mind, there is additional data fetching involved before every update operation.

My first attempt would be to build function of fetchNewPropValue() in an aggregation pipeline.
Have a look at Aggregation Pipeline Operators
If this is not possible then you can try to put all newPropValue's into array and use it like this. 600k properties should fit easily into your RAM.
const newPropValues = await fetchNewPropValue() // getting all new properties as array [{_id: ..., val: ...}, {_id: ..., val: ...}, ...]
db.getCollection('collection').updateMany(
{ legacyProp: { $exists: true } },
[
{
$set: {
newProp: {
$first: {
$filter: { input: newPropValues, cond: { $eq: ["$_id", "$$this._id"] } }
}
}
}
},
{ $set: { legacyProp: "$$REMOVE", newProp: "$$newProp.val" } }
]
)
Or you can try bulkWrite:
let bulkOperations = []
db.getCollection('collection').find({ legacyProp: { $exists: true } }).forEach(doc => {
const newPropValue = await fetchNewPropValue(doc._id);
bulkOperations.push({
updateOne: {
filter: { _id: doc._id },
update: {
$set: { newProp: newPropValue },
$unset: { legacyProp: "" }
}
}
});
if (bulkOperations.length > 10000) {
db.getCollection('collection').bulkWrite(bulkOperations, { ordered: false });
bulkOperations = [];
}
})
if (bulkOperations.length > 0)
db.getCollection('collection').bulkWrite(bulkOperations, { ordered: false })

Aggregate and reduce a nested array based upon an ObjectId

I have an Event document structured like so and I'm trying to query against the employeeResponses array to gather all responses (which may or may not exist) for a single employee:
[
{
...
eventDate: 2019-10-08T03:30:15.000+00:00,
employeeResponses: [
{
_id:"5d978d372f263f41cc624727",
response: "Available to work.",
notes: ""
},
...etc
];
}
];
My current mongoose aggregation is:
const eventResponses = await Event.aggregate([
{
// find all events for a selected month
$match: {
eventDate: {
$gte: startOfMonth,
$lte: endOfMonth,
},
},
},
{
// unwind the employeeResponses array
$unwind: {
path: "$employeeResponses",
preserveNullAndEmptyArrays: true,
},
},
{
$group: {
_id: null,
responses: {
$push: {
// if a response id matches the employee's id, then
// include their response; otherwise, it's a "No response."
$cond: [
{ $eq: ["$employeeResponses._id", existingMember._id] },
"$employeeResponses.response",
"No response.",
],
},
},
},
},
{ $project: { _id: 0, responses: 1 } },
]);
As you'll no doubt notice, the query above won't work after more than 1 employee records a response because it treats each individual response as a T/F condition, instead of all of the responses within the employeeResponses array as a single T/F condition.
As a result, I had remove all subsequent queries after the initial $match and do a manual reduce:
const responses = eventResponses.reduce((acc, { employeeResponses }) => {
const foundResponse = employeeResponses.find(response => response._id.equals(existingMember._id));
return [...acc, foundResponse ? foundResponse.response : "No response."];
}, []);
I was wondering if it's possible to achieve the same reduce result above, but perhaps using mongo's $reduce function? Or refactor the aggregation query above to treat all responses within the employeeResponses as a single T/F condition?
The ultimate goal of this aggregation is extract any previously recorded employee's responses and/or lack of a response from each found Event within a current month and place their responses into a single array:
["I want to work.", "Available to work.", "Not available to work.", "No response.", "No response." ...etc]

You can use $filter with $map to reshape your data and filter by _id. Then you can keep using $push with $ifNull to provide default value if an array is empty:
db.collection.aggregate([
{
$addFields: {
employeeResponses: {
$map: {
input: {
$filter: {
input: "$employeeResponses",
cond: {
$eq: [ "$$this._id", "5d978d372f263f41cc624727"]
}
}
},
in: "$$this.response"
}
}
}
},
{
$group: {
_id: null,
responses: { $push: { $ifNull: [ { $arrayElemAt: [ "$employeeResponses", 0 ] }, "No response" ] } }
}
}
])
Mongo Playground

Update Multiple Sub Doc By array of sub doc _id's in mongodb

I am trying to update multiple sub documents by given array of sub documents id's. I tried multiple approaches but it's not working.
In my scenario i need to update multiple sub documents by given array of id's. Here is my query as below:
Approach 1. (No elements were updating)
var updated = await ModelName.update(
{
'subDocArray._id' : { $in: req.body.elementId }
},
{
$set: {
'subDocArray.$[elem].abc': req.body.abcValue,
'subDocArray.$[elem].xyz': req.body.xyzValue
},
},{ "arrayFilters": [{ "elem._id": { $in: req.body.elementId } }], "multi": true, "upsert": true }
).lean().exec();
Approach 2: (Only First occurred element is updating)
var updated = await ModelName.update(
{
'subDocArray._id' : { $in: req.body.elementId }
},
{
$set: {
'subDocArray.$.abc': req.body.abcValue,
'subDocArray.$.xyz': req.body.xyzValue
},
},{ multi: true}
).exec();
Here req.body.elementId is array of sub doc id's.

Approach 1 was almost right. I was passing array of elementId's are which are in string format so i converted them in ObjectId form and then it works.
var arrOfObjectId = [];
req.body.elementId.forEach(elem => {
arrOfObjectId.push(Types.ObjectId(elem))
});
To find the difference between both of the array i printed both in console which were showing like below:
console.log(req.body.elementId)
Result: ['xxxxxxxxxxxxxxxxxxxxxxxx','yyyyyyyyyyyyyyyyyyyyyyyy'] //WRONG
console.log(arrOfObjectId)
Result: [ObjectId('xxxxxxxxxxxxxxxxxxxxxxxx'),ObjectId('yyyyyyyyyyyyyyyyyyyyyyyy')] //RIGHT
var updated = await ModelName.update(
{
'subDocArray._id' : { $in: arrOfObjectId }
},
{
$set: {
'subDocArray.$[elem].abc': req.body.abcValue,
'subDocArray.$[elem].xyz': req.body.xyzValue
},
},{ "arrayFilters": [{ "elem._id": { $in: arrOfObjectId } }], "multi": true, "upsert": true }
).lean().exec();