I have a Mongo 4.2.0 instance here on my development environment with a simple collection of only 300 entries.
I've build some basic queue handling juggling with some date fields.
To get an document that should be updated I have the following $expr-query, which runs very slow imho.
db.collection("myupdates").findOneAndUpdate({
$expr: {
$and: [
{ $gt: ["$shouldUpdate", "$updatedAt"] },
{ $gt: ["$shouldUpdate", "$isUpdatingAt"] },
{ $gt: ["$shouldUpdate", "$updateErroredAt"] },
]
},
}, {
$set: {
isUpdatingAt: new Date(),
},
});
This query takes around ~120ms after warmup on my standard year 2019 laptop. Where my other simple queries only take ~3ms.
Although it doesn't really matter to set indexes with 300 documents, I've tried of course to set them all. Single to compound indexes. This does not do the trick.
It's also not the findOneAndUpdate, with countDocuments I achieve the same slow speed.
Is this the normal speed of an $expr or aggregation syntax? What did I wrong? Is there a better way to achieve this? Do I have to use Redis for this use case?
Possible solution
As #Neil Lunn pointed out in the answers, calculated conditions do not utilize an index and should be the last resort.
So I just got rid of the calculated condition by splitting the query into 2 queries. The first query is getting an actual value I can match with.
These 2 queries boil down to ~10ms total, which is much better then 120ms.
const shouldUpdateDateResult = await mongo.db.collection("myupdates").findOne({
shouldUpdate: { $exists: true }
}, {
shouldUpdate: 1,
});
const shouldUpdateDate = shouldUpdateDateResult && shouldUpdateDateResult.shouldUpdate;
const result = await mongo.db.collection("myupdates").findOneAndUpdate({
$and: [
{ shouldUpdate: shouldUpdateDate },
{ $or: [
{ updatedAt: { $eq: null } },
{ updatedAt: { $exists: false } },
{ updatedAt: { $lte: shouldUpdateDate } }
] },
{ $or: [
{ isUpdatingAt: { $eq: null } },
{ isUpdatingAt: { $exists: false } },
{ isUpdatingAt: { $lte: shouldUpdateDate } }
] },
{ $or: [
{ updateErroredAt: { $eq: null } },
{ updateErroredAt: { $exists: false } },
{ updateErroredAt: { $lte: shouldUpdateDate } }
] },
],
}, {
$set: {
isUpdatingAt: new Date(),
},
});
The whole idea behind this is a processing queue usable by multiple workers.
Related
I try to update array of object with mongoose methodes. When i try with vanila JS it worked but with mongoose not.
model:
const exampleSchema = new mongoose.Schema({
arrayOfObjects: [
{ name: String, id: mongoose.Schema.Types.ObjectId },
],
});
find and update by vanila js
const example = await Example.findById(req.body.propertyX);
const validIndex = example.arrayOfObjects.findIndex((v) => v.propertyY === req.body.Y);
if (validIndex === -1) {
example.arrayOfObjects.push({ propertyY: req.body.Y, propertyZ: req.body.Z });
} else {
example.arrayOfObjects[validIndex] = { propertyY: req.body.Y, propertyZ: req.body.Z };
console.log('update');
}
await recipe.save();
but when I try use findByIdAndUpdate , $set methode dont work (even $push not working...push is pushing new object id without req.body fields)
mongoose findByIdAndUpdate
const example = await Example.findByIdAndUpdate(req.body.x, {
// arrayOfObjects: { $push: { propertyY: req.body.Y, propertyX: req.body.X} },
$set: { 'arrayOfObjects.$.propertyY': req.body.Y, 'arrayOfObjects.$.propertyX': req.body.X },
});
The issue is with your understand of the positional operator $, from the docs:
the positional $ operator acts as a placeholder for the first element that matches the query document, and
This means it excepts to find a match in the array based on the query, in your case the query does not contain anything regarding the voted array, so you get the following error:
[The positional operator did not find the match needed from the query.]
So what can we do? actually doing the update you want is not so trivial, it only became possible in recent years with the introduction of pipelined updates which allow you to use aggregation operators in your update body, now we can do what you want like so:
db.collection.findByIdAndUpdate(req.body.postId,
[
{
$set: {
voted: {
$ifNull: [
"$voted",
[]
]
}
}
},
{
$set: {
voted: {
$concatArrays: [
{
$filter: {
input: "$voted",
cond: {
$ne: [
"$$this.voterId",
req.body.userId
]
}
}
},
[
{
$mergeObjects: [
{
$ifNull: [
{
$arrayElemAt: [
{
$filter: {
input: "$voted",
cond: {
$eq: [
"$$this.voterId",
req.body.userId
]
}
}
},
0
]
},
{}
]
},
{
voteRank: req.body.rank,
voterId: req.body.userId
}
]
}
]
]
}
}
}
])
Mongo Playground
You can drop the $mergeObjects operator if you don't need it, I added it incase the object could have additional properties that you want to preserve throughout an update. but probably not the case.
It then simplifies the code a little:
db.collection.findByIdAndUpdate(req.body.postId,
[
{
$set: {
voted: {
$ifNull: [
'$voted',
[],
],
},
},
},
{
$set: {
voted: {
$concatArrays: [
{
$filter: {
input: '$voted',
cond: {
$ne: [
'$$this.voterId',
req.body.userId,
],
},
},
},
[
{
voteRank: req.body.rank,
voterId: req.body.userId
}
],
],
},
},
},
]);
I was trying to migrate a large MongoDB of ~600k documents, like so:
for await (const doc of db.collection('collection').find({
legacyProp: { $exists: true },
})) {
// additional data fetching from separate collections here
const newPropValue = await fetchNewPropValue(doc._id)
await db.collection('collection').findOneAndUpdate({ _id: doc._id }, [{ $set: { newProp: newPropValue } }, { $unset: ['legacyProp'] }])
}
}
When the migration script finished, data was still being updated for about 30 minutes or so. I've concluded this by computing document count of documents containing legacyProp property:
db.collection.countDocuments({ legacyProp: { $exists: true } })
which was decreasing on subsequent calls. After a while, the updates stopped and the final document count of documents containing legacy prop was around 300k, so the update failed silently resulting in a data loss. I'm curious what exactly happened, and most importantly, how do you update large MongoDB collections without any data loss? Keep in mind, there is additional data fetching involved before every update operation.
My first attempt would be to build function of fetchNewPropValue() in an aggregation pipeline.
Have a look at Aggregation Pipeline Operators
If this is not possible then you can try to put all newPropValue's into array and use it like this. 600k properties should fit easily into your RAM.
const newPropValues = await fetchNewPropValue() // getting all new properties as array [{_id: ..., val: ...}, {_id: ..., val: ...}, ...]
db.getCollection('collection').updateMany(
{ legacyProp: { $exists: true } },
[
{
$set: {
newProp: {
$first: {
$filter: { input: newPropValues, cond: { $eq: ["$_id", "$$this._id"] } }
}
}
}
},
{ $set: { legacyProp: "$$REMOVE", newProp: "$$newProp.val" } }
]
)
Or you can try bulkWrite:
let bulkOperations = []
db.getCollection('collection').find({ legacyProp: { $exists: true } }).forEach(doc => {
const newPropValue = await fetchNewPropValue(doc._id);
bulkOperations.push({
updateOne: {
filter: { _id: doc._id },
update: {
$set: { newProp: newPropValue },
$unset: { legacyProp: "" }
}
}
});
if (bulkOperations.length > 10000) {
db.getCollection('collection').bulkWrite(bulkOperations, { ordered: false });
bulkOperations = [];
}
})
if (bulkOperations.length > 0)
db.getCollection('collection').bulkWrite(bulkOperations, { ordered: false })
I have written a find query, which works, the find query returns records where name and level exist
db.docs.find( { $and: [{name:{$exists:true}},{level:{ $exists:true}} ] },{_id:0, name:1}).sort({"name":1})
and now want to combine it with something like the code below which also works, but needs to be merged with the above to pull the correct data
db.docs.aggregate(
[
{
$project:
{
_id:0,
name: 1,
Honours:
{
$cond: { if: { $gte: [ "$level", 8 ] }, then: "True", else: "False" }
}
}
}
]
)
The find query returns records where name and level exist, but I need to enhance the result with new column called Honours, showing True of False depending on whether the level is gte (greater than or equal to 8)
So I am basically trying to combine the above find filter with the $cond function (which I found and modified example here : $cond)
I tried the below and a few other permutations to try and make find and sort with the $project and$cond aggregate, but it returned errors. I am just very new to how to construct mongodb syntax to make it all fit together. Can anyone please help?
db.docs.aggregate(
[{{ $and: [{name:{$exists:true}},{level:{ $exists:true}} ] },{_id:0, name:1}).sort({"name":1}
{
$project:
{
_id:0,
name: 1,
Honours:
{
$cond: { if: { $gte: [ "$level", 8 ] }, then: "True", else: "False" }
}
}
}
}
]
)
Try below aggregation pipeline :
db.docs.aggregate([
/** $match is used to filter docs kind of .find(), lessen the dataset size for further stages */
{
$match: {
$and: [{ name: { $exists: true } }, { level: { $exists: true } }]
}
},
/** $project works as projection - w.r.t. this projection it will lessen the each document size for further stages */
{
$project: {
_id: 0,
name: 1,
Honours: {
$cond: { if: { $gte: ["$level", 8] }, then: "True", else: "False" }
}
}
},
/** $sort should work as .sort() */
{ $sort: { name: 1 } }
]);
How would we go about finding records where multiple conditions are true within the same sub-document while at least one of these conditions is negated?
db.getCollection('clients').find( {
data: { '$exists': true },
'data.updates': { '$elemMatch': {
name: { $not: /^KB3109103/i },
install_date: { $gt: 128573812 }
} }
});
This returns all records because $not doesn't seem to work inside $elemMatch.
Solution:
Found a work around, adding $and (which according to the documentation is the same as without) solved it.
db.getCollection('clients').find( {
data: { '$exists': true },
'data.updates': { '$elemMatch': {
$and: [
{name: { $not: /^KB3109103/i }},
{install_date: { $gt: 128573812 }}
]
} }
});
var otherLanguages=[ "English","Arabic","French"];
var first, second;
db.collection.find({ $and: [ { "Language" : { $nin : otherLanguages} },{"Language":{ $ne:null}} ]}).forEach(function(obj){
shell out 341 docs one by one. In these docs,I want to find out documents that satisfy two if statements. Later, I want to collect the count it.
if (obj.find({ $and: [{'POS': { $eq: "Past" } },{'Desp': { $ne: null } }] })) { first= first+1;}
if (obj.find({ $and: [{'POS': { $eq: "Past" } },{'Desp': { $eq: null } }] })) {second= second+1;}
});
print (first,second)
I know that I cannot use find() function on the obj, but Is there a way to search on this "bson obj" to find the count.
If this is not feasible, then please suggest a way to get the desired result.
If I understand your question correctly you can achieve that by using the aggregation framework like so:
db.collection.aggregate({
// filter out all documents that you don't care about
$match: {
"Language": { $nin: otherLanguages, $ne: null },
"POS": "Past"
},
}, {
// then split into groups...
$group: {
_id: { $eq: [ "$Desp", null ] }, // ...one for the "eq: null" and one for the "ne: null"
"count": { $sum: 1 } // ...and count the number of documents in each group
}
})