How to force MongoDB pullAll to disregard document order - mongodb

I have a mongoDB document that has the following structure:
{
user:user_name,
streams:[
{user:user_a, name:name_a},
{user:user_b, name:name_b},
{user:user_c, name:name_c}
]
}
I want to use $pullAll to remove from the streams array, passing it an array of streams (the size of the array varies from 1 to N):
var streamsA = [{user:"user_a", name:"name_a"},{user:"user_b", name:"name_b"}]
var streamsB = [{name:"name_a", user:"user_a"},{name:"name_b", user:"user_b"}]
I use the following mongoDB command to perform the update operation:
db.streams.update({name:"user_name", {"$pullAll:{streams:streamsA}})
db.streams.update({name:"user_name", {"$pullAll:{streams:streamsB}})
Removing streamsA succeeds, whereas removing streamsB fails. After digging through the mongoDB manuals, I saw that the order of fields in streamsA and streamsB records has to match the order of fields in the database. For streamsB the order does not match, that's why it was not removed.
I can reorder the streams to the database document order prior to performing an update operation, but is there an easier and cleaner way to do this? Is there some flag that can be set to update and/or pullAll to ignore the order?
Thank You,
Gary

The $pullAll operator is really a "special case" that was mostly intended for single "scalar" array elements and not for sub-documents in the way you are using it.
Instead use $pull which will inspect each element and use an $or condition for the document lists:
db.streams.update(
{ "user": "user_name" },
{ "$pull": { "streams": { "$or": streamsB } }}
)
That way it does not matter which order the fields are in or indeed look for an "exact match" as the current $pullAll operation is actually doing.

Related

how can I make the "updated" of mongodb stop when updating a field of a nested array?

I have a database like this:
{
"universe":"comics",
"saga":[
{
"name":"x-men",
"characters":[
{
"character":"wolverine",
"picture":"618035022351.png"
},
{
"character":"wolverine",
"picture":"618035022352.png"
}
]
}
]
},
{
"universe":"dc",
"saga":[
{
"name":"spiderman",
"characters":[
{
"character":"venom",
"picture":"618035022353.png"
}
]
}
]
}
And with this code, I update the field where name: wolverine:
db.getCollection('collection').findOneAndUpdate(
{
"universe": "comics"
},
{
$set: {
"saga.$[outer].characters.$[inner].character": "lobezno",
"saga.$[outer].characters.$[inner].picture": "618035022354.png"
}
},
/*{
"saga.characters": 1
},*/
{
"arrayFilters": [
{
"outer.name": "x-men"
},
{
"inner.character": "wolverine"
}
],
"multi":false
}
)
I want to just update the first object where there is a match, and stop it.
For example, if I have an array of 100,000 elements and the object where the match is, is in the tenth position, he will update that record, but he will continue going through the entire array and this seems ineffective to me even though he already did the update.
Note: if I did the update using an _id inside of universe.saga.characters instead of doing the update using the name, it would still loop through the rest of the elements.
How can I do it?
Update using arrayFilters conditions
I don't think it will find and update through loop, and It does not matter if collection have 100,000 sub documents, because here is nice explanation in $[<identifier>] and has mentioned:
The $[<identifier>] to define an identifier to update only those array elements that match the corresponding filter document in the arrayFilters
In the update document, use the $[<identifier>] filtered positional operator to define an identifier, which you then reference in the array filter documents. But make sure you cannot have an array filter document for an identifier if the identifier is not included in the update document.
Update using _id
Your point,
Note: if I did the update using an _id inside of universe.saga.characters instead of doing the update using the name, it would still loop through the rest of the elements.
MongoDB will certainly use the _id index. Here is the nice answer on question MongoDB Update Query Performance, from this you will get an better idea on above point
Update using indexed fields
You can create index according to your query section of update command, Here MongoDB Indexes and Indexing Strategies has explained why index is important,
In your example, lets see with examples:
Example 1: If document have 2 sub documents and when you update and check with explain("executionStats"), assume it will take 1 second to update,
quick use Mongo Playground (this platform will not support update query)
Example 2: If document have 1000 sub documents and when you update and check with explain("executionStats"), might be it will take more then 1 second,
If provide index on fields (universe, saga.characters.character and saga.characters.picture) then definitely it will take less time then usual without index, main benefit of index it will direct point to indexed fields.
quick use Mongo Playground (this platform will not support update query)
Create Index for your fields
db.maxData.createIndex({
"universe": 1,
"saga.characters.character": 1,
"saga.characters.picture": 1
})
For more experiment use above 2 examples data with index and without index and check executionStats you will get more clarity.

Use $not or $ne in update query

Should I use $not or $ne in the query:
Mytable.update({ TheThing: Thing,
'UpdatedInfo.NewInfo': {$ne: ThisNewestInfo} }, {
$push: {
UpdatedInfo: {
TheDate: ThisDate,
NewInfo: ThisNewestInfo,
Past: OriginalInfo
}
}
},
function (err, result) {
if (err){
throw new Error(err.message);
}
}
If I only want to update the document when ThisNewestInfo is not already present in UpdatedInfo array, in NewInfo object element. Trying to understand the difference between $not and $ne.
And also:
If the document does not contain UpdatedInfofield in the beginning? How should I change the update query above? Meaning that if UpdatedInfodoes not exists it adds UpdatedInfo, and later on, say next day, checks if ThisNewestInfois not already present when updating document again.
It depends on your collection actually.
The main different between $ne and $not in this scenario is that, $not performs a logical disjunction. That is if your document didn't had an UpdatedInfo field, using $not would have pushed the document while using $ne nothing would have happened to that document.
So if all your document of collection has UpdatedInfo field, its better to go with $ne.
Edit
Based on your edit you mentioned UpdatedInfo might not be present in document. In such cases you should use $not. $ne wont be able to update docs that doesn't have UpdatedInfo field.
Remember like this: $not checks for presence of key as well as value, while $ne checks only for value and ignores document that doesn't have the particular key in query.

How to ignore a certain field during search in MongoDB

suppose I have a MongDB record like below:
{
name:"name",
streams: [
{user:"user0", name:"name0", locked:true},
{user:"user1", name:"name1", locked:true},
{user:"user2", name:"name2", locked:false}
}
}
I want to find all records that have user0 and name0 in the streams field, but I don't care about the locked field
find({streams:{user:"user0", name:"name0"}}) doesn't work, since the locked field is not specified.
Thank You,
Gary
You are looking for the $elemMatch operator which allows you to select the fields from a sub-document in an array that match your conditions:
db.collection.find({
"streams": { "$elemMatch": { "user": "user0", "name": "name0"} }
})
Take some time to go through the Query Operators in the manual. There are lots of useful operations there.

How to add a field to a document which contains the result of the comparison of two other fields

I would like to speed up an query on my mongoDB which uses $where to compare two fields in the document, which seems to be really slow.
My query look like this:
db.mycollection.find({ $where : "this.lastCheckDate < this.modificationDate})
What I would like to do is add a field to my document, i.e. isCheckDateLowerThenModDate, on which I could execute a probably much faster query:
db.mycollection.find({"isCheckDateLowerThenModDate":true})
I quite new to mongoDB an have no idea how to do this. I would appreciate if someone could give me some hints or examples on
How to initialize such a field on an existing collection
How to maintain this field. Which means how to update this field when lastCheckDate or modificationDate changes.
Thanks in advance for your help!
You are thinking in a right way!
1.How to initialize such a field on an existing collection.
Most simple way is to load each document (from your language), calculate this field, update and save.
Or you could perform an update via mongo shell:
db.mycollection.find().forEach(function(doc) {
if(doc.lastCheckDate < doc.modificationDate)
{
doc.isCheckDateLowerThenModDate = true;
}
else
{
doc.isCheckDateLowerThenModDate = false;
}
db.mycollection.save(doc);
});
2.How to maintain this field. Which means how to update this field when
lastCheckDate or modificationDate changes.
You have to do it yourself from your client code. Make some wrapper for update, save operations and recalculate this value each time there. To be absolutely sure that this update works -- write unit tests.
The $where clause is slow because it is evaluating each document using the JavaScript interpreter.
There are a few alternatives:
1) Assuming your use case is "look for records that need updating", take advantage of a sparse index:
add a boolean field like needsChecking and $set this whenever the modificationDate is updated
in your "check" procedure, find the documents that have this field set (should be fast due to the sparse index)
db.mycollection.find({'needsChecking':true});
after you've done whatever check is needed, $unset the needsChecking field.
2) A new (and faster) feature in MongoDB 2.2 is the Aggregation Framework.
Here is an example of adding a "isUpdated" field based on the date comparison, and then filtering the matching documents:
db.mycollection.aggregate(
{ $project: {
_id: 1,
name: 1,
type: 1,
modificationDate: 1,
lastCheckDate: 1,
isUpdated: { $gt:["$modificationDate","$lastCheckDate"] }
}},
{ $match : {
isUpdated : true,
}}
)
Some current caveats of using the Aggregation Framework are:
you have to specify fields to include aside from _id
the result is limited to the current maximum BSON document size (16Mb in MongoDB 2.2)

MongoDB : use $ positional operator for querying

I have a collection with entries that look like :
{
"userid": 1,
"contents": [
{ "tag": "whatever", "value": 100 },
{"tag": "whatever2", "value": 110 }
]
}
I'd like to be able to query on that collection and returning only one part of the array : the one matching the query. I'm trying to use the $ positional operator to do so but it hasn't worked so far.
Here is more precisely what I'd like to do :
collection.find({'contents.tag':"whatever"},{'contents.$.value':1})
As a result I expect sth with only the value corresponding to the entry in the array that matched query, which is 100 in this case.
Do you know what's wrong ? I was thinking that maybe the $ operator can only be used for update and not for querying. Anyone in the know ?
Thanks !
Yes, you are correct - the positional operator is used for updating an object.
The solution for now would be to return the array an pull the field out in your application.
There is an open enhancement request for this feature (in queries):
https://jira.mongodb.org/browse/SERVER-828
For more information on the positional operator, see:
http://www.mongodb.org/display/DOCS/Updating#Updating-The%24positionaloperator
What you are looking for is the $elemMatch operator.
It might be an overkill, but I guess you can use map-reduce for this.
first, pre-filter with a query, emit all array elements in map, filter the ones that do not match either in emit or in reduce. If you don't set out everything will happen in RAM.
If you have to run those kinds of queries often, it might be worthwhile to duplicate the data.
Nevertheless, I hope the SERVER-828 will be implemented soon!
Create a query with $in instead and add your equal value to the array, this can solve your issue
$users_array = array(xxxxxxxx,yyyyyy);
$user = Db::find('fb_users', array(
'facebook_id' => array(
'$in' => array($users_array)
)
));