MongoDB creating index on a new field - mongodb

I need to create a TTL Index in MongoDB and for that, I'm adding a new field "last_modified". I'm using the latest Python and pymongo in case this makes any difference.
After reading information on sparse and non-sparse indexes I understand that all documents that do not have "last_modified" will have this field added with the null value.
Is there a way to set some default value instead of null for those documents?
Alternatively, I'll have to update all documents in all DB instances using some migration function, but I would really like to do it clean...
Thanks in advance for any links or ideas!

I understand that all documents that do not have "last_modified" will have this field added with the null value.
No this is not true, sparse indexes just index documents where the field exists. documents without this field will just be out of the index converage.
Is there a way to set some default value instead of null for those documents? ... I'll have to update all documents in all DB instances ... to do it clean...
You have to run an update, there is no magic solution. Not sure why doing this is "not clean".
The update is super simple:
// this query will allow you to execute the update even if you started streaming new events with this field.
db.collection.updateMany({ last_modified: {$exists: false} }, { $set: { last_modified: defaultValue }})

Related

Using object as _id in MongoDb causes collscan on queries

I'm having some issues with using a custom object as my _id value in MongoDb.
The objects I'm storing in _id looks like this:
"_id" : {
"EDIEL" : "1010101010101",
"StartDateTicks" : NumberLong(636081120000000000)
}
Now, when I'm performing the following query:
.find({
"_id.EDIEL": { $eq: "1010101010101" },
"_id.StartDateTicks": { $gte: 636082776000000000, $lt: 636108696000000000 }
}).explain()
I does a COLLSCAN. I can't figure out why exactly. Is it because I'm not querying against the _id object with an object?
Does anyone know what I'm doing wrong here? :-)
Edit:
Tried to create a compound index containing the EDIEL and StartDateTicks fields, ran the query again and now it uses the index instead of a column scan. While this works, it would still be nice to avoid having the extra index and just having the _id (since it's basically a "free" index) So, the question still stands: why can't I query against the _id.EDIEL and _id.StartDateTicks and make use of the index?
Indexes are used on keys and not on objects, so when you use object for _id, the indexing on object can't be used for the specific query you do on the field of the object.
This is true not only for _id but subdocument also.
{
"name":"awesome book",
"detail" :{
"pages":375,
"alias" : "AB"
}
}
Now when you have index on detail and you query by detail.pages or detail.alias, the index on detail cannot be used and certainly not for range queries. You need to have indexes on detail.pages and detail.alias.
when index is applied on object it maintains the index of object as a whole and not per field, that's why queries on object fields are not able to use object indexes.
Hope that helps
You will need to index the two fields separately, since indexes cant be on embedded documents. Thus creating a compound index is the only option available, or creating multiple indexes on the fields which in turn use intersection index are the options for you.

Manually change MongoID

Through the PHP problem when inserting stuff into MongoDB explained in the answer by Denver Matt in this question I created duplicate IDs in a dataset in MongoDB. Fixing the PHP code is easy, but to be able to still use my dataset I wonder:
Can I change the MongoId manually without breaking something? Or can I just reset this ID somehow into a new one?
The _id field of a document is immutable, as discussed in this documentation. Attempting to modify its value would result in an error/exception, as in:
> db.foo.drop()
> db.foo.insert({ _id: 1 })
> db.foo.update({ _id: 1 }, { $set: { _id: 3 }})
Mod on _id not allowed
> db.foo.find()
{ "_id" : 1 }
If you do need to alter the identifier of a document, you could fetch it, modify the _id value, and then re-persist the document using insert() or save(). insert() may be safer on the off chance that you new _id value conflicts and you're rather see a uniqueness error than overwrite the existing document (as save() would do). Afterwards, you'll need to go back and remove the original document.
Since you can't do all of this in a single atomic transaction, I would suggest the following order of operations:
findOne() existing document by its _id
Modify the returned document's _id property
insert() the modified document back into the collection
If the insert() succeeded, remove() the old document by the original _id

Updating multiple documents in mongodb using _id field

I have sample products table and would like to update multiple documents using _id field. Every time, I try this it only updates the first doc in the $in clause I mentioned , not updating all.
db.products.update({_id:{$in:[ObjectId("507d95d5719dbef170f15bff"),
ObjectId("507d95d5719dbef170f15c01"), ObjectId("507d95d5719dbef170f15c00")]}},
{$set:{'monthly_price':7865}}, {multi:true})
You can first try running find on the products table to make sure that all the object ids actually exist.
You can also try explain command
give this a try:
db.<collection>.update( { query }, {$set: {monthly_price:7865}}, false, true)
I think the object id's which you have given doesn't exist in the collection.
I tried using the following query and it worked for me.
db.test.update({_id:{$in:[ObjectId("57b33483e5b9ce24f4910855"),
ObjectId("57b33483e5b9ce24f4910856"),
ObjectId("57b33489e5b9ce24f4910857"),
ObjectId("57b33491e5b9ce24f4910858")
]
}
},
{$set{'isCurrentStatus':true}},
{multi:true}
)

Doing an upsert in mongo, can I specify a custom query for the "insert" case? [duplicate]

I am trying to use upsert in MongoDB to update a single field in a document if found OR insert a whole new document with lots of fields. The problem is that it appears to me that MongoDB either replaces every field or inserts a subset of fields in its upsert operation, i.e. it can not insert more fields than it actually wants to update.
What I want to do is the following:
I query for a single unique value
If a document already exists, only a timestamp value (lets call it 'lastseen') is updated to a new value
If a document does not exists, I will add it with a long list of different key/value pairs that should remain static for the remainder of its lifespan.
Lets illustrate:
This example would from my understanding update the 'lastseen' date if 'name' is found, but if 'name' is not found it would only insert 'name' + 'lastseen'.
db.somecollection.update({name: "some name"},{ $set: {"lastseen": "2012-12-28"}}, {upsert:true})
If I added more fields (key/value pairs) to the second argument and drop the $set, then every field would be replaced on update, but would have the desired effect on insert. Is there anything like $insert or similar to perform operations only when inserting?
So it seems to me that I can only get one of the following:
The correct update behavior, but would insert a document with only a subset of the desired fields if document does not exist
The correct insert behavior, but would then overwrite all existing fields if document already exists
Are my understanding correct? If so, is this possible to solve with a single operation?
MongoDB 2.4 has $setOnInsert
db.somecollection.update(
{name: "some name"},
{
$set: {
"lastseen": "2012-12-28"
},
$setOnInsert: {
"firstseen": <TIMESTAMP> # set on insert, not on update
}
},
{upsert:true}
)
There is a feature request for this ( https://jira.mongodb.org/browse/SERVER-340 ) which is resolved in 2.3. Odd releases are actually dev releases so this will be in the 2.4 stable.
So there is no real way in the current stable versions to do this yet. I am afraid the only method is to actually do 3 conditional queries atm: 1 to check the row, then a if to either insert or update.
I suppose if you had real problems with lock here you could do this function with sole JS but that's evil however it would lock this update to a single thread.

MongoDB alter field

I have an collection "companies" and the documents have this structure:
id, name, address, branch, city
I want to add an keyword field that will have an index, so I can do a fulltext search, but how can I add a field to each document?
Thanks for help
There's no schema in MongoDB, so you don't have to add a field to every document.
Just start writing new documents with this field, or update old documents when you have this value for them.
As for indexing, now you can leverage sparse indexes, they will be more efficient if most of your documents don't have this field.
Also, you might want this keyword field to be an array. It can be handled more efficiently than a string.
If you want to add a field with the same value to all documents in a collection, you can use this:
db.collection.update({}, // update all documents
{$set : {keyword : []}}, // or another value
false, // is upsert?
true) // is multi-update?
When you do a $set, you can't use values from other fields. So if this new value is going to be a function of other fields, you have no other option, but to loop through the documents and update them one by one.