I'm trying to atomically insert an empty document if the capped collection is empty or return the last naturally sorted document if not empty. Can I do this with findAndModify?
db.collection.findAndModify({
query: { _id: { $exists: true }},
sort: { $natural: -1 },
update: {},
upsert: true,
new: true
});
I would have expected this to either return the latest document (if the collection is non empty) or insert a new document if none exist, however, it inserts a blank document (without an _id) every single time it's called. Does findAndModify work with capped collections? I need the added document to have an _id.
Thanks.
-Scott
I'm trying to atomically insert an empty document if the capped
collection is empty or return the last naturally sorted document if
not empty. Can I do this with findAndModify?
There is a flaw in your query logic. A findAndModify() with:
query: { _id: { $exists: true }},
sort: { $natural: -1 },
update: {},
upsert: true,
new: true
... will:
do an update on the last inserted record with an _id set
OR
insert a new (empty) document if no existing document with an _id is found.
The update is going to replace your last inserted record with an empty one .. which presumably is not the intended outcome :).
You are seeing a completely empty document (no _id field) because capped collections have some exceptions to the behaviour for standard collections.
In particular:
there is no requirement for an _id field by default; you can have one generated on the server by including the autoIndexId:true option to createCollection()
there is no index on the _id field (note: you will want a unique index if using replication with a capped collection)
Also note that documents in a capped collection must not grow in size or the update will fail
Refer to the Capped Collection Usage & Restrictions on the wiki for more info.
Related
I want to depict the following use case using MongoDb:
I want to read from a collection and memorize that particular point in time.
When writing the next time to that collection, I want to not be able to write a new document, if another document has been added to that collection in between.
Using a timestamp property on the documents would be ok.
Is this possible?
One trick is use findAndModify
Assume at the time of reading, your most recent timestamp on a document is oldTimestamp:
db.collection.findAndModify({
query: {timestamp: {$gt: oldTimestamp}},
new: true, // Return modified / inserted document
upsert: true, // Update if match found, insert otherwise
update: {
$setOnInsert: {..your document...}
}
})
This will not insert your document if another document is inserted between your read and write operation.
However, this won't let you know that the document is inserted or not directly.
You should compare returned document with your proposed document to find that out.
In case using nodejs driver, the correct pattern should be:
collection.findAndModify(criteria[, sort[, update[, options]]], callback)
According to the example, our query should be:
db.collection('test').findAndModify(
{timestamp: {$gt: oldTimestamp}}, // query, timestamp is a property of your document, often set as the created time
[['timestamp','desc']], // sort order
{$setOnInsert: {..your document..}}, // replacement, replaces only the field "hi"
{
new: true,
upsert: true
}, // options
function(err, object) {
if (err){
console.warn(err.message); // returns error if no matching object found
}else{
console.dir(object);
}
});
});
This can be achieved, using a timestamp property in every document. You can take a look at the Mongoose Pre Save path validation hook . Using this hook, you can write something like this.
YourSchema.path('timestamp').validate(function(value, done) {
this.model(YourSchemaModelName).count({ timestamp: {$gt : value} }, function(err, count) {
if (err) {
return done(err);
}
// if count exists and not zero hence document is found with greater timestamp value
done(!count);
});
}, 'Greater timestamp already exists');
Sounds like you'll need to do some sort of optimistic locking at the collection level. I understand you are writing new documents but never updating existing ones in this collection?
You could add an index on the timestamp field, and your application would need to track the latest version of this value. Then, before attempting a new write you could lookup the latest value from the collection with a query like
db.collection.find({}, {timestamp: 1, _id:0}).sort({timestamp:-1}).limit(1)
which would project just the maximum timestamp value using a covered query which is pretty efficient.
From that point on, it's up to your application logic to handle the 'conflict'.
db.books.update({bookId:"123461"},{$set:{"bookPrice":"6.23"}})
I get the error:
update { q: { bookId: "123461" }, u: { $set: { bookPrice: "6.23" } }, multi: false, upsert: false } does not contain _id or shard key for pattern { _id: "hashed" }
But when I use below it works.
db.books.update({_id:ObjectId("54b88167498ec382221a82c2")},{$set: {"bookPrice":"6.23"}})
Why doesn't it work with the bookId
The reason is because you are on a sharded cluster and you are identifying the document you wish to update with an index that is neither unique nor is it the shard key however you have specified that you only want to update 1 document. (multi: false)
Consider that when you make a query that does not include the shard key, mongodb has to scatter the query to ALL of the shards because there is no way for mongos to figure out which shard the document might be on.
So if mongos were to broadcast your query to all of the shards, two or more of them may find a document that matches your query and they would both update the document they found. This would violate your {multi: false}.
Now perhaps you know that bookId is a unique identifier, but your mongodb cluster does not. Is there any chance that you could replace bookId with the _id? That is, could you change the document so that instead of having a bookId field, it has _id: "123461"? Or if you know bookId is unique you could just set multi: true. Though it will not be an efficient operation since the command will have to be sent to all of the shards even though the document is only in one of them.
This is my MongoDB query:
db.events.update({date:{$gte: ISODate("2014-09-01T00:00:00Z")}},{$set:{"artists.$.soundcloud_toggle":false}},{multi:true,upsert:false})
Apparently I cannot use "artists.$.soundcloud_toggle" to update all artist documents within the artists array:
"The $ operator can update the first array element that matches
multiple query criteria specified with the $elemMatch() operator.
http://docs.mongodb.org/manual/reference/operator/update/positional/"
I'm happy to run the query a number of times changing the index of the array in order to set the soundcloud_toggle property of every artist in every event that matches the query e.g
artists.0.soundcloud_toggle
artists.1.soundcloud_toggle
artists.2.soundcloud_toggle
artists.3.soundcloud_toggle
The problem is: when there is say, only one artist document in the artists array and I run the query with "artists.1.soundcloud_toggle" It will insert an artist document into the artist array with a single property:
{
"soundcloud_toggle" : true
},
(I have declared "upsert:false", which should be false by default anyways)
How do I stop the query from inserting a document and setting soundcloud_toggle:false when there is no existing document there? I only want it to update the property if an artist exists at the given artists array index.
If, like you said, you don't mind completing the operation with multiple queries, you can add an $exists condition to your filter.
E.g. in the 5th iteration, when updating index=4, add: "artists.4": {$exists: true}, like:
db.events.update(
{ date: {$gte: ISODate("2014-09-01T00:00:00Z")},
"artists.4": {$exists: true} },
{ $set:{ "artists.4.soundcloud_toggle" :false } },
{ multi: true, upsert: false }
)
I have a huge mongodb collection with 6 million records. I have two fields (latitude, longitude), and I would like to add a third field to the collection with the type of point (spatial). How to do this in command line or PHP?
It you'd like to add a new field (with the same value) to all documents in a collection, that can be done easily with an update() operation. Consider the following shell example:
db.collection.update(
{},
{ $set: { type: "spatial" }},
{ multi: true }
);
This would set the type field to "spatial" for all documents matching empty criteria {} (i.e. everything), and the multi option allows the update to modify multiple documents instead of just the first document matched (default behavior).
If you only wanted to set the type field where it doesn't already exist, you could tweak the criteria like so:
db.collection.update(
{ type: { $exists: false }},
{ $set: { type: "spatial" }},
{ multi: true }
);
Since you're storing geospatial data, you may want to have a look at MongoDB's 2dsphere indexes. This would allow you to store and index well-formed GeoJSON objects in your document. See this previous answer from a related question for more introductory information on the subject.
Considering a simple mongo document structure:
{ _id, firstTime, lastTime }
The client needs to insert a document with a known ID, or update an existing document. The 'lastTime' should always be set to some latest time. For the 'firstTime', if a document is being inserted, then the 'firstTime' should be set to current time. However, if the document is already created, then 'firstTime' remain unchanged. I would like to do it purely with upserts (to avoid look ups).
I've crawled the http://www.mongodb.org/display/DOCS/Updating, but I just don't see how that particular operation can be done.
I don't believe this is something unreasonable, there are $push and $addToSet operations that effectively do that on array fields, just nothing that would do the same on simple fields. It's like there should be something like $setIf operation.
I ran into the exact same problem and there was no simple solution for <2.4 however since 2.4 the $setOnInsert operator let's you do exactly that.
db.collection.update( <query>,
{ $setOnInsert: { "firstTime": <TIMESTAMP> } },
{ upsert: true }
)
See the 2.4 release notes of setOnInsert for more info.
I ran into a very similar problem when attempting to upsert documents based on existing content--maybe this solution will work for you also:
Try removing the _id attribute from your record and only use it in the query portion of your update (you'll have to translate from pymongo speak...)
myid = doc.get('_id')
del doc['_id']
mycollection.update({'_id':myid}, {'$set':doc}, upsert=True)
If you will trigger the following code 2 subsequent times, it will first set both firstVisit and lastVisit on document insert (and will return upsertedId in the response) and on the second it will only update lastVisit (and will return modifiedCount: 1).
Tested with Mongo 4.0.5 though I believe should be working with older versions.
db.collection.updateOne(
{_id: 1},
{
$set: {
lastVisit: Date.now()
},
$setOnInsert: {
firstVisit: Date.now()
}
},
{ upsert: true }
);
There's no way to do this with just one upsert. You'd have to do it as 2 operations - first try to insert the document, if it already exists the insert will fail due to duplicate key violation on the _id index. Then you do an update operation to set the lastTime to now.