How can I add validation to adding email to my db? Upsert? - mongodb

I'm trying to undertstand the best way to do this.
I am getting the name and email and I want to add it to my collection.
However, if the email already exists, then i don't want to insert the name and email. Is there a way to do this using upsert? I'm trying to understand from the documentation but its a bit confusing for me. http://docs.mongodb.org/manual/reference/method/db.collection.update/ Any help is greatly appreciated.

First of all, you should consider creating an unique index for email field to ensure that there could be only one document for any particular email:
db.collection.createIndex({email: 1}, {unique: true})
You could also add sparse option to allow documents without email.
Then you'll have two options depending on your particular use case: to use upsert, or to use insert ignoring duplicate key errors.
Upsert
Using the following upsert operation
db.collection.update({email: email}, {$set: {name: name}}, {upsert: true})
you will:
create new document if there is no such email yet;
update existing document with new name if the email already exists.
Here is a quotation from MondoDB documentation explaining upsert behavior when no document matches the query criteria:
The update creates a base document from the equality clauses in the <query> parameter, and then applies the update expressions from the <update> parameter.
Insert
If you don't want to update name field of an existing document, you should use basic insert operation instead:
db.u.insert({email: email, name: name})
ignoring all 11000 E11000 duplicate key errors.

Related

MongoDB updateOne with upsert failed: Duplicate Key

I have a collection with 2 compound unique index, uuid and id. I want to update a document if the collection have a document with unique value of uuid and id (composite unique) and I found in the documentation that updateOne with upsert=true can do this. So, I use:
db.collection("messages").updateOne({uuid:this.uuid, id:new_message.id}, {$set: {uuid: this.uuid, ...new_message}}, {upsert:true})
and this always throw an error saying that there's a document with duplicate value of uuid=xxx and id=yyy. I looked up and found a post stating there's data race happening on update and insert on mongodb upsert operation so this will always happen. Is there another way to do this? How do I properly and efficiently upsert a collection with 1 million documents?
EDIT:
I gave the wrong code for this question. The code should be:
db.collection("messages").updateOne({uuid:this.uuid, key:{id:new_message.key.id}}, {$set: {uuid: this.uuid, ...new_message}}, {upsert:true})
Since you have multi-threading, this is a common problem. All the supported operations in mongo will run into this issue as it is based on your architecture.
You can catch the exception and retry the operation. In this case, one of the threads would be succeeded. Other one will pass through exception handling. This is a feasible workaround.
When do you except both threads updating the same document at the same time? This is a serious design problem. This will alter the desired document state.
So, after trying out things I found out that I should use dot notation in the query, I changed it to:
db.collection("messages").updateOne({uuid:this.uuid, "key.id":new_message.key.id}}, {$set: {uuid: this.uuid, ...new_message}}, {upsert:true})
and now it works.

What is the best way to delete duplicate records after a raw import of text files to MongoDB?

I imported a lot of records and need to be able to delete duplicates that might have been imported by mistake.
On a separate note, I would like to be able to query all records for specific keywords. I am new to MongoDB and was hoping someone could help with a query or two.
In order to remove duplicates based on a key, you can create an index on the collection and enable dropDups like this,
db.yourCollection.ensureIndex({'myKey' : 1}, {unique : true, dropDups : true})
The following index will keep the first unique document and drop any duplicates followed by that.
Note: dropDups will not work in MongoDB 3.0 or above. If you're a new version, please follow this solution here instead.
As to query records for specific keywords, you can use both find (with or without regex) and MongoDB's text search.
You can find more information on MongoDB find here and on Text Search here.

Mongodb: get notified if exist, insert if not exist

I'm handling my user register logic with Mongodb. I need insert a user if it is not exist, but get to know if it is already exists before insert, so I can notify the user he has already registered.
The update method with upsert will not return the result of how many docs have inserted ( I do not . And findAndModify method will only find docs after insert. So neither way I'm not able to know if there is already such a doc before I insert.
Is there a way to do this?
Update
update and findAndModify are not good examples. I do not want to update my doc if the user is already exists. I just want to know if the username is exists before insert. If not, then insert it.
I'm not using _id with insert. Should I use username as _id and use it to insert?
Use a unique key (on e-mail address or whatever identifies a user) and check for corresponding error code when trying to insert. That's the way. Not just for MongoDB.
db.users.createIndex({email: 1}, {unique: true})
Now when inserting a duplicate e-mail, check for error codes 11000 and 11001.
If you're willing to use _id as your userid, you can use the save() command.
This will create a new record if there's not one already, otherwise will update the existing record.
It's probably better to check for the user and then selectively update, but this is a down and dirty way of doing things:
http://docs.mongodb.org/manual/reference/method/db.collection.save/

Is there an "upsert" option in the mongodb insert command?

I know this may be a silly question, but I read on an e-book that there is an upsert option in MongoDB insert. I couldn't find proper documentation about this. Can someone educate me about this?
Since upsert is defined as operation that "creates a new document when no document matches the query criteria" there is no place for upsertsin insert command. It is an option for the update command. If you execute command like below it works as an update, if there is a document matching query, or as an insert with document described by update as an argument.
db.collection.update(query, update, {upsert: true})
MongoDB 3.2 adds replaceOne:
db.collection.replaceOne(query, replacement, {upsert: true})
which has similar behavior, but its replacement cannot contain update operators.
As in the links provided by PKD, db.collection.insert() provides no upsert possibility. Instead, mongo insert inserts a new document into a collection. Upsert is only possible using db.collection.update() and db.collection.save().
If you happen to pass a document to db.collection.insert() which is already in the collection and thus has an _id similar to an existing _id, it will throw a duplicate key exception.
For upserting a singe document using the java driver:
FindOneAndReplaceOptions replaceOptions = new FindOneAndReplaceOptions();
replaceOptions.upsert(true);
collection.findOneAndReplace(
Filters.eq("key", "value"),
document,
replaceOptions
);
Although uniqueness should be ensured from Filters.eq("key", "value") otherwise there is a possibility of adding multiple documents. See this for more

Doing an upsert in mongo, can I specify a custom query for the "insert" case? [duplicate]

I am trying to use upsert in MongoDB to update a single field in a document if found OR insert a whole new document with lots of fields. The problem is that it appears to me that MongoDB either replaces every field or inserts a subset of fields in its upsert operation, i.e. it can not insert more fields than it actually wants to update.
What I want to do is the following:
I query for a single unique value
If a document already exists, only a timestamp value (lets call it 'lastseen') is updated to a new value
If a document does not exists, I will add it with a long list of different key/value pairs that should remain static for the remainder of its lifespan.
Lets illustrate:
This example would from my understanding update the 'lastseen' date if 'name' is found, but if 'name' is not found it would only insert 'name' + 'lastseen'.
db.somecollection.update({name: "some name"},{ $set: {"lastseen": "2012-12-28"}}, {upsert:true})
If I added more fields (key/value pairs) to the second argument and drop the $set, then every field would be replaced on update, but would have the desired effect on insert. Is there anything like $insert or similar to perform operations only when inserting?
So it seems to me that I can only get one of the following:
The correct update behavior, but would insert a document with only a subset of the desired fields if document does not exist
The correct insert behavior, but would then overwrite all existing fields if document already exists
Are my understanding correct? If so, is this possible to solve with a single operation?
MongoDB 2.4 has $setOnInsert
db.somecollection.update(
{name: "some name"},
{
$set: {
"lastseen": "2012-12-28"
},
$setOnInsert: {
"firstseen": <TIMESTAMP> # set on insert, not on update
}
},
{upsert:true}
)
There is a feature request for this ( https://jira.mongodb.org/browse/SERVER-340 ) which is resolved in 2.3. Odd releases are actually dev releases so this will be in the 2.4 stable.
So there is no real way in the current stable versions to do this yet. I am afraid the only method is to actually do 3 conditional queries atm: 1 to check the row, then a if to either insert or update.
I suppose if you had real problems with lock here you could do this function with sole JS but that's evil however it would lock this update to a single thread.