How to insert new document only if it doesn't already exist in MongoDB - mongodb

I have a collection of users with the following schema:
{
_id:ObjectId("123...."),
name:"user_name",
field1:"field1 value",
field2:"field2 value",
etc...
}
The users are looked up by the user.name, which must be unique. When a new user is added, I first perform a search and if no such user is found, I add the new user document to the collection. The operations of searching for the user and adding a new user, if not found, are not atomic, so it's possible, when multiple application servers are connect to the DB server, for two add_user requests to be received at the same time with the same user name, resulting in no such user being found for both add_user requests, which in turn results with two documents having the same "user.name". In fact this happened (due to a bug on the client) with just a single app server running NodeJS and using Async library.
I was thinking of using findAndModify, but that doesn't work, since I'm not simply updating a field (that exists or doesn't exist) of a document that already exists and can use upsert, but want to insert a new document only if the search criteria fails. I can't make the query to be not equal to "user.name", since it will find other users.

First of all, you should maintain a unique index on the name field of the users collection. This can be specified in the schema if you are using Mongoose or by using the statement:
collection.ensureIndex('name', {unique: true}, callback);
This will make sure that the name field remains unique and will solve the problem of concurrent requests as you have specified in your question. You do not require searching when this index is set.

Related

Is it possible to run a "dummy" query to see how many documents _would_ be inserted

I am using MongoDB to track unique views of a resource.
Everytime a user views a specific resource for the first time, a new view is logged in the db.
If that same user views the same resource again, the unique compound index on the collection blocks the insert of the duplicate.
For bulk inserts, with { ordered: false }, Mongo allows the new views through and blocks the duplicates. The return value of the insert is an object with an insertedCount property, telling me how many docs made it past the unique index.
In some cases, I want to know how many docs would be inserted before running the query. Then, based on the dummy insertedCount, I would choose to run the query, or not.
Is there a way to test a query and have it do everything except actually inserting the docs?
I could solve this by running some js serverside to get the answer I need. But I would prefer to let the db do those checks

Is this firebase security rule redundant?

I have a collection of users, and I have a separate collection of usernames. In my collection usernames I store different usernames as doc_ids. That is, under collection usernames I can have doc_ids as first, second, third, and so on. Under each doc_id I store the following info:
{
ownerId: id,
dateUpdated: someDate
}
When I change some user's username, I execute a batch query, where I first delete the oldUsername doc, and then insert the newUsername doc with the appropriate fields. My question is regarding one of the security rules, related to the usernames collection. Do I need to check, if I already have such username (that is such doc_id). Do I need the following rule:
match /usernames/{username} {
allow create: if !exists(/databases/$(database)/documents/usernames/$(username))
}
I think this rule, is redundant since I am enforcing the uniqueness of collection ids, but I already saw it on a few other posts, so I wanted to check other people's opinions.
Yup, that rule does nothing as the create will only be triggered when the document doesn't exist yet. If the document already exists, its .update will be triggered.
This type of check is common in a .write, but not needed when you're using the more granular .create.

Firestore transaction based on non existent document (Collection level locking not available)

Based on this SO answer I came to know that firestore does not have collection level locking in a transaction.
In my case, I have to ensure that the username field in users collection is unique before I write to a collection.
For that, I write a transaction that does this:
Executes a query on users collection to check if a document exists where username=something
If it does exist, fail and return error from transaction
If it does not exist, just run the write operation for the userId I want to update/create.
Now the issue here is that if two clients simultaneously try to run this transaction, both might query the collection and since the collection is not locked, one client might insert/update a document in collection while other won't see it.
Is my assumption correct? And if yes, then how to deal with such scenarios?
What you're trying to do is actually not possible to do atomically, as it's not possible to transact safely on a document that you can't identify with an ID. The problem here is that a transaction is only "safe" if you can get() the specific document to add or modify. Since you can't get() a document using a field value in the document, you're at a loss.
If you want to ensure uniqueness of anything in Firestore, that uniqueness will need to be coded into the document ID itself. In the simplest case, you can use the username as the ID of a document in a new collection. If you do that, your transaction can simply get() the required document by username, check to see if it exists, then write the document if it doesn't. Else, the transaction can fail.
Bear in mind that because there are limitations to document IDs in Firestore, you might need to escape or encode that username if your usernames could possibly violate the rules.
An alternative to coding this data into the doc id is to use a separate collection as a sort of manual index. Security rules can then enforce uniqueness on the index. So something like this:
/docs/${documentId} => {uniqueField: "foo", ...}
/docmap/${uniqueField} => {docId: "doc2"}
The idea here is that one must first write the docmap entry containing the new doc id before they are allowed to writet he doc. Since the docmap is keyed on our unique field, it enforces uniqueness.
Security rules would look roughly like so:
function getPath(childPath) {
return path('/databases/'+database+'/documents/'+childPath)
}
// we can only write to our doc if the unique field exists in docmap/
// and matches our doc id
match /docs/{docid} {
let docMapPath = 'docmap/' + request.resource.data.uniqueField;
allow write: if getData(docMapPath).docId == docId;
//todo validate data schema
}
// It is only possible to add a uniqueField to the docmap
// if it doesn't already exist for another doc
// we also validate that the doc id matches our schema
match /docmap/{uniqueField} {
allow write: if resource.data.size() == 0 &&
request.resource.data.docId is string &&
request.resource.data.docId.size() < 100
}
And a write would look roughly like so:
const db = firebase.firestore();
db.doc('docmap/foo').set('doc2')
.then(() => db.doc('docs/doc2').set({uniqueField: 'foo'})
.then(doc => console.log("success"))
.catch(e => console.error(e));
You could also do this in a transaction or even a batch operation to make it atomic, but it's probably not necessary to add complexity to the process; the security rules will enforce the constraints.

Firestore security rules: check if array contains strings different from user's ID

I know how to check if an array contains a given string (as explained for example here). My requirement however is different: I have a document with an array updatedByHistoryArray written at server side that contains the history of the ids of all users who updated such a document, for example [id1, id2, ..., idn].
I would like to allow a delete operation for this document only if the latter has been updated exclusively by the user who wants to delete it.
So, for example, if a user with id24 wants to delete a document, the updatedByHistoryArray of this document has to be [id24, id24, ..., id24].
Is it possible to implement this requirement in the security rules of Firestore?
It sounds possible. Try using hasOnly() to see if the list field contains only a single user ID.
resource.data.updatedByHistoryArray.hasOnly([request.auth.uid])

mongodb upsert with conditional field update

I have a script that populates a mongo db from daily server log files. Log files come from a number of servers so the chronological order of the data is not guaranteed. To make this simple, let's say that the document schema is this:
{
_id: <username>,
first_seen: <date>,
last_seen: <date>,
most_recent_ip: <string>
}
that is, documents are indexed by the name of the user who accessed the server. For each user, we keep track of the first time the user was seen and the ip from the last visit.
Right now I handle this very inefficiently: first try an insert. If it fails, retrieve a record by _id, then calculate updated values (e.g. first_seen and most_recent_up), and finally update the record. This is 3 db calls per log entry, which makes the script's running time prohibitively long given the very high volume of data.
I'm wondering if I can replace this with an upsert instead. I can see how to handle first/last_seen: probably something like {$min: {'first_seen': <log_entry_date>}} (hope this works correctly when inserting a new doc). But how do I set most_recent_ip to the new value only when <log_entry_date> > $last_seen.
Is there generally a preferred pattern for my use case?
You can just use $set to set the most_recent_ip, e.g.
db.logs.update(
{_id:"user1"},
{$set:{most_recent_ip:"2.2.2.2"}, $min:{first_seen:new Date()}, $max:{last_seen:new Date()}},
{upsert: true}
)