We need to retrieve an ID that uniquely identifies a document, so that when a user opens the same document in different sessions (even a year apart) we can identify this in the logs.
In the API I found DocumentURL but this could change (if the document is moved?) and it might even be empty (if the document is never stored online?). We could hash a combination of properties like Author and Date Created but these too can change and thus can't be fully relied upon.
How do we access the ID of a document? Ideally we're looking for a solution that works for any type of document, but if currently there is only such a property for a Word document then that is sufficient as well.
EDIT: Adding scenarios that need to work because otherwise my request seems too simple (hence the down-votes?):
The user can open, edit, save, etc. other documents and the ID should ALWAYS be the same PER document. Similarly, if a user shares a document with someone else, the ID read by the other user (when running our add-in) should be the same as for the owner of that document.
The add-in needs to be portable and usable on multiple platforms. When a user opens the same document on Word Online and Win 32, on different computers, etc. the ID must always be the same for that document.
To create a unique ID, it takes only a little JavaScript to create a GUID. See this SO post for example: Create GUID/UUID in JavaScript
To store the ID, you could use a custom setting or custom property. See Persist State and Settings
Related
i am building a small web app with MERN, i have a collection that holds "name, email, password, avatar url, and date" and i am going to add to the users some info like a "bio, hobbies(array), "visited countries(array), and another array"
question is, should i create a diffrent model for the users info, and add owner field that refers to the other model?. or should i put all of them there,
also i might add the following and followers option in the future.
The user's info should be in the user collection, I could see there is no reason to have a separate collection for it. If you want to reduce the responses from listing users, you could use populate to remove unnecessary fields.
Regards to the following and followers, I think there are 2 approaches:
Adding a new field which used to store id and necessary metadata (name, avatar) of users to the existing collection
Create a new collection which is a combination of users and users they are following, or are followed. You then could use Virtual to get this information from the User collection.
Personally, I prefer the first approach although it requires more effort to maintain the list to be accurate. E.g remove an item out of the list when your follower stops following you.
I'm after some advice on a Firestore DB structure. I have an app that has a Firestore db and allows a single user (under the one UID) to create a profile for each member of their family (each profile is a document within the collection). In each of the documents, there are the personal details of the family member (as fields. For example, field1 = firstname, field2 = last name, field3 = phone number and so on). This works well but there is one other detail I need to attribute to each and every field within each profile. I need to be able to set a private or public flag against each individual field (for example: firstname has public flag, last name has private flag, Phone number has private flag and so on..). It would be nice if each field could have nested fields underneath (such as a "private" bool field) but that's not how Firestore works. It seems to be Collection/Document/Collection/Document/and so on...
If I didn't need to private/public flag, I would not have an issue. My data would fit perfectly to the Firestore structure.
Does anyone have any suggestions on how I might best achieve this outcome?
Cheers and thanks in advance...
Family Profiles current structure without flags
You can use structure above. With this structure you can fetch private data and public data separately whenever you need. But I have to tell you if you want to show only first name to other users in your app you can use queries on what to show to users. And also always use unique ids to store data rather than hardcoded Names such as JaneDoe or JoeDoe. Otherwise you can face some problems in the future regarding fetching data from firestore.
If you have questions feel free to ask
Take a look at the official documentation of Firebase. The information provided there will help you to understand what could be the most suitable solution for work with the data structure on this service. On the other hand for your question, it depends of your use case, will be useful if you could provide us with more context about why would your implementation needs to be as you wanted.
Also, since your concerns are related about how to manage the privacy of your data check this document too.
I hope this information will help you
Let's say we have two models like this:
User:
_ _id
- name
- email
Company:
- _id
_ name
_ slug
Now let's say I need to connect a user to the company. A user can have one company assigned. To do this, I can add a new field called companyID in the user model. But I'm not sending the _id field to the front end. All the requests that come to the API will have the slug only. There are two ways I can do this:
1) Add slug to relate the company: If I do this, I can take the slug sent from a request and directly query for the company.
2) Add the _id of the company: If I do this, I need to first use the slug to query for the company and then use the _id returned to query for the required data.
May I please know which way is the best? Is there any extra benefit when using the _id of a record for the relationship?
Agree with the 2nd approach. There are several issues to consider when deciding on which field to use as a join key (this is true of all DBs, not just Mongo):
The field must be unique. I'm not sure exactly what the 'slug' field in your schema represents, but if there is any chance this could be duplicated, then don't use it.
The field must not change. Strictly speaking, you can change a key field but the only way to safely do so is to simultaneously change it in all the child tables atomically. This is a difficult thing to do reliably because a) you have to know which tables are using the field (maybe some other developer added another table that you're not aware of) b) If you do it one at a time, you'll introduce race conditions c) If any of the updates fail, you'll have inconsistent data and corrupted parent-child links. Some SQL DBs have a cascading-update feature to solve this problem, but Mongo does not. It's a hard enough problem that you really, really don't want to change a key field if you don't have to.
The field must be indexed. Strictly speaking this isn't true, but if you're going to join on it, then you will be running a lot of queries on it, so you'll need to index it.
For these reasons, it's almost always recommended to use a key field that serves solely as a key field, with no actual information stored in it. Plenty of people have been burned using things like Social Security Numbers, drivers licenses, etc. as key fields, either because there can be duplicates (e.g. SSNs can be duplicated if people are using fake numbers, or if they don't have one), or the numbers can change (e.g. drivers licenses).
Plus, by doing so, you can format the key field to optimize for speed of unique generation and indexing. For example, if you use SSNs, you need to check the SSN against the rest of the DB to ensure it's unique. That takes time if you have millions of records. Similarly for slugs, which are text fields that need to be hashed and checked against an index. OTOH, mongoDB essentially uses UUIDs as keys, which means it doesn't have to check for uniqueness (the algorithm guarantees a high statistical likelihood of uniqueness).
The bottomline is that there are very good reasons not to use a "real" field as your key if you can help it. Fortunately for you, mongoDB already gives you a great key field which satisfies all the above criteria, the _id field. Therefore, you should use it. Even if slug is not a "real" field and you generate it the exact same way as an _id field, why bother? Why does a record have to have 2 unique identifiers?
The second issue in your situation is that you don't expose the company's _id field to the user. Intuitively, it seems like that should be a valuable piece of information that shouldn't be given out willy-nilly. But the truth is, it has no informational value by itself, because, as stated above, a key should have no actual information. The place to implement security is in the query, ensuring that the user doing the query has permission to access the record / specific fields that she's asking for. Hiding the key is a classic security-by-obscurity that doesn't actually improve security.
The only time to hide your primary key is if you're using a poorly thought-out key that does contain useful information. For example, an invoice Id that increments by 1 for each invoice can be used by someone to figure out how many orders you get in a day. Auto-increment Ids can also be easily guessed (if my invoice is #5, can I snoop on invoice #6?). Fortunately, Mongo uses UUIDs so there's really no information leaking out (except maybe for timing attacks on its cryptographic algorithm? And if you're worried about that, you need far more in-depth security considerations than this post :-).
Look at it another way: if a slug reliably points to a specific company and user, then how is it more secure than just using the _id?
That said, there are some instances where exposing a secondary key (like slugs) is helpful, none of which have to do with security. For example, if in the future you need to migrate DB platforms and need to re-generate keys because the new platform can't use your old ones; or if users will be manually typing in identifiers, then it's helpful to give them something easier to remember like slugs. But even in those situations, you can use the slug as a handy identifier for users to use, but in your DB, you should still use the company ID to do the actual join (like in your option #2). Check out this discussion about the pros/cons of exposing _ids to users:
https://softwareengineering.stackexchange.com/questions/218306/why-not-expose-a-primary-key
So my recommendation would be to go ahead and give the user the company Id (along with the slug if you want a human-readable format e.g. for URLs, although mongo _ids can be used in a URL). They can send it back to you to get the user, and you can (after appropriate permission checks) do the join and send back the user data. If you don't want to expose the company Id, then I'd recommend your option #2, which is essentially the same thing except you're adding an additional query to first get the company Id. IMHO, that's a waste of cycles for no real improvement in security, but if there are other considerations, then it's still acceptable. And both of those options are better than using the slug as a primary key.
Second way of approach is the best,That is Add the _id of the company.
Using _id is the best way of practise to query any kind of information,even complex queries can be solved using _id as it is a unique ObjectId created by Mongodb. Population is the process of automatically replacing the specified paths in the document with document(s) from other collection(s). We may populate a single document, multiple documents, plain object, multiple plain objects, or all objects returned from a query.
In most of my apps, I need to store ID on data attributes to perform CRUD operations on specific elements of the DOM.
Indeed, my elements don't necessarily match specific criteria, or share multiple criteria, so the only way I have to delete them (for example when users clicks on it) is to store their ID in a data-id attribute and then send it to my server.
I use socket.io a lot.
Is that a good practice?
This is good practice. I don't think there is a better attribute to store this identifying data than data-id. You need some unique identifier for the document so the server knows which document the user wants to interact with when performing update/delete operations.
As long as your document is properly validated on the server side, i.e. before deleting/updating you check to make sure that the user in the session has authority to perform valid actions, there is no security risk of exposing the document _ids.
document/show?id=4cf8ce8a8aad6957ff00005b
Generally I think you should be cautious to expose internals (such as DB ids) to the client. The URL can easily be manipulated and the user has possibly access to objects you don't want him to have.
For MongoDB in special, the object ID might even reveal some additional internals (see here), i.e. they aren't completely random. That might be an issue too.
Besides that, I think there's no reason not to use the id.
I generally agree with #MartinStettner's reply. I wanted to add a few points, mostly elaborating what he said. Yes, a small amount of information is decodeable from the ObjectId. This is trivially accessible if someone recognizes this as a MongoDB ObjectID. The two downsides are:
It might allow someone to guess a different valid ObjectId, and request that object.
It might reveal info about the record (such as its creation date) or the server that you didn't want someone to have.
The "right" fix for the first item is to implement some sort of real access control: 1) a user has to login with a username and password, 2) the object is associated with that username, 3) the app only serves objects to a user that are associated with that username.
MongoDB doesn't do that itself; you'll have to rely on other means. Perhaps your web-app framework, and/or some ad-hoc access control list (which itself could be in MongoDB).
But here is a "quick fix" that mostly solves both problems: create some other "id" for the record, based on a large, high-quality random number.
How large does "large" need to be? A 128-bit random number has 3.4 * 10^38 possible values. So if you have 10,000,000 objects in your database, someone guessing a valid value is a vanishingly small probability: 1 in 3.4 * 10^31. Not good enough? Use a 256-bit random number... or higher!
How to represent this number in the document? You could use a string (encoding the number as hex or base64), or MongoDB's binary type. (Consult your driver's API docs to figure out how to created a binary object as part of a document.)
While you could add a new field to your document to hold this, then you'd probably also want an index. So the document size is bigger, and you spend more memory on that index. Here's what you might not have though of: simply USE that "truly random id" as your documents "_id" field. Thus the per-document size is only a little higher, and you use the index that you [probably] had there anyways.
I can set both the 128 character session string and other collection document object ids as cookies and when user visits do a asynchronous fetch where I fetch the session, user and account all at once. Instead of fetching the session first and then after fetching user, account. If the session document is valid ill share the user and account documents.
If I do this I'll have to make every single request for a user and account document require the session 128 character session cookie to be fetched too thus making exposing the user and account object id safer. It means if anyone is guessing a user ID or account ID, they also have to guess the 128 string to get any answers from the system.
Another security measure you could do is wrap the id is some salt which you only know the positioning such as
XXX4cf8ce8XXXXa8aad6957fXXXXXXXf00005bXXXX
Now you know exactly how to slice that up to get the ID.