Firestore Structure for public/private fields - google-cloud-firestore

I'm after some advice on a Firestore DB structure. I have an app that has a Firestore db and allows a single user (under the one UID) to create a profile for each member of their family (each profile is a document within the collection). In each of the documents, there are the personal details of the family member (as fields. For example, field1 = firstname, field2 = last name, field3 = phone number and so on). This works well but there is one other detail I need to attribute to each and every field within each profile. I need to be able to set a private or public flag against each individual field (for example: firstname has public flag, last name has private flag, Phone number has private flag and so on..). It would be nice if each field could have nested fields underneath (such as a "private" bool field) but that's not how Firestore works. It seems to be Collection/Document/Collection/Document/and so on...
If I didn't need to private/public flag, I would not have an issue. My data would fit perfectly to the Firestore structure.
Does anyone have any suggestions on how I might best achieve this outcome?
Cheers and thanks in advance...
Family Profiles current structure without flags

You can use structure above. With this structure you can fetch private data and public data separately whenever you need. But I have to tell you if you want to show only first name to other users in your app you can use queries on what to show to users. And also always use unique ids to store data rather than hardcoded Names such as JaneDoe or JoeDoe. Otherwise you can face some problems in the future regarding fetching data from firestore.
If you have questions feel free to ask

Take a look at the official documentation of Firebase. The information provided there will help you to understand what could be the most suitable solution for work with the data structure on this service. On the other hand for your question, it depends of your use case, will be useful if you could provide us with more context about why would your implementation needs to be as you wanted.
Also, since your concerns are related about how to manage the privacy of your data check this document too.
I hope this information will help you

Related

How to access my subcollection the correct way?

I am struggling to find the right solution in flutter to get the values of my user's sub-collection in Cloud Firestore!
This is my user holding the collection "affiliations"
These are the contents of the sub-collection "affiliations". In this case there are two references to organizations.
At runtime I would like to identify if a specific user is connected to an organization (organization key appears in referenced document).
So I have the user id and the organization id. How would you solve this?
Cheers!
So if I understand you correctly, the path would be:
firestore.collection('users')
.doc(user_id)
.collection('affiliations')
.doc(organization_id)
.get().then((snapshot){
//do something with your snapshot
});
Above is the recommended way but I see you would need to rename your documents in organization to be the organization id.
What you could do but is unnecessary:
firestore.collection('users').
.doc(user_id)
.collection('affiliations')
.where('organization', isEqualTo: 'organization/idstring')
.get((querySnapshot){
//do somthing here
});
If a specific user can only be affiliated with a specific organization once, I recommend using the organization ID as the document ID for that affiliation too. That way you don't have to query, but can directly check if a document with that ID exists.
While it may not make a big performance difference, it will keep your data more readable and allow you to check whether the user is affiliated with an organization in your security rules - something that isn't possible in your current structure.

should I create a seperate model (collection) for this?

i am building a small web app with MERN, i have a collection that holds "name, email, password, avatar url, and date" and i am going to add to the users some info like a "bio, hobbies(array), "visited countries(array), and another array"
question is, should i create a diffrent model for the users info, and add owner field that refers to the other model?. or should i put all of them there,
also i might add the following and followers option in the future.
The user's info should be in the user collection, I could see there is no reason to have a separate collection for it. If you want to reduce the responses from listing users, you could use populate to remove unnecessary fields.
Regards to the following and followers, I think there are 2 approaches:
Adding a new field which used to store id and necessary metadata (name, avatar) of users to the existing collection
Create a new collection which is a combination of users and users they are following, or are followed. You then could use Virtual to get this information from the User collection.
Personally, I prefer the first approach although it requires more effort to maintain the list to be accurate. E.g remove an item out of the list when your follower stops following you.

MongoDB ObjectId foreign key implementation recommendation

I'm looking for a recommendation on how best to implement MongoDB foreign key ObjectId fields. There seem to be two possible options, either containing the nested _id field or without.
Take a look at the fkUid field below.
{'_id':ObjectId('4ee12488f047051590000000'), 'fkUid':{'_id':ObjectId('4ee12488f047051590000001')} }
OR
{'_id':ObjectId('4ee12488f047051590000000'), 'fkUid':ObjectId('4ee12488f047051590000001')} }
Any recommendations would be much appreciated.
I'm having a hard time coming up with any possible advantages for putting an extra field "layer" in there, so I would personally just store the ObjectId directly in fkUid.
I suggest to use default dbref implementation, that is described here http://www.mongodb.org/display/DOCS/Database+References and is compatible with most of specific language drivers.
If your question is about the naming of the field (what you have in the title), usually the convention is to name it after the object to which it refers.
The both ways that you have mentioned are one of the same meaning. But they have different kind of usages.
Storing fkUid like 'fkUid':{'_id':ObjectId('4ee12488f047051590000001')} an object has it's own pros. Let me give an example, Suppose there is a website where users can post images and view images posted by other users as well. But when showing the image the website also shows the name/username of the user. By using this way you also can store the details like 'fkUid':{'_id':ObjectId('4ee12488f047051590000001'), username: 'SOME_X'}. When you are getting details from the db you don't have to send a request again to get the username for the specific _id.
Where as in the second way 'fkUid':ObjectId('4ee12488f047051590000001')} } you have to send another request to the server only for getting the name/username and nothing else is useful from the same object.

Is it ok to turn the mongo ObjectId into a string and use it for URLs?

document/show?id=4cf8ce8a8aad6957ff00005b
Generally I think you should be cautious to expose internals (such as DB ids) to the client. The URL can easily be manipulated and the user has possibly access to objects you don't want him to have.
For MongoDB in special, the object ID might even reveal some additional internals (see here), i.e. they aren't completely random. That might be an issue too.
Besides that, I think there's no reason not to use the id.
I generally agree with #MartinStettner's reply. I wanted to add a few points, mostly elaborating what he said. Yes, a small amount of information is decodeable from the ObjectId. This is trivially accessible if someone recognizes this as a MongoDB ObjectID. The two downsides are:
It might allow someone to guess a different valid ObjectId, and request that object.
It might reveal info about the record (such as its creation date) or the server that you didn't want someone to have.
The "right" fix for the first item is to implement some sort of real access control: 1) a user has to login with a username and password, 2) the object is associated with that username, 3) the app only serves objects to a user that are associated with that username.
MongoDB doesn't do that itself; you'll have to rely on other means. Perhaps your web-app framework, and/or some ad-hoc access control list (which itself could be in MongoDB).
But here is a "quick fix" that mostly solves both problems: create some other "id" for the record, based on a large, high-quality random number.
How large does "large" need to be? A 128-bit random number has 3.4 * 10^38 possible values. So if you have 10,000,000 objects in your database, someone guessing a valid value is a vanishingly small probability: 1 in 3.4 * 10^31. Not good enough? Use a 256-bit random number... or higher!
How to represent this number in the document? You could use a string (encoding the number as hex or base64), or MongoDB's binary type. (Consult your driver's API docs to figure out how to created a binary object as part of a document.)
While you could add a new field to your document to hold this, then you'd probably also want an index. So the document size is bigger, and you spend more memory on that index. Here's what you might not have though of: simply USE that "truly random id" as your documents "_id" field. Thus the per-document size is only a little higher, and you use the index that you [probably] had there anyways.
I can set both the 128 character session string and other collection document object ids as cookies and when user visits do a asynchronous fetch where I fetch the session, user and account all at once. Instead of fetching the session first and then after fetching user, account. If the session document is valid ill share the user and account documents.
If I do this I'll have to make every single request for a user and account document require the session 128 character session cookie to be fetched too thus making exposing the user and account object id safer. It means if anyone is guessing a user ID or account ID, they also have to guess the 128 string to get any answers from the system.
Another security measure you could do is wrap the id is some salt which you only know the positioning such as
XXX4cf8ce8XXXXa8aad6957fXXXXXXXf00005bXXXX
Now you know exactly how to slice that up to get the ID.

Correct usage of Voldemort as key-value pair?

I am trying to understand, how can Voldermort be used? Say, I have this scenario:
Since, Voldemort is a key-value pair.
I need to fetch a value (say some text) on the basis of 3 parameters.
So, what will be the key in this case? I cannot use 3 keys for 1 value right, but that value should be search able on the basis of those 3 parameters.
Am I making sense?
Thanks
EDIT1
eg: A blog system. A user posts a blog: User's data stored: Name, Age and Sex
The blog content (text) is stored.
Now, I need to use Voldemort here, if a user searches from the front end for all the blog posts by Sex: Male
Then, my code should query voldemort and return all the "blog content (text)" which have Sex as Male.
So, as per my understanding:
Key = Name, Age and Sex
Value = Text
I am using Java.
Edited answer to fit with example added to question:
The thing to understand about Voldemort is that it's a very simple key-value store. As far as I know about it, the only thing you can do is store a value under a key, and then fetch those values by key. So for your example case, if you really want to use Voldemort, you have a few options.
So, for example, you've said that you're storing data for users. So, you might have something like this:
Key = user-Chad
Value = Name:Chad Birch, Age:26, Sex:Male
Now, if I want to post a new blog post, you also need to store that under a key. So you could do something like this:
Key = blog-Chad1
Value = Here is my very first blog post.
Now, your problem is that you need some way to look up all the blog posts made by users with Sex:Male, but there's no way to get that data directly. At this point, you have to either:
Pull out every single user, check if they're male, and if they are, pull out their blog posts.
Start storing more stuff in other key-value pairs so that you can look this up.
To implement #2, you could add another pair like this:
Key = search-Sex:Male
Value = Chad1 Chad2 Steve1 ...
Then, when someone does a search for Sex:Male, you pull out the value for this, split it up, and then go fetch all those blog posts.
Does that make sense? Using a k-v store is quite a bit different from a database, because you lose all these relational abilities.
I don't think you can do that directly with a key-value store, but one way to work around is to store the user in multiple places.
For example, you have key-value mapping of a user to a list of blog posts. You also have a mapping of an age to a list of users. Also a gender to a list of users. Now if you want to search by age or gender you pull the corresponding list of users, and then pull all of their blog posts.
Part of the reason a key-value store like Voldemort can work is that storage and queries are cheap enough that you can do extra ones.
The problem, though, with the above scheme is that if you're using Voldemort in a distributed way you're better off with lots of keys that map to short lists of data (so you can distribute based on key) which something like mapping gender to user would violate (only a few keys with potentially very large lists of data for each).