Firestore, should I use a single collection for all users, or rather create a different collection for each user role? - google-cloud-firestore

My app should manage different users, of different roles (user / moderator / admin etc.) in Firebase Firestore.
What are the pros and cons of each approach:
A single collection for all users, where each document includes a ROLE field (user / moderator / admin etc.)
A separate collection for all users of the same role (users collections / moderators collection / admins collections etc.). Three collection in this example
One drawback for approach 2, is that authentication becomes a bit complicated since I have to look for the userId in three different collections (vs a single one in approach 1).
Are there any significant advantages to approach 2?

If you never need to search across all three collections (so if you don't need a collection group index) then having multiple collections will give you better total write performance.
If you do need to search across collections, then the write performance will be the same.
For query performance it makes no different either way, as the performance for queries on Firestore is not dependent on the number of documents in the collection/collection group.

Related

Cloud Firestore splitting access levels for extra sensitive user data

I want to restrict access to sensitive attributes of my Users documents to a smaller set of clients. My current understanding is that there are two ways to split the data, so that we can make security rules for each part:
Create a Users collection and a top level SensitiveUserData collection that both use the same document ID, and only retrieve the SensitiveUserData for a user when needed and allowed.
Create a SensitiveUserData subcollection within the User document. This collection will always contain just a single document, but the ID won't matter.
Which of these (or a third) is preferred in general?
Neither of these approaches is pertinently better than the other, and both have valid use-cases. In the end it's a combination of personal preference, and a (typically evolving) insight into the use-cases of your app.
In many scenarios, using subcollections is preferred as it allows the data to be better spread out over the physical storage, which in turn helps throughput. But in this case I doubt that makes a difference, as you're likely to use the user ID as keys in both SensitiveUserData and Users collections, so they'll be similarly distributed anyway.
For me personally, I often end up with a top-level collection. But that may well be related to my long history of modeling data in the Firebase Realtime Database, where access permission is inherited, so you can't hide a subcollection there.

how to model many to many relation using google's firestore

I have being reading the docs about Cloud firestore Data Model
and about Security in Cloud firestore but was unable to find an answer on how to model my data.
I have groups and users and I want to implement a membership relation such that each user can be member in multiple groups and each group could obviously have multiple members.
I want to be able to efficiently be able to tell who are the members of each group and what groups a given user belongs to.
In a relational DB I would have 3 tables, users, groups and user_groups or group_membrs and I am guessing anyone reading this Q could tell how to handle this.
What collections / documents / sub collections should I use to implement the same using firestore?
I am especially concerned with the security setting that would allow a user to remove himself (and only himself) from a group and would allow a group owner/manager to add any user to the group he manages (and only to that group)
Can/Should this be implemented using only collections for users and groups with the memebrship being part of the document data on those collections? or should an additional memeber ship collection be used?
My initial take would be to do this fairly similar to how I recommended modeling many-to-many relationship on the Firebase Realtime Database. The biggest difference with your relational experience, is that you'll typically store the relations in both directions. So in addition to user_groups you'll also have group_users for a total of four collections per entity pair.

How to design my Mongo database

I got a collection Users, that is name, password, email etc.
Also i got a collection Groups, every group has it's members - array of users.
How should i design my database? I clearly see 2 ways of doing so:
Way 1 (MySQL-like): every user has an _id, so i just put it into the members array and so be it.
Way 2: copy a whole user document inside plus add some fields.
On the MongoDB site they are telling that duplicate data is nothing to worry bcs of the low price of storages. Also they say that we should avoid JOINs on data read.
duplicate data is nothing to worry about
This is something to worry about when it comes to updating. Suppose you have user details nested and duplicated in every document. What happens when a user changes their name? You'll have to update every instance of that user in every document.
Be careful to differentiate between data and entities. A user is an entity, think carefully before duplicating entities as fixing it later could be hard work.
Personally, I'd split them unless you find yourself in a situation where performance is too slow to do the joining in real time. Then, and only then, consider merging.
Actually answer to this question depends on what kind of screens you are designing and what kind of queries you are going to make to fetch data. Lets go through pros and cons of each option which will help you in weighing each option.
Way 1 :- Putting array of user_ids in group collection
Pros
1) If you have a screen which shows group details of a particular group and list of all members (users_ids) belonging to that group, then one query can fetch all the details needed for this screen and it would be faster too.
Cons
1) If in group detail screen, you have to show details of users along with group details, then since mongodb does not provide any joins, you would be fetching user details in a separate query and would be joining both on the client side. This can lead to a impact on performance.
2) If you have a screen which shows user details and all the groups he/she belongs, then you will be searching user_id in user array in group collection. If you are expecting number of members in a group to be very high(millions), then searching inside the array can lead to a huge performance impact.
Way 2 :- Copy user document inside inside group collection
Duplicating data is not a problem in Mongodb, but you should have a really good reason for that. Thumb rule should be duplicate data when relationship is 1:few and not 1:many.
Pros
1) This approach will save you from joining group and user collection at client side as one query can fetch all the details of group along with its users.
Cons
1) Suppose you have a million groups and user_id_1 belongs to 100,000 groups, then whenever you have an update on user_id_1, you will have to update 100,000 documents. This can again lead to huge performance impact.
2) Also if a large number of users subscribe to 1 group, then document size of this group keeps on increasing. In Mongodb The maximum BSON document size is 16 megabytes that means you cannot have a document greater than 16MB, so you cannot add users to a group infinitely. This will limit your functionality.
Way 3 :- Embed group details in user collection
Pros
1) One query can fetch user details along with all the details of all the groups this user belongs to.
2) If you have are expecting few users in a group, then you will have few group arrays in a user document. This will not exceed 16MB limit.
Cons
1) If you are expecting that a user can subscribe to a lot many groups(millions), then user document may exceed 16MB limit.
2) Also if you have very frequent updates in group details then you will have to update the same in many user documents.
You can also go through the following link to get more details about data model design :-
https://docs.mongodb.org/manual/core/data-model-design/
It depends on how you will use data in your application.
If you have more than 2 groups and you will have to search a user in all of the groups, embed the user document within the group (way 2) is not a good idea. So in this case I sugest to use the way 1.
If you have only 2 groups or the user group will be known before your application when doing the query, then use the way 2.
I guess that separating the data is the way to go, since it will be better to direct update, get and delete user data directly.

possible mongo architecture for SaaS

I'm creating a platform where customers (users) are from different organisations. So I would like to keep their data totally separated according to organisations they belong. How would you suggest to store such data in mongo db? On which level?
Are you keeping the data separate for security reasons (i.e. compliance or regulation) or simply for administration/ease-of-use?
If it's the former, I'd go with separate databases at the very least, if not separate MongoDB instances. Separate instances enables you to perform segregation at an IP level through something like iptables so that you can tie down different instances to different IP ranges, representing the different organisations presuming they will be accessing the data.
If it's the latter, I'd still go with separate databases because it gives you the ability to have different users on a database level and from version 2.2, concurrency will be on a database level (so there's no sharing of the write lock, for example, that you'd have if you split it out on collection level).
As a FYI, here's some additional information on schema design in MongoDB -
Schema Design
Schema Design Presentation by Kyle Banker
Schema Design Blogs from Customers
MongoSF2012: mongodb-schema-design-insights-and-tradeoffs
There was actually a schema introduction webinar held last week that you can now listen to.
You can create a document for each organization and put the user's details into sub-documents inside the root document.
If the overall users' profiles are so big that don't fit into MongoDB document size (16 mg), then you can use different approach by creating a document for every user and add a field referring to the organization.

Sharing a document with users

I have to choose a database for implementing a sharing system.
My system will have users and documents. I have to share a document with a few users.
Example:
There are 2 users, and there is one document.
So if I have to share that one document with both the users, I could do these possible solutions:
The current method I'm using is with MySQL (I don't want to use this):
Relational Databases (MySQL)
Users Table = user1, user2
Docs Table = doc1
Docs-User Relation Table = doc1, user1
doc1, user2
And I would like to use something like this:
NoSQL Document Stores (MongoDB)
Users Documents:
{
_id: user1,
docs_i_have_access_to: {doc1}
}
{
_id: user2,
docs_i_have_access_to: {doc1}
}
Document's Document:
{
_id: doc1
members_of_this_doc: {user1, user2}
}
And I don't yet know how I would implement in a key-value store like Redis.
So I just wanted to know, would the MongoDB way I have given above, the best solution?
And is there any other way I could implement this? Maybe with another database solution?
Should I try to implement it with Redis or not?
Which database and which method should I choose and will be the best to share the data and why?
Note: I want something highly scalable and persistent. :D
Thanks. :D
Actually, you need to represent a many-to-many relationship. One user can have several documents. One document can be shared among several users.
See my previous answer to this question: how to have relations many to many in redis
With Redis, representing relationship with the set datatype is a pretty common pattern. You can expect to get better performance than with MongoDB for this kind of data model. And as a bonus, you can easily and efficiently find which users have a given list of documents in common, or which documents are shared by a given set of users.
Considering only this simple example (you just need to keep who owns what) SQL seems to be the most appropriate, as it will give additional options for free, such as reporting who has how many docs, the most popular documents, most active user etc with almost zero cost + the data will be more consistent (no duplication, possibly foreign keys). This is valid unless you have millions of documents of course.
If I chose between document-oriented and relational DB, I'd make a decision based mostly on the structure of the document itself. Whether they're all uniform or may have different fields for different types, do you nested sub-documents or arrays with the ability to search by their contents.