Hierarchic structure Firestore and subcollections - mongodb

I am trying to build a database with Cloud Firestore and having read the documentation about hierarchic structure I have found another solution to the one present in the documentation example.
I have a collection of categories, and each category can have subcategories. Whereas I could use the system provided in the Firestore documentation example such as collection/document/subcollection/document... I have found another example for MongoDB where instead of having subcollections it uses nested data in a single document with parent-child relations as described in the image below
What approach is better if I want the user to be able to see all the subcategories so the user can categorize a certain event? By better I mean in terms of avoiding multiple connections to the DB (as this is what Firestore prices with).

This document discusses nested data and its tradeoffs, "Choose a Data Structure". I'm not sure that the statement about multiple connections for pricing is correct. Firestore charges for number of operations, storage, and network bandwidth.
If all users share a single categories list, you would be able to retrieve the entire list with one operation with your current structure. Note that there is a 1 MiB limit for document size.

Related

Flutter Firebase field string matching

So I'm creating this flutter project where I want to match 2 field from a different collection in firebase how should I do that? Like if the first data from the 1st field matches the data from the 2nd field return true
here it the example image
- and here is the 2nd example image
Firestore queries can only order/filter on data that is inside each document that they actually return. There is no way to order/filter in any value in another document, whether present in the results or not.
Usually you'll want to modify your data model to your use-case, typically by adding more data that allows this specific query. This is actually quite common in NoSQL databases, where you often end up expanding your data model to allow for specific use-cases. To learn more about data modeling in NoSQL databases, I recommend reading NoSQL data modeling and watching Todd's video series Get to know Cloud Firestore.

Designing mongodb data model: embedding vs. referencing

I'm writing an application that gathers statistics of users across multiple social networks accounts. I have a collection of users and I would like to store the statistics information of each user.
Now, I have two options:
Create a collection that stores users statistics documents, and add a reference object to each of the user documents that links it to the corresponding document in the statistics collection.
Embed a statistics document in each of the users document.
Besides for query performance (which I'm less concerned about):
what are the pros and cons of each of these approaches?
What should I take into account if I choose to use references rather than embedding the information inside the user document?
The shape of the data is determined by the application itself.
There’s a good chance that when you are working with the users data, you probably need statistics details.
The decision about what to put in the document is pretty much determined by how the data is used by the application.
The data that is used together as users documents is a good candidate to be pre-joined or embedded.
One of the limitations of this approach is the size of the document. It should be a maximum of 16 MB.
Another approach is to split data between multiple collections.
One of the limitations of this approach is that there is no constraint in MongoDB, so there are no foreign key constraints as well.
The database does not guarantee consistency of the data. Is it up to you as a programmer to take care that your data has no orphans.
Data from multiple collections could be joined by applying the lookup operator. But, a collection is a separate file on disk, so seeking on multiple collections means seeking from multiple files, and that is, as you are probably guessing, slow.
Generally speaking, embedded data is the preferable approach.

Storing favorites list for each user in mongo?

I am working with movie app using node, express and mongo. How do I store the favorite movie list for each user in mongo database?
Here you have two alternatives:
Embedding: that is, to include into a user document the list of its favorite movies
Split collections: use two separate collections for users and movies, referencing them through id
Which one to choose?
You should design your MongoDB collections based on the access strategy of your application.
In general, you should go for embedding movies into user doc if:
embedded data (movies) are always retrieved when user info are retrieved. If you are always retrieving user info but rarely you need also movies data then embedding cannot be so good.
embedded data are not frequently subject to modifications
the cardinality is not too big (remember MongoDB document max size is 16MB). In this case I think cardinality of favorite movies per user could be ok to go for embedding them, but if just as example you have 100K movies per user you are forced to split the collections
Data Integrity can be an issue: having movies embedded into user docs, you can take advantage of document level locking (if you are using Wired Tiger storage engine). Otherwise, if you're in a concurrent system, with the two collections design you have to handle locking mechanism by yourself at application level.

Documents store database and connected domain

Consider this picture:
The book says documents store database struggle with highly connected domains because "relationships between aggregates aren’t firstclass citizens in the data model, most aggregate stores furnish only the insides of aggregates with structure, in the form of nested maps.
".
And besides: "Instead, the application that uses the database must build relationships from these flat, disconnected data structures."
I'm sorry, I don't understand what does it mean. Why documents store database struggle with a context based on highly relationships?
Because document stores do not support joins. Each time you need to get more data it is a separate query. Instead, document stores support the idea of nesting data within documents.

Mongodb model to store user/item specific data

The case:
There are users in system, and there are static documents (like books) Each user may work with some documents and have specific state/settings (like current position/page in document, bookmarks/notes) for each of his docs.
What is a better way to store that user and document specific information in flat collection with two keys userId and documentId or collection that have _id equal to userId and nested array of subdocuments that have _id equal to documentId (in that scenario collection is also used for storing non-document specific user data)?
1st scenaroio: find({userId: ..., documentId:...})
2nd scenaroio: findBy({_id:...}), then find sub doc with _id equal to documentId
PROS of 1st scenario:
1) I believe quicker find and save operations.
CONS of 1st scenario:
1) greater amount of documents
2) no way to store some non-doc related user-specific data in collection
PROS of 2nd scenario:
1) better representation of data relations (subjective though)
2) makes possible to use the same collection to store some other non particular document related user data.
CONS of 2nd:
1) more difficult search and more difficult save operations (I'm using using Mongoose ODM and code would not be complex), and I think the operations is less speedy then in 1st scenario.
Some things to consider:
1) In general in read operations I would to select only one document specific data
2) I would need OFTEN to save one document specific data (for example periodical saving of position in document that user is working with).
3) User/document state may have some nested arrays (bookmarks, notes) that have to be changed (docs inserted/removed)
Taking this considerations I would say that 1st scenario is more suitable for the task, but I would like to hear some pro opinions, whether two scenarios differ greatly.
What are your actual access paths? Do you start with a user id, and the look for the documents the user reads? Or do you start with a document and search for the users, that read it?
Is the document object lightweight (just title and author and suchlike information) or is it heavyweight (includes the contents)?
If documents are heavyweight, I'd keep them in a separate collection and go for scenario 2.
Basically scenario 1 mimics a relational solution and scenario looks like an object model.
I believe object models describe the reality better and are more efficient.
So I'd go for scenario 2, unless you frequently search the readers for a book.