Firestore Document Size Limitations - google-cloud-firestore

I've Google Cloud Firestore Project. My database model is like this:
Each store has own document. Sales and inventory collections has a lot of documents and their size increases every day.
There is document max size limitation for documents in Firestore. So, Document that named Store1 has sales and inventory collections and they store every sale and item. Does Store1 document have max size limitation? Would sales and inventory documents size increasing be a problem? If it would be, my data model should be incorrect and if it's incorrect, how should it be?

The document size limitation in Firestore is enforced per individual document, and does not include the size of the documents in subcollections of that document. It is relatively uncommon for folks to hit the document size limit.

Related

Is sharding necessary for subcollections in Cloud Firestore?

I have read in the documentation, that writes per second can be limited to 500 per second if a collection has sequential values with an index.
I am saving sequential timestamps in a subcollection.
I would like to know if I need a shard field in this specific case to increase the maximum writes per second?
I am only using "normal" collection indexes, no collection group index.
Some additional explanations:
I have a top level collection called "companies" and under every document is a subcollection called "orders". Every order document has a timestamp "created". This field is indexed and I need this index. These orders could be created very frequently. I am sure that the 500 writes per second limit would apply to this constellation. But I wonder if every subcollection "orders" would have its own limit of 500 writes per second or if all subcollections share one limit. I could add a shard field to avoid the write limit as stated in the documentation but if every subcollection would get its own limit this would not be necessary for my specific case. 500 write per second per subcollection should be more than enough. I think that every subcollection has its own index as long as I am not using a collection group index. And therefore the server should be able to split the data across multiple servers if necessary. But maybe I am wrong. I can't find any concrete informations on that in the documentation or in the internet.
Screenshots from database:
root collection companies
subcollection orders

Is better to have multiple collections with thousands of documents or one collection with 100 million documents?

I'm migrating a MySql table which has 100 million rows to a MongoDB database, this table stores companys documents and what difference them are the column company_id. I was wondering if have multiple collections on mongodb would be faster than just one collection, for example, each company would have it own collection (collections: company_1, company_2, company_3...) and store only documents from that company, so I will not need to filter then as I would need to do if I just had 1 big collection and in every document there would be a column named company_id that would be used to filter documents.
Which method would perform best in this case?
EDIT:
Here's a JSON document example: https://pastebin.com/T5m2tbaY
{"_id":"5d8b8241ae0f000015006142","id_consulta":45254008,"company_id":7,"tipo_doc":"nfe","data_requisicao":"2019-09-25T15:05:35.155Z","xml":Object...
You could have one collection and one document per company, with company specific details in the document, assuming the details do not exceed 16MB in size. Place an index on company id for performance reasons. If performance conditions are not meeting expectations scale vertically - i.e., add memory, CPU, disk IO, and network enhancements to increase performance. If that does not suffice, consider sharding the collection across multiple hosts.

Firestore sub documents read pricing

Firestore if I read a collection and the collection contains 100 documents then is firebase calculate it as 100 read operation or 1 read operation?
If my 1 document contains another 2 collections and each sub collection contains 10 docs then How much will be the total read count in this case?
if it counts subcollection and it doc separately then Firestore is very very high pricing
If you read all documents from a collection that contains 100 documents, then you're reading 100 documents. So you'll be charged for 100 document reads.
If you're reading documents from subcollections, then there too: you'll be charged for each document you read.
If you're struggling to find a data model that strikes a balance between a flexible structure and limiting the number of reads you need, I recommend watching the Getting to know Cloud Firestore video series, specifically these episodes:
What is a NoSQL Database? How is Cloud Firestore structured?
Cloud Firestore Pricing
How to Structure Your Data

Maximum size of a document in firestore?

I want create a document that containing about 20 million objects.
The structure like that:
documentID
---- key1
-------- object1
-------------name: "test1"
-------------score: 123
I don't know the limitation of a document size in firestore, so can you help me any reference or information about that?
Thanks!
The maximum size is roughly 1 Megabyte, storing such a large number of objects (maps) inside of a single document is generally a bad design (and 20 million is beyond the size limit anyway).
You should reconsider why they need to be in a document, rather than each object being their own document.
Cloud Firestore's limits are listed in the documentation.
Have you looked at Firestore sub-collections?
You can store the main item as a document in one top-level collection, and all of its underlying data could be in a sub-collection of that document.
There is no limit to how many object records a sub-collection can contain when those objects are stored as child documents of that sub-collection.
So 20M records should not be an issue.
If you want to save objects bigger than 1mb you should use cloud storage, the limit is 5tb per object:
There is a maximum size limit of 5 TB for individual objects stored in Cloud Storage. There is an update limit on each object of once per second, so rapid writes to a single object won't scale.
Google cloud storage
If you want to check the size of a document against the maximum of 1 MiB (1,048,576 bytes), there is a library that can help you with that:
https://github.com/alexmamo/FirestoreDocument-Android/tree/master/firestore-document
In this way, you'll be able to always stay below the limit.

Mapping datasets to NoSql (MongoDB) collection

what I have ?
I have data of 'n' department
each department has more than 1000 datasets
each datasets has more than 10,000 csv files(size greater than 10MB) each with different schema.
This data even grow more in future
What I want to DO?
I want to map this data into mongodb
What approaches I used?
I can't map each datasets to a document in mongo since it has limit of 4-16MB
I cannot create collection for each datasets as max number of collection is also limited (<24000)
So finally I thought to create collection for each department , in that collection one document for each record in csv file belonging to that department.
I want to know from you :
will there be a performance issue if we map each record to document?
is there any max limit for number of documents?
is there any other design i can do?
will there be a performance issue if we map each record to document?
mapping each record to document in mongodb is not a bad design. You can have a look at FAQ at mongodb site
http://docs.mongodb.org/manual/faq/fundamentals/#do-mongodb-databases-have-tables .
It says,
...Instead of tables, a MongoDB database stores its data in collections,
which are the rough equivalent of RDBMS tables. A collection holds one
or more documents, which corresponds to a record or a row in a
relational database table....
Along with limitation of BSON document size(16MB), It also has max limit of 100 for level of document nesting
http://docs.mongodb.org/manual/reference/limits/#BSON Document Size
...Nested Depth for BSON Documents Changed in version 2.2.
MongoDB supports no more than 100 levels of nesting for BSON document...
So its better to go with one document for each record
is there any max limit for number of documents?
No, Its mention in reference manual of mongoDB
...Maximum Number of Documents in a Capped Collection Changed in
version
2.4.
If you specify a maximum number of documents for a capped collection
using the max parameter to create, the limit must be less than 232
documents. If you do not specify a maximum number of documents when
creating a capped collection, there is no limit on the number of
documents ...
is there any other design i can do?
If your document is too large then you can think of document partitioning at application level. But it will have high computation requirement at application layer.
will there be a performance issue if we map each record to document?
That depends entirely on how you search them. When you use a lot of queries which affect only one document, it is likely even faster that way. When a higher document-granularity results in a lot of document-spanning queries, it will get slower because MongoDB can't do that itself.
is there any max limit for number of documents?
No.
is there any other design i can do?
Maybe, but that depends on how you want to query your data. When you are content with treating files as a BLOB which is retrieved as a whole but not searched or analyzed on the database level, you could consider storing them on GridFS. It's a way to store files larger than 16MB on MongoDB.
In General, MongoDB database design doesn't depend so much on what and how much data you have, but rather on how you want to work with it.