We are using .net Core and node.js micro services some of them with mongoDB.
Currently we got the following DB structure :
Every customer gets his own Database.
So if we got a micro service for Invoices, every new customer adds 1 new DB for that micro service.
Invoice_customerA
Invoice_customerB
etc...
While the collections in each such DB remain the same (usually we got 1-3 collections in each DB)
In terms of logic - We choose the right DB by request input in runtime.
I am thinking now about changing it a bit, to start making separation on the collections instead:
So if we take the same example from before this time around this Invoice Service will only have 1 DB,
Invoice_allCustomers
and there will be 1 new collection for each customer in it ( or more if there were more collections for this service).
collection_customerA
collection_customerB
What I am trying to understand is if there is any difference performance wise?
Or is it mostly a "cosmetic" change?
Or maybe there are some other considerations?
P.S.
If the change is mostly cosmetic I am thinking that the new solution is better for us since we usually got only 1-2 collections per each micro service.
And it will be easier to navigate when there are significantly less Databases.
As far as I know in microservices,each service should have its own database. If it is not a different service than you can use one database with different collections in it. It is more of cosmetic changes but I should also warn you that mongodb still has it limits which you can find here. It really depends on the amount of data that will be stored and retrieved.
Related
I'm currently experimenting with a test collection on a LAN-accessible MongoDB server and data in a Meteor (v1.6) application. View layer of choice is React and right now I'm using the createContainer to bind the subscriptions to props.
The data that gets put in the MongoDB storage is updated on a daily basis and consists of a big set of data from several SQL databases, netting up to about 60000 lines of JSON per day. The data has been ever-so-slightly reshaped to be turned into a usable format whilst remaining as RAW as I'd like it to be.
The working solution right now is fetching all this data and doing further manipulations client-side to prepare the data for visualization. The issue should seem obvious: each client is fetching a set of documents that grows every day and repeats a lot of work on earlier entries before being ready to display. I want to do this manipulation on the server, through MongoDB's Aggregation Framework.
My initial idea is to do the aggregations on the server and to create new Collections containing smaller, more specific datasets without compromising the RAWness of the original Collection. That would mean the "reduced" Collections can still be reactive, as I've been able to confirm through testing in a Remote Desktop, subscribing to an aggregated Collection which I can update through Robo3T.
I don't know if this would be ideal. As far as storage goes, there's plenty of room for the extra Collections. But I have no idea how to set up an automated aggregation script on said server. And regarding Meteor, I've tried using meteorhacks:aggregate and jcbernack:reactive-aggregate but couldn't figure out how to deal with either one of them. If anyone is dealing, or has dealt with, something similar; I'd love to hear ideas / suggestions.
I am pretty new to MongoDB. I am creating an application where I will have users and a lot of other data.I have already created a database where I am storing user information using MongoDB. Now I have to create a new database or collection to store rest of the data. What are the pros and cons of creating different or different collection ?
I use MongoDB in a very similar way and have already thought a lot about dividing my database. Here are some of the things we considered:
Using 2 databases is harder to maintain, your application will have to know which database to update, also it can increase the costs (even more if you intend to monitor the databases and host on different infrastructure).
Mongo 2 used to lock the entire database when updating, so I think it would be better to separate then, but Mongo 3 with WiredTiger locks only the document, so you won't have the problems we used to have in the past.
One good thing about splitting the database in two is that even if your data explodes one database, the other will still work
IMHO, if you use a decent machine to store your databases and monitor it the right way, you won't have any troubles keeping just one until your system is giant with millions of active users. You can also use Replica Sets and Sharding to increase efficiency.
My team will deploy a new version of our app (Capture social media posts, hashtags etc.) they create a different DB for each user and we may have thousands of collections on each DB. I read all mongoDB shard documentation and I saw that I can only shard an collection or one DB at time, I'm missing something ?
We will start this new version fresh, without any databases and we will grow from 0 again (For now, we have 23k users) but we will escalate this number really quickly (100.000+ at the end of the year)
My question is: I really need a Shard cluster ? (My test setup have 3 shards with 3 microshards, 3 config servers and 2 mongos) for now, in production, i have a large server doing all the hard work but i dont want to scale to top, the horizontal scale is the best choice, i think.
Can I shard all my databases automatically or I really need to do that one by one doing the shard key procedure and so. ?
Thanks in advance
You are reading correctly. What you intend to do is so far away from what any sensible person would do that MongoDB doesn't offer any tools to support this. If you really want to go with this WTF solution, your application will be responsible to set up sharding for each collection it creates. This forces you to give administration permission to the application (despite what any security guides recommend).
"Will you really need a sharded cluster" - that depends on how much data you will have and how often you query it with what kind of query. But it is unlikely to work anyway, because your sharded cluster will have to manage (100,000 databases* 1.000 collections) = a hundred million collections. MongoDB is not designed for scaling in that direction. The cluster will likely be so busy with bookkeeping that you won't really see any notable performance gain.
It is also questionable if clustering would even theoretically make sense. Clustering is usually only useful when you have very large collections. But in your scenario where your data is so heavily fragmented into a million collections, each individual collection is unlikely to be very large.
If you really want to go this route, it might in fact be a better solution to separate the databases physically by assigning each user to a database server.
Or you could just build a database architecture like a normal team would with one database for all users and one collection per type of document. You would then speed up lookups by creating a compound index on user and whatever criteria you used to tell which database a document belonged to. This index might also be a good shard key.
I know this question has been asked a number of times here. However, I am unable to find a satisfying answer and reach on a conclusion.
This question is specifically for Mongo DB version 3.2. Should I have separate DBs and collection for different apps, or just one DB with all collections within it?
To simplify it further, let’s say I have about 15-20k apps on a server. Is it advisable to create a different database for each of these apps (with 10 collections/app), or create just one database and store all collections (20k apps * 10 collections = 200k collections) in it?
Also, this would be called from single Node app so need to consider the performance on having multi DB connections.
Should I have separate DBs and collection for different apps, or just one DB with all collections within it.
As I understand there is not any concrete answer or any concrete algorithm for this question It depends on no of connections with the database actually.
No of connections depends on the upcoming concurrent requests count so first see logically that Is it really worth to create a different database for each of these apps
Or another way (with 10 collections/app)
second way is you can judge the response time with some load testing (If possible)
I'm planing to port from entity framework 4.0 to MongoDb. What are the best practices that can minimize the impact since the project is having social networking functionality hence, maintain a complex relational database.As a result, performance should be a matter if we use
relational database.
We have used domain Layer(using POCO), repository pattern and DTO Mapping in the project.Also,
What are the advantages and disadvantages of the decision ? At the same time, how it affect to my domain layer implementation ?
If you want to 'minimize impact' you'll want to create a database in MongoDB the one you have in SQL. Since there are no joins in the database you'll need to do multiple reads to complete your query. In itself that's not too bad because MongoDB is really fast, but obviously it has other issues (concurrency, etc.).
If, however, you want to move over fully to the NOSQL-way of doing things you'll likely not be able to 'minimize impact', you'll need to make substantial changes to the way you store content, the way you access it and the way you update it.
Storage: You'll likely create documents in your database that are denormalized and much closer to 'ViewModels' than 'Models'. You might for example store a count of child records in a parent record so that you can display it without having to load them or count them.
Access: You might end up using Map-Reduce for some queries to your database which is a very different mind-set from a traditional query.
Updates: In all likelihood your approach to updating will be different in order to take advantage of the many fine-grained MongoDB update features like $inc. Instead of posting back some large view model and then applying it to your model and then updating the database you might instead provide a much finer-grained Ajax call back that updates a single value. Take a look at CQRS for more ideas on how to think about models for updates vs queries.