I am building an online platform where one can attend courses. Each course has a set of exercisegroups (think exercise session) and each exercise group has multiple exercises.
An exercise can contain multiple modules. Furthermore an exercise also has multiple handins and one or more solutions. In short i have the hierarchi:
Course <- {UserCourseRole, ExerciseGroup <- {Exercise <- module, handin, solution}}.
The problem is: I have problems determining which entities should be aggregate roots. For now only Course is an aggregate root. given the database design I'm having issues determining more aggregate roots.
I cannot add a picture, but following this link you access the diagram online
Related
I'm having some issues to correctly design the domain that I'm working on.
My straightforward use case is the following:
The user (~5000 users) can access to a list of ads (~5 millions)
He can choose to add/remove some of them as favorites.
He can decide to show/hide some of them.
I have a command which will mutate the aggregate state, to set Favorite to TRUE, let's say.
In terms of DDD, how should I design the aggregates?
How design the relationship between a user and his favorite's ads selection?
Considering the large numbers of ads, I cannot duplicate each ad inside a user aggregate root.
Can I design a Ads aggregateRoot containing a user "collection".
And finally, how to handle/perform the readmodels part?
Thanks in advance
Cheers
Two concepts may help you understand how to model this:
1. Aggregates are Transaction Boundaries.
An aggregate is a cluster of associated objects that are considered as a single unit. All parts of the aggregate are loaded and persisted together.
If you have an aggregate that encloses a 1000 entities, then you have to load all of them into memory. So it follows that you should preferably have small aggregates whenever possible.
2. Aggregates are Distinct Concepts.
An Aggregate represents a distinct concept in the domain. Behavior associated with more than one Aggregate (like Favoriting, in your case) is usually an aggregate by itself with its own set of attributes, domain objects, and behavior.
From your example, User is a clear aggregate.
An Ad has a distinct concept associated with it in the domain, so it is an aggregate too. There may be other entities that will be embedded within the Ad like valid_until, description, is_active, etc.
The concept of a favoriting an Ad links the User and the Ad aggregates. Your question seems to be centered around where this linkage should be preserved. Should it be in the User aggregate (a list of Ads), or should an Ad have a collection of User objects embedded within it?
While both are possibilities, IMHO, I think FavoriteAd is yet another aggregate, which holds references to both the User aggregate and the Ad aggregate. This way, you don't burden the concepts of User or the Ad with favoriting behavior.
Those aggregates will also not be required to load this additional data every time they are loaded into memory. For example, if you are loading an Ad object to edit its contents, you don't want the favorites collection to be loaded into memory by default.
These aggregate structures don't matter as far as read models are concerned. Aggregates only deal with the write side of the domain. You are free to rewire the data any way you want, in multiple forms, on the read side. You can have a subscriber just to listen to the Favorited event (raised after processing the Favorite command) and build a composite data structure containing data from both the User and the Ad aggregates.
I really like the answer given by Subhash Bhushan and I want to add another approach for you to consider.
If you look closely at your question you will see that you've made the assumption that an aggregate can 'see' everything that the user does when they are interacting with the UI. This doesn't need to be so.
Depending on the requirements of the domain you don't need to hold a list of any Ads in the aggregate to favourite them. Here's what I mean:
For this example, it doesn't matter where the the 'favourite' ad command sits. It could be on the user aggregate or a specific aggregate for handling the concept of Favouriting. The command just needs to hold the id of the User and the Ad they are favouriting.
You may need to handle what happens if a user or ad is deleted but that would just be a case of an event process manager listening to the appropriate events and issuing compensating commands.
This way you don't need to load up 5 million ads. That's a job for the read model and UI, not the domain.
Just a thought.
Suppose I have a store such as Amazon that sells a variety of products such as computers and paintings. They are quite different from each other and have their own set of fields and logic.
In addition to the typical CRUD, I need to design a JSON API that allows me to:
A. Fetch an ungrouped list of paintings and computers. For example: [computer, painting, painting, computer, ...] ordered by date published (so with filtering capability).
B. Fetch only paintings
C. Fetch only computers
The RESTful approach will typically be something like: /api/paintings and api/computers which works really well for segregated results.
But my main concern is operation A - getting an ungrouped list of paintings and products sorted by date published. The way I see it, there are three approaches:
1) Create a new standalone resource called products such as /api/products which will have filtering capability and continue to use /api/resource for specific CRUD operations.
2) Create a parent products resource which will be used for filtering operations. So I can do something like /products?order_by=published_date And for more specific resources I can do something like /products/paintings or /products/computers
3) Do not have a resource for paintings or computers. Instead have one for a generic product. I will then have most logic in the api layer and reduce the complexity of the client.
I am leaning towards approach #3 but wanted to get feedback prior to implementing since this will be a core feature of the API.
I've always taken the approach the your API Layer should match your object modeling. So, the answer to your question would be it depends on the source data. Well, the source data after it's object modeling.
If you have an object model for Computer and for Printer, they should be resources like you've said. Do they share any data/functions? If so, you should have an object model for that, too, perhaps: Product. Then Computer and Printer extend the Product class.
With that in mind, design the API layer to mirror it. Since Computer and Printer both extend Product. Product as a parent of the Computer and Printer resources make sense.
In my opinion I would go for approach #3 and query the API with type of product if you search for it.
/products?type=computers&order_by=date
We recently started to work in a big project and we decided to use MongoDB as a DDBB solution.
We wrote a lot of code, but the project has started to grow and we found out that we're trying to use joins instead of doing it the NoSQLway, which denotes a bad DDBB design.
What I'm trying to ask here is a good design for our project, which, at this point consists of the following:
More than 12.000 Products
More than 2.000 Sellers
Every seller should have its own private area that will allow to create a product catalog based on the +12.000 "products template list".
The seller should be able to set the price, stock and offers, which will then be reflected only in his public product listing. The template list of products will remain unchanged.
Currently we have two collections. One for the products (which holds the general product information, like name, description, photos, etc...) and one collection in which we store documents that contain the ID of the product from the first collection, an ID that is related to the seller and the stock, price and offers values.
We are using aggregate with $lookup to "emulate" SQL's left join to merge the two collections, but the process is not scaling as we'd like it to and we're hitting serious performance issues.
We're aware that using joins is not the way to go in NoSQL. What should we do? How should we refactor our DDBB design? Should we embed the prices, offers and stock for each seller in each document?
The decision of using "Embedded documents" or "Joins among two or more different collections" should depend on how you are going to retrieve the data.If every time,while fetching product, you are going to fetch sellers,then it makes sense to make it an embedded document instead of different collections.But if you will be planning to fetch these two entities separately, then only option you are left with is to use Join.
We were looking for the most suitable database for our innovative “collaboration application”. Sorry, we don’t know how to name it in a way generally understood. In fact, highly complicated relationships among tenants, roles, users, tasks and bills need to be handled effectively.
After reading 5 DBs(Postgrel, Mongo, Couch, Arango and Neo4J), when the words “… relationships among things are more important than things themselves” came to my eyes, I made up my mind to dig into OrientDB. Both the design philosophy and innovative features of OrientDB (multi-models, cluster, OO,native graph, full graph API, SQL-like, LiveQuery, multi-masters, auditing, simple RID and version number ...) keep intensifying my enthusiasm.
OrientDB enlightens me to re-think and try to model from a totally different viewpoint!
We are now designing the data structure based on OrientDB. However, there are some questions puzzling me.
LINK vs. EDGE
Take a case that a CLIENT may place thousands of ORDERs, how to choose between LINKs and EDGEs to store the relationships? I prefer EDGEs, but they seem like to store thousands of RIDs of ORDERs in the CLIENT record.
Embedded records’ Security
Can an embedded record be authorized independently from it’s container record?
Record-level Security
How does activating Record-level Security affect the query performance?
Hope I express clearly. Any words will be truly appreciated.
LINK vs EDGE
If you don't have properties on your arch you can use a link, instead if you have it use edges. You really need edges if you need to traverse the relationship in both directions, while using the linklist you can only in one direction (just like a hyperlink on the web), without the overhead of edges. Edges are the right choice if you need to walk thru a graph.Edges require more storage space than a linklist. Another difference between them it's the fact that if you have two vertices linked each other through a link A --> (link) B if you delete B, the link doesn't disappear it will remain but without pointing something. It is designed this way because when you delete a document, finding all the other documents that link to it would mean doing a full scan of the database, that typically takes ages to complete. The Graph API, with bi-directional links, is specifically designed to resolve this problem, so in general we suggest customers to use that, or to be careful and manage link consistency at application level.
RECORD - LEVEL SECURITY
Using 1 Million vertex and an admin user called Luke, doing a query like: select from where title = ? with an NOT_UNIQUE_HASH_INDEX the execution time it has been 0.027 sec.
OrientDB has the concept of users and roles, as well as Record Level Security. It also supports token based authentication, so it's possible to use OrientDB as your primary means of authorizing/authenticating users.
EMBEDDED RECORD'S SECURITY
I've made this example for trying to answer to your question
I have this structure:
If I want to access to the embedded data, I have to do this command: select prop from User
Because if I try to access it through the class that contains the type of car I won't have any type of result
select from Car
UPDATE
OrientDB supports that kind of authorization/authentication but it's a little bit different from your example. For example: if an user A, without admin permission, inserts a record, another user B can't see the record inserted by user A without admin permission. An User can see only the records that has inserted.
Hope it helps
I'm more used to a relational database and am having a hard time thinking about how to design my database in mongoDB, and am even more unclear when taking into account some of the special considerations of database design for meteorjs, where I understand you often prefer separate collections over embedded documents/data in order to make better use of some of the benefits you get from collections.
Let's say I want to track students progress in high school. They need to complete certain required classes each school year in order to progress to the next year (freshman, sophomore, junior, senior), and they can also complete some electives. I need to track when the students complete each requirement or elective. And the requirements may change slightly from year to year, but I need to remember for example that Johnny completed all of the freshman requirements as they existed two years ago.
So I have:
Students
Requirements
Electives
Grades (frosh, etc.)
Years
Mostly, I'm trying to think about how to set up the requirements. In a relational DB, I'd have a table of requirements, with className, grade, and year, and a table of student_requirements, that tracks the students as they complete each requirement. But I'm thinking in MongoDB/meteorjs, I'd have a model for each grade/level that gets stored with a studentID and initially instantiates with false values for each requirement, like:
{
student: [studentID],
class: 'freshman'
year: 2014,
requirements: {
class1: false,
class2: false
}
}
and as the student completes a requirement, it updates like:
{
student: [studentID],
class: 'freshman'
year: 2014,
requirements: {
class1: false,
class2: [completionDateTime]
}
}
So in this way, each student will collect four Requirements documents, which are somewhat dictated by their initial instantiation values. And instead of the actual requirements for each grade/year living in the database, they would essentially live in the code itself.
Some of the actions I would like to be able to support are marking off requirements across a set of students at one time, and showing a grid of users/requirements to see who needs what.
Does this sound reasonable? Or is there a better way to approach this? I'm pretty early in this application and am hoping to avoid painting myself into a corner. Any help suggestion is appreciated. Thanks! :-)
Currently I'm thinking about my application data design too. I've read the examples in the MongoDB manual
look up MongoDB manual data model design - docs.mongodb.org/manual/core/data-model-design/
and here -> MongoDB manual one to one relationship - docs.mongodb.org/manual/tutorial/model-embedded-one-to-one-relationships-between-documents/
(sorry I can't post more than one link at the moment in an answer)
They say:
In general, use embedded data models when:
you have “contains” relationships between entities.
you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents.
The normalized approach uses a reference in a document, to another document. Just like in the Meteor.js book. They create a web app which shows posts, and each post has a set of comments. They use two collections, the posts and the comments. When adding a comment it's submitted together with the post_id.
So in your example you have a students collection. And each student has to fulfill requirements? And each student has his own requirements like a post has his own comments?
Then I would handle it like they did in the book. With two collections. I think that should be the normalized approach, not the embedded.
I'm a little confused myself, so maybe you can tell me, if my answer makes sense.
Maybe you can help me too? I'm trying to make a app that manages a flea market.
Users of the app create events.
The creator of the event invites users to be cashiers for that event.
Users create lists of stuff they want to sell. Max. number of lists/sellers per event. Max. number of position on a list (25/50).
Cashiers type in the positions of those lists at the event, to track what is sold.
Event creators make billings for the sold stuff of each list, to hand out the money afterwards.
I'm confused how to set up the data design. I need Events and Lists. Do I use the normalized approach, or the embedded one?
Edit:
After reading percona.com/blog/2013/08/01/schema-design-in-mongodb-vs-schema-design-in-mysql/ I found following advice:
If you read people information 99% of the time, having 2 separate collections can be a good solution: it avoids keeping in memory data is almost never used (passport information) and when you need to have all information for a given person, it may be acceptable to do the join in the application.
Same thing if you want to display the name of people on one screen and the passport information on another screen.
But if you want to display all information for a given person, storing everything in the same collection (with embedding or with a flat structure) is likely to be the best solution