I want to know what is the best way for designing my domain objects for my project.
The domain objects should be ignorant of the ORM/Framework and even the database technology (MySQL, MSSQL, and even Mongo)
And the next question comes in mind, lets say that i have two objects, Post and Comment.
This is the code for the Post object:
class Post {
public int Id { get; set; }
public string Title { get; set; }
public string Content { get; set; }
public IList<Comment> Comments { get; set; }
}
Now, my question for the Comment object is should it include a Post Id or just a Post object?
class Comment {
public int Id { get; set; }
public string Content { get; set; }
public Post Post { get; set; }
public int PostId { get; set; } // Should i include it?
}
Is it a good practice to include the Id of the relational object or just its domain object?
Another concern is for no sql database like mongodb.
If i store my data in mongodb, what Post object will the Comment object reference?
In mongodb there is no relationship between post and comments, any post includes list of comments, but the comments themselves are ignorant of the post they are part of.
So what is the best way to make those domain classes compatible both with Sql based database and NoSql databases?
Thank you
Object-oriented design
You designed your Post and Comment classes in a way that they both reference each other. While I'd try to avoid this tight coupling unless absolutely necessary, it surely makes Comment.PostId obsolete, you can just call Post.Id from within a Comment.
Also, your domain objects should probably try to protect their invariants. What you have now is just property bags (even with public setters), and not domain objects, so currently they don't offer any meaningful advantage over the objects you'd use for persistence.
So if you want to create a domain model, ask yourself questions like these:
What can an actor do with a post in the system I'm building?
What data on a comment can an actor retrieve?
Then model your domain objects in a way that supports your business cases. For example, if users of your application are prohibited to change the title of a post, you should remove the setter from Post.Title.
This answer may give you more information on the distinction between domain objects and simple property bags (aka POCOs), even though that answer is in the context of domain-driven design.
Document DB Persistence
To store these objects in a document-oriented DB, you have basically two choices:
Store them as separate documents and reference one from the other
Store comments as part of their post on the post document
Both are valid approaches, you have to decide for your application. Note however that document boundaries are also consistency boundaries, so if you need atomic operations on a post and its comments, your only option is to put them on the same document.
Absolutely you should include it. What if you want to get a list of comments but don't want to get the Post object with each comment, yet still need the PostID?
Related
Posting this here despite having found the solution already so people can find it if anyone has the same problem.
Entity framework seemed to update properties that were only linked to the actually changed entity via navigation properties, which had me quite confused.
A similar example for clarification: Let's take an order for 3 kgs of sugar. That order has an ID, an amount, a product ID giving it a virtual product (sugar), and a customer ID giving it a virtual customer. While saving the order, the customer and the product were getting updated in the DB too.
I was trying to find answers to questions like "why are navigation properties set as modified", "why are unchanged child entities updated too", and so on.
A thorough debugging showed that the actual reason was me using the classes at a different point in the same controller action - for example, I migh want to send out a notification about the order by E-Mail, so I'm calling a Mail action with a DTO (data transfer object) giving it the relevant information for the mail, looking somewhat like this:
public class MailDTO {
public Order order { get; set; }
public Customer customer { get; set; }
public Product product { get; set; }
}
Since the actual customer entity has lots of information that the Mail doesn't need, I created a new Customer object with just the relevant information to put into that DTO. That's where the problem was - that new customer entity was being tracked by the framework, so when I called the savechanges(async) function, the framework tried to write that to the DB too.
Hope this helps someone else to not waste time looking for an error in all the wrong places ;)
I have following POCO class being used in EF 6.x.
My question: Why is the navigation property of 'Posts' under 'Blog' entity declared as virtual?
public class Blog
{
public int BlogId { get; set; }
public string Name { get; set; }
public string Url { get; set; }
public string Tags { get; set; }
public virtual ICollection<Post> Posts { get; set; }
}
If you define your navigation property virtual, Entity Framework will at runtime create a new class (dynamic proxy) derived from your class and uses it instead of your original class. This new dynamically created class contains logic to load the navigation property when accessed for the first time. This is referred to as "lazy loading". It enables Entity Framework to avoid loading an entire tree of dependent objects which are not needed from the database.
In some circumstances, it is best to use "Eager Loading" instead, especially if you know that you will be interacting with related objects at some point.
Julie Lerman really is the authority on all things Entity Framework, and she explains this process very well in her MSDN Article Demystifying Entity Framework Strategies: Loading Related Data
Eager loading with Include is useful for scenarios where you know in advance that you want the related data for all of the core data being queried. But remember the two potential downsides. If you have too many Includes or navigation paths, the Entity Framework may generate a poorly performing query. And you should be careful about returning more related data than necessary thanks to the ease of coding with Include.
Lazy loading very conveniently retrieves related data behind the scenes for you in response to code that simply makes mention of that related data. It, too, makes coding simpler, but you should be conscientious about how much interaction it’s causing with the database. You may cause 40 trips to the database when only one or two were necessary.
If you are developing a Web Application where every communication with the server is a new context anyway, Lazy Loading will just create unnecessary overhead to maintain the dynamic class for related objects that will never be loaded. Many people will disable lazy loading in these scenarios. Ultimately, it's still best to evaluate your SQL queries which EF has built and determine which options will perform best for the scenario you are developing under.
I'm just now learning MVC4 and Entity Framework. Some examples I have seen have all the "DbSet"s in one class, other I have seen each model have the DbSet in it. Is there an advantage of one way or the other? I kinda like having ONE "MyDbContext" model that references all the other models, but not sure which is better. Any thoughts and real life issues with either way?
public class UsersContext : DbContext
{
public DbSet<UserProfile> UserProfiles { get; set; }
}
public class UsersPostsContext : DbContext
{
public DbSet<UserPost> UserPosts { get; set; }
}
Verses:
public class MyContext : DbContext
{
public DbSet<UserProfile> UserProfiles { get; set; }
public DbSet<UserPost> UserPosts { get; set; }
}
The first example is definitely not the way to go.
It defeats the power of EF to handle complex object graphs. What if you want to retrieve users and their posts and profiles from the database? (Just a random example). You'd need three contexts and a lot of cunning to put the right objects together. And that's only the reading part. CUD actions are even more complex, if only the logic you need to do inserts/deletes in the right order and set FK associations.
That does not necessarily mean that, consequently, you should always have one context class. It can be beneficial to have several of them for parts of the database that logically belong together and are relatively isolated from other parts (like authorization tables, CRM tables, product tables, reporting, ...). In this case you may decide to use bounded contexts.
I use the second notation because that context is more flexible to use. You don't have to wonder which object to pass to the service for example. You don't have to manage a numer of files so it is easier to understand database schema.
I'm working on a MVC4/EF5 project and I really want to invest a lot of time in the design of my application.
I have made one policy/principle for myself. My controllers need to be dedicated, which means they can only interact with one specific repo.
E.g. (I'm just writing some pseudo code)
objects: User, Blog, Post, Comment
UserController -> UserRepo -> Handles the User object
BlogController -> BlogRepo -> Handles the Blog object
etc...
Now I'm looking into the following dilemma. What's the most performant to create/add new objects to the database with EF5.
1st Approach:
Add a function to User to add a Blog or add a Post.
addBlog(Blog b){this.Blog.addBlog(b);}
addPost(int blogid, Post p){this.Blog(b).addPost(p);}
For me this means that every post will initiate a usercontroller, inject a userrepo and will have a read operation in the User table to fetch the object. Then it will perform the addPost function and saves the changes.
2nd Approach:
Give the Blog and Post object some foreign key ids
Blog property: UserId
Post property: UserId, BlogId
This means whenever a blog is created, a blogcontroller will be initiated, a blogrepo will be injected and there is no 'LOOKUP' needed. The controller will just add the new Blog object to the context. (the UserId property is set from the websecurity context)
This also means that whenever a post is created, a postcontroller will be initiated, a postrepo will be injected and there is no 'LOOKUP' needed. The controller will just add the new Post object with the injected UserId and BlogId. (which are parameters of the controller action)
For me the second approach seems to devide the load on different tables instead of one. But the downside of this approach is that you can't really do model testing. Because there would be no function addBlog to the User object to be tested.
Is my question somewhat clear? I really want to build the most performant framework for my application. Also what is the impact of a User object with the following properties.
Virtual ICollection
Virtual IColleciton
They can both be retrieved by fetching the User object. (then there is lazy loading on the virtual objects)
Or is it better to let the dedicated BlogController index the blogs that belong to a certain user id.
Thanks a lot for all the advice and comments!
Kr
After investigating the created entity framework statements with SQL Profiler, I came to the conclusion that it doesn't matter how you create a new object.
Adding a new object directly to the repo creates the SAME insert statement as when you add the object through a collection of another object.
public class Blog
{
public int Id { get; set; }
public virtual ICollection Articles { get; set; }
}
public class Article
{
public int Id { get; set; }
}
I'm still getting accustomed to EF Code First, having spent years working with the Ruby ORM, ActiveRecord. ActiveRecord used to have all sorts of callbacks like before_validation and before_save, where it was possible to modify the object before it would be sent off to the data layer. I am wondering if there is an equivalent technique in EF Code First object modeling.
I know how to set object members at the time of instantiation, of course, (to set default values and so forth) but sometimes you need to intervene at different moments in the object lifecycle.
To use a slightly contrived example, say I have a join table linking Authors and Plays, represented with a corresponding Authoring object:
public class Authoring
{
public int ID { get; set; }
[Required]
public int Position { get; set; }
[Required]
public virtual Play Play { get; set; }
[Required]
public virtual Author Author { get; set; }
}
where Position represents a zero-indexed ordering of the Authors associated to a given Play. (You might have a single "South Pacific" Play with two authors: a "Rodgers" author with a Position 0 and a "Hammerstein" author with a Position 1.)
Let's say I wanted to create a method that, before saving away an Authoring record, it checked to see if there were any existing authors for the Play to which it was associated. If no, it set the Position to 0. If yes, it would find set the Position of the highest value associated with that Play and increment by one.
Where would I implement such logic within an EF code first model layer? And, in other cases, what if I wanted to massage data in code before it is checked for validation errors?
Basically, I'm looking for an equivalent to the Rails lifecycle hooks mentioned above, or some way to fake it at least. :)
You can override DbContext.SaveChanges, do the fix up there and call into base.SaveChanges(). If you do that you may want to call DetectChanges before doing the fix up. Btw. the very same issue is discussed in Programming Entity Framework DbContext book (ISBN 978-1-449-31296-1) on pages 192-194. And yes, this is in context of validation...
You can implement IValidatableObject. That gives you a single hook:
IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
And you can apply your validation logic there.
There's also a SavingChanges event on ObjectContext (which you can obtain from the DbContext).
You could just create a custom IDoStuffOnSave interface and apply it to your entities that need to execute some logic on save (there's nothing out of the box)