What is the overhead of Entity Framework tracking?

What is the overhead of Entity Framework tracking? - entity-framework

I've just been talking with a colleague about Entity Framework change tracking. We eventually figured out that my context interface should have
IDBSet<MyPoco> MyThings { get; }
rather than
IQueryable<MyPoco> MyThings { get; }
and that my POCO should also have all it's properties as virtual.
Using the debugger we could then see the tracking objects and also that the results contained proxies to my actual POCOs.
If I don't have my POCO properties as virtual and have my context interface using IQueryable<> instead of IDbSet<> I don't get any of that.
In this instance I am only querying the database, but in the future will want to update the database via Entity Framework.
So, to make my life easier in the future when I come to look at this code as a reference, is there any performance penalty in having the tracking info/proxies there when I will never make use of them?

There is a performance penalty of tacking entities in EF. When you query using entity framework EF will keep a copy of values loaded from database. Also single Context instance keeps track of only single instance of an entity. So EF has to check whether it already has a copy of the entity before it creates an instance(ie. There will be lot of comparisons going behind the scenes).
So avoid it if you don't need it. You can do so as follows.
IQueryable<MyPoco> MyThings { get { return db.MyThings.AsNoTracking(); } }
MSDN page on Stages of Query Execution details the cost associated with each step of query execution.
Edit:
You should not expose IDBSet<MyPoco> MyThings because that tells the consumer of your API that your entities can be added, updated and deleted when in fact you intend to query the data.

Navigation properties in the model classes as declared as virtual so as to imply lazy load feature which means the navigation property will only be needed if required. As far as the Entity objects are concerned, there main aim is to load the specific table records from the database into the DbSet which comes from DbContext. You can't use IQueryable in this case. Also, it doesn't make any sense with the DataContext. IQueryable is an altogether different interface

Related

Entity Framework Memory Management and Dispose?

I'm using EF (EF Core, actually, with ASP.NET Core on OSX, but I believe this is more of a general "newbie-style" EF question, so please read on...)
I built a little logging routine that uses EF to publish log entries to my database. Sort of like this, called from a repository class:
WebLog log = new WebLog(source, path, message);
Context.WebLogs.Add(log);
Context.SaveChanges();
Where WebLog is a simple model class, Context.WebLogs is a DbSet<WebLog> collection, and Context is obviously the DbContext. I believe this is quite straightforward.
But my question is this: if I continue to add new log entries to the Context.WebLogs collection and I never do anything like reboot my server, isn't the collection just going to grow without bounds? Is there some kind of "purge" or "flush" action I can take periodically to manage memory usage (without affecting the committed rows in the database, of course--I want those to persist). Or is DbSet some sort of a special collection that won't do this?

As mentioned by DevilSuichiro above, the recommended approach is to limit the lifetime of the instances of DbContext. E.g. in a Web application you typically use a DbContext instance per request, so an unbounded number of entities added doesn't become a problem.
The closest thing to a "flush" operation is SaveChanges() that method will not try to remove references to tracked entities, as DbContext is designed to be reused after SaveChanges().
In previous versions of EF we had a Detach() API that you could use to get rid of an individual tracked reference but we don't have that API in DbContext or anywhere in EF Core.
BTW, having an instance of DbContext that is shared between multiple requests is extremely problematic because DbContext is not thread safe.

DbContext with dynamic DbSet

Is it possible to have a DbContext that has only one property of the generic type IDbSet and not a collection of concrete IDbSet e.g. DbSet.
More specifically, i want to create only one generic DbSet where the actual type will be determined dynamically e.g.
public new IDbSet<T> Set<T>() where T : class
{
return context.Set<T>();
}
I don't want to create multiple DbSets e.g.
DbSet<product> Products { get; set; }
...
Actually i tried to use that generic DbSet but there seems to be one problem. The DbContext doesn't create the corresponding tables in the database. So although i can work with the in-memory entity graph, when the time comes to store the entites into the DB an exception is thrown (Invalid object name 'dbo.Product'.)
Is there any way to force the EF to create tables that correspond to dynamicaly creates DbSets?

Yes you can do this.
modelBuilder.Configurations.Add
The DBSet entries will be derived.
If you plan to use POCOs and just build the model this way ok.
So you save Manual DBSet<> declaration...
But if you plan on being more Dynamic without POCOs...
Before you go down the this route, there are a number of things to consider.
Have you selected the right ORM ?
Do you plan on having a POCOs ?
Why is DbSet Products { get; set; } so bad ?
You get a lot of action for that 1 line of code.
What Data access approach you plan to use without types DBSets
Do you plan to use Linq to Entity statements?
do you plan on creating Expression trees for the Dynamic Data access necessary. Since the types arent known at compile time.
Do you plan to use the DB Model cache,?
How will the cache be managed, especially in Web. ASP environments.
There are most likely other issues i did think of off the top of my head.
Constructing the model yourself is a big task. The Linq access is compromised when compile time types/POCOs are NOT used and the model cache and performance become critical management tasks.
The practical side of this task is not to under estimate
Start here bContext.OnModelCreating
Typically, this method is called only once when the first instance of
a derived context is created. The model for that context is then
cached and is for all further instances of the context in the app
domain. This caching can be disabled by setting the ModelCaching
property on the given ModelBuidler, but this can seriously degrade
performance. More control over caching is provided through use of the
DbModelBuilder and DbContext classes directly.
The modelbuilder class
Good Luck

EF 4.2 Code First and DDD Design Concerns

I have several concerns when trying to do DDD development with EF 4.2 (or EF 4.1) code first. I've done some extensive research but haven't come up with concrete answers for my specific concerns. Here are my concerns:
The domain cannot know about the persistence layer, or in other words the domain is completely separate from EF. However, to persist data to the database each entity must be attached to or added to the EF context. I know you are supposed to use factories to create instances of the aggregate roots so the factory could potentially register the created entity with the EF context. This appears to violate DDD rules since the factory is part of the domain and not part of the persistence layer. How should I go about creating and registering entities so that they correctly persist to the database when needed to?
Should an aggregate entity be the one to create it's child entities? What I mean is, if I have an Organization and that Organization has a collection of Employee entities, should Organization have a method such as CreateEmployee or AddEmployee? If not where does creating an Employee entity come in keeping in mind that the Organization aggregate root 'owns' every Employee entity.
When working with EF code first, the IDs (in the form of identity columns in the database) of each entity are automatically handled and should generally never be changed by user code. Since DDD states that the domain is separate from persistence ignorance it seems like exposing the IDs is an odd thing to do in the domain because this implies that the domain should handle assigning unique IDs to newly created entities. Should I be concerned about exposing the ID properties of entities?
I realize these are kind of open ended design questions, but I am trying to do my best to stick to DDD design patterns while using EF as my persistence layer.
Thanks in advance!

On 1: I'm not all that familiar with EF but using the code-first/convention based mapping approach, I'd assume it's not too hard to map POCOs with getters and setters (even keeping that "DbContext with DbSet properties" class in another project shouldn't be that hard). I would not consider the POCOs to be the Aggregate Root. Rather they represent "the state inside an aggregate you want to persist". An example below:
// This is what gets persisted
public class TrainStationState {
public Guid Id { get; set; }
public string FullName { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
// ... more state here
}
// This is what you work with
public class TrainStation : IExpose<TrainStationState> {
TrainStationState _state;
public TrainStation(TrainStationState state) {
_state = state;
//You can also copy into member variables
//the state that's required to make this
//object work (think memento pattern).
//Alternatively you could have a parameter-less
//constructor and an explicit method
//to restore/install state.
}
TrainStationState IExpose.GetState() {
return _state;
//Again, nothing stopping you from
//assembling this "state object"
//manually.
}
public void IncludeInRoute(TrainRoute route) {
route.AddStation(_state.Id, _state.Latitude, _state.Longitude);
}
}
Now, with regard to aggregate life-cycle, there are two main scenario's:
Creating a new aggregate: You could use a factory, factory method, builder, constructor, ... whatever fits your needs. When you need to persist the aggregate, query for its state and persist it (typically this code doesn't reside inside your domain and is pretty generic).
Retrieving an existing aggregate: You could use a repository, a dao, ... whatever fits your needs. It's important to understand that what you are retrieving from persistent storage is a state POCO, which you need to inject into a pristine aggregate (or use it to populate it's private members). This all happens behind the repository/DAO facade. Don't muddle your call-sites with this generic behavior.
On 2: Several things come to mind. Here's a list:
Aggregate Roots are consistency boundaries. What consistency requirements do you see between an Organization and an Employee?
Organization COULD act as a factory of Employee, without mutating the state of Organization.
"Ownership" is not what aggregates are about.
Aggregate Roots generally have methods that create entities within the aggregate. This makes sense because the roots are responsible for enforcing consistency within the aggregate.
On 3: Assign identifiers from the outside, get over it, move on. That does not imply exposing them, though (only in the state POCO).

The main problem with EF-DDD compatibility seems to be how to persist private properties. The solution proposed by Yves seems to be a workaround for the lack of EF power in some cases. For example, you can't really do DDD with Fluent API which requires the state properties to be public.
I've found only mapping with .edmx files allows you to leave Domain Entities pure. It doesn't enforce you to make things publc or add any EF-dependent attributes.
Entities should always be created by some aggregate root. See a great post of Udi Dahan: http://www.udidahan.com/2009/06/29/dont-create-aggregate-roots/
Always loading some aggregate and creating entities from there also solves a problem of attaching an entity to EF context. You don't need to attach anything manually in that case. It will get attached automatically because aggregate loaded from the repository is already attached and has a reference to a new entity. While repository interface belongs to the domain, repository implementation belongs to the infrastructure and is aware of EF, contexts, attaching etc.
I tend to treat autogenerated IDs as an implementation detail of the persistent store, that has to be considered by the domain entity but shouldn't be exposed. So I have a private ID property that is mapped to autogenerated column and some another, public ID which is meaningful for the Domain, like Identity Card ID or Passport Number for a Person class. If there is no such meaningful data then I use Guid type which has a great feature of creating (almost) unique identifiers without a need for database calls.
So in this pattern I use those Guid/MeaningfulID to load aggregates from a repository while autogenerated IDs are used internally by database to make a bit faster joins (Guid is not good for that).

Entity Framework Service Layer Update POCO

I am using the Service Layer --> Repository --> Entity Framework (Code-First) w/POCO objects approach, and I am having a hard time with updating entities.
I am using AutoMapper to map my Domain Objects to my View Models and that works good for getting the data, no how do I get that changes back into the database?
Using pure POCO objects, I would assume that there is no sort of change tracking, so I see my only option is to handle it myself. Do you just make sure that your View Models have the EXACT same properties as your Domain Objects? What if I just change a field or two on the View Model? Won't the rest of the fields on the Domain Object get overwritten in the database with default values?
With that said, what is the best approach?
Thanks!
Edit
So what I am stumbling on is this, lets take for example a simple Customer:
1) The Controller has a service, CustomerService, that calls the services GetCustmoerByID method.
2) The Service calls into the CustomerRepository and retrieves the Customer object.
3) Controller uses AutoMapper to map the Customer to the ViewModel.
4) Controller hands the model to the View. Everything is great!
Now in the view you do some modifications of the customer and post it back to the controller to persist the changes to the database.
I would assume at this point the object is detached. So should the model have the EXACT same properties as the Customer object? And do you have to make hidden fields for each item that you do not want to show, so they can persist back?
How do you handle saving the object back to the database? What happens if your view/model only deals with a couple of the fields on the object?

If you're using EF Code First, i.e: the DbContext API, then you still do have change tracking which is taken care of by your context class.
after making changes to your objects, all you have to do is call SaveChanges() on your context and that will persist the changes to your database.
EDIT:
Since you are creating a "copy" of the entity using AutoMapper, then it's no longer attached to your context.
I guess what you could do is something similar to what you would in ASP.NET MVC (with UpdateModel). You can get the original entity from your context, take your ViewModel (which may contain changed properties) and update the old entity, either manually (just modified properties), or using AutoMapper. And then persist the changes using context.SaveChanges().
Another solution would be to send the model entity as [part of] the ViewModel. This way, you'll have your entity attached to the container and change tracking will still work.
Hope this helps :)

You are absolutely right that with a detached object you are responsible for informing the context about changes in your detached entity.
The basic approach is just set the entity as modified. This works for scalar and complex properties but it doesn't work for navigation properties (except FK relations) - for further reading about problems with navigation properties check this answer (it is related to EFv4 and ObjectContext API but same problems are with DbContext API). The disadvantage of this approach is that all fields in DB will be modified. If you just want to modify single field you still have to correctly fill others or your database record will be corrupted.
There is a way to explicitly define which fields have changed. You will set the modified state per property instead of whole entity. It is little bit harder to solve this on generic approach but I tried to show some way for EFv4 and for EFv4.1.

I agree with #AbdouMoumen that it's much simpler to use the model entities at the view level. The service layer should provide an API to persist those entities in the data store (db). The service layer shouldn't dumbly duplicate the repository lawyer (ie: Save(entity) for every entity) but rather provide a high level save for an aggregate of entities. For instance, you could have a Save(order) in the service layer which results in updating more basic entities like inventory, customer, account.

Should i use partial classes as business layer when using entity framework?

I am working on a project using entity framework. Is it okay to use partial classes of the EF generated classes as the business layer. I am begining to think that this is how EF is intended to be used.
I have attempted to use a DTO pattern and soon realized that i am just creating a bunch of mapping classes that is duplicating my effort and also a cause for more maintenance work and an additional layer.
I want to use self-tracking-entities and pass the EF entities to all the layers. Please share your thoughts and ideas. Thanks

I had a look at using partial classes and found that exposing the database model up towards the UI layer would be restrictive.
For a few reasons:
The entity model created includes a deep relational object model which, depending on your schema, would get exposed to the UI layer (say the presenter of MVP or the ViewModel in MVVM).
The Business logic layer typically exposes operations that you can code against. If you see a save method on the BLL and look at the parameters needed to do the save and see a model that require the construction of other entities (cause of the relational nature the entity model) just to do the save, it is not keeping the operation simple.
If you have a bunch of web services then the extra data will need to be sent across for no apparent gain.
You can create more immutable DTO's for your operations parameters rather than encountering side effects cause the same instance was modified in some other part of the application.
If you do TDD and follow YAGNI then you will tend to have a structure specifically designed for the operation you are writing, which would be easier to construct tests against (not requiring to create other objects not realated to the test just because they are on the model). In this case you might have...
public class Order
{ ...
public Guid CustomerID { get; set; }
... }
Instead of using the Entity model generated by the EF which have references exposed...
public class Order
{ ...
public Customer Customer { get; set; }
... }
This way the id of the customer is only needed for an operation that takes an order. Why would you need to construct a Customer (and potentially other objects as well) for an operation that is concerned with taking orders?
If you are worried about the duplication and mapping, then have a look at Automapper

I would not do that, for the following reasons:
You loose the clear distinction between the data layer and the business layer
It makes the business layer more difficult to test
However, if you have some data model specific code, place that is a partial class to avoid it being lost when you regenerate the model.

I think partial class will be a good idea. If the model is regenerated then you will not loose the business logic in the partial classes.
As an alternative you can also look into EF4 Code only so that you don't need to generate your model from the database.

I would use partial classes. There is no such thing as data layer in DDD-ish code. There is a data tier and it resides on SQL Server. The application code should only contain business layer and some mappings which allow persisting business objects in the mentioned data tier.
Entity Framework is you data access code so you shouldn't built your own. In most cases the database schema would be modified because the model have changed, not the opposite.
That being said, I would discourage you to share your entities in all the layers. I value separation of UI and domain layer. I would use DTO to transfer data in and out of the domain. If I have the necessary freedom, I would even use CQRS pattern to get rid of mapping entities to DTO -- I would simply create a second EF data access project meant only for reading data for the UI. It would be built on top of the same database. You read data through read (anemic -- without business logic) model, but you modify it by issuing commands that are executed against real model implemented using EF and partial methods.
Does this answer your question?

I wouldn't do that. Try too keep the layers independent as possible. So a tiny change in your database schema will not affect all your layers.
Entities can be used for data layer but they should not.
If at all, provide interfaces to be used and let your entities implement them (on the partial file) the BL should not know the entities but the interfaces.