How does EntityFramework Core mange data internally? - entity-framework-core

I'm trying to understand how EntityFramework Core manages data internally because it influences how I call DbSets. Particularly, does it refer to in-memory data or re-query the database every time?
Example 1)
If I call _context.ToDo.Where(x => x.id == 123).First() and then in a different procedure call the same command again, will EF give me the in-memory value or re-query the DB?
Example 2)
If I call _context.ToDo.Where(x => x.id == 123).First() and then a few lines later call _context.ToDo.Find(123).Where(x => x.id == 123).Incude(x => x.Children).First(), will it use the in-memeory and then only query the DB for "Children" or does it recall the entire dataset?
I guess I'm wondering if it matters if I duplicate a call or not?
Is this affected by the AsNoTracking() switch?

What you really ask is how caching works in EF Core, not how DbContext manages data.
EF always offered 1st level caching - it kept the entities it loaded in memory, as long as the context remains alive. That's how it can track changes and save all of them when SaveChanges is called.
It doesn't cache the query itself, so it doesn't know that Where(....).First() is meant to return those specific entities. You'd have to use Find() instead. If tracking is disabled, no entities are kept around.
This is explained in Querying and Finding Entities, especially Finding entities using primary keys:
The Find method on DbSet uses the primary key value to attempt to find an entity tracked by the context. If the entity is not found in the context then a query will be sent to the database to find the entity there. Null is returned if the entity is not found in the context or in the database.
Find is different from using a query in two significant ways:
A round-trip to the database will only be made if the entity with the given key is not found in the context.
Find will return entities that are in the Added state. That is, Find will return entities that have been added to the context but have not yet been saved to the database.
In Example #2 the queries are different though. Include forces eager loading, so the results and entities returned are different. There's no need to call that a second time though, if the first entity and context are still around. You could just iterate over the Children property and EF would load the related entities one by one, using lazy loading.
EF will execute 1 query for each child item it loads. If you need to load all of them, this is slow. Slow enough to be have its own name, the N+1 selects problem. To avoid this you can load a related collection explicitly using explicit loading, eg. :
_context.Entry(todo).Collection(t=>t.Children).Load();
When you know you're going to use all children though, it's better to eagerly load all entities with Include().

Related

JPA First level cache and when its filled

working with Spring data JPA and reading it Hibernate first level cache is missed, the answer says "Hibernate does not cache queries and query results by default. The only thing the first level cache is used is when you call EntityManger.find() you will not see a SQL query executing. And the cache is used to avoid object creation if the entity is already loading."
So, if If get an entity not by its Id but other criteria, if I update some property I should not see an update sql inside a transactional methods because it has not been stored int the first level cache, right?
According to the above answer, if I get some list of entities, they will not be stored in first level cache not matter the criteria I use to find them, right?
When a Transactional(propagation= Propagation.NEVER) method loads the same entity by its id two times, is not supposed it will hit the database two times because each loading will run in its own "transaction" and will have its own persistent context? What is the expected behaviour in this case?
Thanks

How to properly use EFCore with SignalR Core (avoid caching entities)

I just found some really strange behaviour which turns out it is not so strange at all.
My select statement (query from database) worked only the first time. The second time, query from database was cached.
Inside Hub method I read something from database every 10 seconds and return result to all connected clients. But if some API change this data, Hub context does not read actual data.
In this thread I found this:
When you use EF it by default loads each entity only once per context. The first query creates entity instance and stores it internally. Any subsequent query which requires entity with the same key returns this stored instance. If values in the data store changed you still receive the entity with values from the initial query. This is called Identity map pattern. You can force the object context to reload the entity but it will reload a single shared instance.
So my question is how to properly use EFCore inside SignalR Core hub method?
I could use AsNoTracking, but I would like to use some global setting. Developer can easily forget to add AsNoTracking and this could mean serving outdated data to user.
I would like to write some code in my BaseHub class which will tell context do not track data. If I change entity properties, SaveChanges should update data. Can this be achieved? It is hard to think all the time to add AsNoTracking when querying from hub method.
I would like to write some code in my BaseHub class which will tell context do not track data.
The default query tracking behavior is controlled by the ChangeTracker.QueryTrackingBehavior property with default value of TrackAll (i.e. tracking).
You can change it to NoTracking and then use AsTracking() for queries that need tracking. It's a matter of which are more commonly needed.
If I change entity properties, SaveChanges should update data.
This is not possible if the entity is not tracked.
If you actually want tracking queries with "database wins" strategy, I'm afraid it's not possible currently in EF Core. I think EF6 object context services had an option for specifying the "client wins" vs "database wins" strategy, but EF Core currently does not provide such control and always implements "client wins" strategy.

Aggregate Root support in Entity Framework

How can we tell Entity Framework about Aggregates?
when saving an aggregate, save entities within the aggregate
when deleting an aggregate, delete entities within the aggregate
raise a concurrency error when two different users attempt to modify two different entities within the same aggreate
when loading an aggregate, provide a consistent point-in-time view of the aggregate even if there is some time delay before we access all entities within the aggregate
(Entity Framework 4.3.1 Code First)
EF provides features which allows you defining your aggregates and using them:
This is the most painful part. EF works with entity graphs. If you have an entity like Invoice and this entity has collection of related InvoiceLine entities you can approach it like aggregate. If you are in attached scenario everything works as expected but in detached scenario (either aggregate is not loaded by EF or it is loaded by different context instance) you must attach the aggregate to context instance and tell it exactly what did you changed = set state for every entity and independent association in object graph.
This is handled by cascade delete - if you have related entities loaded, EF will delete them but if you don't you must have cascade delete configured on the relation in the database.
This is handled by concurrency tokens in the database - most commonly either timestamp or rowversion columns.
You must either use eager loading and load all data together at the beginning (= consistent point of view) or you will use lazy loading and in such case you will not have consistent point of view because lazy loading will load current state of relations but it will not update other parts of aggregate you have already loaded (and I consider this as performance killer if you try to implement such refreshing with EF).
I wrote GraphDiff specifically for this purpose. It allows you to define an 'aggregate boundary' on update by providing a fluent mapping. I have used it in cases where I needed to pass detached entity graphs back and forth.
For example:
// Update method of repository
public void Update(Order order)
{
context.UpdateGraph(order, map => map
.OwnedCollection(p => p.OrderItems);
}
The above would tell the Entity Framework to update the order entity and also merge the collection of OrderItems. Mapping in this fashion allows us to ensure that the Entity Framework only manages the graph within the bounds that we define on the aggregate and ignores all other properties. It supports optimistic concurrency checking of all entities. It handles much more complicated scenarios and can also handle updating references in many to many scenarios (via AssociatedCollections).
Hope this can be of use.

copy records from between two databases using EF

I need to copy data from one database to another with EF. E.g. I have the following table relations: Forms->FormVersions->FormLayouts... We have different forms in both databases and we want to collect them to one DB. Basically I want to load Form object recursively from one DB and save it to another DB with all his references. Also I need to change IDs of the object and related objects if there are exists objects with the same ID in the second database.
Until now I have following code:
Form form = null;
using (var context = new FormEntities())
{
form = (from f in context.Forms
join fv in context.FormVersions on f.ID equals fv.FormID
where f.ID == 56
select f).First();
}
var context1 = new FormEntities("name=FormEntities1");
context1.AddObject("Forms", form);
context1.SaveChanges();
I'm receiving the error: "The EntityKey property can only be set when the current value of the property is null."
Can you help with implementation?
The simplest solution would be create copy of your Form (new object) and add that new object. Otherwise you can try:
Call context.Detach(form)
Set form's EntityKey to null
Call context1.AddObject(form)
I would first second E.J.'s answer. Assuming though that you are going to use Entity Framework, one of the main problem areas that you will face is relationship management. Your code should use the Include method to ensure that related objects are included in the results of a select operation. The join that you have will not have this effect.
http://msdn.microsoft.com/en-us/library/bb738708.aspx
Further, detaching an object will not automatically detach the related objects. You can detach them in the same way however the problem here is that as each object is detached, the relationships that it held to other objects within the context are broken.
Manually restoring the relationships may be an option for you however it may be worthwhile looking at EntityGraph. This framework allows you to define object graphs and then perform operations such as detach upon them. The entire graph is detached in a single operation with its relationships intact.
My experience with this framework has been in relation to RIA Services and Silverlight however I believe that these operations are also supported in .Net.
http://riaservicescontrib.codeplex.com/wikipage?title=EntityGraphs
Edit1: I just checked the EntityGraph docs and see that DetachEntityGraph is in the RIA specific layer which unfortunately rules it out as an option for you.
Edit2: Alex Jame's answer to the following question is a solution to your problem. Don't load the objects into the context to begin with - use the notracking option. That way you don't need to detach them which is what causes the problem.
Entity Framework - Detach and keep related object graph
If you are only doing a few records, Ladislav's suggestion will probably work, but if you are moving lots of data, you should/could consider doing this move in a stored procedure. The entire operation can be done at the server, with no need to move objects from the db server, to your front end and then back again. A single SP call would do it all.
The performance will be a lot better which may or may not not matter in your case.

Inheritance problems with Entity Framework (table per type)

For part of the project I'm currently working on, I have a set of four tables for syndicatable actions. One table is the abstract base for the other three, and each table is represented in my EF model like so:
EF Model -- Actions http://chris.charabaruk.com/system/files/images/EF+Model+Actions.png
There are two problems that I'm currently facing with this, however. The first problem is that Actor (a reference to a User) and Subject (a reference to an entity of the class associated with each type of action) are null in my subclasses, despite the associated database columns holding valid keys to rows in their associated tables. While I can get the keys via ActorReference and SubjectReference this of course requires setting up a new EF context and querying it for the referenced objects (as FooReference.Value is also null).
The second problem is that the reciprocal end of the relationship between the concrete action classes and their related entity classes always turn up nothing. For example, Task.RelatedActions, which should give me all TaskAction objects where Subject refers to the particular task object on which RelatedActions is called, is entirely devoid of objects. Again, valid rows exist in the database, Entity Framework just isn't putting them in objects and handing them to me.
Anyone know what it is I'm doing wrong, and what I should do to make it work?
Update: Seems that none of the relationship properties are working in my entity model any more, at all. WTF...
I think the issue you are experiencing here is that by default the EF does not automatically load related entities. If you load an entity, the collection or reference to related entities will be empty unless you do one of the following things:
1) Use eager loading in order to retrieve your main entity and your related entity in a single query. To do this, modify your query by adding a call to the Include method. In your sample above, you might use the following query:
from a in context.Actions.Include("Actor") select a
This would retrieve each of the actions with the related Actor method.
2) Use explicit lazy loading to retrieve the related entity when you need it:
action1.ActorReference.Load()
In the version of the EF which will ship with .Net 4.0, you will also have the following additional option:
3) Turn on implicit lazy loading so that related entities will automatically be retrieved when you reference the navigation property.
Danny