Entity Framework - "Attach()" is slow - entity-framework

I'm using EF5 and attaching a disconnected graph of POCO entities to my context, something like this:-
using (var context = new MyEntities())
{
context.Configuration.AutoDetectChangesEnabled = false;
context.MyEntities.Attach(myEntity);
// Code to walk the entity graph and set each entity's state
// using ObjectStateManager omitted for clarity ..
context.SaveChanges();
}
The entity "myEntity" is a large graph of entities, with many child collections, which in turn have their own child collections, and so on. The entire graph contains in the order of 10000 entities, but only a small number are usually changed.
The code to set the entity states and the actual SaveChanges() is fairly quick (<200ms). It's the Attach() that's the problem here, and takes 2.5 seconds, so I was wondering if this could be improved. I've seen articles that tell you to set AutoDetectChangesEnabled = false, which I'm doing above, but it makes no difference in my scenario. Why is this?

I am afraid that 2,5 sec for attaching an object graph with 10000 entities is "normal". It's probably the entity snapshot creation that takes place when you attach the graph that takes this time.
If "only a small number are usually changed" - say 100 - you could consider to load the original entities from the database and change their properties instead of attaching the whole graph, for example:
using (var context = new MyEntities())
{
// try with and without this line
// context.Configuration.AutoDetectChangesEnabled = false;
foreach (var child in myEntity.Children)
{
if (child.IsModified)
{
var childInDb = context.Children.Find(child.Id);
context.Entry(childInDb).CurrentValues.SetValues(child);
}
//... etc.
}
//... etc.
context.SaveChanges();
}
Although this will create a lot of single database queries, only "flat" entities without navigation properties will be loaded and attaching (that occurs when calling Find) won't consume much time. To reduce the number of queries you could also try to load entities of the same type as a "batch" using a Contains query:
var modifiedChildIds = myEntity.Children
.Where(c => c.IsModified).Select(c => c.Id);
// one DB query
context.Children.Where(c => modifiedChildIds.Contains(c.Id)).Load();
foreach (var child in myEntity.Children)
{
if (child.IsModified)
{
// no DB query because the children are already loaded
var childInDb = context.Children.Find(child.Id);
context.Entry(childInDb).CurrentValues.SetValues(child);
}
}
It's just a simplified example under the assumption that you only have to change scalar properties of the entities. It can become arbitrarily more complex if modifications of relationships (children have been added to and/or deleted from the collections, etc.) are involved.

Related

Item creation child issue

I have an issue with a code first app.
When I tried to insert test item in my database I have an already exist error on child issue.
here the code of my newitem Operation :
public async Task<ActionResult<Event>> NewEvent(Event newEvent)
{
if (await _context.Events.CountAsync() > 0 && await _context.Events.FindAsync(newEvent.Id) is not null)
return BadRequest(new ConstraintException("Event Already Exist"));
if (newEvent.DoorPrize is not null && newEvent.DoorPrize.Count() > 0)
{
var doorPrizes = newEvent.DoorPrize.Where(d => _context.DoorPrizes.Contains(d)).ToList();
foreach (DoorPrize doorPrize in doorPrizes)
{
_context.Entry(doorPrize).State = EntityState.Detached;
}
foreach (DoorPrize doorPrize in newEvent.DoorPrize)
{
if (_context.FairlightUsers.Contains(doorPrize.Sponsor))
_context.Entry(doorPrize.Sponsor).State = EntityState.Detached;
}
}
if (newEvent.AttendeeDetails is not null && newEvent.AttendeeDetails.Count() > 0)
{
var attendeeDetails = _context.EventAttendeeDetails.Where(d => newEvent.AttendeeDetails.Contains(d)).ToList();
foreach (EventAttendeeDetail attendeeDetail in attendeeDetails)
{
_context.Entry(attendeeDetail).State = EntityState.Detached;
}
}
if (newEvent.VenueAddress is not null)
{
if (_context.Addresses.Contains(newEvent.VenueAddress))
_context.Entry(newEvent.VenueAddress).State = EntityState.Detached;
}
if (newEvent.Sponsor is not null)
{
if (_context.FairlightUsers.Contains(newEvent.Sponsor))
_context.Entry(newEvent.Sponsor).State = EntityState.Detached;
}```
I don't know why, even if I make Sponsors (the 2) to detached, the application still try to add one.
Is someone see where is my mistake ?
My goal would be to avoid any child insertion because app need to take it from existing list but I don't success to achieve this. the application always try to create children event with setting them as detached. is there a method to avoid this ?
Working with detached entities are a nuisance. Working with detached entity graphs are a complete pain. There are several issues with your approach, namely when working with detached entities, the database state should always be treated as the point of truth for which you apply changes from your detached state only after you validate that they are still relevant. (I.e. someone else hasn't modified the entity data since your current copy had been taken.
First off,
await _context.Events.CountAsync() > 0
&& await _context.Events.FindAsync(newEvent.Id) is not null
is completely unnecessary. Why would you tell the DbContext to execute a count AND load the entity just to determine if the entity exists? If you want to know if an entity exists:
var doesExist = _context.Events.Any(x => x.Id == newEvent.Id);
if (doesExist)
return BadRequest(new ConstraintException("Event already exists"));
This executes an IF EXISTS SELECT against the database which is much faster.
Not every operation on the DbContext needs to be async. Asynchronous calls are useful for operations that will take some time to run. They come with an small extra performance cost so any operation that can be done quickly such as fetching individual entities or reasonable entity graphs can just be done synchronously.
Next, when dealing with detached entities you generally do not want to detach existing entities, and especially not references, to be overwritten by the detached entities coming in. In short you should never trust data coming into the domain to be current or safe from unexpected tampering, either by bugs or malicious consumers. For example if you have a web application where the server sends an detached entity to be rendered, then the client presents fields to be changed then the Form or Ajax POST (Javascript) serializes data into an Entity class to send back to the server, it is very easy to miss values resulting in #nulls, and malicious users can use browser debugging tools to intercept the POST, view the entity data and make changes which code like the above could unwittingly overwrite data. What gets passed in may look like an entity, but it is often not.
Instead, without changing the fact that a detached entity graph is being passed in, treat it like a DTO. The data in Event will serve as a new entity, but everything related to it you will need to decide whether those represent new entities or references to existing ones. So for instance if the relationship between an Event and a DoorPrize is one to many, where a DoorPrize entity would be created with the new event, and only ever associated with that entity, then it stands that it should be allowed to be inserted with that entity. If instead the DoorPrize is its own entity and merely associated with this Event (and others) then it needs to be re-associated with the data state.
The difference between the two: 1-to-many (Event owns DoorPrizes) in the database would have an EventId on the DoorPrize table. Many-to-many (Event is associated with DoorPrizes) There would be an EventDoorPrize linking table containing the EventId & DoorPrizeId.
In the first case, if the event is considered as New, the door prizes should all be new. However, the relationship between DoorPrize and Sponsor is most likely a many-to-many association where one sponsor will likely be associated with many different door prizes across different events.
With ownership, if a client consumer is generating new IDs for entities (not recommended, it's better to leverage things like Identity columns and let the database manage that) then you might need to check that new DoorPrize records are not in the DB. The point here wouldn't be to replace existing Door Prizes if found, but to throw a data exception since we expect to be adding these new children:
Example if DoorPrizes are "owned" by Events (1-to-many relationship) but DoorPrizeIds are set by the consumer such as using a meaningful key or Guid.New()
var doorPrizeIds = newEvent.SelectMany(e => e.DoorPrize.Id).ToList();
var doorPrizeExists = _context.DoorPrizes.Any(dp => doorPrizeIds.Contains(dp.Id));
if (doorPrizeExists)
return BadRequest(new ConstraintException("One or more door prizes already exists"))
Dealing with associations requires a bit more attention. If DoorPrizes are expected to exist and are associated with a new Event then we need to locate those. If this request needs to handle that new DoorPrize entities might be created as part of this request, then that would need to be handled as well. As a general rule it is better to handle things more atomically where creating an Event that associates with door prizes would be responsible for just that. If there was an operation to create a new Door Prize then that would be handled by a separate call.
Example if DoorPrizes are "associated" to Events (many-to-many relationship)
var doorPrizeIds = newEvent.SelectMany(e => e.DoorPrize.Id)
.ToList();
var existingDoorPrizes = await _context.DoorPrizes
.Where(dp => doorPrizeIds.Contains(dp.Id))
.ToListAsync();
var existingDoorPrizeIds = existingDoorPrizes.Select(dp => dp.Id).ToList();
var doorPrizesToExclude = newEvent.SelectMany(e => e.DoorPrize)
.Where(dp => existingDoorPrizeIds.Contains(dp.Id))
.ToList();
foreach(var doorPrize in doorPrizesToExclude)
newEvent.DoorPrizes.Remove(doorPrize);
foreach(var doorPrize in existingDoorPrizes)
newEvent.DoorPrizes.Add(doorPrize);
What this gives us is a list of matching real Door Prize entities to associate. We will want to associate these in place of the data that came in with the new event. Any door prizes that might be new would be added when the event is added. The final step here will apply to both scenarios which will be to associate the sponsors to any new DoorPrize. In the one-to-many scenario that would be every door prize, in the many-to-many that would just be the non-existing ones that might be added:
1-to-many example:
var sponsorIds = newEvent
.SelectMany(e => e.DoorPrizes.Select(dp => dp.Sponsor.Id))
.Distinct();
var sponsors = await _context.Sponsors
.Where(s => sponsorIds.Contains(s.Id))
.ToListAsync();
foreach(var doorPrize in newEvent.DoorPrizes)
{
var sponsor = sponsors.SingleOrDefault(s => s.Id == doorPrize.Sponsor.Id);
if(sponsor == null)
return BadRequest(new ConstraintException("One or more door prizes was invalid."))
doorPrize.Sponsor = sponsor;
}
Many-to-many example:
1-to-many example:
var newDoorPrizes = newEvent.DoorPrizes.Where(dp => !existingDoorPrizeIds.Contains(dp.Id)).ToList();
if(newDoorPrizes.Any())
{
var sponsorIds = newDoorPrizes.Select(dp => dp.Sponsor.Id))
.Distinct();
var sponsors = await _context.Sponsors
.Where(s => sponsorIds.Contains(s.Id))
.ToListAsync();
foreach(var doorPrize in newEvent.DoorPrizes)
{
var sponsor = sponsors.SingleOrDefault(s => s.Id == doorPrize.Sponsor.Id);
if(sponsor == null)
return BadRequest(new ConstraintException("One or more door prizes was invalid."))
doorPrize.Sponsor = sponsor;
}
}
A similar operation to deal with associations for the Sponsor, the difference just being if the DoorPrizes are associations, we only want to do the substitution for Sponsors on newly added door prizes. The door prizes we re-associated from context tracked entities will already have valid sponsors.
Later, when you perform updates, it is a similar process, except you would expect to fetch the existing entity, but also pre-fetch the associated details with eager loading:
var existingEntry = _context.Entries
.Include(e => e.DoorPrizes)
.Single(e => e.Id == entryId);
This will throw if the entry isn't found which you can catch, or call .SingleOrDefault and check for #null to return your BadRequest if you prefer doing it inline. From there it is much the same process by where you can inspect the details coming un with the existingEntry to determine if DoorPrizes need to be updated, added, or removed. Again, for adding DoorPrizes the same process to re-associate Sponsors with actual tracked instances.
The important thing when updating entity graphs (parent-child relationships or associations) is to differentiate between whether the higher level entity "owns" the relationship, or if it is an association between entities that may already exist in the database. You will want to avoid code that detaches tracked entities and then does things like setting a passed in entity state to Modified to be saved. This will lead to all manners of problems where you overwrite data you don't intend to change, or exceptions when EF/SQL get told to do something invalid.

Having a hard time with Entity Framework detached POCO objects

I want to use EF DbContext/POCO entities in a detached manner, i.e. retrieve a hierarchy of entities from my business tier, make some changes, then send the entire hierarchy back to the business tier to persist back to the database. Each BLL call uses a different instance of the DbContext. To test this I wrote some code to simulate such an environment.
First I retrieve a Customer plus related Orders and OrderLines:-
Customer customer;
using (var context = new TestContext())
{
customer = context.Customers.Include("Orders.OrderLines").SingleOrDefault(o => o.Id == 1);
}
Next I add a new Order with two OrderLines:-
var newOrder = new Order { OrderDate = DateTime.Now, OrderDescription = "Test" };
newOrder.OrderLines.Add(new OrderLine { ProductName = "foo", Order = newOrder, OrderId = newOrder.Id });
newOrder.OrderLines.Add(new OrderLine { ProductName = "bar", Order = newOrder, OrderId = newOrder.Id });
customer.Orders.Add(newOrder);
newOrder.Customer = customer;
newOrder.CustomerId = customer.Id;
Finally I persist the changes (using a new context):-
using (var context = new TestContext())
{
context.Customers.Attach(customer);
context.SaveChanges();
}
I realise this last part is incomplete, as no doubt I'll need to change the state of the new entities before calling SaveChanges(). Do I Add or Attach the customer? Which entities states will I have to change?
Before I can get to this stage, running the above code throws an Exception:
An object with the same key already exists in the ObjectStateManager.
It seems to stem from not explicitly setting the ID of the two OrderLine entities, so both default to 0. I thought it was fine to do this as EF would handle things automatically. Am I doing something wrong?
Also, working in this "detached" manner, there seems to be an lot of work required to set up the relationships - I have to add the new order entity to the customer.Orders collection, set the new order's Customer property, and its CustomerId property. Is this the correct approach or is there a simpler way?
Would I be better off looking at self-tracking entities? I'd read somewhere that they are being deprecated, or at least being discouraged in favour of POCOs.
You basically have 2 options:
A) Optimistic.
You can proceed pretty close to the way you're proceeding now, and just attach everything as Modified and hope. The code you're looking for instead of .Attach() is:
context.Entry(customer).State = EntityState.Modified;
Definitely not intuitive. This weird looking call attaches the detached (or newly constructed by you) object, as Modified. Source: http://blogs.msdn.com/b/adonet/archive/2011/01/29/using-dbcontext-in-ef-feature-ctp5-part-4-add-attach-and-entity-states.aspx
If you're unsure whether an object has been added or modified you can use the last segment's example:
context.Entry(customer).State = customer.Id == 0 ?
EntityState.Added :
EntityState.Modified;
You need to take these actions on all of the objects being added/modified, so if this object is complex and has other objects that need to be updated in the DB via FK relationships, you need to set their EntityState as well.
Depending on your scenario you can make these kinds of don't-care writes cheaper by using a different Context variation:
public class MyDb : DbContext
{
. . .
public static MyDb CheapWrites()
{
var db = new MyDb();
db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;
return db;
}
}
using(var db = MyDb.CheapWrites())
{
db.Entry(customer).State = customer.Id == 0 ?
EntityState.Added :
EntityState.Modified;
db.SaveChanges();
}
You're basically just disabling some extra calls EF makes on your behalf that you're ignoring the results of anyway.
B) Pessimistic. You can actually query the DB to verify the data hasn't changed/been added since you last picked it up, then update it if it's safe.
var existing = db.Customers.Find(customer.Id);
// Some logic here to decide whether updating is a good idea, like
// verifying selected values haven't changed, then
db.Entry(existing).CurrentValues.SetValues(customer);

DbContext.Entry performance issue

Following Julia Lermas book 'DbContext' on a N-Tier solution of keeping track of changes, I provided each entity with a State property and a OriginalValues dictionary (through IObjectWithState). After the entity is constructed I copy the original values to this dictionary. See this sample (4-23) of the book:
public BreakAwayContext()
{
((IObjectContextAdapter)this).ObjectContext.ObjectMaterialized += (sender, args) =>
{
var entity = args.Entity as IObjectWithState;
if (entity != null)
{
entity.State = State.Unchanged;
entity.OriginalValues = BuildOriginalValues(this.Entry(entity).OriginalValues);
}
};
}
In the constructor of the BreakAwayContext (inherited from DbContext) the ObjectMaterialized event is caught. To retrieve the original values of the entity, the DbEntityEntry is retrieved from the context by the call to this.Entry(entity). This call is slowing the process down. 80% of the time of this event handler is spend on this call.
Is there a faster way to retrieve the original values or the entities DbEntityEntry?
Context.Entry() calls DetectChanges() that depends on number of objects in context and could be very slow. In your case you could replace with faster version ((IObjectContextAdapter) ctx).ObjectContext.ObjectStateManager.GetObjectStateEntry(obj);

EF4.1 Code First - How to Assign/Remove from Many-To-Many?

I have a Many to Many relationship between Products and ProductGroups.
Using EF4.1 Code First, I'm getting strange and incosistent results when adding/removing Products to/from a ProductGroup.
My view works perfectly; it retunrs a GroupId and a List productIds. The problem is in the controller where I loop through the list of productIds and assign/remove to a ProductGroup.
Here's an extract from my code to remove Products from a ProductGroup:
ProductGroup productGroup = _Repository.GetProductGroup(groupId);
using (var db = GetDbContext())
{
foreach (var pId in productIds)
{
Product p = _Repository.GetProduct(Convert.ToInt32(pId));
productGroup.Products.Remove(p);
db.Entry(productGroup).State = System.Data.EntityState.Modified;
p.ProductGroups.Remove(productGroup);
db.Entry(p).State = System.Data.EntityState.Modified;
db.SaveChanges();
}
}
Basically, I have have to affect both the ProductGroup and individual Products to get any result... and then the results are mixed. For example, when inspecting the DB table (ProductGroupProducts) only some records will get removed, but I can't figure out the pattern of which are and which aren't.
I have similar issues when assigning a Products to a ProductGroup. The code is almost identical with the obvious .Add() instead of .Remove().
What am I missing here? Anybody know of a better, and hopefully more consistent, way of doing this?
Many thanks in advance!
Radu
What is GetDbContext()? If this creates a new context then db is obviously another context than the context you are using in _Repository. (If it returns the same context as _Repository is using then the using block is weird because it disposes the context at the end and therefore destroys also the context in _Repository).
You must attach the productGroup and p to the context where you are doing the modifications in:
ProductGroup productGroup = _Repository.GetProductGroup(groupId);
using (var db = GetDbContext())
{
db.ProductGroups.Attach(productGroup);
foreach (var pId in productIds)
{
Product p = _Repository.GetProduct(Convert.ToInt32(pId));
db.Products.Attach(p);
productGroup.Products.Remove(p);
}
db.SaveChanges();
}
I've removed p.ProductGroups.Remove(productGroup) because I think EF will do that automatically when you remove the product from the group (but I'm not sure; you can watch the collections in the debugger to see.)
If GetDbContext() indeed creates a new context rethink the design. You should only have one context for such operations (reading and updating).
Edit
This is possibly easier:
ProductGroup productGroup = _Repository.GetProductGroup(groupId);
using (var db = GetDbContext())
{
db.ProductGroups.Attach(productGroup);
foreach (var pId in productIds)
{
var p = productGroup.Products
.SingleOrDefault(p1 => p1.ID == Convert.ToInt32(pId))
if (p != null)
productGroup.Products.Remove(p);
}
db.SaveChanges();
}
It saves you the database query for the product. I'm assuming that you either using lazy loading or that _Repository.GetProductGroup has an Include for the products collection.

Convince entity context (EF1) to populate entity references

I have an entity with self reference (generated by Entity Designer):
public MyEntity: EntityObject
{
// only relevant stuff here
public int Id { get...; set...; }
public MyEntity Parent { get...; set...; }
public EntityCollection<MyEntity> Children { get...; set...; }
...
}
I've written a stored procedure that returns a subtree of nodes (not just immediate children) from the table and returns a list of MyEntity objects. I'm using a stored proc to avoid lazy loading of an arbitrary deep tree. This way I get relevant subtree nodes back from the DB in a single call.
List<MyEntity> nodes = context.GetSubtree(rootId).ToList();
All fine. But when I check nodes[0].Children, its Count equals to 0. But if I debug and check context.MyEntities.Results view, Children enumerations get populated. Checking my result reveals children under my node[0].
How can I programaticaly force my entity context to do in-memory magic and put correct references on Parent and Children properties?
UPDATE 1
I've tried calling
context.Refresh(ClientWins, nodes);
after my GetSubtree() call which does set relations properly, but fetches same nodes again from the DB. It's still just a workaround. But better than getting the whole set with context.MyEntities().ToList().
UPDATE 2
I've reliably solved this by using EF Extensions project. Check my answer below.
You need to assign one end of the relationship. First, divide the collection:
var root = nodes.Where(n => n.Id == rootId).First();
var children = nodes.Where(n => n.Id != rootId);
Now, fix up the relationship.
In your case, you'd do either:
foreach (var c in children)
{
c.Parent = root;
}
...or:
foreach (var c in children)
{
root.Children.Add(c);
}
It doesn't matter which.
Note that this marks the entities as modfied. You'll need to change that if you intend to call SaveChanges on the context and don't want this saved.
The REAL solution
Based on this article (read text under The problem), navigation properties are obviously not populated/updated when one uses stored procedures to return data.
But there's a nice manual solution to this. Use EF Extensions project and write your own entity Materilizer<EntityType> where you can correctly set navigation properties like this:
...
ParentReference = {
EntityKey = new EntityKey(
"EntityContextName.ParentEntitySetname",
new[] {
new EntityKeyMember(
"ParentEntityIdPropertyName",
reader.Field<int>("FKNameFromSP")
)
})
}
...
And that's it. Calling stored procedure will return correct data, and entity object instances will be correctly related to eachother. I advise you check EF Extensions' samples, where you will find lots of nice things.