I've got a simple GET method in my Web API 2 project which queries my Microsof tSQL database via Entity Framework that always returns an error. If I step through it in the debugger the exception is NOT hit. It actually looks like it's cleanly leaving the method. I'm very confused.
[Route("ar")]
public IHttpActionResult GetAuditArs(int auditId)
{
using (var context = new LabSOREntities()) {
try {
var ars = from r in context.SMBWA_Audit_AR
where r.SMBWA_Audit_Id == auditId
select r;
var ret = Ok(ars.ToArray());
return ret;
} catch (Exception ex) {
return BadRequest($"Something went wrong: {ex.Message}");
}
}
}
There's one row in the database, and I see my ars.ToArray() says that's there's a single element in it. How can I debug this since it's left my method when it blows up?
If I just hit that endpoint via the browser I get:
<Error>
<Message>An error has occurred.</Message>
</Error>
The issue will be that you are returning entities from your API call. Behind the scenes WebAPI has to serialize the data being returned, as it does this it will hit any lazy-load reference properties and attempt to load them. Since you are instantiating a scoped DB Context within a using block, the entities will be orphaned from the context prior to the serialization so EF will be throwing exceptions that the DB Context is not available.
Option to verify the behaviour
- Eager-load all references in your "SMBWA_Audit_AR" class. This should eliminate the error and confirm the lazy load serialization issue.
var ars = context.SMBWA_Audit_AR
.Include(x => x.Reference1)
.Include(x => x.Reference2) // etc. where Reference1/2 are related entites to your audit record. If you have a lot of references, that is a lot of includes...
.Where(x => x.SMBWA_Audit_Id = auditId)
.ToArray();
To avoid issues like this, and the cost/time to eager-load everything I recommend using a POCO DTO/ViewModel to return the details about these audit records. Then you can .Select() the fields needed for the POCO. This avoids the lazy-load serialization issue, plus optimizes your queries from EF to return just the data that is needed, not the entire object graph.
for example:
If you need the Audit #, Name, and a Notes field to display in a list of audit summaries:
public class AuditSummary
{
public int AuditID {get; set;}
public string AuditorName {get; set;}
public string Notes {get; set;}
// You can add whatever fields are needed from the Audit or related entities... Including collections of other DTOs for related entites, or summary details like Counts etc..
}
var ars = context.SMBWA_Audit_AR
.Where(x => x.SMBWA_Audit_Id = auditId)
.Select(x => new AuditSummary
{
AuditId = x.AuditId,
AuditorName = x.AuditedBy.Name, //Example getting reference details..
Notes = x.Notes
}).ToArray();
Return models that reflect what the consumer will need. This avoids issues with EF and ensures your queries are efficient.
Scope the DbContext to the Request using an IoC Container (Unity/Autofac, etc.) This can seem like a viable option but it isn't recommended. While it will avoid the error, as the serializer iterates over your entity, your DbContext will be querying each individual dependency one at a time by ID. You can see this behaviour by running a profiler against the database while the application is running to detect lazy-load calls. It will work in the end, but it will be quite slow.
As a general rule, don't return entities from Web API or MVC controller methods to avoid errors and performance issues with serializers.
Related
I am doing some research into memory use on our Web Servers. We are hitting just shy of 1 gig of memory used when loading just our Sign In page. After a lot of debugging and removing as much as possible from the solution I can see an explosion of memory use on our very first query from the database.
We have around 180 Entities in our Db Context. We are using SQL Server as our storage mechanism and are using something like a repo pattern for reading from the database (we call it a Reader). The baseline of memory use just before querying the database for the first time is around 100 Mb
Some of the items I have tried include not using the reader and going after the DbContext without any additional extension methods which only delayed the initial explosion of memory. I also wondered if maybe extra memory was being used to map the DbContext like it does on the initial query and I generated a file and had it read from it instead of mapping and didn't notice any real drop in memory use. I also ensured that we are disposing after each read and only having 1 DbContext per unit of work (in this case loading the Sign In page)
Finally after quite a bit of effort I stripped the solution down to only the Entities needed to load the Signin Page (like 3-4 Entities) and then I saw a HUGE drop in memory use. Not quite as much as I wanted but only a couple hundred meg jump in memory use.
So the question is what is the correlation in Entity Count to the memory use of the DbContext? Perhaps we are doing something wrong with our entities.
By default, an EF DbContext will maintain a reference to every entity that it loads. This means that a DbContext that is left "alive" for a significant amount of time will continually consume memory as new entities are requested. This is one reason the general advice for EF Contexts is to keep them as short-lived as possible.
The number of entity definitions and relationships a context is configured to work with can have a bearing on EF performance including memory use, but the bulk of the memory usage will be from entities being loaded and tracked. Breaking up a DbContext into smaller contexts to track a subset of entities can be useful for larger systems to bump initialization and query performance. It also accommodates defining multiple entity definitions per table, such as detailed entities for things like updates and stripped down summary entities for searches and other basic operations.
The best tool to avoid excessive memory use in an application is to leverage projection wherever possible. This can be a bugbear when working with Repository patterns where the default behaviour is to return entities and specifically collections of entities.
For example, if I have a need to display a list of orders and I need to display the products in each order along with the quantity, as well as the customer name given a structure similar to:
Order (OrderId, Number, CreatedDate)
-> OrderStatus (OrderStatusId, Name)
-> Customer (CustomerId, Name)
-> OrderLines (OrderLineId, Quantity)
-> Product (ProductId, Name)
I might have a Repository that can return an IEnumerable<Order> and have dealt with a way to tell it what to eager load, maybe even included a complex Expression<Func<Order>> to pass in search criteria for a "New" order status that ends up executing the following against the DbContext:
var orders = context.Orders
.Include(x => x.OrderStatus)
.Include(x => x.Customer)
.Include(x => x.OrderLines)
.ThenInclude(x => x.Product)
.Where(x => x.OrderStatus.OrderStatusId == OrderStatuses.New)
.ToList();
return orders;
Then this data would be passed to a view where we display the Order Number, Customer Name, and list the Product Names along with the Quantity from the associated OrderLine. The view can be provided these entities and extract the necessary fields, however we have loaded and tracked all fields of the related entities into the DbContext and into memory.
The alternative is to project to a ViewModel, a POCO containing just the fields we need:
[Serializable]
public class OrderSummaryViewModel
{
public int OrderId { get; set; }
public string OrderNumber { get; set; }
public int CustomerId { get; set; }
public string CustomerName { get; set; }
public IEnumerable<OrderLineSummaryViewModel> OrderLines { get; set; } = new List<OrderLineSummaryViewModel>();
}
[Serializable]
public class OrderLineSummaryViewModel
{
public int OrderLineId { get; set; }
public int ProductId { get; set; }
public string ProductName { get; set; }
public int Quantity { get; set; }
}
now when we query the data, instead of returning IEnumerable<Order>, we return IQueryable<Order>. We can also do away with all complexity around passing what to eager load, filtering, sorting, and pagination. Consumers can refine the query as they see fit. The repository method becomes a simple:
IQueryable<Order> GetOrdersByStatus(params OrderStatuses[] orderStatuses)
{
var query = context.Orders
.Where(x => orderStatuses.Contains(x => x.OrderStatus.OrderStatusId);
return query;
}
Then the caller projects this result into the ViewModel using Select or Automapper's ProjectTo:
var orders = Repository.GetOrdersByStatus(OrderStatuses.New)
.Select(x => new OrderSummaryViewModel
{
OrderId = x.OrderId,
OrderNumber = x.Number,
CustomerId = x.Customer.CustomerId,
CustomerName = x.Customer.Name,
OrderLines = x.OrderLines
.Select(ol => new OrderLineSummaryViewModel
{
OrderLineId = ol.OrderLineId,
ProductLineId = ol.Product.ProductId,
ProductName = ol.Product.Name,
Quantity = ol.Quantity
}).ToList()
}).ToList();
The key differences here is that we now do not have the DbContext loading or tracking all of the associated orders, order lines, customer, products, etc. Instead it generates a query that will just pull back the fields projected into the view model. The memory footprint only grows by the amount of data we actually need rather than all of the entities and details we don't need.
Loading entities should be reserved for cases where we want to update data. For cases where we just want to read data, projection is very useful to reduce the memory footprint. One consideration I believe many teams have around loading and passing around entities /w longer lived DbContexts or the hassle of detaching and reattaching entities is to leverage caching and avoid DB Round-Trips. IMO this is a poor reason for loading and passing around entities as it results in more memory usage, leaves systems vulnerable to tampering, easily leads to performance issues /w lazy loading, and also can result in stale data overwrites since cached entities in the DbContext are not automatically refreshed from their data.
I have an application in which I verify the following behavior: the first requests after a long period of inactivity take a long time, and timeout sometimes.
Is it possible to control how the entity framework manages dispose of the objects? Is it possible mark some Entities to never be disposed?
...in order to avoid/improve the warmup time?
Regards,
The reasons that similar queries will have an improved response time are manifold.
Most Database Management Systems cache parts of the fetched data, so that similar queries in the near future will be faster. If you do query Teachers with their Students, then the Teachers table will be joined with the Students table. This join result is quite often cached for a while. The next query for Teachers with their Students will reuse this join result and thus become faster
DbContext caches queried object. If you select a Single teacher, or Find one, it is kept in local memory. This is to be able to detect which items are changed when you call SaveChanges. If you Find the same Teacher again, this query will be faster. I'm not sure if the same happens if you query 1000 Teachers.
When you create a DbContext object, the initializer is checked to see if the model has been changed or not.
So it might seem wise not to Dispose() a created DbContext, yet you see that most people keep the DbContext alive for a fairly short time:
using (var dbContext = new MyDbContext(...))
{
var fetchedTeacher = dbContext.Teachers
.Where(teacher => teacher.Id = ...)
.Select(teacher => new
{
Id = teacher.Id,
Name = teacher.Name,
Students = teacher.Students.ToList(),
})
.FirstOrDefault();
return fetchedTeacher;
}
// DbContext is Disposed()
At first glance it would seem that it would be better to keep the DbContext alive. If someone asks for the same Teacher, the DbContext wouldn't have to ask the database for it, it could return the local Teacher..
However, keeping a DbContext alive might cause that you get the wrong data. If someone else changes the Teacher between your first and second query for this Teacher, you would get the old Teacher data.
Hence it is wise to keep the life time of a DbContext as short as possible.
Is there nothing I can do to improve the speed of the first query?
Yes you can!
One of the first things you could do is to set the initialize of your database such that it doesn't check the existence and model of the database. Of course you can only do this when you are fairly sure that your database exists and hasn't changed.
// constructor; disables initializer
public SchoolDBContext() : base(...)
{
//Disable initializer
Database.SetInitializer<SchoolDBContext>(null);
}
Another thing could be, if you already have fetched your object to update the database, and you are sure that no one else changed the object, you can Attach it, instead of fetching it again, as is shown in this question
Normal usage:
// update the name of the teacher with teacherId
void ChangeTeacherName(int teacherId, string name)
{
using (var dbContext = new SchoolContext(...))
{
// fetch the teacher, change the name and save
Teacher fetchedTeacher = dbContext.Teachers.Find(teacherId);
fetchedTeader.Name = name;
dbContext.SaveChanges();
}
}
Using Attach to update an earlier fetched Teacher:
void ChangeTeacherName (Teacher teacher, string name)
{
using (var dbContext = new SchoolContext(...))
{
dbContext.Teachers.Attach(teacher);
dbContext.Entry(teacher).Property(t => t.Name).IsModified = true;
dbContext.SaveChanges();
}
}
Using this method doesn't require to fetch the Teacher again. During SaveChanges the value of IsModified of all properties of all Attached items is checked. If needed they will be updated.
I am not sure I am approaching wrong way or it is a default behaviour but it is not working the way I am expecting ...
Here are two sample classes ...
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public Department Department { get; set; }
}
Second one is Department
public class Department
{
public string Name { get; set; }
public List<Person> People { get; set; }
}
Context Configuration
public MyDbContext() : base("DefaultConnection")
{
this.Configuration.ProxyCreationEnabled = false;
this.Configuration.LazyLoadingEnabled = false;
}
public DbSet<Person> People { get; set; }
public DbSet<Department> Departments { get; set; }
I am try to load people where last name is from 'Smith'
var foundPeople
= context
.people
.Where(p => p.LastName == "Smith");
Above query load foundPeople with just FirstName and LastName no Department object. It is a correct behaviour as my LazyLoading is off. And that was expected as well.
Now in another query with Eager loading Department,
var foundPeople
= context
.people
.Where(p => p.LastName == "Smith")
.Include(p => p.Department);
Above query loads foundPeople with FirstName, LastName, Department with Department->Name as well as Deparment->People (all people in that department, which I dont want, I just want to load first level of the Included property.
I dont know is this an intended behaviour or I have made some mistake.
Is there any way to just load first level of Included property rather then complete graph or all levels of included property.
Using Include() to achieve eager loading only works if lazy loading is enabled on your objects--that is, your navigation properties must be declared as virtual, so that the EF proxies can override them with the lazy-loading behavior. Otherwise, they will eagerly load automatically and the Include() will have no effect.
Once you declare Person.Department and Department.People as virtual properties, your code should work as expected.
Very sorry, my original answer was wholly incorrect in the main. I didn't read your question closely enough and was incorrect in fact on the eager behavior. Not sure what I was thinking (or who upvoted?). Real answer below the fold:
Using the example model you posted (with necessary modifications: keys for the entities and removed "this" from context constructor) I was unable to exactly reproduce your issue. But I don't think it's doing what you think it's doing.
When you eagerly load the Department (or explicitly load, using context.Entry(...).Reference(...).Load()) inspect your results more closely: there are elements in the Department.People collections, but not all the Persons, only the Persons that were loaded in the query itself. I think you'll find, on your last snippet, that !foundPeople.SelectMany(p => p.Department.People).Any(p => p.LastName != "Smith") == true. That is, none of them are not "Smith".
I don't think there's any way around this. Entity Framework isn't explicitly or eagerly loading People collections (you could Include(p => p.Department.People) for that). It's just linking the ones that were loaded to their related object, because of the circular relationship in the model. Further, if there are multiple queries on the same context that load other Persons, they will also be linked into the object graph.
(An aside: in this simplified case, the proxy-creation and lazy-loading configurations are superfluous--neither are enabled on the entities by virtue of the fact that neither have lazy or proxy-able (virtual) properties--the one thing I did get right the first time around.)
By desing, DbContext does what it's called "relationship fix-up". As your model has information on which are the relations between your entities, whenever an entity is attached, or modified, in the context, EF will try to "fix-up" the relations between entities.
For example, if you load in the context an entity with a FK that indicates that it's a children of another entity already attached to the context, it will be added to the children collection of the existing entity. If you make any chages (change FK, delete entity, etc.) the relationships will be automatically fixed up. That's what the other answer explains: even if you load the related entities separatedly, with a different query, they'll be attached to the children collection they belong to.
This functionality cannot be disabled. See other questions related to this:
AsNoTracking and Relationship Fix-Up
Is it possible to enable relationship fixup when change tracking is disabled but proxies are generated
How to get rid of the related entities
I don't know what you need to do, but with the current version of EF you have to detach the entity from the context and manually remove the related entities.
Another option is to map using AutoMapper or ValueInjecter, to get rid of the relationship fix-up.
You could try using a LINQ query so you can select only the fields that you need. I hope that helps.
This is my first post :) I'm new to MVC .NET. And have some questions in regards to Entity Framework functionality and performance. Questions inline...
class StudentContext : DbContext
{
public StudentContext() : base("myconnectionstring") {};
public DbSet<Student> Students {get; set; }
...
}
Question: Does DbSet read all the records from the database Student table, and store it in collection Students (i.e. in memory)? Or does it simply hold a connection to this table, and (record) fetches are done at the time SQL is executed against the database?
For the following:
private StudentContext db = new StudentContext();
Student astudent = db.Students.Find(id);
or
var astudent = from s in db.Students
where s.StudentID == id)
select s;
Question: Which of these are better for performance? I'm not sure how the Find method works under-the-hood for a collection?
Question: When are database connections closed? During the Dispose() method call?
If so, should I call the Dispose() method for a class that has the database context instance? I've read here to use Using blocks.
I'm guessing for a Controller class get's instantiated, does work including database access, calls it's associated View, and then (the Controller) gets out of scope and is unloaded from memory. Or the garbase collector. But best call Dispose() to do cleanup explicitly.
The Find method looks in the DbContext for an entity which has the specified key(s). If there is no matching entity already loaded, the DbContext will makes a SELECT TOP 1 query to get the entity.
Running db.Students.Where(s => s.StudentID == id) will get you a sequence containing all the entities returned from a SQL query similar to SELECT * FROM Students WHERE StudentID = #id. That should be pretty fast; you can speed it up by using db.Students.FirstOrDefault(s => s.StudentID == id), which adds a TOP 1 to the SQL query.
Using Find is more efficient if you're loading the same entity more than once from the same DbContext. Other than that Find and FirstOrDefault are pretty much equivalent.
In neither case does the context load the entire table, nor does it hold open a connection. I believe the DbContext holds a connection until the DbContext is disposed, but it opens and closes the connection on demand when it needs to resolve a query.
I'm using VS1010RC with the POCO self tracking T4 templates.
In my WCF update service method I am using something similar to the following:
using (var context = new MyContext())
{
context.MyObjects.ApplyChanges(myObject);
context.SaveChanges();
}
This works fine until I set ConcurrencyMode=Fixed on the entity and then I get an exception. It appears as if the context does not know about the previous values as the SQL statement is using the changed entities value in the WHERE clause.
What is the correct approach when using ConcurrencyMode=Fixed?
The previous values need to be in your object.
Let's say you have a property ConcurrencyToken:
public class MyObject
{
public Guid Id { get; set; }
// stuff
public byte[] ConcurrencyToken { get; set; }
}
Now you can set ConcurrencyMode.Fixed on that property. You also need to configure your DB to automatically update it.
When you query the DB, it will have some value:
var mo = Context.MyObjects.First();
Assert.IsNotNull(mo.ConcurrencyToken);
Now you can detach or serialize the object, but you need to include ConcurrencyToken. So if you're putting the object data on a web form, you'll need to serialize ConcurrencyToken to a string and put it in a hidden input.
When you ApplyChanges, you need to include the ConcurrencyToken:
Assert.IsNotNull(myObject.ConcurrencyToken);
using (var context = new MyContext())
{
context.MyObjects.ApplyChanges(myObject);
context.SaveChanges();
}
Having ConcurrencyMode.Fixed changes the UPDATE SQL. Normally it looks like:
UPDATE [dbo].[MyObject]
SET --stuff
WHERE [Id] = #0
With ConcurrencyMode.Fixed it looks like:
UPDATE [dbo].[MyObject]
SET --stuff
WHERE [Id] = #0 AND [ConcurrencyToken] = #1
...so if someone has updated the row between the time you read the original concurrency token and the time you saved, the UPDATE will affect 0 rows instead of 1. The EF throws a concurrency error in this case.
Therefore, if any of this isn't working for you, the first step is to use SQL Profiler to look at the generated UPDATE.
Mark,
The objects created as "Self-tracking entities" cannot be considered pure POCOs;
Here's the reason:
The STEs only work well if your client uses the generated proxies from the STE T4 template.
Change-tracking, and thus your service, will only work with these generated proxies.
In a pure POCO world (interoperatibility, Not all .Net 4.0 clients, .. ), you cannot put
constraints on you client. For instance, facebook will not be writing a service that can
only handle .Net 4.0 clients.
STEs may be a good choice in some environments, it all depends on your requirements.