Entity Framework Memory Management and Dispose? - entity-framework

I'm using EF (EF Core, actually, with ASP.NET Core on OSX, but I believe this is more of a general "newbie-style" EF question, so please read on...)
I built a little logging routine that uses EF to publish log entries to my database. Sort of like this, called from a repository class:
WebLog log = new WebLog(source, path, message);
Context.WebLogs.Add(log);
Context.SaveChanges();
Where WebLog is a simple model class, Context.WebLogs is a DbSet<WebLog> collection, and Context is obviously the DbContext. I believe this is quite straightforward.
But my question is this: if I continue to add new log entries to the Context.WebLogs collection and I never do anything like reboot my server, isn't the collection just going to grow without bounds? Is there some kind of "purge" or "flush" action I can take periodically to manage memory usage (without affecting the committed rows in the database, of course--I want those to persist). Or is DbSet some sort of a special collection that won't do this?

As mentioned by DevilSuichiro above, the recommended approach is to limit the lifetime of the instances of DbContext. E.g. in a Web application you typically use a DbContext instance per request, so an unbounded number of entities added doesn't become a problem.
The closest thing to a "flush" operation is SaveChanges() that method will not try to remove references to tracked entities, as DbContext is designed to be reused after SaveChanges().
In previous versions of EF we had a Detach() API that you could use to get rid of an individual tracked reference but we don't have that API in DbContext or anywhere in EF Core.
BTW, having an instance of DbContext that is shared between multiple requests is extremely problematic because DbContext is not thread safe.

Related

EF Core 2.2: use same DbContext instance get record from database multiple times, always the same result

I am facing an issue: we have an ASP.NET Core 2.2 Web API project with EF Core 2.2. We are using default IOC framework to create the DbContext with scope lifetime. And we have a socket pipeline connected to our ASP.NET Web API service.
I find that when we change the data in the web frontend, the socket pipeline will always get the old result (we are using .FirstOrDefault() to fetch the data, it should not be the problem with first-level cache).
So I infer that it might be because of that the scope lifetime for DbContext, so I changed it to transient lifetime. And it works! We get the modified record.
I have two questions:
Is that behavior of DbContext by design? Or maybe I have some tricky issue in my code.
How much performance will the transient lifetime DbContext cost? Since maybe I will make every DbContext transient
1) Is that behavior of DbContext by design?
Yes
For each item in the result set If this is a tracking query, EF checks
if the data represents an entity already in the change tracker for the
context instance If so, the existing entity is returned If not, a new
entity is created, change tracking is setup, and the new entity is
returned
How Queries Work
2) How much performance will the transient lifetime DbContext cost?
Very little. Especially in ASP.NET Core, which has DbContext Pooling
Since maybe I will make every DbContext transient
But you shouldn't do that. Using a request-scoped DbContext is very useful. For instance you can use the DbContext in various layers of your application without having to pass one around, and you can manage transactions more easily.

A static DbContext object for read-only purposes in ASP.NET MVC WebAPI

I'm refactoring my ASP.NET MVC 4 WebAPI project for performance optimization reasons.
Within my controller code, I'm searching for entities in a context (DbContext, EF6). There are a few thousands of such entities, new ones are added on an hourly basis (i.e. "slowly"), they are rarely deleted (and I don't care if deleted entities are still found on the context's cache!) and are never modified.
After reading the answers to this question, to this one and a few more discussions, I'm still not sure it's a bad idea to use a single static DbContext for the purpose described above - a DbContext which never updates the database.
Performance-wise, I'm not worried about the instantiation cost, but rather about the uselessness of caching requested entities if the DbContext is created for each request. I'm also using a 2nd level caching, which makes the persistence of the context even more acute.
My questions are:
1. Regardless of the specific implementation, is a "static" DbContext a valid solution in my case?
2. If so, what would be the most appropriate way of implementing such a DbContext?
3. Should I periodically "flush" the context to clear the cache in order to prevent if from growing too big?
DbContext caches entity instances when you get/query the data. It ensures different queries that return the same data map to the same entity (based on type and id). Otherwise, if you modify the same entity in different object instances, the context would not know which one has the correct data. Therefore a static DbContext would blow up over time until the process crashes.
DbContexts should be short lived. Request.Properties is a good place to save it in Web API (maps to HttpContext.Items in IIS).

How entity framework track the loaded entities? what are their life cycle?

I am relatively new to entity framework, all the documents or books I can find are talking about how to use the framework, or which model should be used, but short of explanation how the framework works in depth.
For instance, when I load the entities from the database via either LINQ query or framework methods, are those entities thread safe? In another words can they be shared with other threads? If so how EF controls the consistency?
When control goes out of context, are those entities gone or still in memory? After .SaveChanges are those entities gone? What is the life cycle?
Can an expert in EF explain the above points in details please.
Thanks in advance.
The life cycle of loaded entities is more-or-less tied to that of the Entity Context which loaded them. Hence in many examples you will see:
using (var ctx = new Context())
{
// ... do work
} // The context gets disposed here.
Once the context is disposed (at the end of the using statement, e.g.), you should no longer treat entities that were loaded inside the context as if you can load additional information from them. For example, don't try accessing navigation properties on them. To avoid problems, I usually find it best to create a DTO that has only the exact data that I expect people to be able to use, and have that be the only value that leaves the using statement.
using (var ctx = new Context())
{
var q = from p in ctx.People
select new PersonSummary{Name = p.Name, Email = p.Email};
return q.ToList(); // This will fully evaluate the query,
// leaving you with plain PersonSummary objects.
}
Entity Contexts are not thread-safe, so you shouldn't be trying to load navigation properties and such from multiple threads for objects tied to the same context, even within the context's lifecycle.
For instance, when I load the entities from the database via either LINQ query or framework methods, are those entities thread safe? In
another words can they be shared with other threads? If so how EF
controls the consistency?
The ObjectContext class is not tread safe. You must have one object context per thread or to create you own thread synchronization process. This way the consistency is managed by the ObjectContext since it tracks all the objects' state.
When control goes out of context, are those entities gone or still in memory? After .SaveChanges are those entities gone? What is the life
cycle?
ObjectContext class inherit from IDisposable interface so you can, and should, use USING statement when using Entity Framework. This way they're gone after you close the using statement. If you DO NOT dispose the context they keep being tracked, only their states are changed. Disposing ObjectContext instances will also make sure that the database connection is properly disposed and you are not leaking database connections.
So, the big question is:
Where and when should EF live?
Theses ORM should be treated as the Unit of Work pattern, that is, the ORM object should live until the business task is done.
In my specific scenarios I use an IoC container like Windsor that does the heavy lifting for me. In an ASP.NET MVC app for example, Windsor can create a Context per Web Request. With this you don't have to write a lot of using statements throughout your code. You can read more about it here:
Windsor Tutorial - Part Seven - Lifestyles
Here's a link that explains it in more details directly from the guy that helps build the framework at Microsoft:
Entity Framework Object Context Life Cycle compared to Linq to Sql Data Context Life Cycle
You can write a test application to observe the behavior of the context tracker.
If you retrieve an entity from a context, then dispose of that context, then create a new instance of the context and attempt to save a change to the entity you retrieved earlier, it will complain that it is already tracking an entity with that ID.

What is the overhead of Entity Framework tracking?

I've just been talking with a colleague about Entity Framework change tracking. We eventually figured out that my context interface should have
IDBSet<MyPoco> MyThings { get; }
rather than
IQueryable<MyPoco> MyThings { get; }
and that my POCO should also have all it's properties as virtual.
Using the debugger we could then see the tracking objects and also that the results contained proxies to my actual POCOs.
If I don't have my POCO properties as virtual and have my context interface using IQueryable<> instead of IDbSet<> I don't get any of that.
In this instance I am only querying the database, but in the future will want to update the database via Entity Framework.
So, to make my life easier in the future when I come to look at this code as a reference, is there any performance penalty in having the tracking info/proxies there when I will never make use of them?
There is a performance penalty of tacking entities in EF. When you query using entity framework EF will keep a copy of values loaded from database. Also single Context instance keeps track of only single instance of an entity. So EF has to check whether it already has a copy of the entity before it creates an instance(ie. There will be lot of comparisons going behind the scenes).
So avoid it if you don't need it. You can do so as follows.
IQueryable<MyPoco> MyThings { get { return db.MyThings.AsNoTracking(); } }
MSDN page on Stages of Query Execution details the cost associated with each step of query execution.
Edit:
You should not expose IDBSet<MyPoco> MyThings because that tells the consumer of your API that your entities can be added, updated and deleted when in fact you intend to query the data.
Navigation properties in the model classes as declared as virtual so as to imply lazy load feature which means the navigation property will only be needed if required. As far as the Entity objects are concerned, there main aim is to load the specific table records from the database into the DbSet which comes from DbContext. You can't use IQueryable in this case. Also, it doesn't make any sense with the DataContext. IQueryable is an altogether different interface

Ado Entity Framework when should you use attach/detach

In ADO.net EF, when should you call the context.Attach() and the context.Detach() methods and how do these calls affect the data being returned or being inserted?
This is one of those questions where, "If you have to ask, you probably should not be doing it." The Entity Framework will implicitly attach entities in cases where it is obvious that this needs to happen. You really only ever need to explicitly attach and detach entities in cases where you are using more than one ObjectContext at once. Because this can be quite confusing, due to the implicit attachment which happens in the course of normal Entity Framework operations, I strongly recommend that people new to the Entity Framework use only one ObjectContext at a time. If you do this, you should never need to explicitly call Attach or Detach.
Calling, say, Attach does not really affect the data returned, insofar as it's scaler properties are concerned. But if it refers to other entities which are already loaded into the context into which it is attached, then these properties will be pre-populated without explicit loading. That said, entities returned from a query are already attached, so you cannot attach them.
Attaching Objects (Entity Framework)
http://msdn.microsoft.com/en-us/library/bb896271.aspx
Detaching Objects (Entity Framework)
http://msdn.microsoft.com/en-us/library/bb738611.aspx