Is it a bad practice to open a new context or send one as a parameter into sub-methods?

Is it a bad practice to open a new context or send one as a parameter into sub-methods? - entity-framework

A colleague asked for my opinion regarding the following setup. It's based on a declaration of a context in a method and then providing it into the called submethods. Schematically, it looks like so.
public void SuperMethod()
{
using(Context context = new Context())
{
...
SetMethod(context, ...);
...
GetMethodDuo(context, ...);
...
}
}
public void SetMethod(Context context, ...) { ... }
public Some GetMethod(Context context, ...) { ... }
I advised him against it, motivating my answer by the idea of opening/closing access to the database as near the actual operations as possible. But now that I think about it, I'm uncertain if that was the best advice in any circumstances.
Question 1: Was the suggestion of mine correct in a general case or should I consider altering it?
I also noticed that the super-method calling the sub-methods used the context itself. My suggestion was to move the part that talked to the database in a new sub-method, hence freeing the super-method from any references to the context. I felt that it made sense to make the super-method a controller, while performing all the database related operations in the workers.
Question 2: Does it make sense to have a controlling method that calls a (possibly large) number of sub-methods carrying the actual work?
Please note that both questions are related to the usage of the context while working with Entity Framework and not a general class structure.

IMO, the context should be opened and disposed with every unit of work (so for example, within a simple function) - which doesn't mean you shouldn't pass your context to underlying functions. This can be exactly what you want, especially considering connection pooling and context entry lifetime.
This has a few pretty simple reasons:
It's pretty cheap to open a new context, it will take almost no time relatively to the main performance issues in EF like materializing values (DataSet to object and vice versa) and creating the queries - and those two have to be done with an already open context as well.
One main reason against opening and disposing a context every time is the opening/disposing of connections (some DBMS, I know particularly of SQL CE, have incredible problems with the creation of connections to certain databases - and EF will create a new connection based on the provided connection string whenever it needs one). However, you can easily surpass this, by keeping a connection open (or letting it timeout isn't too bad most of the time either) and pass it to your context upon creating, using the DbContext(Connection, bool) overload with ContextOwnsConnection=false.
When you keep the context open over the whole lifetime, you can't possibly know which objects are already in the change tracker, materialized or there in another form, and which aren't. For me, this was a problem when rewriting the BL of my project. I tried to modify an object, which I added earlier. It was in the context (unchanged) but not in the change tracker, I couldn't set its state, because it wasn't in the change tracker. And I couldn't attach it again, because it was already in the context. This kind of behavior will be pretty hard to control.
Another form of this is as follows. Whenever a new object enters the context, EF will try to set the navigation properties of these regarding the other objects in the context. This is called relationship fixup and is one of the main reasons Include() is working so well. This means that most of the time, you'll have a huge object tree in your context. Then, upon adding/deleting it (or whatever else operation) EF will try to carry out this to the whole tree (well... sometimes ;) ), which can cause a lot of trouble, especially when trying to add a new entry with FK to already existing items.
A database context is, like already mentioned, basically an object tree, which can be, depending on its lifetime, gigantic. And here, EF has to do a few things, like... Checking if an item is already there, because of obvious reasons... In best case complexity O(n*log(n)+m), where m is the number of object types and n the number of objects of this type in the context. ...Checking if an object has been modified since retrieval - well, you can imagine, since EF has to do this for every single object in every single call, this can slow things down pretty far.
A bit corresponding to the last issue. What do you really want, when calling SaveChanges()? Most likely, you want to be able to tell: "ok, these are the actions I did, so EF should now issue these and these calls to the db", right? Well... But, since EF has been tracking the entities, and maybe you modified some values, or another thread did something there and there... How can you be sure, these are the only things SaveChanges() will do? How can you be sure that over the whole lifetime of the context, there's nothing fishy in your database then (which will cancel the transaction, which can be pretty big)?
But yeah, of course, there are a few issues, where you need to keep the context open (well, you don't need to - you could just pass it). For me, this was mostly in a few cases where FK correction was hard to maintain (but still within one function, while sometimes within one function I just had to dispose and re-create the context for the sake of simplicity) and whenever you call sub-functions from multiple places in your code - I had the issue that I had a context open within a calling function, which called another function, which still needed the context. Usually, that's not a problem but my connection handling is kind of... advanced. That led to performance loss, I dealt with this by passing the already open context through an optional additional context parameter to the sub-function - just like what you already mentioned, however it shouldn't really be necessary.
For additional reference, here are some links that might be helpful in this regard. One's straight from MSDN and the other from a blog.

As #DevilSuichiro mentioned the DbContext is meant as a Unit of Work container. By default DbContext stores all loaded objects in the memory and track their changes. When the SaveChanges method is called all changes are sent to a DB in the single transaction.
So if your SuperMethod handles some kind of a logical unit of work (e.g. a HTTP request in a web application) I would instantiate the context only once and pass it as a parameter to submethods.
Regarding your second question - if you instantiate the context only once, it's IMO better to have more methods that are simple, easier to maintain and have meaningful names. If you want to create a new instance of the context in every submethod, it depends on what "possibly large" number means :-)

Related

Correct way to persist and existing JPA entity in database

In one application I am working on, I found that the JPA objects that are modified are not previouly loaded, I mean;
Instead of doing like this:
getEntityManager().find(Enterprise.class, idEnterprise);
//Some modifying operations
They do like this(and works!):
Enterprise obj = new Enterprise (IdEnterprise);
//Some modifying operations
getEntityManager().persist(obj);
This last solution doesnt load from database the object and it modifies the object correctly.
How is this possible?
Is a good practice? At least you avoid the query that loads from database, right?
Thanks

It depends. If you are writing code from a controller class (or any application code) you shouldn't be worried about jpa stuff, so the second approach is bad and redundant.
If, instead, you are working in infrastructure code, maybe you can manually persist you entities to enable some performance optimization or simply because you want the data to persist even if the transaction fails.
I strongly suspect the second bit of code is someone creating an entity from scratch, but mixing application, domain and infrastructure code in the same method: an extremely evil practice. (Very evil, darth father evil, never do that)

What is the point of the Update function in the Repository EF pattern?

I am using the repository pattern within EF using an Update function I found online
public class Repository<T> : IRepository<T> where T : class
{
public virtual void Update(T entity)
{
var entry = this.context.Entry(entity);
this.dbset.Attach(entity);
entry.State = System.Data.Entity.EntityState.Modified;
}
}
I then use it within a DeviceService like so:
public void UpdateDevice(Device device)
{
this.serviceCollection.Update(device);
this.uow.Save();
}
I have realise that what this actually does it update ALL of the device's information rather than just update the property that changed. This means in a multi threaded environment changes can be lost.
After testing I realised I could just change the Device then call uow.Save() which both saved the data and didnt overwrite any existing changes.
So my question really is - What is the point in the Update() function? It appears in almost every Repository pattern I find online yet it seems destructive.

I wouldn't call this generic Update method generally "destructive" but I agree that it has limited use cases that are rarely discussed in those repository implementations. If the method is useful or not depends on the scenario where you want to apply it.
In an "attached scenario" (Windows Forms application for instance) where you load entities from the database, change some properties while they are still attached to the EF context and then save the changes the method is useless because the context will track all changes anyway and know at the end which columns have to be updated or not. You don't need an Update method at all in this scenario (hint: DbSet<T> (which is a generic repository) does not have an Update method for this reason). And in a concurrency situation it is destructive, yes.
However, it is not clear that a "change tracked update" isn't sometimes destructive either. If two users change the same property to different values the change tracked update for both users would save the new column value and the last one wins. If this is OK or not depends on the application and how secure it wants changes to be done. If the application disallows to ever edit an object that is not the last version in the database before the change is saved it cannot allow that the last save wins. It would have to stop, force the user to reload the latest version and take a look at the last values before he enters his changes. To handle this situation concurrency tokens are necessary that would detect that someone else changed the record in the meantime. But those concurrency checks work the same way with change tracked updates or when setting the entity state to Modified. The destructive potential of both methods is stopped by concurrency exceptions. However, setting the state to Modified still produces unnecessary overhead in that it writes unchanged column values to the database.
In a "detached scenario" (Web application for example) the change tracked update is not available. If you don't want to set the whole entity to Modified you have to load the latest version from the database (in a new context), copy the properties that came from the UI and save the changes again. However, this doesn't prevent that changes another user has done in the meantime get overwritten, even if they are changes on different properties. Imagine two users load the same customer entity into a web form at the same time. User 1 edits the customer name and saves. User 2 edits the customer's bank account number and saves a few seconds later. If the entity gets loaded into the new context to perform the update for User 2 EF would just see that the customer name in the database (that already includes the change of User 1) is different from the customer name that User 2 sent back (which is still the old customer name). If you copy the customer name value the property will be marked as Modified and the old name will be written to the database and overwrite the change of User 1. This update would be just as destructive as setting the whole entity state to Modified. In order to avoid this problem you would have to either implement some custom change tracking on client side that recognizes if User 2 changed the customer name and if not it just doesn't copy the value to the loaded entity. Or you would have to work with concurrency tokens again.
You didn't mention the biggest limitation of this Update method in your question - namely that it doesn't update any related entities. For example, if your Device entity had a related Parts collection and you would edit this collection in a detached UI (add/remove/modify items) setting the state of the parent Device to Modified won't save any of those changes to the database. It will only affect the scalar (and complex) properties of the parent Device itself. At the time when I used repos of this kind I named the update method FlatUpdate to indicate that limitation better in the method name. I've never seen a generic "DeepUpdate". Dealing with complex object graphs is always a non-generic thing that has to be written individually per entity type and depending on the situation. (Fortunately a library like GraphDiff can limit the amount of code that has to be written for such graph updates.)
To cut a long story short:
For attached scenarios the Update method is redundant as EFs automatic change tracking does all the necessary work to write correct UPDATE statements to the database - including changes in related object graphs.
For detached scenarios it is a comfortable way to perform updates of simple entities without relationships.
Updating object graphs with parent and child entities in a detached scenario can't be done with such a simplified Update method and requires significantly more (non-generic) work.
Safe concurrency control needs more sophisticated tools, like enabling the optimistic concurrency checks that EF provides and handling the resulting concurrency exceptions in a user-friendly way.

After Slauma's very profound and practical answer I'd like to zoom in on some basic principles.
In this MSDN article there is one important sentence
A repository separates the business logic from the interactions with the underlying data source or Web service.
Simple question. What has the business logic to do with Update?
Fowler defines a repository pattern as
Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.
So as far as the business logic is concerned a repository is just a collection. Collection semantics are about adding and removing objects, or checking whether an object exists. The main operations are Add, Remove, and Contains. Check out the ICollection<T> interface: no Update method there.
It's not the business logic's concern whether objects should be marked as 'modified'. It just modifies objects and relies on other layers to detect and persist changes. Exposing an Update method
makes the business layer responsible for tracking and reporting its changes. Soon all kinds of if constructs will creep in to check whether values have changes or not.
breaks persistence ignorance, because the mere fact that storing updates is something else than storing new objects is a data layer detail.
prevents the data access layer from doing its job properly. Indeed, the implementation you show is destructive. While the Data Access Layer may be perfectly capable of perceiving and persisting granular changes, this method marks a whole object as modified and forces a swiping UPDATE.

What is the correct way to manage dependency injection for Entity Framework ObjectContext in ASP.NET MVC controllers?

In my MVC controllers, I'm using an IoC container (Ninject), but am not sure how to best use when it comes to the Entity Framework ObjectContext.
Currently, I'm doing something like:
using(var context = new MyObjectContext())
{
var stuff = m_repository.GetStuff(context);
}
This is the best way to manage from the point of view of keeping the database connection open for the shortest time possible.
If I were to create the ObjectContext via Ninject on a per request basis, this obviously keeps the database connection open for too long.
Also the above code would become...
var stuff = m_repository.GetStuff(m_myObjectContext);
(And when would I dispose of the context...?)
Should I be creating a factory for the ObjectContext and pass that in via DI? This would loosen the coupling, but does this really help with testability if there is no easy means of maintaining an interface for the ObjectContext (that I know of)?.
Is there a better way? Thanks

This is the best way to manage from the point of view of keeping the
database connection open for the shortest time possible.
If I were to create the ObjectContext via Ninject on a per request
basis, this obviously keeps the database connection open for too long.
Entity Framework will close the connection directly after the execution of each query (except when supplying an open connection from the outside), so your argument for doing things like this does not hold.
In the past I used to have by business logic (my command handlers to be precise) have control over the context (create, commit, and dispose it), but the downside is that you need to pass on this context to all other methods and all dependencies. When the application logic gets more complex, this results in less readable, less maintainable code.
For that reason I moved to a model where the unit of work (your MyObjectContext) is created, committed, and disposed outside the control of the business logic. This allows you to inject the unit of work into all dependencies and reuse the same unit of work for all objects. Downside is that this makes your DI configuration a bit harder. Some things your need to make sure of:
The unit of work must be created as per web request or within a certain scope.
The unit of work must be disposed at the end of the request or scope (although it is probably not a problem when the DbContext is not disposed, since the underlighing connection is closed and DbContext does not implemente a finalizer).
You need to explicitly commit the unit of work, but you can't do this at the end of the web request, since at that point you have no idea whether it is safe to commit (since you don't want to commit when your business logic threw an exception, but at the end of the request there is no way to correctly detect if this actually happened).
One tip I can give you is to model the business logic in the system around command handlers, since this allows you do define a single decorator that handles the transactional behavior (committing the unit of work and perhaps even running everything in a database transaction) at a single point. This decorator can be wrapped around each handler in the system.
I must admit that I have no idea how to register generic types and generic decorators with Ninject, but you'll probably get an answer quickly when asking here at Stackoverflow.

Callbacks on entity on created/updated

I would like to know when entities in a certain database table are either created or updated. The application is essentially a CMS, and I need to know when changes are made to the content so that I can reindex them for searches.
I know that the autogenerated LINQ to EF class has overridable methods for when certain fields change, but I need to know when the whole object is created/updated, not just a single field. I tried putting it in OnCreated, only to find that meant OnObjectInitialized and not OnObjectInsertedIntoDBTable xD
I did some searching and came across this link. The "Entity State" section looks like its what I want, but I'm not sure how to use this information. Where do I override those methods?
Or perhaps there is a another/better way?
(I also need to know this for another part of the system, which will send notifications when certain content is changed. I would prefer this code to execute automatically when the insert/update occurs instead of placing it in a controller and hoping hoping I always call that method.)

You need to get ObjectStateEntry(s) from the ObjectStateManager property of the ObjectContect.
var objectStateEntries = this.ObjectStateManager.GetObjectStateEntries();
This entries contain every object state you've pulled down per context and what kind of actions where performed on them.
If you are using EF4 you can override the SaveChanges method to include this functionality. I've used this technique to audit every change that occurs in the database instead of triggers.

What's the point of setRetainsRegisteredObjects:?

Why would I want to set that to YES? What difference would that make? Must I worry about this?

setRetainsRegisteredObjects: to YES makes your context maintain a strong reference to managed objects that it would otherwise maintain a weak relationship with. When you perform a fetch request, the objects returned have a weak reference (by default) to the respective managed object context. Only when an object is modified (added, changed, deleted) does the managed object context (MOC) maintain a strong relationship to the object.
Setting setRetainsRegisteredObjects: to YES ensures that strong pointers will be maintained between all fetched objects.
I don't know what #TechZen is talking about - this can be the cause for a sneaky bug if you're not careful. It's a useful method to invoke on the MOC when you find yourself in a situation where this would be useful.

Worry? I don't know, are you interested in wasting time?
You only fiddle with this particular context attribute when you want to do custom memory management within Core Data (which you almost never do.) I had to go look this up just to remember what it was because I haven't used it in years.
The rule of thumb with Core Data is that if you have an attribute with a default value then you use the default value in the vast majority of cases. That's why its the default.
Unless you see a context attribute changed in virtually every example i.e. the store name, then it is not necessary to change it in 90% of the uses. It is certainly not necessary for a novice to try and change it.
Core Data is intended to be relatively simple once you under it abstractly. Using binding, it is possible to use Core Data on the Mac without writing any code at all. Everything just works with the default configuration.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse