Intercepting databases accesses with EF Core - entity-framework-core

I'm looking for a way to intercept database queries as a means of getting some more in-depth performance stats. I'm after number of queries, query duration, the result data (for an idea of data magnitude) and ideally some access to any LINQ expression.
I can fall back on extending a base context class, creating another method to get DbSet, returning a wrapper there, but 1) that seems hackier than it ought to be, 2) any code there won't be able to distinguish getting cached results vs actual database accesses..
I've looked through the code and feel like wrapping ExecutionStrategyFactory or Database is the way to go - and while I can create an extension method on RelationalDbContextOptionsBuilder for the former, or /replace/ services, I can't see a way of wrapping either, such that the underlying provider's implementation is still used.
(See also: https://github.com/aspnet/EntityFramework/issues/6967)
Is there a decent place to hook into this?

For the ref of anyone else:
Ok, the trick is:
Replace the IRelationalConnection that EF Core's DI system uses via ReplaceService<IRelationalConnection, MyRc>(). Our new MyRc will wrap the existing connection to add hooks
In our MyRc, have a constructor param of IDatabaseProviderServices. The EF DI system will populate it. Cast that to IRelationalDatabaseProviderServices and then grab the RelationalConnection property from that
Wrap all methods, but have public DbConnection DbConnection { get; } return an instance of a DbConnection-wrapping class
In that wrapping class, have CreateDbCommand() return an instance of a DbCommand-wrapping class
In that wrapping class, have ExecuteDbDataReader() return an instance of a DbDataReader-wrapping class
Then, in the DbCommand and DbDataReader wrappers, we can see what commands and parameters are being sent to the database, and the results coming back.
This only works with relational models.

Related

Kotlin immutable entities changing unexpectedly when using it with JPA

In our project we are using kotlin with JPA. All of our entities are immutable so, it is not possible to set fields of our entities directly. You have to create a new instance by using the copy method. If you want these changes to be reflected to database, you must persist this newly created entity with an explicit function call.
In the beginning, this approach looks perfect to us. However, nowadays we are having some problems like some of our instances are changing unexpectedly in the memory.
val instance1 = repository.findById(entityId)
repository.save(instance1.copy(deletedAt = Instant.now()))
..
..
assertNull(instance1.deletedAt())
In the code snipped above, instance1 is retrieved from database and its deletedAt field is set with copy method and the new instance which is created with this copy method is passed to save method of the repository. We don't set any field of instance1, we create a new instance to do these changes. However, the result on assert line is unexpectedly not-null.
It seems, There is a confliction on JPA persistence context (first level cache) and kotlin's immutable and copy method logic.
Is anyone facing this problem or any suggestion or best practices when using JPA and immutable Kotlin entities?
I suspect the problem is that you're ignoring the return value from save().  Its docs say:
Saves a given entity. Use the returned instance for further operations as the save operation might have changed the entity instance completely.
But you're not doing that; you're instead continuing to use the original instance which (as that says) may have changed.
Instead, store the return value from save(), and use that thereafter.  (Either by making instance1 a var, or creating a new val and not referring to instance1 afterward.)
(This isn't a Kotlin-specific problem, and is exactly the same in Java.  JPA , Spring, &c work their magic by futzing with the bytecode, so can do things your code can't — such as changing immutable values.  Most of the time you can ignore it, but this case makes it obvious.)
Immutable types are not compatible on how JPA works.
JPA works around the concept of UnitOfWork, which mean objects retrieved from the database lives in a PersistedContext (1st level cache) and they get discarded once the EntityManager is closed (on a web application at the end of the HTTP request).
When using the copy method in an entity you just retrieved from the database, the copied object is considered detached from the current session meaning that changes on it cannot be tracked by JPA and the underlying implememtation (Hibernate / EclipseLink) have hard time figuring out which SQL statement needs to be fired (Insert/Update/Delete ????)
Things got way more complex when you have complex object graph with OneToMany associations and cascading options.
So my recommendation is unfortunately is to avoid Immutable types when using JPA.

What is the overhead of Entity Framework tracking?

I've just been talking with a colleague about Entity Framework change tracking. We eventually figured out that my context interface should have
IDBSet<MyPoco> MyThings { get; }
rather than
IQueryable<MyPoco> MyThings { get; }
and that my POCO should also have all it's properties as virtual.
Using the debugger we could then see the tracking objects and also that the results contained proxies to my actual POCOs.
If I don't have my POCO properties as virtual and have my context interface using IQueryable<> instead of IDbSet<> I don't get any of that.
In this instance I am only querying the database, but in the future will want to update the database via Entity Framework.
So, to make my life easier in the future when I come to look at this code as a reference, is there any performance penalty in having the tracking info/proxies there when I will never make use of them?
There is a performance penalty of tacking entities in EF. When you query using entity framework EF will keep a copy of values loaded from database. Also single Context instance keeps track of only single instance of an entity. So EF has to check whether it already has a copy of the entity before it creates an instance(ie. There will be lot of comparisons going behind the scenes).
So avoid it if you don't need it. You can do so as follows.
IQueryable<MyPoco> MyThings { get { return db.MyThings.AsNoTracking(); } }
MSDN page on Stages of Query Execution details the cost associated with each step of query execution.
Edit:
You should not expose IDBSet<MyPoco> MyThings because that tells the consumer of your API that your entities can be added, updated and deleted when in fact you intend to query the data.
Navigation properties in the model classes as declared as virtual so as to imply lazy load feature which means the navigation property will only be needed if required. As far as the Entity objects are concerned, there main aim is to load the specific table records from the database into the DbSet which comes from DbContext. You can't use IQueryable in this case. Also, it doesn't make any sense with the DataContext. IQueryable is an altogether different interface

when using code first, accessing association does not account for .Take(x)

2 entities: Member and Comment
Member has an ICollection<Comment> Comments
Whenever I use member.Comments.Take(x) EF produces a query that gets all the comments from database.
Is it supposed to be like that?
Is it because property is ICollection?
Is there a way to tell EF to factor in my Take(x) or should i refactor my code to use context.Comments.Where(c=>c.MemberId==member.Id).Take(x) and live with it?
As described by #J. Tihon it is how EF works. When accessing lazy loaded property EF will always load the whole collection and any Linq expression is evaluated on the loaded collection. If you want to avoid that you must use the query as you described but the result of the query will not be loaded into your navigation property. To solve this you can use explicit loading instead of lazy loading:
context.Entry(member)
.Collection(m => m.Comments)
.Query()
.OrderBy(...) // Take requires some sorting
.Take(2)
.Load();
This should fill your Comments property with two comments.
The proxy classes generated by EF only provide lazy-loading for navigation properties, but they do not evaluate queries. Once you accessed the member.Comments property, the Comment-entities are loaded from the database and your query is applied in memory. To avoid this, you must get your comments in a query that is directly executed on the object-set (like the example you've already gave).
I believe this is by design, since you would have to return an IQueryable from the navigation property in order for the EF to intercept access to this property, but I suppose this isn't covered aswell.
You've already described a way to handle this, although it isn't pretty. Another option would be to somehow tell EF to partially load the property when you make the original query for the Member-object. I will look into that, but I can already think of one or two thinks that might go wrong with that approach.
Edit
After some research and trial and error I couldn't come up with another approach, that could be executed directly on the DbSet<Member> rather than DbSet<Comment> and returns a Member object. I is possible using an anonymous object:
var query = from m in catalog.Members
select new
{
Id = m.Id,
Name = m.Name,
Comments = m.Comments.Take(1)
};
Which could then be translated into a Member-object in memory, but of course it wouldn't be connected to the context in anyway (=no change tracking). In the sample query above I cannot create an instance of Member instead of an anonymous type, because EF can only create non-complex types (I'm guessing because the context knows that "Member" is an entity).

How to do filtering for many entities of many DataServices in one common class?

Tier database and every single table has a DataSetId and I absolutely want to be sure that the data is always partitioned correctly.
Currently I'm using the QueryInterceptor attribute but it's messy and overly repetitive and prone to errors. Some new Dev could add a new table and forget to filter by DataSetId, or just rename a table. So I've put this in a base class but the IQuerable properties of my repository are never called.
I have a "CoreRepository" class that inherits from ObjectContext, and each of my IQueryable collections uses "CoreObjectSet". CoreObjectSet extends ObjectSet by always adding an expression to filter by DataSetId. When used directly this works fine. But when used for a DataService the Get accessor for the collections on the Repository are never called by the DataService. It appears to be cheating and not using them at all and accessing the data directly.
Is there a way to get the DataService to access through the repository class correctly (And still get the efficiency of passing through the query as SQL)?
If this is the behaviour why even make DataService of T anyway if it's not even going to use the class? For the ADO team to just ignore it and use the edmx directly seems like a hack.
Thanks
Aaron
Looks like the only way around it is to use a T4 template to generate the DataService. I much prefer a base class or some kind of reusable handler but ADO has given me no choice here.

Are protected entity properties in EF 4 a bad idea with data services?

After some NullReferenceException headache from deep down System.Data.Services I found out that data services doesn't like properties in the EF model marked as protected.
The exception occurs when the data service is initializing and tries to generate metadata for the EF model. Apparently Microsoft decided to throw a NullReferenceException when reflecting protected properties here rather than a meaningful message.
I use a few protected properties to wrap custom types not available in the database. This is a convenient way of representing for example enums in the database. The enum can be represented as a protected string property with a public enum wrapper that converts to and from the string value. This works very well in all other EF usage scenarios and I would rather not abandon the protected properties pattern.
I use self tracking entities that flows around nicely with WCF. Protected properties works well since I have checked "Reuse types in referenced assemblies", which makes my wrapper properties available in all assemblies. I was hoping that I could do something similar with a data service. I realize that I might run into trouble if I want to construct a query where the protected properties is part of the expression, but that is a different problem that could be addressed otherwise.
Is there a (practical) way to use protected entity properties with a data service?
If not, how should I best represent non-db types if I don't want to keep my set of public properties nice and clean?
Obviously I could just make everything public, but that would be a practice that rhymes with globals.
What about calling SetEntitySetAccessRule?