update audit fields in a many-to-many Entity Framework model - entity-framework

I have a legacy database with a many-to-many relationship like the following:
public class Post
{
public ICollection<Tag> Tags { get; set; }
...
}
public class Tag
{
public ICollection<Post> Posts { get; set; }
...
}
with the many-to-many relationship tracked in a 'PostTagLink' table.
Normally it is easy to use Code First to express the many-to-many relationship more or less implicitly, i.e. update the 'PostTagLink' table when a relationship is added or removed, but without actually having a 'PostTagLink' entity explicitly defined.
Audit fields on Tags and Posts can be updated easily by the DbContext when changes are saved:
public abstract class MyAuditableEntityContext : DbContext
{
public override int SaveChanges()
{
string currentUser = Thread.CurrentPrincipal.Identity.Name;
foreach (DbEntityEntry<IAuditableEntity> changeEntry in base.ChangeTracker.Entries<IAuditableEntity>())
{
if (changeEntry.State == EntityState.Added)
{
changeEntry.Entity.CreatedBy = currentUser;
changeEntry.Entity.RevisedBy = currentUser;
}
else if (changeEntry.State == EntityState.Modified)
{
changeEntry.Entity.RevisedBy = currentUser;
}
}
return base.SaveChanges();
}
}
But what if the 'PostTagLink' table also includes audit fields?
The only solution I can see it to include a PostTagLink entity in the model (with many-to-one relationships back to Tag and Post) so I can access the audit fields in the DbContext SaveChanges method.
But adding these extra entities makes working with the model awkward. Clients and queries have to work the the 'link' entities directly instead of Entity Framework handling the relationship changes automatically.
The question: Is there some Entity-Framework-ninja technique where I could intercept the changes to many-to-many relationships and update the link table audit fields as necessary, without having to explicitly include 'link' entities in the model?
(Again - this is a legacy database and there is little I can do to change it, so I'd like to avoid adding stored procedures or any other logic to the database.)
Thanks for your time!

So if I understand this correctly, when you add an item to one of the collections and save the entity, you need to set a CreatedBy field in the many-to-many table. You can execute raw sql using DbContext.Database.SqlCommand to update the link table audit fields. How to execute raw sql.
So how to intercept the changes?
These answers might help: EF4 Audit changes of many to many relationships and Entity Framework: Tracking changes to FK associations
EDIT:
For reference here is the original example I posted that may have led you to believe that you had to put sql into your model

Patient: "Doctor, it hurts when I do this. What's the cure?"
Doctor: "Don't do that."
I'm going to answer my own question as 'there is no answer'. As I've seen others advise in comments on related StackOverflow questions - I think it will ultimately be better to just include the 'link' entities explicitly in the model.

Related

EF Core update untracked entity with many to many collection

I'm facing a problem with EF Core and collections; I have persons who read books, the books can be read by multiple people and people can read multiple books (it's a many-to-many relationship). My EF generates the 3 tables Books, Persons and BookPersons.
When I insert new persons with a set of books they read, there is no problem. Still when I recreate one of the persons outside the db context (so same id, but mutated collection of read books) and I try to save it, it fails on the many-to-many relation. Because the relation between the existing already exists (not unique constraint)
I've tried:
to attach the book collection to the context (same error)
the person (no error but no change either)
only change person details not the collection (the untracked entity is saved but my books read is not saved)
I'm not very fond of managing the BookPersons table or doing queries first to get existing entities. My goal is to do an update of a person and its read books in one go. I do know how to write it in SQL but it seems EF is quite a challenge.
If you want to view my code, visit: https://github.com/CasperCBroeren/EfCollectionsProblem/blob/master/Program.cs
Thanks for explaining what I'm missing or not getting
I would create PersonBooks model to handle that,
Book model
public class Book
{
[DatabaseGenerated(DatabaseGeneratedOption.None)]
public int Id { get; set; }
public string Title { get; set; }
[ForeignKey("BookId")]
public virtual ICollection<PersonBook> PersonBooks { get; set; }
}
Person Model
public class Person
{
[DatabaseGenerated(DatabaseGeneratedOption.None)]
public int Id { get; set; }
public string Name { get; set; }
[ForeignKey("PersonId")]
public virtual ICollection<PersonBook> PersonBooks { get; set; }
}
PersonBook Model
public class PersonBook
{
public int Id { get; set; }
public int PersonId { get; set; }
public int BookId { get; set; }
public virtual Person Person { get; set; }
public virtual Book Book { get; set; }
}
then you can get all book Id's readen by person by using
var personId = 15; // what ever you want
db.PersonBooks.Where(a=> a.PersonId == personId);
or get all persons Id's who read a book by id
var bookId= 11; // what ever you want
db.PersonBooks.Where(a=> a.BookId== bookId);
Note:
you can reach the Book entity by using for example
db.PersonBooks.Where(a=> a.PersonId == personId).FirstOrDefault().Book;
A key factor in EF is dealing with object references. Any reference that a DbContext isn't tracking will be treated as a new entity. The Update method on DbSets should actually be avoided as it can lead to inefficient and potentially dangerous data changes.
This option: "to attach the book collection to the context" works with singular references, but doesn't work with collections. The trouble is that what you want to say is "add any book the person isn't already associated with" however, the DbContext has no knowledge of what books that person is already associated with unless you fetch that information first.
... or doing queries first to get existing entities.
This is actually what you should do in most cases. In the case of a simple console application to test out ideas and learn how EF works it may look like overkill, but in real-world systems this is the recommended approach for a number of reasons.
Keeping payloads small. Take an API or web site where you allow a user to associate books to people. Sending entire representation of people, their books, etc. back and forth between server and client can get potentially expensive in terms of data size. If I have an API that allows me to associate books to a person, if those books already reflect known data state (already exist in the db) then all I need to pass are IDs. When passing data to views the idea is to only pass what the view needs rather than entire entity graphs.
Keeping payloads safe. Passing entire entities around and using methods like Update can make your systems prone to tampering. Update will update all columns in an entity whether you expect, or allow them to change or not. By minimizing the data coming back you ensure only the expected details can change, and you by definition validate that the provided values are safe.
For example, if I have a service that wanted to update books associated to a person. In the UI I had loaded that John had "Jungle Book (ID: 1)", and I wanted to update the associations so John now had "Jungle Book" and "Tom Sawyer". While my UI might now allow it, it is certainly possible that the client browser can intercept the call to my controller / web service, and seeing a Book { ID: 1, Name: "Jungle Book" }, tamper with that data to send Book { ID: 1, Name: "Hitchhiker's Guide to the Galaxy"}. Provided you did solve this issue in a way that resulted in attaching entities and doing an Update or such, the consequence of this tampering would be that an attacker could rename a book. That would have a flow-on effect to every Person that referenced Book ID #1.
Instead if I want to have something like an "UpdateBooks" method that can reassign books for a person, I would have a method something like this:
private void UpdateBooks(int personId, params int[] bookIds)
{
using (var context = new AppDbContext())
{
var person = context.Persons
.Include(x => x.Books)
.Single(x => x.PersonId == personId);
var existingBookIds = person.Books.Select(x => x.BookId).ToList();
var bookIdsToAdd = bookIds.Except(existingBookIds).ToList();
var bookIdsToRemove = existingBookIds.Except(bookIds).ToList();
foreach(var bookId in bookIdsToRemove)
{
var book = person.Books.Single(x => x.BookId == bookId);
person.Books.Remove(book);
}
if (bookIdsToAdd.Any())
{
var booksToAdd = context.Books
.Where(x => bookIdsToAdd.Contains(x.BookId))
.ToList();
if(booksToAdd.Count != bookIdsToAdd.Count)
{
// Handle scenario where one or more book IDs provided weren't found.
}
person.Books.AddRange(booksToAdd);
}
context.SaveChanges();
}
}
This assumes that EF is handling PersonBooks entirely behind the scenes where PersonBook consists of just PersonId and BookId so-as Person can have a collecton of Books rather than PersonBooks.
This example runs up to two SELECT queries. One to get the Person and it's current books, and one to get any new books if any need to be added. There is no risk of tampering with books, and we can easily validate scenarios such as passing an unknown book ID. The temptation might be to avoid querying, seeing it as expensive, but in most cases EF can provide data quite quickly and efficiently. It is the exception rather than the norm that you might need to get creative to get around possible performance bottlenecks with data access.
A third consideration is to focus on keeping operations atomic, especially for things like web services / web applications. This doesn't apply when just getting familiar with the workings of EF, entities, and such, but a consideration for more real-world applications. Rather than having more complex methods like UpdateBooks(), using actions like "AddBook" and "RemoveBook" can keep operations faster and simpler. One argument for a larger method is that you might expect all of the operations to be committed (or not) as one operation, such as UpdateBooks gets called as part of one big "SavePerson" method reflecting changes to the person and all of it's associated details. In these cases having atomic actions is still recommended, except instead of updating data state, they can update server (session) state waiting for a "Save" call to come through to persist the changes as one operation, or discarding the changes. Add/Remove methods can still provide the validation checks ultimately setting things up for entities to be loaded, modified, and persisted.

What is Owned Entity? When and why to use Owned Entity in Entity Framework Core?

I'm learning Entity Framework Core. I came across the term "Owned Entity" in almost all tutorials.
Here is one example on using an Owned Entity in Entity Framework Core
Job Entity:
public class Job : Entity
{
public HiringManagerName HiringManagerName { get; private set; }
}
HiringManagerName Value Object:
public class HiringManagerName : ValueObject
{
public string First { get; }
public string Last { get; }
protected HiringManagerName()
{
}
private HiringManagerName(string first, string last)
: this()
{
First = first;
Last = last;
}
public static Result<HiringManagerName> Create(string firstName, string lastName)
{
if (string.IsNullOrWhiteSpace(firstName))
return Result.Failure<HiringManagerName>("First name should not be empty");
if (string.IsNullOrWhiteSpace(lastName))
return Result.Failure<HiringManagerName>("Last name should not be empty");
firstName = firstName.Trim();
lastName = lastName.Trim();
if (firstName.Length > 200)
return Result.Failure<HiringManagerName>("First name is too long");
if (lastName.Length > 200)
return Result.Failure<HiringManagerName>("Last name is too long");
return Result.Success(new HiringManagerName(firstName, lastName));
}
protected override IEnumerable<object> GetEqualityComponents()
{
yield return First;
yield return Last;
}
}
Entity Configuration:
public class JobConfiguration : IEntityTypeConfiguration<Job>
{
public void Configure(EntityTypeBuilder<Job> builder)
{
builder.OwnsOne(p => p.HiringManagerName, p =>
{
p.Property(pp => pp.First)
.IsRequired()
.HasColumnName("HiringManagerFirstName")
.HasMaxLength(200);
p.Property(pp => pp.Last)
.IsRequired()
.HasColumnName("HiringManagerLastName")
.HasMaxLength(200);
});
}
}
And this gets created as two columns in table like other columns in Job Entity.
Since this is also created as columns just like other properties in entity this can directly be added as normal properties in the Job Entity. Why this needs to be added as Owned Entity?
Please can anyone help me understand,
What is owned entity?
Why we need to use owned entity?
When to use owned entity?
What does this look like without owned entities?
If you create an entity, Job, in EF Core that points to a complex object, HiringManagerName, in one of the properties, EF Core will expect that each will reside in a separate table and will expect you to define some sort of relationship between them (e.g. one-to-one, one-to-many, etc.).
When retrieving Job, if you want to explicitly load the values of HiringManagerName as well, you'd have to use an explicit Include statement in the query or it will not be populated.
var a = dbContext.Jobs
.Include(b => b.HiringManagerName) //Necessary to populate
.ToListAsync();
But because each is thought to be a separate entity, they will be required to expose keys and you'll have to configure foreign keys between each.
What is an owned entity?
That's where [Owned] types come in (see docs). By marking the child class with the [Owned] attribute, you leave the explicit handling of that relationship to EF Core to manage and no longer have a need to define the key(s)/foreign key(s) on the owned type. Same if you point to a collection of your owned type - you no longer need to deal with navigation properties on either class to describe the relationship.
EF Core also supports queries against these owned types, as in:
var job = context.Jobs.Where(a => a.HiringManagerName.First == "fingers10").FirstOrDefaultAsync();
Now, it comes with two important design restrictions described in the docs (but elaborated on here):
You cannot create a DbSet for the owned type
This means that you cannot subsequently do a DB call with:
dbContext.HiringManagerNames.ToListAsync();
This will throw because you are expected to simply retrieve the value as part of a call to:
dbContext.Jobs.ToListAsync();
Unlike the first example I gave, HiringManagerNames no longer needs to be explicitly included and will instead be returned with a call to the Jobs DbSet<T>.
Cannot call Entity<T> with an owned type on ModelBuilder
Similarly, you cannot reference your owned type in the ModelBuilder to configure it. Rather, if you must configure it, do so through the configuration against your Jobs entity and against the owned property, e.g.:
modelBuilder.Entity<Job>().OwnsOne(a => a.HiringManagerNames).//Remaining configuration
So when should I use owned entities?
If you've got a type that's only ever going to appear as a navigation property of another type (e.g. you're never querying against it itself as the root entity of the query), use owned types in order to save yourself some relationship boilerplate.
If you ever anticipate querying the child entity independent of the parent, don't make it owned - it will need to be defined with its own DbSet<T> in order to be called from the context.
While #Whit Waldo explanation is great with respect to technical ef core, we should also try to understand from Domain Driven Design perspective.
Lets observe the classes mentioned in the question itself
public class Job : Entity
and
public class HiringManagerName : ValueObject
Take a note at Entity and ValueObject. Both of them are DDD concepts.
Identity matters for entities, but does not matter for value objects.
Take a look at this write up from Vladimir Khorikov for a more extensive explanation.
I past the summary bullets here.
Entities have their own intrinsic identity, value objects don’t.
The notion of identity equality refers to entities; the notion of structural equality refers to value objects; the notion of reference equality refers to both.
Entities have a history; value objects have a zero lifespan.
A value object should always belong to one or several entities, it can’t live by its own.
Value objects should be immutable; entities are almost always mutable.
To recognize a value object in your domain model, mentally replace it with an integer.
Value objects shouldn’t have their own tables in the database.
Always prefer value objects over entities in your domain model.
So a value object is owned by an entity. So how do we achieve that using EF Core? Here comes the concept of Owned entities. Now go back and read #Whit Waldo answer.

Want Entity Framework 6.1 eager loading to load only first level

I am not sure I am approaching wrong way or it is a default behaviour but it is not working the way I am expecting ...
Here are two sample classes ...
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public Department Department { get; set; }
}
Second one is Department
public class Department
{
public string Name { get; set; }
public List<Person> People { get; set; }
}
Context Configuration
public MyDbContext() : base("DefaultConnection")
{
this.Configuration.ProxyCreationEnabled = false;
this.Configuration.LazyLoadingEnabled = false;
}
public DbSet<Person> People { get; set; }
public DbSet<Department> Departments { get; set; }
I am try to load people where last name is from 'Smith'
var foundPeople
= context
.people
.Where(p => p.LastName == "Smith");
Above query load foundPeople with just FirstName and LastName no Department object. It is a correct behaviour as my LazyLoading is off. And that was expected as well.
Now in another query with Eager loading Department,
var foundPeople
= context
.people
.Where(p => p.LastName == "Smith")
.Include(p => p.Department);
Above query loads foundPeople with FirstName, LastName, Department with Department->Name as well as Deparment->People (all people in that department, which I dont want, I just want to load first level of the Included property.
I dont know is this an intended behaviour or I have made some mistake.
Is there any way to just load first level of Included property rather then complete graph or all levels of included property.
Using Include() to achieve eager loading only works if lazy loading is enabled on your objects--that is, your navigation properties must be declared as virtual, so that the EF proxies can override them with the lazy-loading behavior. Otherwise, they will eagerly load automatically and the Include() will have no effect.
Once you declare Person.Department and Department.People as virtual properties, your code should work as expected.
Very sorry, my original answer was wholly incorrect in the main. I didn't read your question closely enough and was incorrect in fact on the eager behavior. Not sure what I was thinking (or who upvoted?). Real answer below the fold:
Using the example model you posted (with necessary modifications: keys for the entities and removed "this" from context constructor) I was unable to exactly reproduce your issue. But I don't think it's doing what you think it's doing.
When you eagerly load the Department (or explicitly load, using context.Entry(...).Reference(...).Load()) inspect your results more closely: there are elements in the Department.People collections, but not all the Persons, only the Persons that were loaded in the query itself. I think you'll find, on your last snippet, that !foundPeople.SelectMany(p => p.Department.People).Any(p => p.LastName != "Smith") == true. That is, none of them are not "Smith".
I don't think there's any way around this. Entity Framework isn't explicitly or eagerly loading People collections (you could Include(p => p.Department.People) for that). It's just linking the ones that were loaded to their related object, because of the circular relationship in the model. Further, if there are multiple queries on the same context that load other Persons, they will also be linked into the object graph.
(An aside: in this simplified case, the proxy-creation and lazy-loading configurations are superfluous--neither are enabled on the entities by virtue of the fact that neither have lazy or proxy-able (virtual) properties--the one thing I did get right the first time around.)
By desing, DbContext does what it's called "relationship fix-up". As your model has information on which are the relations between your entities, whenever an entity is attached, or modified, in the context, EF will try to "fix-up" the relations between entities.
For example, if you load in the context an entity with a FK that indicates that it's a children of another entity already attached to the context, it will be added to the children collection of the existing entity. If you make any chages (change FK, delete entity, etc.) the relationships will be automatically fixed up. That's what the other answer explains: even if you load the related entities separatedly, with a different query, they'll be attached to the children collection they belong to.
This functionality cannot be disabled. See other questions related to this:
AsNoTracking and Relationship Fix-Up
Is it possible to enable relationship fixup when change tracking is disabled but proxies are generated
How to get rid of the related entities
I don't know what you need to do, but with the current version of EF you have to detach the entity from the context and manually remove the related entities.
Another option is to map using AutoMapper or ValueInjecter, to get rid of the relationship fix-up.
You could try using a LINQ query so you can select only the fields that you need. I hope that helps.

Maintain Many to Many References

I have a product that I am trying to associate categories to. The list of categories is static. I have set up a bi-directional many-to-many relationship up between Product and Category using Set<?> properties like so:
class Product {
#ManyToMany
public Set<Category> categories;
}
class Category {
#ManyToMany(mappedBy = "categories")
public Set<Product> products;
}
I would like certain users to maintain this relationship, but the only previous way I have seen is to just use a List<Long> to pass back to the controller and add appropriately. This works fine until the user needs to edit these mappings. I have tried clearing the relationship, but that doesn't prove to be simple either.
Is there a decent way to maintain this relationship? If my only option is to "loop and delete" the references, can someone point me in the right direction how to do so appropriately? So far my failed attempts look like this:
for(Category category : product.categories) {
category.products.remove(product);
}
and
Category.delete("categories.id = ?", product.id)
Maintaining the relationship: Yes, passing the IDs to the controller and fetching the entities there is okay.
The relationship proper, there are some things to note:
First, you need to set the cascade annotation, without it nothing in the assocation will get deleted:
#ManyToMany(cascade=CascadeType.ALL)
public Set<Category> categories;
Second, one Entity is the owner of the relation. In your case it's correctly set as the Product class (as the Categoryclass uses mappedBy). Updates only reflect when done on the owner, so to remove all categories from a product you would do
products.categories = new Set<Product>();
if you want to remove a single categorie, just remove it from the products.categories.

Why does Entity Framework make certain fields EnityKeys when they are not even PK's in the source DB?

Starting out on an Entity Framework project.
Imported the Db I am going to use and right away noticed that many table fields were made into EntityKey types and the source fields are not even Keys. Doesn't seem to be a pattern as to which fields were made EntityKeys and which were not.
Is this normal? There were no options for this in the wizard. I don;t want to have to go through and remove this property for all the fields where it was added.
Thanks for your advice!
Each entity on your model requires a unique key, so EF can track and retrieve/persist these entities based on their unique identifier.
If your tables in your database don't have primary keys, then your database is not relational and therefore should not be used by an ORM like EF which is predominantly designed for RDBMS.
If you had an entity like this:
public class Orders
{
public string Name { get; set; }
public double Price { get; set; }
}
How would you retrieve a single order? How would you save a single order?
Crucial LINQ methods such as SingleOrDefault() would be useless, as there is no guarantee that this won't throw an exception:
var singleOrder = ctx.Orders.SingleOrDefault(x => x.Name == "Foo");
Whilst if you had an EntityKey and PK called "OrderId", this is guaranteed to not throw an exception:
var singleOrder = ctx.Orders.SingleOrDefault(x => x.OrderId == 1);
http://msdn.microsoft.com/en-us/library/dd283139.aspx
I think as soon as you read the first paragraph you will understand the role of entity keys in Entity Framework.

Categories