EntityFramework Core Lazy Loading returns all related tables - entity-framework

I have 3 tables in my DB:
Countries
States
Cities
I have an API that should return one country with its states.
However, when I attempt this, I end up getting a JSON object with the countries, its stats AND all the cities under each state
My code is something like this (both Eager and Lazy return the same):
//Eager Loading
var countries = await _context.Countries.Include(s=>s.States).ToListAsync(cancellationToken);
//Lazy Loading
var countries = await _context.Countries.ToListAsync(cancellationToken);
How can I only load the country with the states only and leave the cities?

My advice is never return entities. Entities should exist only as long as their DbContext as a representation of the data model. Models used by a view or API serve a different purpose and should be simple, serializable POCOs that EF can populate. This lets them suit solely the data your view/consumer is concerned about. In your case you only care about country and state, not city or other related bits. You may not even need all data about a country or a state. Let EF build a query for just the data needed. This improves performance for your queries, reduces the memory use on server and client, and avoids pitfalls from serialization. (i.e. circular references) Entities should always represent a complete state of an entity. Turning off lazy loading and passing incomplete entity graphs around can easily lead to errors as methods accepting a reference to an entity and faced with a null/empty reference will not know the difference whether that reference was just not loaded, or does not exist.
[Serializable]
public class CountryViewModel
{
public int CountryID { get; set; }
public string CountryName { get; set; }
public IEnumerable<StateViewModel> States { get; set; } = new List<StateViewModel>();
}
[Serializable]
public class StateViewModel
{
public int StateID { get; set; }
public string StateName { get; set; }
}
Then when fetching the countries and states:
var countries = await _context.Countries
.Select(x => new CountryViewModel
{
CountryId = x.CountryId,
CountryName = x.Name,
States = x.States.Select(s => new StateViewModel
{
StateId = s.StateId,
StateName = s.Name
}).ToList()
}).ToListAsync(cancellationToken);
Leveraging Automapper, this can be simplified fairly easily down to:
var countries = await _context.Countries
.ProjectTo<CountryViewModel>()
.ToListAsync(cancellationToken);

Related

Returning Domain Objects from Repositories on Joining tables

I have been reading that Repositories should return domain objects only. I am having difficulty with implementing this. I currently have API with Service Layer, Repository and I am using EF Core to access sql database.
If we consider User(Id, Name, address, PhoneNumber, Email, Username) and Orders (id, OrderDetails, UserId) as 2 domain objects. One Customer can have multiple Orders. I have created navigation property
public virtual User User{ get; set; }
and foreign Key.
Service layer needs to return DTO with OrderId, OrderDetails, CustomerId, CustomerName. What should the Repository return in this case? This is what i was trying:
public IEnumerable<Orders> GetOrders(int orderId)
{
var result = _context.Orders.Where(or=>or.Id=orderId)
.Include(u => u.User)
.ToList();
return result;
}
I am having trouble with Eager loading. I have tried to use include. I am using Database first. In the case of above, Navigation Properties are always retuned with NULL. The only way i was able to get data in to Navigation Properties was to enable lazy loading with proxies for the context. I think this will be a performance issue
Can anyone help with what i should return and why .Include is not working?
Repositories can return other types of objects, even primitive types like integers if you want to count some number of objects based on a criteria.
This is from the Domain Driven Design book:
They (Repositories) can also return symmary information, such as a
count of how many instances (of Domain Object) meet some criteria.
They can even return summary calculations, such as the total across
all matching objects of some numerical attribute.
If you return somethings that isn't a Domain Objects, it's because you need some information about the Domain Objects, so you should only return immutable objects and primitive data types like integers.
If you make a query to get and objects with the intention of changing it after you get it, it should be a Domain Object.
If you need to do it place boundaries around your Domain Objects and organize them in Aggregates.
Here's a good article that explains how to decompose your model into aggregates: https://dddcommunity.org/library/vernon_2011/
In your case you can either compose the User and the Order entities in a single Aggreate or have them in separate Aggregates.
EDIT:
Example:
Here we will use Reference By Id and all Entities from different Aggregates will reference other entities from different Aggregates by Id.
We will have three Aggregates: User, Product and Order with one ValueObject OrderLineItem.
public class User {
public Guid Id{ get; private set; }
public string FirstName { get; private set; }
public string LastName { get; private set; }
}
public class Product {
public Guid Id { get; private set; }
public string Name { get; private set; }
public Money Price { get; private set; }
}
public class OrderLineItem {
public Guid ProductId { get; private set; }
public Quantity Quantity { get; private set; }
// Copy the current price of the product here so future changes don't affect old orders
public Money Price { get; private set; }
}
public class Order {
public Guid Id { get; private set; }
public IEnumerable<OrderLineItem> LineItems { get; private set; }
}
Now if you do have to do heavy querying in your app you can create a ReadModel that will be created from the model above
public class OrderLineItemWithProductDetails {
public Guid ProductId { get; private set; }
public string ProductName { get; private set; }
// other stuff quantity, price etc.
}
public class OrderWithUserDetails {
public Guid Id { get; private set; }
public string UserFirstName { get; private set; }
public string UserLastName { get; private set; }
public IEnumerable<OrderLineItemWithProductDetails > LineItems { get; private set; }
// other stuff you will need
}
How you fill the ReadModel is a whole topic, so I can't cover all of it, but here are some pointers.
You said you will do a Join, so you're probably using RDBMS of some kind like PosteSQL or MySQL. You can do the Join in a special ReadModel Repository. If your data is in a single Database, you can just use a ReadModel Repository.
// SQL Repository, No ORM here
public class OrderReadModelRepository {
public OrderWithUserDetails FindForUser(Guid userId) {
// this is suppose to be an example, my SQL is a bit rusty so...
string sql = #"SELECT * FROM orders AS o
JOIN orderlineitems AS l
JOIN users AS u ON o.UserId = u.Id
JOIN products AS p ON p.id = l.ProductId
WHERE u.Id = userId";
var resultSet = DB.Execute(sql);
return CreateOrderWithDetailsFromResultSet(resultSet);
}
}
// ORM based repository
public class OrderReadModelRepository {
public IEnumerable<OrderWithUserDetails> FindForUser(Guid userId) {
return ctx.Orders.Where(o => o.UserId == userId)
.Include("OrderLineItems")
.Include("Products")
.ToList();
}
}
If it's not, well you will have to build it an keep it in a separate database. You can use DomainEvents to do that, but I wont go that far if you have a single SQL database.
The advice I give around the repository pattern is that repositories should return IQueryable<TEntity>, Not IEnumerable<TEntity>.
The purpose of a repository is to:
Make code easier to test.
Centralize common business rules.
The purpose of a repository should not be to:
Abstract EF away from your project.
Hide knowledge of your domain. (Entities)
If you're introducing a repository to hide the fact that the solution is depending on EF or hide the domain then you are sacrificing much of what EF can bring to the table for managing the interaction with your data or you are introducing a lot of unnecessary complexity into your solution to try and keep that capability. (filtering, sorting, paginating, selective eager loading, etc.)
Instead, by leveraging IQueryable and treating EF as a first-class citizen to your domain you can leverage EF to produce flexible and fast queries to get the data you need.
Given a Service where you want to " return DTO with OrderId, OrderDetails, CustomerId, CustomerName."
Step 1: Raw example, no repository...
Service code:
public OrderDto GetOrderById(int orderId)
{
using (var context = new AppDbContext())
{
var order = context.Orders
.Select(x => new OrderDto
{
OrderId = x.OrderId,
OrderDetails = x.OrderDetails,
CustomerId = x.Customer.CustomerId,
CustomerName = x.Customer.Name
}).Single(x => x.OrderId == orderId);
return order;
}
}
This code can work perfectly fine, but it is coupled to the DbContext so it is hard to unit test. We may have additional business logic to consider that will need to apply to pretty much all queries such as if Orders have an "IsActive" state (soft delete) or the database serves multiple clients (multi-tenant). There will be a lot of queries in our controllers and would lead to the need for a lot of things like .Where(x => x.IsActive) included everywhere.
With the Repository pattern (IQueryable), unit of work:
public OrderDto GetOrderById(int orderId)
{
using (var context = ContextScopeFactory.CreateReadOnly())
{
var order = OrderRepository.GetOrders()
.Select(x => new OrderDto
{
OrderId = x.OrderId,
OrderDetails = x.OrderDetails,
CustomerId = x.Customer.CustomerId,
CustomerName = x.Customer.Name
}).Single(x => x.OrderId == orderId);
return order;
}
}
Now at face value in the controller code above, this doesn't really look much different to the first raw example, but there are a few bits that make this testable and can help manage things like common criteria.
The repository code:
public class OrderRepository : IOrderRepository
{
private readonly IAmbientContextScopeLocator _contextScopeLocator = null;
public OrderRepository(IAmbientContextScopeLocator contextScopeLocator)
{
_contextScopeLocator = contextScopeLocator ?? throw new ArgumentNullException("contextScopeLocator");
}
private AppDbContext Context => return _contextScopeLocator.Get<AppDbContext>();
IQueryable<Order> IOrderRepository.GetOrders()
{
return Context.Orders.Where(x => x.IsActive);
}
}
This example uses Mehdime's DbContextScope for the unit of work, but can be adapted to others or an injected DbContext as long as it is lifetime scoped to the request. It also demonstrates a case with a very common filter criteria ("IsActive") that we might want to centralize across all queries.
In the above example we use a repository to return the orders as an IQueryable. The repository method is fully mock-able where the DbContextScopeFactory.CreateReadOnly call can be stubbed out, and the repository call can be mocked to return whatever data you want using a List<Order>().AsQueryable() for example. By returning IQueryable the calling code has full control over how the data will be consumed. Note that there is no need to worry about eager-loading the customer/user data. The query will not be executed until you perform the Single (or ToList etc.) call which results in very efficient queries. The repository class itself is kept very simple as there is no complexity about telling it what records and related data to include. We can adjust our query to add sorting, pagination, (Skip/Take) or get a Count or simply check if any data exists (Any) without adding functions etc. to the repository or having the overhead of loading the data just to do a simple check.
The most common objections I hear to having repositories return IQueryable are:
"It leaks. The callers need to know about EF and the entity structure." Yes, the callers need to know about EF limitations and the entity structure. However, many alternative approaches such as injecting expression trees for managing filtering, sorting, and eager loading require the same knowledge of the limitations of EF and the entity structure. For instance, injecting an expression to perform filtering still cannot include details that EF cannot execute. Completely abstracting away EF will result in a lot of similar but crippled methods in the repository and/or giving up a lot of the performance and capability that EF brings. If you adopt EF into your project it works a lot better when it is trusted as a first-class citizen within the project.
"As a maintainer of the domain layer, I can optimize code when the repositories are responsible for the criteria." I put this down to premature optimization. The repositories can enforce core-level filtering such as active state or tenancy, leaving the desired querying and retrieval up to the implementing code. It's true that you cannot predict or control how these resulting queries will look against your data source, but query optimization is something that is best done when considering real-world data use. The queries that EF generates reflect the data that is needed which can be further refined and the basis for what indexes will be most effective. The alternative is trying to predict what queries will be used and giving those limited selections to the services to consume with the intention of them requesting further refined "flavours". This often reverts to services running less efficient queries more often to get their data when it's more trouble to introduce new queries into the repositories.

How do you map strings in the database to enums in your model without introducing a second property?

My database has a table like this:
Cats
- CatId INT PK
- Name VARCHAR(100)
- FavoriteToy VARCHAR(100)
And my code looks like this:
Cat.cs
public int CatId { get; set; }
public string Name { get; set; }
public Toy FavoriteToy {get; set; }
StaticVariables.cs
public enum Toy { Box, Ball, StuffedAnimal }
In a normalized database I would use a lookup table in the database to store all the toys and then the Cats table would just store a ToyId. But for this situation it's a lot easier to just store the FavoriteToy as a string even though it will be redundant.
The problem is I don't know how to convert a string in the database to an enum in code without creating a second FavoriteToyString property and having FavoriteToy just be a computed that returns the enum derived from FavoriteToyString.
I've heard this might be possible in the current version of entity framework. Is that true? Can you please show me how to do this?
You may use DTO class and automapper to solve your issue :)
Generally, yes a lookup table reference is a better option since your data can comply with referential integrity. That is, No cat records with toys that your Enum hopefully doesn't contain. (Though your Enum would need to be kept in sync with the Toys table.) You can configure EF to store enumerations as a string using a bit of a trick with the mapping:
public class Cat
{
[Key]
public int CatId { get; set; }
public string Name { get; set; }
[Column("FavoriteToy")]
public string FavoriteToyMapped { get; set; }
[NotMapped]
public Toy FavoriteToy
{
get { return (Toy)Enum.Parse(typeof(Toy), FavoriteToyMapped); }
set { FavoriteToyMapped = value.ToString(); }
}
}
The caveat of this approach is that where you might use Linq to Entity to filter on your cat's favorite toy, you need to reference the FavoriteToyMapped value in the query expression because EF/DB won't know what FavoriteToy is.
I.e.
Cats with a favorite toy of "Yarn"
var catsThatLoveYarn = context.Cats.Where(c => c.FavoriteToyMapped == Toys.Yarn.ToString()).ToList();
// not
var catsThatLoveYarn = context.Cats.Where(c => c.FavoriteToy == Toys.Yarn).ToList();
// Will error because EF doesn't map that property.
Once you are working with instances of entities, that the set of entities has been pulled back from the database, you can further access/refine queries with FavoriteToy. Just be cautious and prepared for the unknown field if you use it too early and EF goes and tries to compose a query.
var threeYearOldCats = context.Cats.Where(c => c.Age == 3).ToList();
var threeYearOldCatsThatLoveYarn = threeYearOldCats.Where(c => c.FavoriteToy == Toys.Yarn).ToList();
This is Ok because the .ToList() in the first query executed the EF-to-SQL, so threeYearOldCats is now a local List<Cat> of cat entities, not an IQueryable<Cat>.

EF Core - unincluding binrary fields

I have an EF Core model that has a binary field
class SomeModel {
string Id;
string otherProperty;
byte[] blob;
};
Usually, when I query the DB, I want to return a list of this Model - and then, on subsequent calls, query just a single entity, but return the blob.
I can't see a way in either data or code first to prevent EF Core paying the cost of retrieving the blob field always.
I really want to be able to say something like:
var list = await Context.SomeModels.ToListAsync();
// later
var item = await Context.SomeModels
.Where(m=>m.Id==someId)
.Include(m=>m.blob)
.FirstOrDefaultAsync();
I think I might have to put the blobs into a 2nd table so I can force a optional join.
The only way you could get a separate loading is to move the data to a separate entity with one-to-one relationship.
It doesn't need to be a separate table though. Although the most natural choice looks to be owned entity, since owned entities are always loaded with the owners, it has to be a regular entity, but configured with table splitting - in simple words, share the same table with the principal entity.
Applying it to your sample:
Model:
public class SomeModel
{
public string Id { get; set; }
public string OtherProperty { get; set; }
public SomeModelBlob Blob { get; set; }
};
public class SomeModelBlob
{
public string Id { get; set; }
public byte[] Data { get; set; }
}
Configuration:
modelBuilder.Entity<SomeModelBlob>(builder =>
{
builder.HasOne<SomeModel>().WithOne(e => e.Blob)
.HasForeignKey<SomeModelBlob>(e => e.Id);
builder.Property(e => e.Data).HasColumnName("Blob");
builder.ToTable(modelBuilder.Entity<SomeModel>().Metadata.Relational().TableName);
});
Usage:
Code:
var test = context.Set<SomeModel>().ToList();
SQL:
SELECT [s].[Id], [s].[OtherProperty]
FROM [SomeModel] AS [s]
Code:
var test = context.Set<SomeModel>().Include(e => e.Blob).ToList();
SQL:
SELECT [e].[Id], [e].[OtherProperty], [e].[Id], [e].[Blob]
FROM [SomeModel] AS [e]
(the second e.Id in the select looks strange, but I guess we can live with that)

Entity Framework Navigation Property preload/reuse

Why is Entity Framework executing queries when I expect objects can be grabbed from EF cache?
With these simple model classes:
public class Blog
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Post> Posts { get; set; }
}
public class Post
{
public int Id { get; set; }
public string Content { get; set; }
public virtual Blog Blog { get; set; }
}
public class BlogDbContext : DbContext
{
public BlogDbContext() : base("BlogDbContext") {}
public DbSet<Blog> Blogs { get; set; }
public DbSet<Post> Posts { get; set; }
}
I profile the queries of following action
public class HomeController : Controller
{
public ActionResult Index()
{
var ctx = new BlogDbContext();
// expecting posts are retrieved and cached by EF
var posts = ctx.Posts.ToList();
var blogs = ctx.Blogs.ToList();
var wholeContent = "";
foreach (var blog in blogs)
foreach (var post in blog.Posts) // <- query is executed
wholeContent += post.Content;
return Content(wholeContent);
}
}
Why doesn't EF re-use the Post entities which I had already grabbed with the var posts = ctx.Posts.ToList(); statement?
Further explanation:
An existing application has an Excel export report. The data is grabbed via a main Linq2Sql query with a tree of includes (~20). Then it is mapped via automapper and additional data from manual caches (which previously slowed down the execution if added to the main query) is added.
Now the data is grown and SQL Server crashes when trying to execute the query with an error:
The query processor ran out of internal resources and could not produce a query plan.
Lazy loading would result in >100.000 queries. So I thought I could preload all the required data with a few simple queries and let EF use the objects automatically from cache during lazy loading.
There I initial had additional problems with limits of the TSQL IN() clause which I solved with MoreLinq´s Batch extension.
When you have Lazy Loading enabled, EF will still reload the Collection Navigation Properties. Probably because EF doesn't know whether you have really loaded all the Posts. EG code like
var post = db.Posts.First();
var relatedPosts = post.Blog.Posts.ToList();
Would be tricky, as the Blog would have one Post already loaded, but obviously the others need to be fetched.
In any case when relying on the Change Tracker to fix-up your Navigation Properties, you should disable Lazy Loading anyway. EG
using (var db = new BlogDbContext())
{
db.Configuration.LazyLoadingEnabled = false;
. . .
Given you have the navigation properties, look at leveraging them in your query to feed Automapper a dynamic object to map to your ViewModel/DTO rather than a top-level entity which you'd be relying on eager loading or waiting on lazy loading.
This is done by issuing a .Select() on your query. To use a simple example of extracting order details including the customer name, list of product names and quantities from order lines, and the delivery address where an Order has a reference to customer, and that customer has a delivery address, a collection of order lines, each with a product...
var orderDetails = dbContext.Orders
.Where(o => /* Insert criteria */)
.Select(o => new
{
o.OrderId,
o.OrderNumber,
o.Customer.CustomerId,
CustomerName = x.Customer.FullName,
o.Customer.DeliveryAddress, // Address entity if no further dependencies, or extract fields/relations from the Address.
o.OrderLines.Select( ol = > new
{
ol.OrderLineId,
ProductName = ol.Product.Name,
ol.Quantity
}
}).ToList(); // Ready to feed into Automapper.
With ~20 includes your Select will undoubtedly be a bit more involved, but the idea is to feed SQL Server a query to retrieve just the data you want that you can then feed into Automapper to navigate through where any child relationships can either be flattened by EF or simplified and returned for your mapper to flesh out into the resulting models.
With growing systems you will also want to consider leveraging paging /w Skip and Take rather than ToList, or at least leveraging Take to ensure that there is a cap to the amount of data your return. ToList is a primary performance troll that I look for in EF code because its misuse can kill applications.

EF6 best practice for loading not mapped fields of an entity tracked by the context

Considering the blogging data model:
Blog:
int Id
ICollection<Post> Posts
Post:
int Id
int BlogId
DateTime Date
Then loading Blogs with the date of their latest post (LatestPostDate) and bind to the UI, while they are tracked by the context.
There are some solutions, such as using DTO, but the result entities are not tracked by the context.
Also I can set the LatestPostDate as NotMapped, define a Table-valued function, and apply SqlQuery on DbSet. Although, the NotMapped fields are not loaded in this way.
What are the best practices?
I try not to add column to the table, also avoid calculating the values after loading.
Best practice would be to handle display concerns in a ViewModel.
But as you do not want to map the Entity to another class, let's first take a look at the [NotMapped] variant, using LINQ to calculate the latest post date instead of plain SQL.
using System.Linq;
public class Blog {
public int Id { get; set; }
public virtual ICollection<Post> Posts { get; set; }
[NotMapped]
public DateTime? LatestPostDate {
get {
return Posts.OrderBy(p => p.Date).LastOrDefault()?.Date;
}
}
}
This way, the value is calculated only when you access the property LatestPostDate (probably during UI rendering). You can reduce the number of DB accesses by eager loading the Posts, although this will increase the size of the data set you are working with.
var blogs = _dbContext.Blogs.Include(b => b.Posts).ToArray();
But if you use a ViewModel, you can fill the LatestPostDate in one go:
public class BlogViewModel {
public int Id { get; set; }
public DateTime? LatestPostDate { get; set; }
}
var viewModels = _dbContext.Blogs.Select(b => new BlogViewModel {
Id = b.Id,
LatestPostDate = b.Posts.OrderBy(p => p.Date).LastOrDefault()?.Date;
}).ToArray();
Regarding your concerns that the ViewModel is not tracked by the context: in the edit usecase, load the Entity again using the Id provided by the ViewModel and map the updated properties. This gives you full control over the properties that should be editable. As a bonus, the ViewModel is a good place to do input validation, formatting etc.