Find most related post in many to many relationship - entity-framework

I have a many to many relationship as following in code first:
public class Post
{
public int Id { get; set; }
public ICollection<Tag> Tags { get; set; }
}
public class Tag
{
public int Id { get; set; }
public ICollection<Post> Posts { get; set; }
}
modelBuilder.Entity<Post>().HasMany(c => c.Tags).WithMany(a => a.Posts);
If i have a Post with its Tags , how i can get most related posts by considering Tags.

Several ways. My suggestion:
var postsAndTagCounts = context.Posts
.Select(p => new {
PostId = p.Id,
TagCount = p.Tags.Count()
})
.OrderByDescending(p => p.Tags.Count())
.ToList();

First of all, you will need to define how related should be..
The easiest way would be simply checking the post that share the most tags. We need to do this for the given post and the rest of the posts. I won't dwell much on the theory, your query will be looked like this at least:
IQueryable<Post> posts = context.Posts.Where(x => x.Id != currentPost.Id)
.OrderByDescending(x => x.Tags.Count(y => currentPost.Tags.Any(z => z.Id == y.Id)));
Where currentPost is an instance of Post that we want to retrieve its related Posts. The calculation is done on the x.Tags.Count(y => currentPost.Tags.Any(z => z.Id == y.Id)) part. It counts how many tags it share with currentPost.
A bit more sophisticated, you could use Jaccard Similarity. It simply divide the number of same tags in a Post pair with the total Tags of the pair.
int currentPostTagsCount = currentPost.Tags.Count();
IQueryable<Post> posts = context.Posts.Where(x => x.Id != currentPost.Id)
.OrderByDescending(x => x.Tags.Count(y => currentPost.Tags.Any(z => z.Id == y.Id))
/ (currentPostTagsCount + x.Tags.Count()));
Disclaimer: I have not tested the query, take it with grain and salt - i will edit if i had time to test and it does not work.

Related

GroupBy Expression failed to translate

//Model
public class Application
{
[Key]
public int ApplicationId { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime ConfirmedDate { get; set; }
public DateTime IssuedDate { get; set; }
public int? AddedByUserId { get; set; }
public virtual User AddedByUser { get; set; }
public int? UpdatedByUserId { get; set; }
public virtual User UpdatedByuser { get; set; }
public string FirstName { get; set; }
public string MiddleName { get; set; }
public string LastName { get; set; }
public string TRN { get; set; }
public string EmailAddress { get; set; }
public string Address { get; set; }
public int ParishId { get; set; }
public Parish Parish { get; set; }
public int? BranchIssuedId { get; set; }
public BranchLocation BranchIssued { get; set; }
public int? BranchReceivedId { get; set; }
public BranchLocation BranchReceived {get; set; }
}
public async Task<List<Application>> GetApplicationsByNameAsync(string name)
{
if (string.IsNullOrEmpty(name))
return null;
return await _context.Application
.AsNoTracking()
.Include(app => app.BranchIssued)
.Include(app => app.BranchReceived)
.Include(app => app.Parish)
.Where(app => app.LastName.ToLower().Contains(name.ToLower()) || app.FirstName.ToLower()
.Contains(name.ToLower()))
.GroupBy(app => new { app.TRN, app })
.Select(x => x.Key.app)
.ToListAsync()
.ConfigureAwait(false);
}
The above GroupBy expression fails to compile in VS Studio. My objective is to run a query filtering results by name containing a user given string and then it should group the results by similar TRN numbers returning a list of those applications to return to the view. I think I am really close but just cant seem to figure out this last bit of the query. Any guidance is appreciated.
Error being presented
InvalidOperationException: The LINQ expression 'DbSet<Application>
.Where(a => a.LastName.ToLower().Contains(__ToLower_0) || a.FirstName.ToLower().Contains(__ToLower_0))
.GroupBy(
source: a => new {
TRN = a.TRN,
app = a
},
keySelector: a => a)' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to either AsEnumerable(), AsAsyncEnumerable(), ToList(), or ToListAsync()
UPDATE
Seems it is definitely due to a change in how .net core 3.x and EF core play together since recent updates. I had to change it to client evaluation by using AsEnumerable() instead of ToListAsync(). The rest of the query given by Steve py works with this method. I was unaware even after reading docs how the groupby really worked in LINQ, so that has helped me a lot. Taking the query to client side eval may have performance issues however.
The GroupBy support in EF core is a joke.
This worked perfectly on the server in EF6
var nonUniqueGroups2 = db.Transactions.GroupBy(e => new { e.AccountId, e.OpeningDate })
.Where(grp => grp.Count() > 1).ToList();
In EF core it causes an exception "Unable to translate the given 'GroupBy' pattern. Call 'AsEnumerable' before 'GroupBy' to evaluate it client-side." The message is misleading, do not call AsEnumerable because this should be handled on the server.
I have found a workaround here. An additional Select will help.
var nonUniqueGroups = db.Transactions.GroupBy(e => new { e.AccountId, e.OpeningDate })
.Select(x => new { x.Key, Count = x.Count() })
.Where(x => x.Count > 1)
.ToList();
The drawback of the workaround is that the result set does not contain the items in the groups.
There is an EF Core issue. Please vote on it so they actually fix this.
Based on this:
I want to group by TRN which is a repeating set of numbers eg.12345, in the Application table there may be many records with that same sequence and I only want the very latest row within each set of TRN sequences.
I believe this should satisfy what you are looking for:
return await _context.Application
.AsNoTracking()
.Include(app => app.BranchIssued)
.Include(app => app.BranchReceived)
.Include(app => app.Parish)
.Where(app => app.LastName.ToLower().Contains(name.ToLower()) || app.FirstName.ToLower()
.Contains(name.ToLower()))
.GroupBy(app => app.TRN)
.Select(x => x.OrderByDescending(y => y.CreatedAt).First())
.ToListAsync()
.ConfigureAwait(false);
The GroupBy expression should represent what you want to group by. In your case, the TRN. From there when we do the select, x represents each "group" which contains the Enumarable set of Applications that fall under each TRN. So we order those by the descending CreatedAt date to select the newest one using First.
Give that a shot. If it's not quite what you're after, consider adding an example set to your question and the desired output vs. what output / error this here produces.
I experience a similar issue where I find it interesting and stupid at the same time. Seems like EF team prohibits doing a WHERE before GROUP BY hence it does not work. I don't understand why you cannot do it but this seems the way it is which is forcing me to implement procedures instead of nicely build code.
LMK if you find a way.
Note: They have group by only when you first group then do where (where on the grouped elements of the complete table => does not make any sense to me)

EF core - parent.InverseParent returns null for some rows

I have a Category table and it has a Parent Category, I try to iterate over all the categories and get the parents categories with it's Inverse Parent but some of them returns without the inverse parents from unknown reason.
Categories.cs
public partial class Categories
{
public Categories()
{
InverseParent = new HashSet<Categories>();
}
public int Id { get; set; }
public int? ParentId { get; set; }
public DateTime CreateDate { get; set; }
public bool? Status { get; set; }
public virtual Categories Parent { get; set; }
public virtual ICollection<Categories> InverseParent { get; set; }
}
This is how I try to iterate them to create a select list items:
var parentCategories = await _context.Categories.
Include(x => x.Parent).
Where(x => x.Status == true).
Where(x => x.Parent != null).
Select(x => x.Parent).
Distinct().
ToListAsync();
foreach (var parent in parentCategories)
{
SelectListGroup group = new SelectListGroup() { Name = parent.Id.ToString() };
foreach (var category in parent.InverseParent)
{
categories.Add(new SelectListItem { Text = category.Id.ToString(), Value = category.Id.ToString(), Group = group });
}
}
So the problem is that some of my parent categories returns all their children categories and some don't and I don't why.
There are several issues with that code, all having some explaination in the Loading Related Data section of the documentation.
First, you didn't ask EF Core to include InverseParent, so it's more logically to expect it to be always null.
What you get is a result of the following Eager Loading behavior:
Tip
Entity Framework Core will automatically fix-up navigation properties to any other entities that were previously loaded into the context instance. So even if you don't explicitly include the data for a navigation property, the property may still be populated if some or all of the related entities were previously loaded.
Second, since the query is changing it's initial shape (Select, Disctinct), it's falling into Ignored Includes category.
With that being said, you should build the query other way around - starting directly with parent categories and including InverseParent:
var parentCategories = await _context.Categories
.Include(x => x.InverseParent)
.Where(x => x.InverseParent.Any(c => c.Status == true)) // to match your query filter
.ToListAsync();
While you are including Include(x => x.Parent), you don't seem to do the same for InverseParent. This might affect your results exactly the way you describe. Would including it fix it?
parentCategories = await _context.Categories.
Include(x => x.Parent).
Include(x => x.InverseParent).
Where(x => x.Status == true).
Where(x => x.Parent != null).
Select(x => x.Parent).
Distinct().
ToListAsync();
foreach (var parent in parentCategories)
{
SelectListGroup group = new SelectListGroup() { Name = parent.Id.ToString() };
foreach (var category in parent.InverseParent)
{
categories.Add(new SelectListItem { Text = category.Id.ToString(), Value = category.Id.ToString(), Group = group });
}
}
UPD: Since you are selecting x => x.Parent anyway it might be necessary to use ThenInclude() method instead.

Entity Framework include collection/reference where

I'm trying to select the latest version of all my clients and load each object with the latest version of their respective payments and the payments respective segment name.
It's a .net Core 2.0 project.
The controller is using:
using System;
using System.Linq;
using Microsoft.AspNetCore.Mvc;
using CBFU.Data;
using CBFU.Models;
using Microsoft.EntityFrameworkCore;
A client is created with no foreign keys.
A payment is created with af the foreign keys: ClientId and SegmentId.
A segment is created with no foreign keys.
I'm thinking something like:
var clients = _context.Clients
.Where(client => client.IsLatest == 1)
.Include(client => client.Payments
.Select(payment => payment.IsLatest == 1)
.Select(payment => payment.Segment))
.ToList();
But that does not work. Below I've listed af few of the things I tried and if it worked. I've no examples with .ThenInclude as my intelliSense does not respond to it.
// 1 This works: Loading payments into clients
var clients = _context.Clients
.Where(client => client.IsLatest == 1)
.Include(client => client.Payments)
.ToList();
// 2 This does NOT work: Loading payment with IsLatest == 0 into clients
var clients = _context.Clients
.Where(client => client.IsLatest == 1)
.Include(client => client.Payments
.Select(p => p.IsLatest == 1))
.ToList();
// 3 This does NOT work: Loading segment into payments into clients
var clients = _context.Clients
.Where(client => client.IsLatest == 1)
.Include(client => client.Payments
.Select(p => p.Segment == 1))
.ToList();
Both 2 and 3 gives the same error:
The property expression 'client => {from Payment payment in client.Payments select ([payment].IsLatest == 1)}' is not valid. The expression should represent a property access: 't => t.MyProperty'.For more information on including related data, see http://go.microsoft.com/fwlink/?LinkID=746393.
Source =< Cannot evaluate the exception source>
My classes looks as follows:
public class Client
{
public int Id { get; set; }
public int IsLatest { get; set; }
public ICollection<Payment> Payments { get; set; }
}
public class Payment
{
public int Id { get; set; }
public Client Client { get; set; }
[Required]
[Display(Name = "Client")]
public int ClientId { get; set; }
public Segment Segment { get; set; }
[Required]
[Display(Name = "Segment")]
public int SegmentId { get; set; }
public int IsLatest { get; set; }
}
public class Segment
{
public int Id { get; set; }
public string Name { get; set; }
}
Filtered Includes were never supported in pre EF Core, and (as of current v2.0) are still not supported by EF Core. EF Core 2.0 introduced Model-level query filters, but they apply for all queries and have to be specifically turned off when not needed. Also are not flexible enough to handle dynamic query level filtering.
What you can do though is to utilize a combination of the so called navigation property fixup, eager loading and filtering loaded entities techniques described in the Loading Related Data section of the documentation:
var clientQuery = _context.Clients
.Where(client => client.IsLatest == 1);
var clients = clientQuery.ToList();
clientQuery
.SelectMany(c => c.Payments)
.Include(p => p.Segment)
.Where(p => p.IsLatest == 1)
.Load();

Entity framework 6 join on a groupjoin using lambda

I need to set a Join on an GroupJoin. I have googled a lot to find the answer on my problem, but i cannot find it.
In the database I have templates. I select a template with a table joined to that with items. There is also a table with one or multiple rows with files linked to the item, that is the GroupJoin I use. That one works, but now the problem is, that I need to call the table (and that is always 1 not more) that is linked to table with files.
So far I have this with a join in the groupjoin, but that join isn't working at all:
DataBundle = _context.DataTemplates.Join(_context.DataItems, DataTemplates => DataTemplates.Id, DataItems => DataItems.DataTemplateId, (DataTemplates, DataItems) => new { DataTemplates, DataItems })
.GroupJoin(_context.DataItemFiles.Join(_context.DataTemplateUploads, DataItemFiles => DataItemFiles.DataTemplateUploadId, DataTemplateUploads => DataTemplateUploads.Id, (DataItemFiles, DataTemplateUploads) => new { DataItemFiles, DataTemplateUploads }), x => x.DataItems.Id, x => x.DataItemFiles.DataItemId, (x, DataItemFiles) => new { x.DataItems, x.DataTemplates, DataItemFiles })
.Where(x => x.DataTemplates.CallName == CallName).Where(x => x.DataItems.WebsiteLanguageId == WebsiteLanguageId)
.Select(x => new DataBundle()
{
DataItemFiles = x.DataItemFiles, //error
DataItemResources = null,
DataItems = x.DataItems,
DataTemplateFields = null,
DataTemplates = x.DataTemplates,
DataTemplateUploads = x.DataTemplateUploads, //can't find, because DataTemplateUploads is linked to DataItemFiles
}).ToList();
public class DataBundle
{
public IEnumerable<DataItemFiles> DataItemFiles { get; set; }
public IEnumerable<DataItemResources> DataItemResources { get; set; }
public DataItems DataItems { get; set; }
public IEnumerable<DataTemplateFields> DataTemplateFields { get; set; }
public DataTemplates DataTemplates { get; set; }
public IEnumerable<DataTemplateUploads> DataTemplateUploads { get; set; }
}
Someone know how to solve this?
The DataItemFiles variable here
(x, DataItemFiles) => new { x.DataItems, x.DataTemplates, DataItemFiles }
is actually IEnumerable<anonymous_type> where anonymous_type is the result of the previous Join operator new { DataItemFiles, DataTemplateUploads } (btw, you should use singular form for most of the names, it's really hard to follow which one is single and which one is sequence).
Hence to get the individual parts you need to use projection (Select):
.Select(x => new DataBundle()
{
DataItemFiles = x.DataItemFiles.Select(y => y.DataItemFiles),
// ...
DataTemplateUploads = x.DataItemFiles.Select(y => y.DataTemplateUploads),
// ...
}

What is right way to deal with "N+1" + Count problem?

Supose the model as below:
class public Post
{
public int Id {get; set;}
public virtual ICollection<Comment> Comments {get;set;}
}
in the Posts/Index Page, I want to show a list of Post, with the Count of comments of each post (not total number of comments of all posts).
1: If I use
context.Posts.Include("Comments")
it will load the whole entity of all related commments , in fact I only need the Count of Comments.
2: If I get the count of each post one by one:
var commentCount = context.Entry(post)
.Collection(p => p.Comments)
.Query()
.Count();
that is a N+1 problem.
Any one knows the right way?
Thank you!
Do you need this for your presentation layer / view model? In such case create specialized ViewModel
public class PostListView
{
public Post Post { get; set; }
public int CommentsCount { get; set; }
}
And use query with projection:
var data = context.Posts
.Select(p => new PostListView
{
Post = p,
CommentsCount = p.Comments.Count()
});
And you are done. If you need it you can flatten your PostListView so that it contains Post's properties instead of Post entity.
What about something like this:
public class PostView
{
public String PostName { get; set; }
public Int32 PostCount { get; set; }
}
public static IEnumerable<PostView> GetPosts()
{
var context = new PostsEntities();
IQueryable<PostView> query = from posts in context.Posts
select new PostView
{
PostName = posts.Title,
PostCount = posts.PostComments.Count()
};
return query;
}
Then use something like this:
foreach (PostView post in GetPosts())
{
Console.WriteLine(String.Format("Post Name: {0}, Post Count: {1}", post.PostName, post.PostCount));
}
Should display the list as so:
Post name (12)
Post name (1)
Etc etc