using hashset in entity framework - entity-framework

I want to know what is the difference between creating classes with or without using "hashset" in constructor.
Using code first approach (4.3) one can creat models like this:
public class Blog
{
public int Id { get; set; }
public string Title { get; set; }
public string BloggerName { get; set;}
public virtual ICollection<Post> Posts { get; set; }
}
public class Post
{
public int Id { get; set; }
public string Title { get; set; }
public DateTime DateCreated { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public ICollection<Comment> Comments { get; set; }
}
or can create models like this :
public class Customer
{
public Customer()
{
BrokerageAccounts = new HashSet<BrokerageAccount>();
}
public int Id { get; set; }
public string FirstName { get; set; }
public ICollection<BrokerageAccount> BrokerageAccounts { get; set; }
}
public class BrokerageAccount
{
public int Id { get; set; }
public string AccountNumber { get; set; }
public int CustomerId { get; set; }
}
What is hashset doing here?
should i use hashset in the first two models also?
is there any article which shows the application of hashset?

Generally speaking, it is best to use the collection that best expresses your intentions. If you do not specifically intend to use the HashSet's unique characteristics, I would not use it.
It is unordered and does not support lookups by index. Furthermore, it is not as well suited for sequential reads as other collections, and the fact that it allows you to add the same item multiple times without creating duplicates is only useful if you have a reason to use it for that. If that is not your intention, it can hide misbehaving code and make problems difficult to isolate.
The HashSet is mostly useful in situations where insertion and removal times are very important, such as when processing data. It is also extremely useful for comparing sets of data (again when processing) using operations like intersect, except, and union. In any other situation, the cons generally outweigh the pros.
Consider that when working with blog posts, inserts and removes are quite rare compared to reads, and you generally want to read the data in a specific order, anyway. That is more or less the exact opposite of what the HashSet is good at. It is highly doubtful that you would ever intend to add the same post twice, for any reason, and I see no reason why you would use set-based operations on posts in a class like that.

The HashSet does not define the type of collection that will be generated when you actually fetch data. This will always be of type ICollection as declared.
The HashSet created in the constructor is to help you avoid NullReferenceExceptions when no records are fetched or exist in the many side of the relationship. It is in no way required.
For example, based on your question, when you try to use a relationship like...
var myCollection = Blog.Posts();
If no Posts exist then myCollection will be null. Which is OK, until you fluent chain things and do something like
var myCollectionCount = Blog.Posts.Count();
which will error with a NullReferenceException.
Where as
var myCollection = Customer.BrokerageAccounts();
var myCollectionCount = Customer.BrokerageAccounts.Count();
will result in and empty ICollection and a zero count. No exceptions :-)

I'm fairly new to Entity Framework but this is my understanding. The collection types can be any type that implements ICollection<T>. In my opinion a HashSet is usually the semantically correct collection type. Most collections should only have one instance of a member (no duplicates) and HashSet best expresses this. I have been writing my classes as shown below and this has worked well so far. Note that the collection is typed as ISet<T> and the setter is private.
public class Customer
{
public Customer()
{
BrokerageAccounts = new HashSet<BrokerageAccount>();
}
public int Id { get; set; }
public string FirstName { get; set; }
public ISet<BrokerageAccount> BrokerageAccounts { get; private set; }
}

Related

Entity framework one foreign key toward two tables - code first

All,
Is it possible to use the same FK for two tables.
Probably it is not a good practice, but I have a two different classes that can be both booked:
public class Course {
public Course() {
BookingRefs = new HashSet<BookingRef>();
}
public long Id { get; set; }
public string Title { get; set; }
// other props ...
[InverseProperty(nameof(BookingRef.Course))]
public virtual ICollection<BookingRef> BookingRefs { get; set; }
}
public class GiftCard {
public GiftCard() {
BookingRefs = new HashSet<BookingRef>();
}
public long Id { get; set; }
public string Prop1 { get; set; }
public int Prop2 { get; set; }
// other props ...
[InverseProperty(nameof(BookingRef.Course))]
public virtual ICollection<BookingRef> BookingRefs { get; set; }
}
// this is the bookin reference for a Course or an GiftCard
public class BookingRef {
public BookingRef() {
}
public long Id { get; set; }
// other props ...
/// <summary>The item (usually the course but theoretically anything with a long id)</summary>
public long? ItemId { get; set; }
// maybe a generic Object?
[ForeignKey(nameof(ItemId))]
public Object GiftCard { get; set; }
// maybe 2 items possibly null?
[ForeignKey(nameof(ItemId))]
public Course Course { get; set; }
// maybe 2 items possibly null?
[ForeignKey(nameof(ItemId))]
public GiftCard GiftCard { get; set; }
}
Is it possible to use the same FK for two tables
No. The relational model doesn't allow that. You can introduce a superclass of all your bookable things and have a FK to that, but you shouldn't do that just get a single collection rather than multiple.
Think of it from the relational data perspective. How would the database know what table an "Item ID" pointed at? How would it index it?
This would be a case for using a null-able FK to each related table on the booking. These FKs do not need to reside in the entity, just the navigation properties. You can leverage .Map(x => x.MapKey) in EF6 or .HasForeignKey("") in EF Core to leverage a shadow property.
This does not enforce if you want a booking to only be associated to a course or a gift card but not both. That would need to be catered for at the application level, and I would recommend using a scheduled maintenance task to evaluate the data for violations to that rule. (Look for bookings holding both a course ID and a gift card ID for example)
You can alternatively keep the joins "loose" and evaluated by the application based on a discriminator similar to an inheritance model. (ItemId + ItemType) However you have to resolve the relationship load separately in your application based on the ItemType and lose out on any FK, indexing, and data integrity checks in the database. This could be a significant performance & maintenance cost to save adding a couple FKs.

Vertical partitioning in Entity Framework Code First

I use Entity Framework Code First to access my SQL Server database. The "Client" table currently has about 90 columns:
[Table("Clients")]
public class Client
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property90 { get; set; }
}
I have decided to vertically partition this table into 3 tables, because often not all the properties are used. However, I still have legacy code (that I can't change right now) that expects the full Client object with all 90 columns.
My solution so far is to split the Client class into 3 classes corresponding with the new tables, and then use Table Per Type inheritance to allow the legacy code to access the Client object as though the original Clients table is still there:
[Table("Clients")]
public class Client: Client1
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property30 { get; set; }
}
[Table("Client1s")]
public class Client1: Client2
{
public int Id { get; set; }
public string Property31 { get; set; }
...
public string Property60 { get; set; }
}
[Table("Client2s")]
public class Client2
{
public int Id { get; set; }
public string Property61 { get; set; }
...
public string Property90 { get; set; }
}
However, this somehow seems a bit clunky to me.
Is there a more elegant way to achieve vertical partitioning with Entity Framework Code First?
So, considering you refer to the existing approach as being used by "legacy" systems and your new partitioned approach is most likely intended to be the new "correct" way going forwards, my advice would be to keep them as separated as possible.
What you could look to do is replace the existing monolithic Clients table with a database view that joins the 3 separate, partitioned tables back together. Then you can hook up the existing Clients class in all it's former glory to the view, leaving your legacy systems relatively untouched, in theory.
I'd also recommend ditching the inheritance idea and leaving the 3 new partitioned classes completely independent of one another. Otherwise, both legacy and new systems will be extremely sensitive to any changes being made to classes and properties within that entire inheritance chain.
By doing it this way you are then free to change and evolve the new classes independently and modify any underlying table structures however you see fit in the future. Providing you maintain the views integrity and consistency, your legacy systems should continue to function as normal without any repercussions or regressions, mostly :-)
In my humble experience, shielding the old from changes in the new far outway the slight inconveniences of having some code duplication and stricter boundaries.
To answer your elegance question more directly I'd say that insulating your classes against unnecessary coupling and avoiding the "ripple effect" yields the more elegant solution.
Anyway, I hope it helps.
I'm going to assume your actual classes have much more meaningful property names, right?
// LEGECY SYSTEMS
[Obsolete("Use the newer partitioned classes going forwards")]
[Table("vClients")]
public class Client
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property90 { get; set; }
}
// NEW STRUCTURES
[Table("Client1s")]
public class Client1
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property30 { get; set; }
}
[Table("Client2s")]
public class Client2
{
public int Id { get; set; }
public string Property31 { get; set; }
...
public string Property60 { get; set; }
}
[Table("Client3s")]
public class Client3
{
public int Id { get; set; }
public string Property61 { get; set; }
...
public string Property90 { get; set; }
}
Updating the the base tables via the view can be done using some INSTEAD OF triggers, like so:
CREATE TRIGGER ClientsLegacyInsertAdapter on vClients
INSTEAD OF INSERT
AS
BEGIN
BEGIN TRANSACTION
INSERT INTO Client1s
SELECT Id, PropertyA1, Property2, ..., Property30
FROM inserted;
INSERT INTO Client2s
SELECT Id, Property31, Property32, ..., Property60
FROM inserted;
INSERT INTO Client3s
SELECT Id, Property61, Property62, ..., Property90
FROM inserted;
COMMIT TRANSACTION
END
You should be able to use the same technique for UPDATE and DELETE commands also.

Entity Framework Eager Loading Loads Everything

We are using Entity Framework + Repository Pattern in a web based application to fetch database . Because of our complex business, our models are getting complex sometimes and this cause strange behaviour at Entity Framework eager loading system.
Please imagine our real model like this. We have tables, boxes which are on table, pencil cases which can be on table or in the box and pencils that can be on the table or in the box or in the pencil case.
We had modelled this in our application like this.
public class Table
{
public int TableID{ get; set; }
public virtual ICollection<Box> Boxes{ get; set; }
public virtual ICollection<PencilCases> PencilCases{ get; set; }
public virtual ICollection<Pencils> Pencils{ get; set; }
}
public class Box
{
public int BoxID{ get; set; }
public int TableID{ get; set; }
[ForeignKey("TableID")]
public virtual Table Table{ get; set; }
public virtual ICollection<PencilCases> PencilCases{ get; set; }
public virtual ICollection<Pencils> Pencils{ get; set; }
}
public class PencilCases
{
public int PencilCaseID{ get; set; }
public int? BoxID{ get; set; }
public int TableID{ get; set; }
[ForeignKey("TableID")]
public virtual Table Table{ get; set; }
[ForeignKey("BoxID")]
public virtual Box Box{ get; set; }
public virtual ICollection<Pencils> Pencils{ get; set; }
}
public class Pencils
{
public int PencilID{ get; set; }
public int? PencilCaseID{ get; set; }
public int? BoxID{ get; set; }
public int TableID{ get; set; }
[ForeignKey("TableID")]
public virtual Table Table{ get; set; }
[ForeignKey("BoxID")]
public virtual Box Box{ get; set; }
[ForeignKey("PencilCaseID")]
public virtual PencilCase PancelCase{ get; set; }
}
Our repository pattern implementation similar with this tutorial, http://www.asp.net/mvc/tutorials/getting-started-with-ef-5-using-mvc-4/implementing-the-repository-and-unit-of-work-patterns-in-an-asp-net-mvc-application
So we call get method like this.
var tables = unitOfWork.TableRepository.Get(includeProperties: "Boxes, PencilCases, Boxes.Pencils");
So the problem is the result is very different from my expectations;i expect only Boxes,PencilCases and Boxes.Pencils collections will be fetched, but all the Pencil entities fetched from database including Pencils, PencilCases.Pencils and Boxes.PencilCases.Pencils. This recursive fetch causes OutOfMemoryException because amount of data.
I couldn't understand why Entity Framework fetches all Pencils except Boxes.Pencils. I also tried to specify including list with Expression instead of Query Path but result didn't change.
first off - I'm fairly new to EF myself so please excuse if the following is not 100% accurate. However, I've dealt with this exact same problem just a couple of days ago, so hopefully this will help.
The problem is that when EF loads a specific entity, it will add that entity to every part of the Data Model that it appears in - not just the parts that were explicitly loaded.
This means that every Pencil in Boxes.Pencils that is also in the ICollection of Table.Pencils will be automatically resolved even though you did not specifically ask for it.
By itself that fact does not present a problem, and can even be helpful in a user-driven MVC application.
Where it all goes wrong is when you try to do anything that recurses trough the Data Entity, such as trying to map the self-recursing Data Entity to a Business Model or trying to turn the self-recursing data entity into JSON/XML.
Now, there are several solutions to this problem:
Implement a mapper / encoder that hashes / remembers each object and only adds it once:
The problem with this one is that it can lead to some hard-to-predict results, especially when you want / need the object in multiple places. Additionally, hashing and comparing every object could be costly.
Implement a mapper / encoder that can be configured to ignore some properties
Relatively simple - if you can specify that you don't want to map or encode Pencil at all, you won't have any issues. Downsides are of course that you could still encounter a stackoverflow if you are not vigilant about specifying the ignored properties.
Implement a mapper / encoder with specifyable recursion depth
This is a very simple and pretty decent solution - simply set a hard limit on recursion depth, either on a global or on a per-type basis, and you won't have any more stackoverflows. Downside is that you would still end up with elements that you don't want, and thus get a unnecessarily bloated return object.
Implement custom business entities
This is probably the best solution - simply create a new business entity with the offending navigational properties removed. The primary downside is that it would require you to create different business entities for different purposes.
Here is a example:
// Removed Pencils
public class BusinessTable
{
public int TableID{ get; set; }
public IEnumerable<Box> Boxes{ get; set; }
public IEnumerable<PencilCases> PencilCases{ get; set; }
}
// Removed Table & PencilCases
public class BusinessBox
{
public int BoxID{ get; set; }
public int TableID{ get; set; }
public IEnumerable<Pencils> Pencils{ get; set; }
}
// Removed Table & Box & Pencils
public class BusinessPencilCases
{
public int PencilCaseID{ get; set; }
public int? BoxID{ get; set; }
public int TableID{ get; set; }
}
// Removed Table, Box, PencilCase
public class BusinessPencils
{
public int PencilID{ get; set; }
public int? PencilCaseID{ get; set; }
public int? BoxID{ get; set; }
public int TableID{ get; set; }
}
Now when you map your Data Entity to this set of Business Entities, you won't get any more errors.
For the mapping aspect of this, theres 2 solutions: Manually doing things / using a mapping factory Example of Model Factory, ValueInjecter and AutoMapper - the latter two being available NuGet packages.
For AutoMapper:
I don't use AutoMapper, but you'd have to create a config file that looks something like this:
Mapper.CreateMap<Table, BusinessTable>();
Mapper.CreateMap<Box, BusinessBox>();
Mapper.CreateMap<PencilCases, BusinessPencilCases>();
Mapper.CreateMap<Pencils, BusinessPencils>();
And then in your query:
var tables = unitOfWork.TableRepository.Get(includeProperties: "Boxes, PencilCases, Boxes.Pencils");
var result = Mapper.Map<IEnumerable<Table>, IEnumerable<BusinessTable>>(tables);
Or
var tables = unitOfWork.TableRepository.Get(includeProperties: "Boxes, PencilCases, Boxes.Pencils").Project().To<IEnumerable<BusinessTable>;
For more info pertaining AutoMapper ( like how to set up a config file ): https://github.com/AutoMapper/AutoMapper/wiki/Getting-started
For ValueInjecter:
var tables = unitOfWork.TableRepository.Get(includeProperties: "Boxes, PencilCases, Boxes.Pencils");
var result = new List<BusinessTable>().InjectFrom(tables);
Or:
var tables = unitOfWork.TableRepository.Get(includeProperties: "Boxes, PencilCases, Boxes.Pencils");
var result = tables.Select(x => new BusinessTable.InjectFrom(x).Cast<BusinessTable>());
It might also be worthwhile to look at additional ValueInjecter Injections, like SmartConventionInjection, Deep Cloning, Useful Injections and a ORM with ValueInjecter guide.
I also made a few injections for my own project that may be of use to you, which you can find On my Github
With MaxDepthCloneInjector for example, you can supply a dictionary of (property names, max recursion depth) and it will only map values included in the dictionary, and only until the specified level.
Two more pieces of advice:
If you want a bit more freedom with your queries, you should consider using the Query Expression Syntax for some of your more complex needs. Theres also some good information in this answer on SO: How to limit number of related data with Include
If you are planning to run queries including navigational properties like the one in your example: STICK WITH EAGER LOADING. A query like that in Lazy Loading would lead to the N + 1 problem. As a rule of thumb:
Use Lazy Loading if you don't need the entire result set right away, for example if you are developing a application where data requirements naturally expand based on the User's interaction with the application.
Use Eager Loading if you need the entire result-set right away, for example in a Web Api, or a application that needs to work with the complete entity.
Best of luck,
Felix

EF 5 Code First using Inheritence in the class

I am getting Error when trying to run this code.
Unable to determine the principal end of an association between the
types 'AddressBook.DAL.Models.User' and 'AddressBook.DAL.Models.User'.
The principal end of this association must be explicitly configured
using either the relationship fluent API or data annotations.
The objective is that i am creating baseClass that has commonfield for all the tables.
IF i don't use base class everything works fine.
namespace AddressBook.DAL.Models
{
public class BaseTable
{
[Required]
public DateTime DateCreated { get; set; }
[Required]
public DateTime DateLastUpdatedOn { get; set; }
[Required]
public virtual int CreatedByUserId { get; set; }
[ForeignKey("CreatedByUserId")]
public virtual User CreatedByUser { get; set; }
[Required]
public virtual int UpdatedByUserId { get; set; }
[ForeignKey("UpdatedByUserId")]
public virtual User UpdatedByUser { get; set; }
[Required]
public RowStatus RowStatus { get; set; }
}
public enum RowStatus
{
NewlyCreated,
Modified,
Deleted
}
}
namespace AddressBook.DAL.Models
{
public class User : BaseTable
{
[Key]
public int UserID { get; set; }
public string UserName { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public string MiddleName { get; set; }
public string Password { get; set; }
}
}
You need to provide mapping information to EF. The following article describes code-first strategies for different EF entity inheritance models (table-per-type, table-per-hierarchy, etc.). Not all the scenarios are directly what you are doing here, but pay attention to the mapping code because that's what you need to consider (and it's good info in case you want to use inheritance for other scenarios). Note that inheritance does have limitations and costs when it comes to ORMs, particularly with polymorphic associations (which makes the TPC scenario somewhat difficult to manage). http://weblogs.asp.net/manavi/archive/2010/12/24/inheritance-mapping-strategies-with-entity-framework-code-first-ctp5-part-1-table-per-hierarchy-tph.aspx
The other way EF can handle this kind of scenario is by aggregating a complex type into a "fake" compositional relationship. In other words, even though your audit fields are part of some transactional entity table, you can split them out into a common complex type which can be associated to any other entity that contains those same fields. The difference here is that you'd actually be encapsulting those fields into another type. So for example, if you moved your audit fields into an "Audit" complext type, you would have something like:
User.Audit.DateCreated
instead of
User.DateCreated
In any case, you still need to provide the appropriate mapping information.
This article here explains how to do this: http://weblogs.asp.net/manavi/archive/2010/12/11/entity-association-mapping-with-code-first-part-1-one-to-one-associations.aspx

Entity framework data-first: Auto-load associated data

I am trying to use Entity Framework with an existing database. Using the code-first approach, I have gotten the following auto-created model (among others - I've tried to shorten the code to get to the essence of the question):
namespace Fidd.Models
{
using System;
using System.Collections.Generic;
public partial class Movies
{
public Movies()
{
...
this.MoviesPictures = new HashSet<MoviesPictures>();
...
}
public int MovieID { get; set; }
public string MovieName { get; set; }
...
public virtual ICollection<MoviesPictures> MoviesPictures { get; set; }
}
}
So basically this is a 1-n relationship between Movies and MoviesPictures. I am still in the process of learning EF.
If I want to load a single Movie with
var movie = from m in dbContext.Movies
where m.MovieID == 5
select m;
How do I get the MoviesPictures collection to be loaded automatically? Either eager or lazy.
UPDATE: There is actually an extra association:
Movies 1..n MoviesPictures n..1 Pictures
The MoviesPictures model is defined like this:
public partial class MoviesPictures
{
public int MoviePictureID { get; set; }
public int MoviePictureMovieID { get; set; }
public int MoviePicturePictureID { get; set; }
public System.DateTime MoviePictureAddDatetime { get; set; }
public bool MoviePictureRemoved { get; set; }
public Nullable<System.DateTime> MoviePictureRemovedDatetime { get; set; }
public virtual Movies Movies { get; set; }
public virtual Pictures Pictures { get; set; }
}
Is there any way to eager load this 2. layer of association within the same query? I've tried to do this:
var model = from m in db.Movies.Include("MoviesPictures").Include("Pictures")
where m.MovieID == id
select m
which does not work - I get a runtime Exception that Pictures is not defined with an navigation-attribute in Movies. Which of course makes sense. I just don't know how to specify the query otherwise.
Another thing that worries me... The above Include() statement does not catch any errors during compile-time. Is there a way to specify this in a type-safe fashion?
/Carsten
You would want to do:
var movie = from m in dbContext.Movies.Include("MoviesPictures")
where m.MovieId == 5
select m;
This is eagerly fetching the records from the "MoviesPictures" table. You can read a bit more about it here: http://msdn.microsoft.com/en-us/magazine/cc507640.aspx. Also if you google on "entity framework include" you can probably find a lot more information.
UPDATE
You might be able to do .Include("MoviesPictures.Pictures") it depends on how you have things set up. If not, then you want to do some joins; there's a good blog post here: http://weblogs.asp.net/salimfayad/archive/2008/07/09/linq-to-entities-join-queries.aspx on joins.
In regards to do it in "type safe"ly; this is the only method of "including" related records. You could, as I mentioned, use joining which might be a little closer to being "type safe".