Vertical partitioning in Entity Framework Code First - entity-framework

I use Entity Framework Code First to access my SQL Server database. The "Client" table currently has about 90 columns:
[Table("Clients")]
public class Client
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property90 { get; set; }
}
I have decided to vertically partition this table into 3 tables, because often not all the properties are used. However, I still have legacy code (that I can't change right now) that expects the full Client object with all 90 columns.
My solution so far is to split the Client class into 3 classes corresponding with the new tables, and then use Table Per Type inheritance to allow the legacy code to access the Client object as though the original Clients table is still there:
[Table("Clients")]
public class Client: Client1
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property30 { get; set; }
}
[Table("Client1s")]
public class Client1: Client2
{
public int Id { get; set; }
public string Property31 { get; set; }
...
public string Property60 { get; set; }
}
[Table("Client2s")]
public class Client2
{
public int Id { get; set; }
public string Property61 { get; set; }
...
public string Property90 { get; set; }
}
However, this somehow seems a bit clunky to me.
Is there a more elegant way to achieve vertical partitioning with Entity Framework Code First?

So, considering you refer to the existing approach as being used by "legacy" systems and your new partitioned approach is most likely intended to be the new "correct" way going forwards, my advice would be to keep them as separated as possible.
What you could look to do is replace the existing monolithic Clients table with a database view that joins the 3 separate, partitioned tables back together. Then you can hook up the existing Clients class in all it's former glory to the view, leaving your legacy systems relatively untouched, in theory.
I'd also recommend ditching the inheritance idea and leaving the 3 new partitioned classes completely independent of one another. Otherwise, both legacy and new systems will be extremely sensitive to any changes being made to classes and properties within that entire inheritance chain.
By doing it this way you are then free to change and evolve the new classes independently and modify any underlying table structures however you see fit in the future. Providing you maintain the views integrity and consistency, your legacy systems should continue to function as normal without any repercussions or regressions, mostly :-)
In my humble experience, shielding the old from changes in the new far outway the slight inconveniences of having some code duplication and stricter boundaries.
To answer your elegance question more directly I'd say that insulating your classes against unnecessary coupling and avoiding the "ripple effect" yields the more elegant solution.
Anyway, I hope it helps.
I'm going to assume your actual classes have much more meaningful property names, right?
// LEGECY SYSTEMS
[Obsolete("Use the newer partitioned classes going forwards")]
[Table("vClients")]
public class Client
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property90 { get; set; }
}
// NEW STRUCTURES
[Table("Client1s")]
public class Client1
{
public int Id { get; set; }
public string Property1 { get; set; }
...
public string Property30 { get; set; }
}
[Table("Client2s")]
public class Client2
{
public int Id { get; set; }
public string Property31 { get; set; }
...
public string Property60 { get; set; }
}
[Table("Client3s")]
public class Client3
{
public int Id { get; set; }
public string Property61 { get; set; }
...
public string Property90 { get; set; }
}
Updating the the base tables via the view can be done using some INSTEAD OF triggers, like so:
CREATE TRIGGER ClientsLegacyInsertAdapter on vClients
INSTEAD OF INSERT
AS
BEGIN
BEGIN TRANSACTION
INSERT INTO Client1s
SELECT Id, PropertyA1, Property2, ..., Property30
FROM inserted;
INSERT INTO Client2s
SELECT Id, Property31, Property32, ..., Property60
FROM inserted;
INSERT INTO Client3s
SELECT Id, Property61, Property62, ..., Property90
FROM inserted;
COMMIT TRANSACTION
END
You should be able to use the same technique for UPDATE and DELETE commands also.

Related

EF6 Code first - skip binary (or any other) columns during load()

I have the ReportingActivity entity class.
public class ReportingActivity
{
public int ReportingActivityID { get; set; }
public DateTime ReportingActivitySend { get; set; }
public string Remark { get; set; }
public string SendersCSV { get; set; }
public string MailSenderStatus { get; set; }
public long RptGenerationCostMiliseconds { get; set; }
public DateTime RptGeneratedDateTime { get; set; }
public string RptGeneratedByWinUser { get; set; }
public string RptGeneratedOnMachine { get; set; }
public Int64 Run { get; set; }
public byte[] AttachmentFile { get; set; }
public virtual Report Report { get; set; }
public virtual Employee Employee { get; set; }
public virtual ReportingTask ReportingTask { get; set; }
}
I use this code to load data:
ctxDetail = new ReportingContext();
ctxDetail.ReportingActivity
.Where(x => x.Employee.EmployeeID == currentEmployee.EmployeeID)
.Load();
My code gets all the columns in (like SELECT * FROM... )
My question is how to skip the byte[] column, ideally recommend me a way how to improve my lines of code to be able specify exact list of columns.
Normally when dealing with a schema where records have large, seldom accessed details, it is beneficial to split those details off into a separate table as David mentions /w a one-to-one relationship back to the main record. These can be eager or lazy loaded as desired when they are needed.
If changing the schema is not practical then another option when retrieving data from the table is to utilize Projection via Select to populate a view model containing just the fields you need, excluding the larger fields. This will help speed up things like reads for views, however for things like performing updates you will still need to load the entire entity including the large fields to ensure you don't accidentally overwrite/erase data. It is possible to perform updates without loading this data, but it will add a bit of complexity and risk of introducing bugs if mismanaged later.
You can use Table Splitting, and optionally Lazy Loading to have only commonly needed columns loaded.
See
https://learn.microsoft.com/en-us/ef/core/modeling/table-splitting
This is for EF Core but it works the same on EF6 code-first.

Nested Properties with Inheritance

Online shop I am working on has entity Order that has member DeliveryDetails.
The purpose of DeliveryDetails is to contain data which is specific to delivery method selected by user (e.g. Shipping or Pick Up From Store), while some details are common for all methods (e.g. Firstname, Lastname, PhoneNumber). I was thinking about structure similar to the following using inheritance:
public class Order {
// ....other props...
public DeliveryMethodType DeliveryMethodType { get; set; }
public DeliveryDetailsBase DeliveryDetails { get; set; }
}
public class DeliveryDetailsBase
{
public int Id { get; set; }
public string CustomerId { get; set; }
public Order Order { get; set; }
public int OrderId { get; set; }
public string Firstname { get; set; }
public string Lastname { get; set; }
public string PhoneNumber { get; set; }
}
public class DeliveryDetailsShipping : DeliveryDetailsBase
{
public string Street { get; set; }
public string Building { get; set; }
public string Appartment { get; set; }
public string PostalCode { get; set; }
public string City { get; set; }
public string Country { get; set; }
}
public class DeliveryDetailsPickupFromStore : DeliveryDetailsBase
{
public string StoreCode { get; set; }
}
However, I can't figure out how to make DeliveryDetails prop be assigned to different type of delivery method details depending on what method customer selected and how to fit it in EntityFramework on ASP.Core.
Workarounds I have already tried:
-> (1). Creating "super class" contatining props for ALL delivery methods and populate in db only those that are needed for selected delivery method (selection via setting enum DeliveryMethodType). OUTCOME: works, but with 1 big and ugly table featuring multiple nulls.
-> (2). In Order, creating prop DeliveryDetails which in turn embraces DeliveryDetailsPickupFromStoreDATA & DeliveryDetailsShippingDATA. OUTCOME: works, but with several related tables and quite a lot of ugly code checking selected type from enum, instantiating specific subclass for chosen delivery method and setting to null other unused subclasses.
TO SUM UP: Is there any more elegant and feasible way to organize this?
Is there any more elegant and feasible way to organize this?
Keep it simple, and inheritance isn't usually simple. :)
As a general rule I opt for composition over inheritance. It's easier to work with. Given an order that needs to be delivered to an address or to a store:
public class Order
{
public DeliveryMethod DeliveryMethod { get; set; } = DeliveryMethod.None;
public virtual OrderDeliveryAddress { get; set; } // should never be null.
public virtual OrderDeliveryStore { get; set; } // not null if delivery mode = store.
}
public class Address
{
public string Street { get; set; }
public string Building { get; set; }
public string Appartment { get; set; }
public string PostalCode { get; set; }
public string City { get; set; }
public string Country { get; set; }
}
public class OrderDeliveryAddress
{
public virtual Order Order { get; set; }
public virtual Address Address { get; set; }
}
public class Store
{
public int StoreId { get; set; }
public virtual Address { get; set; }
}
public class OrderDeliveryStore
{
public virtual Order Order { get; set; }
public virtual Store Store { get; set; }
}
Where DeliveryMethod is an Enum. { None = 0, ToAddress, ToStore }
When an order is placed the operator can choose to deliver it to an address, selecting the address of the customer, or entering a new address record; or they can deliver it to a store which can also set the OrderDeliveryAddress to the address of the store. You can establish checks in the database/system to ensure that the data integrity for the delivery method and referenced OrderDeliveryAddress/OrderDeliveryStore are in sync and raise any mismatches that might appear.
One consideration would be that when it comes to deliveries, you will probably want to clone a new Address record based on the customer address, or store address as applicable at the time of ordering rather than referencing their current address record by ID. The reason would be for historical integrity. An order will have been delivered to the address at that point in time, and if a customer address or store address changes in the future, past orders should still show the address that order was delivered.
EF Core has only implemented Table Per Hierarchy (TPH) inheritance.
Table Per Type (TPT) is still an open ticket (not implemented).
Table Per Concrete Type (TPC) is also still an open ticket (not implemented).
So, if TPH meets your requirements, you can follow this guide.
Basically, one table will be used and an extra column called Discriminator will be used to determine which implementation the record corresponds to.
If you are just getting started with Entity, my recommendation would be to not use inheritance and just use nullable columns for data that may or may not be needed depending on the type.

Entity framework one foreign key toward two tables - code first

All,
Is it possible to use the same FK for two tables.
Probably it is not a good practice, but I have a two different classes that can be both booked:
public class Course {
public Course() {
BookingRefs = new HashSet<BookingRef>();
}
public long Id { get; set; }
public string Title { get; set; }
// other props ...
[InverseProperty(nameof(BookingRef.Course))]
public virtual ICollection<BookingRef> BookingRefs { get; set; }
}
public class GiftCard {
public GiftCard() {
BookingRefs = new HashSet<BookingRef>();
}
public long Id { get; set; }
public string Prop1 { get; set; }
public int Prop2 { get; set; }
// other props ...
[InverseProperty(nameof(BookingRef.Course))]
public virtual ICollection<BookingRef> BookingRefs { get; set; }
}
// this is the bookin reference for a Course or an GiftCard
public class BookingRef {
public BookingRef() {
}
public long Id { get; set; }
// other props ...
/// <summary>The item (usually the course but theoretically anything with a long id)</summary>
public long? ItemId { get; set; }
// maybe a generic Object?
[ForeignKey(nameof(ItemId))]
public Object GiftCard { get; set; }
// maybe 2 items possibly null?
[ForeignKey(nameof(ItemId))]
public Course Course { get; set; }
// maybe 2 items possibly null?
[ForeignKey(nameof(ItemId))]
public GiftCard GiftCard { get; set; }
}
Is it possible to use the same FK for two tables
No. The relational model doesn't allow that. You can introduce a superclass of all your bookable things and have a FK to that, but you shouldn't do that just get a single collection rather than multiple.
Think of it from the relational data perspective. How would the database know what table an "Item ID" pointed at? How would it index it?
This would be a case for using a null-able FK to each related table on the booking. These FKs do not need to reside in the entity, just the navigation properties. You can leverage .Map(x => x.MapKey) in EF6 or .HasForeignKey("") in EF Core to leverage a shadow property.
This does not enforce if you want a booking to only be associated to a course or a gift card but not both. That would need to be catered for at the application level, and I would recommend using a scheduled maintenance task to evaluate the data for violations to that rule. (Look for bookings holding both a course ID and a gift card ID for example)
You can alternatively keep the joins "loose" and evaluated by the application based on a discriminator similar to an inheritance model. (ItemId + ItemType) However you have to resolve the relationship load separately in your application based on the ItemType and lose out on any FK, indexing, and data integrity checks in the database. This could be a significant performance & maintenance cost to save adding a couple FKs.

EF 5 Code First using Inheritence in the class

I am getting Error when trying to run this code.
Unable to determine the principal end of an association between the
types 'AddressBook.DAL.Models.User' and 'AddressBook.DAL.Models.User'.
The principal end of this association must be explicitly configured
using either the relationship fluent API or data annotations.
The objective is that i am creating baseClass that has commonfield for all the tables.
IF i don't use base class everything works fine.
namespace AddressBook.DAL.Models
{
public class BaseTable
{
[Required]
public DateTime DateCreated { get; set; }
[Required]
public DateTime DateLastUpdatedOn { get; set; }
[Required]
public virtual int CreatedByUserId { get; set; }
[ForeignKey("CreatedByUserId")]
public virtual User CreatedByUser { get; set; }
[Required]
public virtual int UpdatedByUserId { get; set; }
[ForeignKey("UpdatedByUserId")]
public virtual User UpdatedByUser { get; set; }
[Required]
public RowStatus RowStatus { get; set; }
}
public enum RowStatus
{
NewlyCreated,
Modified,
Deleted
}
}
namespace AddressBook.DAL.Models
{
public class User : BaseTable
{
[Key]
public int UserID { get; set; }
public string UserName { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public string MiddleName { get; set; }
public string Password { get; set; }
}
}
You need to provide mapping information to EF. The following article describes code-first strategies for different EF entity inheritance models (table-per-type, table-per-hierarchy, etc.). Not all the scenarios are directly what you are doing here, but pay attention to the mapping code because that's what you need to consider (and it's good info in case you want to use inheritance for other scenarios). Note that inheritance does have limitations and costs when it comes to ORMs, particularly with polymorphic associations (which makes the TPC scenario somewhat difficult to manage). http://weblogs.asp.net/manavi/archive/2010/12/24/inheritance-mapping-strategies-with-entity-framework-code-first-ctp5-part-1-table-per-hierarchy-tph.aspx
The other way EF can handle this kind of scenario is by aggregating a complex type into a "fake" compositional relationship. In other words, even though your audit fields are part of some transactional entity table, you can split them out into a common complex type which can be associated to any other entity that contains those same fields. The difference here is that you'd actually be encapsulting those fields into another type. So for example, if you moved your audit fields into an "Audit" complext type, you would have something like:
User.Audit.DateCreated
instead of
User.DateCreated
In any case, you still need to provide the appropriate mapping information.
This article here explains how to do this: http://weblogs.asp.net/manavi/archive/2010/12/11/entity-association-mapping-with-code-first-part-1-one-to-one-associations.aspx

using hashset in entity framework

I want to know what is the difference between creating classes with or without using "hashset" in constructor.
Using code first approach (4.3) one can creat models like this:
public class Blog
{
public int Id { get; set; }
public string Title { get; set; }
public string BloggerName { get; set;}
public virtual ICollection<Post> Posts { get; set; }
}
public class Post
{
public int Id { get; set; }
public string Title { get; set; }
public DateTime DateCreated { get; set; }
public string Content { get; set; }
public int BlogId { get; set; }
public ICollection<Comment> Comments { get; set; }
}
or can create models like this :
public class Customer
{
public Customer()
{
BrokerageAccounts = new HashSet<BrokerageAccount>();
}
public int Id { get; set; }
public string FirstName { get; set; }
public ICollection<BrokerageAccount> BrokerageAccounts { get; set; }
}
public class BrokerageAccount
{
public int Id { get; set; }
public string AccountNumber { get; set; }
public int CustomerId { get; set; }
}
What is hashset doing here?
should i use hashset in the first two models also?
is there any article which shows the application of hashset?
Generally speaking, it is best to use the collection that best expresses your intentions. If you do not specifically intend to use the HashSet's unique characteristics, I would not use it.
It is unordered and does not support lookups by index. Furthermore, it is not as well suited for sequential reads as other collections, and the fact that it allows you to add the same item multiple times without creating duplicates is only useful if you have a reason to use it for that. If that is not your intention, it can hide misbehaving code and make problems difficult to isolate.
The HashSet is mostly useful in situations where insertion and removal times are very important, such as when processing data. It is also extremely useful for comparing sets of data (again when processing) using operations like intersect, except, and union. In any other situation, the cons generally outweigh the pros.
Consider that when working with blog posts, inserts and removes are quite rare compared to reads, and you generally want to read the data in a specific order, anyway. That is more or less the exact opposite of what the HashSet is good at. It is highly doubtful that you would ever intend to add the same post twice, for any reason, and I see no reason why you would use set-based operations on posts in a class like that.
The HashSet does not define the type of collection that will be generated when you actually fetch data. This will always be of type ICollection as declared.
The HashSet created in the constructor is to help you avoid NullReferenceExceptions when no records are fetched or exist in the many side of the relationship. It is in no way required.
For example, based on your question, when you try to use a relationship like...
var myCollection = Blog.Posts();
If no Posts exist then myCollection will be null. Which is OK, until you fluent chain things and do something like
var myCollectionCount = Blog.Posts.Count();
which will error with a NullReferenceException.
Where as
var myCollection = Customer.BrokerageAccounts();
var myCollectionCount = Customer.BrokerageAccounts.Count();
will result in and empty ICollection and a zero count. No exceptions :-)
I'm fairly new to Entity Framework but this is my understanding. The collection types can be any type that implements ICollection<T>. In my opinion a HashSet is usually the semantically correct collection type. Most collections should only have one instance of a member (no duplicates) and HashSet best expresses this. I have been writing my classes as shown below and this has worked well so far. Note that the collection is typed as ISet<T> and the setter is private.
public class Customer
{
public Customer()
{
BrokerageAccounts = new HashSet<BrokerageAccount>();
}
public int Id { get; set; }
public string FirstName { get; set; }
public ISet<BrokerageAccount> BrokerageAccounts { get; private set; }
}