Best practice for Ids Entity framework code first - entity-framework

So I stumbled upon this article earlier today
https://blogs.msdn.microsoft.com/azuremobile/2014/05/22/tables-with-integer-keys-and-the-net-backend/
In the article, the author makes a comment that got my attention. He said
Note: Typically, when starting from an Entity Framework Code-First model in your .NET Backend, you would use string ids
From what I've read, using string Ids can be a performance issue in as your table grows. So I would just like to know if this was just the authors opinion or it is a standard. If it is the later, I would like to know the reasons behind this.

IMHO identity field should be numeric for performance reasons matching int is way much faster than matching string and numeric field saves a lot of space than string.

Technically yes, you can use the string as primary key, but if a string makes sense to be the primary key then you should probably use it. You have to take in your account some consideration.
Digtis comparison is faster then string comparison
Longer string mean harder to compare
When you must use a string as primary key then set the length e.g. MaxLength = 20 = nvarchar(20)
public class User
{
[Key, DatabaseGenerated(DatabaseGeneratedOption.None), MaxLength(20)]
public string UserId { get; set; }
....
This will help you to avoid some performance issues.
You can also change the generated key from nvarchar to varchar by using dbcontext.executesqlcommand this will give you more space (One charachter will use only one byte and not 2).
Alternatively, you can with code first change the column data type as following:
[Key, DatabaseGenerated(DatabaseGeneratedOption.None), Column(TypeName = "varchar"), MaxLength(20)]
public string UserId { get; set; }
....

Related

EF Code-First Design: Common Collection Type on different parent types/tables

I need help with a design question using Entity Framework 6.1.3 from a Code-First perspective.
We have tables called "Businesses", "People", and "PhoneNumbers". A business can have one-to-many phone numbers and person can have one-to-many phone numbers. I'm struggling to determine the best way to implement this. The following solutions have been explored but neither strikes me as being the "obvious" solution. Can you please offer advice?
Use a common Phone table to hold numbers for the Business and People.
In this solution, the Phone table would have RI to the People table and to the Business table. The ID fields would be nullable so that when it is a business phone, the participant ID would be null and vice-versa:
public Nullable<int> ParticipantID { get; set; }
public Participant Participant { get; set; }
public Nullable<int> BusinessID { get; set; }
public Business Business { get; set; }
Create separate tables for the Business (BusinessPhone) and Person (PersonPhone) phone numbers. Both phone tables could inherit from the original phone table but each would have separate RI statements to the corresponding Business or Person. This way, neither table would need a nullable key.
For example, the PersonPhone table would look something like:
public class PersonPhone : Phone
{
public int ParticipantID{ get; set; }
public Participant Participant { get; set; }
}
Are either of these solutions best practice? Is there a better solution? What do you recommend?
I will suggest that the best option is to use separate tables for this. Use a common ABC called PhoneNumber, and derive subclasses for each collection parent type, each mapped to their own table (TPC strategy). Or, map the ABC to a common table for the common fields, and referenced tables for the references to Person or Business (TPT strategy).
Coming from an object-oriented background where we often optimize for code-reuse; with inheritance being a strategy for reusing code; and with TPH being the default mapping of inheritance structures in EF; it feels right to have these both in a single table. But looking from the perspective of a DBA, mashing these two concerns together is an abuse of the data structures in the database. Call it "schema-smell". It might work, but you are giving up a lot of what the database can do for you. You're painting yourself into a corner for no good reason.
There are three inheritance strategies; this EF tutorial site has a good rundown. The single-table is TPH. Usually I will use this even though it violates 3rd normal form, because it is simple and performant. In the case where types differ by the other types they reference, though, the denormalization compromises offer diminishing returns.
The main problem as I see it: With one table, you have to figure out a way to mash separate FKs in. With one column and a discriminator (which may not even be legal with EF) you lose the ability to do DRI and cascading deletes. With two columns, you'll have to monitor that one and only one is non-null at any given time. With either solution, you'll be giving up storage space. If the two objects ever diverge--say, you have to add an EmployeeName to BusinessPhone--the problem will only be exacerbated.
So, in this case I would recommend either the table-per-type or table-per-concrete-type strategy over table-per-hierarchy.
The only remaining note is that all three involve compromises. I won't get into all the trade-offs (you should be able to find many discussions of these). There may still be cases where TPH makes the most sense because of a use-case that must perform very well.
I think the best way to go at it is to have a table named 'PhoneNumber'. Both Business and People can have a list of PhoneNumbers.
Ofcourse this is only true if the PhoneNumber is equal for both Business and People. If you would like to add extra properties to a PhoneNumber for People, I would suggest going with your second option.
Having a single table for phone numbers seems like the better option. Could you use some sort of discriminator on your PhoneNumbers entity to avoid having to have two nullable columns?
So you'd have a PhoneNumber class, and an enumerable representing the phone number type. Then your Businesses and People each have a list of PhoneNumber as you've mentioned. Example -
public class PhoneNumber
{
public int PhoneNumberId { get; set; }
public int PhoneNumber { get; set; }
public PhoneNumberType PhoneNumberType { get; set; }
public Participant Participant { get; set; }
}
public enum PhoneNumberType
{
Person,
Business
}
This answer is to a very similar question and seems like an even better option if you want to give it a look.

Adding Navigation property breaks breeze client-side mappings (but not Server Side EF6)

I have an application that I developed standalone and now am trying to integrate into a much larger model. Currently, on the server side, there are 11 tables and an average of three navigation properties per table. This is working well and stable.
The larger model has 55 entities and 180+ relationships and includes most of my model (less the relationships to tables in the larger model). Once integrated, a very strange thing happens: the server sends the same data, the same number of entities are returned, but the exportEntities function returns a string of about 150KB (rather than the 1.48 MB it was returning before) and all queries show a tenth of the data they were showing before.
I followed the troubleshooting information on the Breeze website. I looked through the Breeze metadata and the entities and relationships seem defined correctly. I looked at the data that was returned and 9 out of ten entities did not appear as an object, but as a function: function (){return e.refMap[t]} which, when I expand it, has an 'arguments' property: Exception: TypeError: 'caller', 'callee', and 'arguments' properties may not be accessed on strict mode functions or the arguments objects for calls to them.
For reference, here are the two entities involved in the breaking change.
The Repayments Entity
public class Repayment
{
[Key, Column(Order = 0)]
public int DistrictId { get; set; }
[Key, Column(Order = 1)]
public int RepaymentId { get; set; }
public int ClientId { get; set; }
public int SeasonId { get; set; }
...
#region Navigation Properties
[InverseProperty("Repayments")]
[ForeignKey("DistrictId")]
public virtual District District { get; set; }
// The three lines below are the lines I added to break the results
// If I remove them again, the results are correct again
[InverseProperty("Repayments")]
[ForeignKey("DistrictId,ClientId")]
public virtual Client Client { get; set; }
[InverseProperty("Repayments")]
[ForeignKey("DistrictId,SeasonId,ClientId")]
public virtual SeasonClient SeasonClient { get; set; }
The Client Entity
public class Client : IClient
{
[Key, Column(Order = 0)]
public int DistrictId { get; set; }
[Key, Column(Order = 1)]
public int ClientId { get; set; }
....
// This Line lines were in the original (working) model
[InverseProperty("Client")]
public virtual ICollection<Repayment> Repayments { get; set; }
....
}
The relationship that I restored was simply the inverse of a relationship that was already there, which is one of the really weird things about it. I'm sure I'm doing something terribly wrong, but I'm not even sure at this point what information might be helpful in debugging this.
For defining foreign keys and inverse properties, I assume I must use either data annotations or the FluentAPI even if the tables follow all the EF conventions. Is either one better than the other? Is it necessary to consistently choose one approach and stay with it? Does the error above provide any insight as to what I might be doing wrong? Is there any other information I could post that might be helpful?
Breeze is an excellent framework and has the potential to really increase our reach providing assistance to small farmers in rural East Africa, and I'd love to get this prototype working.
THanks
Ok, some of what you are describing can be explained by breeze's default behavior of compressing the payload of any query results that return multiple instances of the same entity. If you are using something like the default 'json.net' assembly for serialization, then each entity is sent with an extra '$id' property and if the same entity is seen again it gets serialized via a simple '$ref' property with the value of the previously mentioned '$id'.
On the breeze client during deserialization these '$refs' get resolved back into full entities. However, because the order in which deserialization is performed may not be the same as the order that serialization might have been performed, breeze internally creates deferred closure functions ( with no arguments) that allow for the deferred resolution of the compressed results regardless of the order of serialization. This is the
function (){return e.refMap[t]}
that you are seeing.
If you are seeing this value as part of the actual top level query result, then we have a bug, but if you are seeing this value while debugging the results returned from your server, before they have been returned to the calling function, then this is completely expected ( especially if you are viewing the contents of the closure before it should be executed.)
So a couple of questions and suggestions
Are you are actually seeing an error processing the result of your query or are simply surprised that the results are so small? If it's just a size issue, check and see if you can identify data that should have been sent to the client and is missing. It is possible that the reference compression is simply very effective in your case.
take a look at the 'raw' data returned from your web service. It should look something like this, with '$id' and '$ref' properties.
[{
'$id': '1',
'Name': 'James',
'BirthDate': '1983-03-08T00:00Z',
},
{
'$ref': '1'
}]
if so, then look at the data and make sure that an '$'id' exists that correspond to each of your '$refs'. If not, something is wrong with your server side serialization code. If the data does not look like this, then please post back with a small example of what the 'raw' data does look like.
After looking at your Gist, I think I see the issue. Your metadata is out of sync with the actual results returned by your query. In particular, if you look for the '$id' value of "17" in your actual results you'll notice that it is first found in the 'Client' property of the 'Repayment' type, but your metadata doesn't have 'Client' navigation property defined for the 'Repayment' type ( there is a 'ClientId' ). My guess is that you are reusing an 'older' version of your metadata.
The reason that this results in incomplete results is that once breeze determines that it is deserializing an 'entity' ( i.e. a json object that has $type property that maps to an actual entityType), it only attempts to deserialize the 'known' properties of this type, i.e. those found in the metadata. In your case, the 'Client' navigation property on the 'Repayment' type was never being deserialized, and any refs to the '$id' defined there are therefore not available.

EF Code First accessing HasMaxLength value

I would like to get the max length of a column I have defined using EF code first. I need to ensure that the value inserted does not exceed the max length:
this.Property(t => t.COMPANY_ID)
.HasMaxLength(30);
Any suggestions?
The way I understood your question, your real need seems to be that you want to make sure that a property of an entity (in this case the COMPANY_ID) does not exceed a certain maximum length (in this case 30).
Instead of performing manual checks like that, you can consider making use of Data Annotations (System.ComponentModel.DataAnnotations and System.ComponentModel.DataAnnotations.Schema), especially since you're using code first anyway. Something like this:
public class MyEntity
{
[MaxLength(30)]
public string MyProperty {get; set;}
[Column(TypeName="Date")]
public DateTime MyDate {get; set;}
}
You can set more than just the maximum length. As you can see above you can specify what data type should reflect in your database. You can also specify if a property is required and many more. EF will manage this for you automatically and will raise exceptions for you if your entities do not meet the criteria set by your data annotations. If you use MVC scaffolding, it can automatically generate validations as well that are consistent with the annotations you've specified for your entities.

Using Spring-Data and mongodb, natural vs. artificial id?

I'm using Spring-data to map pojos to mongo json documents.
The mongo Object Id reference says "If your document has a natural primary key that is immutable we recommend you use that in _id instead of the automatically generated ids." My question is, if my document has a natural primary key but it is some combination of the object's attributes, should I combine them to create the natural primary key?
Assume that neither of the values can ever change, and when concatenated together the result is guaranteed to be unique. Note that whatever type you declare for id, Spring converts it to an ObjectId (unless they don't have a converter for that type, then they convert it to a String).
Here is an example:
#Document
public Class HomeworkAssignment {
#Id
private String id;
private final String yyyymmdd;
private final String uniqueStudentName;
private double homeworkGrade;
public HomeworkAssignment(String yyyymmdd, String uniqueStudentName) {
this.yyyymmdd = yyyymmdd;
this.uniqueStudentName = uniqueStudentName;
// can either set the 'id' here, or let Spring give me an artificial one.
}
// setter provided for the homeworkGrade
}
There is guaranteed to be no more than one homework assignment per student per day. Both yyyymmdd and uniqueStudentName are given to me as Strings.
For example, "20120601bobsmith" uniquely identifies Bob Smith's homework on June 1, 2012. (If there is more than one Bob Smith, it is already handled in the uniqueName I'm given).
Assume that I want to follow the mongo reference advice and use a natural primary key if there is one. There is one, but it is a combination of 2 fields. Is this a case where I should combine them like so?
this.id = yyyymmdd + uniqueStudentName.toLowerCase();
It is certainly reasonable to use a combination of attributes as a primary key. However, rather than concatenating them, it is probably more logically intuitive to place them into a subdocument with two fields (uniqueStudentName and yyyymmdd) that is used as the _id.
Take a look at this question, which involves using a compound primary key:
MongoDB Composite Key

Why does code first/EF use 'nvarchar(4000)' for strings in the raw SQL command?

Essentially I have a table with zip codes in it. The zipcode field is defined as 'char(5)'. I'm using code first, so I've put these attributes on my ZipCode property:
[Key, Column( Order = 0, TypeName = "nchar"), StringLength(5)]
public string ZipCode { get; set; }
Now if I query against this in EF:
var zc = db.ZipCodes.FirstOrDefault(zip => zip.ZipCode == "12345");
The generated SQL uses nvarchar(4000) to inject the parameters. Huh? Is it because "12345" is technically a string of unknown length? Shouldn't EF be smart enough to just use the proper "nchar(5)" when querying that table?
I ask because the nvarchar(4000) query takes half a second whereas the properly-scoped query is much faster (and less reads).
Any assistance/advice would be appreciated.
This is to take advantage of auto parameterization. The following article explains the general concept as well as why specifically nvarchar(4000) is used.
http://msdn.microsoft.com/en-us/magazine/ee236412.aspx
Have you tried using the MaxLength attribute? This article gives a decent summary of how the various data annotation attributes are interpreted by EF.