Are there drawbacks to going around Entity Framework? - entity-framework

I have some design requirements that are not supported by Entity Framework, but are easily met by a simple SQL Query.
Essentially I need to do an insert that sets an Identity value.
Are there drawbacks to making a sproc that does my insert and then having EF call that sproc?
Are there caching concerns I need to be worried about? (Because I will be updating data "behind EF's back".)
Are there concurrency issues?
Anything else I need to be worried about?

If you don't keep around your db context, i.e. you dispose it after every unit of work this should work just fine (this covers most web scenarios) - unless you concurrently operate on the same table - if that is the case you might want to use a lock to synchronize the SQL and the EF queries or catch OptimisticConcurrencyException thrown by EF.
If you do keep a context around on the other hand make sure that you refresh it with RefreshMode.StoreWins.
Also see "Saving Changes and Managing Concurrency"

Related

Use Custom SQL OnModelCreating and/or immediately after

I wish to add custom SQL to my model creation.
(Right now I want to do that because I have used strongly typed ids in my domain model; so now ef core won't let me use .UseIdentityAlwaysColumn() on them. (As of 2021 this is a still-open issue). Even it it did, I also want to add specific Postgres sequence options).
A simple workaround is just a single line of Alter Table Alter Column... sql straight after the model creation.
I can see that MigrationBuilder.Sql() can do custom sql. So
Can ModelBuilder do custom Sql? I can't find it.
Alternatively, can I shoehorn a short Migration into the OnModelCreating()?
I wish to keep all the data definition code in sync in one place, not have most of it in OnModelCreating but bits of it elsewhere.
The short answer to both your questions is no. Or if I can use your phraseology, as of 2021 this is still not possible.
Seriously, EF Core is ORM, thus the main focus is on M(apping). Physical database attributes are not a priority, given the fact that one can use EF Core just to map to an exiting database (a.k.a. Database first). There is some limited support for indexes (not used by EF Core) and small set of other physical attributes, but no views, synonyms, triggers etc. The only SQL supported is in fact HasDefaultValueSql.
I wish to keep all the data definition code in sync in one place, not have most of it in OnModelCreating but bits of it elsewhere.
OnModelCreating is creating the mappings. At the time it is called, there is no real database involved. The model could be created for generating a migration, but that's only one (an completely optional) of the many usage scenarios. That's why you can't "execute" anything there. All you can do is to specify metadata (a.k.a. annotations) which then eventually are processed by the services responsible for different functionalities. Migration SQL generator is one of them, but it needs to understand these annotations when processing the corresponding operations. Which basically is the definition of supporting something or not.
In theory you could create your own annotations, provide custom metadata/fluent API for specifying them, but then you have also implement them for every database provider you want to support. This is a lot of work, practically impossible as every database provider implements the migration SQL generator for their specific attributes and DDL dialects.
So, whether what you wish it better or not, the practical approach would be to use what you got from ORM. Which currently is MigrationBuilder.Sql(). No more, no less. That's all. Period.
To recap shortly, if the questions are if there is some hidden "magic" way which you can't find, there isn't.

Entity Framework code first - development strategies

Working on a brand new project from the ground up. That means the data model is in a constant flux, doubly so because things are, inevitably, not as well planned as they should be. Model classes are being created and changed fairly regularly.
The plan was to use the latest version of EF with all the neat code-first stuff in it. But we're constantly tripping over the limitations the framework has in terms of adding or updating tables. The initialization options seem to allow only the complete deletion and re-creation of the database, which isn't really ideal.
I've had a look at the migrations. But this seems a sledgehammer to crack a nut: we don't need to detail every single small change and update with a new migration scaffold.
Are there some better strategies to deal with this? For instance, I started writing some unit tests to pre-populate one of the contexts with some test data, but because this causes the whole Db to drop and re-create, it causes problems with all the other contexts. Or perhaps making use of a custom initialiser to seed the data for us? How can we easily exclude these in production code?
We're also wondering about perhaps abandoning code-first and going back to EDMX diagrams. At least that way changes result in updated SQL commands which can be run directly against the database.
Any suggestions gratefully received.
I think, imho, that:
as the database schema must at least match your model you should/must detail every single change, and code first migration allows that and trace the changes over time
code first migration also allows to migrate the database schema for you
code first migration also allows you to produce sql that allows you to migrate the schema
For these reasons code first is as good (if not better) as the edmx approach
Please take few minutes to implement http://msdn.microsoft.com/en-us/data/jj591621.aspx
One other point, always imho and in a perfect world, if you unit test the business of you model you should not need the DAL, use generic collection. Be aware of different comportement of linq to object vs linq to entities, for example concerning the case sensitivity.

What are the benefits of ORM lazy loading?

I'm researching data layer underpinnings for a new web-based reporting system and have spent a lot of time evaluating ORM's over the last few days. That said, I've never dealt with "lazy loading" before and am confused at why its the default setting for LINQ queries in the Entity Framework. It seems like it creates a lot of network traffic and unnecessarily tasks the database with additional queries that could otherwise be resolved with joins.
Can someone describe a scenario in which lazy loading would be beneficial?
Some meta:
The new system will be working against a database with hundreds of tables and many terabytes of data in a production environment with over 3,000 concurrent users on the system 24 hours a day. They will be retrieving large datasets continuously. Is it possible that an ORM just isn't the right solution for our needs, especially since the app will be web-based?
When we talk about lazy loading we are talking about Navigation Properties (how we follow foreign keys). What lazy loading will do for us is to populate the entity from a remote table as we attempt to access that entity. For example if we have a model like this
public class TestEntity
{
public int Id{get;set;}
public AnotherEntity RemoteEntity{get;set;}
}
And call the following
var something = WhateverContext.TestEntities.First().RemoteEntity;
We will get 2 database calls, one for WhateverContext.TestEntities.First() and one for loading the remote entity.
I'm a web guy, (and more specifically an MVC guy) and for web stuff I don't think there is ever a good reason for wanting to do this, One database call is always going to be quicker than two if we require the same set of data.
The situation where I think that lazy loading is actually worth considering is when you don't know when you do your first query if you will need the second entity at all. In my opinion this is much more relevant for windows applications where we have a user who is performing actions in real time (rather than stateless MVC where users are requesting whole pages at once). For example I think lazy loading shines when we have a list of data with a details link, then we don't load the details until the user decides they want to see them.
I don't feel this extends to paging, sorting and filtering, IMO there should be one specifically crafted database query per page of data you are displaying, which returns exactly the data set required to display that page.
In terms of your performance question, I feel that EF (or another ORM) can probably meet your needs here but you want to be careful with how you are retrieving large datasets due to the way EF tracks entities. Check out my EF performance tuning cheat sheet, and read up on DetectChanges and AsNoTracking if you do decide to use EF with large queries.
Most ORMs will give you the option, when you're building up your object selections, to say "don't be lazy, go ahead and join", so if you're worried about it from an efficiency perspective, don't be. You can make it work (usually).
There are 2 particular cases I know of where lazy loading helps:
Chaining commands
What if you want to create a basic select, but then you want to run it through a sort and a filter function that's based on user input. You can simply pass the ORM object in, and attach the sort and filtering functionality to it. Instead of evaluating it each time, it only evaluates when it's actually used.
Avoiding huge, deep, highly-relational queries
What if you just need the IDs of some related fields? If it loads lazily, you don't have to worry about it joining a whole bunch of data and tables that you don't need, potentially slowing down the query and overusing bandwidth. Of course, if you DID want everything else, then you'll need to be explicit, or you may run into a problem where it lazily runs a query for each detail record. Like I mentioned at the outset, that's easily overcome in any ORM worth using.
A simple case is a result set of N records which you do not want to bring to the client at once. The benefit is that you are able to lazily load only what is needed for the clients demands, such as sorting, filtering, etc... An example would be a paging view where one could page through records and sort them accordingly, thus the client only needs N amount at a given time.
When you perform the LINQ query it translates that to SQL commands on the server side to provide only what is needed in the given context. It boils down to offloading work to the database and minimizing what you need to send back to the client.
Some will argue that ORM based lazy loading is wrong however that starts to move to semantics fairly quick and should be more about approach to design versus what is right and wrong.

Entity framework - what to do if SaveChanges fails and I don't want some changes to be made?

I think I'm running into a common problem:
I would like to try to insert an object to the database. If primary key is violated then I would like to abort the insert. (this is an example, the question really applies to any kind of error and any of the CRUD operations)
How can I discard changes made to EF context?
I can't afford recreating it every time something goes wrong.
PS. I know that perhaps I could check if everything is ok eg. by querying the db, but I don't like the idea. Db constraints are there for some reason and this way it's faster and I have to write less code.
You can detach inserted entity from ObjectContext. You can also use ObjectStateManager and its method GetObjectStateEntries. In ObjectStateEntry you can modify its state.
The problem is that you are not using technology in supposed way:
I can't afford recreating it every
time something goes wrong.
Sure you should because your code doesn't prevent such situations.
PS. I know that perhaps I could check
if everything is ok eg. by querying
the db, but I don't like the idea. Db
constraints are there for some reason
and this way it's faster and I have to
write less code.
Yes indeed you should check if everything is OK. Calling database to "validate" your data is something that DBAs really like (sarcasm). It is your responsibility to achieve the highest possible validity of your data before you call SaveChanges. I can imagine that many senior developers / team leaders would simply not pass your code through their code review. And btw. in the most cases it is not faster because of inter process or network communication.
Try using DbTransaction.
System.Data.Common.DbTransaction _tran = null;
_tran = _ent.Connection.BeginTransaction();
_tran .Commit (); //after SaveChanges();
and if theres an exception
do a rollback.
_tran.Rollback();

Rules of thumbs for writing "queries" using ADO.NET Entity Framework

I’m currently working on a prototype of a medium size web application, and I thought that it would be good to also experiment with Entity Framework. The problem is that the major part of the application is not the data layer and logic, and so that I don't have much time to play with Entity Framework. On the other hand, the database schema is quite simple.
One of the problems I’m facing is that I cannot find a consistent way to "write queries". As far as I can tell, there are four "interfaces" for the job:
LINQ to Entities
LINQ to Entities using LINQ extension methods
Entity SQL
Query builder
OK, the first two are essentially the same, but it’s good to use just one for maintenance and consistency.
I’m mostly puzzled by the fact that none of them seems to be complete and the most general. I often find myself cornered and using some ugly looking combination of several of them. My guess is that Entity SQL is the most general one, but writing queries using strings feels like a step back. The main reason I’m experimenting with something like Entity Framework is that I like the compile time checking.
Some other random thought / issues:
I often also use the ObjectQuery.Include() method, but again it takes a string. Is this the only way?
When to use ObjectQuery.Execute() (vs. ToList())? Does it actually execute the query?
Should execute queries as soon as possible (e.g. using ToList()) or should I not care just let leave the execution for the first enumeration which gets in the way?
Are ObjectQuery.Skip() and ObjectQuery.Take() available only as extension methods? Is there a better way to do paging? It’s 2009 and almost every web application deals with paging.
Overall, I understand there are many difficulties when implementing an ORM, and often one has to compromise. On the other hand, the direct database access (e.g. ADO.NET) is plain and simple and has well defined interface (tabular results, data readers), so all code - no matter who and when writes it - is consistent. I don’t want to faced with too many choices whenever writing a database query. It’s too tedious and more than likely different developers will come up with different ways.
What are your rules of thumbs?
I use LINQ-to-Entities as much as possible. I also try and formalise to the lambda-form, as opposed to the extended SQL-style syntax. I have to admit to have had problems enforcing relationships and making compromises on efficiency just to expedite my coding of our application (eg. Master->Child tables may need to be manually loaded) but all in all, EF is a good product.
I do use EF's .Include() method for lazy-loading, which as you say, does require a string input. I find no problem with this, other than that of identifying the string to use which is relatively simple. I guess if you're keen on compile-time checking of such relations, a model similar to: Parent.GetChildren() might be more appropriate.
My application does require some "dynamic" queries to be performed, though. I have two ways of meeting this:
a) I create a mediator object, eg. ClientSearchMediator, which "knows" how to search for clients by name, etc. I can then put this through a SearchHandler.Search(ISearchMediator[] mediators) call (for example). This can be used to target specific data structures and sort results accordingly using LINQ-to-Entities.
b) For a looser experience, possibly as a result of a user designing their own query (using high level tools our application provides), eSQL is ideal for this purpose. It can be made to be injection-safe.
I don't have enough knowledge to address all of this, but I'll at least take a few stabs.
I don't know why you think ADO.NET is more consistent than Entity Framework. There are many different ways to use ADO.NET and I've definitely seen inconsistency within a single code base.
Entity Framework is currently a 1.0 release and it suffers from many 1.0 type problems (incomplete & inconsistent API, missing features, etc.).
In regards to Include, I assume you are referring to eager loading. Multiple people (outside of Microsoft) have developed solutions for getting "type safe" includes (try googling something like: Entity Framework ObjectQueryExtension Include). That said, Include is more of a hint than anything. You can't force eager loading and you have to always remember to call the IsLoaded() method to see if your request was fulfilled. As far as I know, the way "Include" works is not changing at all in the next version of Entity Framework (4.0 - to ship with VS 2010).
As far as executing the Linq query as soon as it's built vs. the last possible moment, that decision is situational. Personally, I would probably execute it as soon as it's built for the most part unless there was a compelling reason not to, but I can see other people going the opposite direction.
There are more mature ORMs on the market and Entity Framework isn't necessarily your best option. For the most part, you can bend Entity Framework to your will, but you may end up rolling your own implementation of features that come out of the box with other ORMs.