should I use Entity Framework instead of raw ADO.NET - entity-framework

I am new to CSLA and Entity Framework. I am creating a new CSLA / Silverlight application that will replace a 12 year old Win32 C++ system. The old system uses a custom DCOM business object library and uses ODBC to get to SQL Server. The new system will not immediately replace the old system -- they must coexist against the same database for years to come.
At first I thought EF was the way to go since it is the latest and greatest. After making a small EF model and only 2 CSLA editable root objects (I will eventually have hundreds of objects as my DB has 800+ tables) I am seriously questioning the use of EF.
In the current system I have the need many times to do fine detail performance tuning of the queries which I can do because of 100% control of generated SQL. But it seems in EF that so much happens behind the scenes that I lose that control. Article like http://toomanylayers.blogspot.com/2009/01/entity-framework-and-linq-to-sql.html don't help my impression of EF.
People seem to like EF because of LINQ to EF but since my criteria is passed between client and server as criteria object it seems like I could build queries just as easily without LINQ. I understand in WCF RIA that there is query projection (or something like that) where I can do client side LINQ which does move to the server before translation into actual SQL so in that case I can see the benefit of EF, but not in CSLA.
If I use raw ADO.NET, will I regret my decision 5 years from now?
Has anyone else made this choice recently and which way did you go?

In your case, I would still choose EF over doing it all by hand.
Why? EF - especially in .NET 4 - has matured considerably. It will allow you to do most of your database operations a lot easier and with a lot less code than if you have to all hand-code your data access code.
And in cases where you do need the absolute maximum performance, you can always plug in stored procedures for insert, update, delete which EF4 will then use instead of the default behavior of creating the SQL statements on the fly.
EF4 has a much better stored proc integration, and this really opens up the best of both worlds:
use the high productivity of EF for the 80% cases where performance isn't paramount
fine tune and handcraft stored procs for the remaining 20% and plug them into EF4
See some resources:
Using Stored Procedures for Insert, Update and Delete in an Entity Data Model
Practical Entity Framework for C#: Stored Procedures (video)

You seem to have a mix of requirements and a mix of solutions.
I normally rate each requirement with an essential, nice to have, not essential. And then see what works.
I agree with what #marc_s has said, you can have the best of both worlds.
The only other thing I would say is that if this solution is to be around for the next 5 years, have you considered Unit Testing?
There's plenty of examples on how to set this up using EF. (I personally avoid ADO.Net just because the seperating of concerns is so complicated for Unit Tests.)
There is no easy solution. I would pick a feature in your project that would take you a day or so to do. Try the different methods (raw sql, EF, EF + Stored Procs) and see what works!

Take an objective look at CSLA - invoke the 'DataPortal' and check out the call stack.
Next, put those classes on a CI build server that stores runtime data and provides a scatter plot over a series of runs.
Next, look at the code that gets created. Ask yourself how you can use things like dependecy injection in light of classes that rely on static creators with protected/private constructors.
Next, take a look at how many responsibilities the 'CSLA' classes take on.
Finally ask yourself if creating objects with different constructors per environment make sense, and ask yourself how you will unit test those.

Related

Entity framework vs NHibernate - Performance

I am looking to implement an ORM into our system. We currently have many tables with lots of horrible data and stored procedures. I've heard using an ORM can slow the system down. Does anyone know which ORM is better on speed and performance using Queries created in the C# code and mapping to stored procedures?
Thanks
EDIT:
The project will use existing tables that are large and contain a lot of data, it will also use existing stored procedures that carry out complex tasks in a SQL Server DB. The ORM must be able to carry out transactions and have high performance when running the existing stored procedures and querying the current tables. The project is web based and will use WCF web services with DDD. I can see that EF is a lot easier to use and has greater support but if NH the most suitable option?
The Entity Framework is constantly implementing new features and everything is automated in your project. It is very easy to use Entity Framework, extend and refactor your code.
Visual Studio integrates it (Code-First, Database-first, Entity-First...) like a charm.
Windows Azure makes easy to deploy and change it.
Moreover Visual Studio can generate all your CRUD pages in 3 clicks.
I would suggest you use EF but it depends of your project. Can you give us more details about it?
You can find lots of comparison charts on Google, for example this one explains performance and this one differences.
EDIT:
Can you quantify the number of users in your application?
When you use Database-First in EntityFramework it is very easy to import and use stored procedures, for NHibernate it quite simple as well.
Note that if you use a lot of stored procedures and not a lot of simultaneous users the choice of a ORM between those two may not be so crucial.
Also don't forget the performance of a tool is often based on the way to use it. If you misuse the ORM (e.g. async, lazy/eager loading, base classes, ...) the performance will drastically drop.
Perhaps you can instal both of them, look at how they work and check at their roadmap (e.g. Entity framework) to check at the evolution and interest.
check this artical
Some Advantages of NH over EF
NH has 10+ id generator strategies (IDENTITY, sequence, HiLo, manual, increment, several GUIDs, etc), EF only has manual or SQL Server's IDENTITY;
NH has lazy property support (note: not entities, this is for string, XML, etc properties), EF doesn't;
NH has second level cache support (and, believe me, enterprise developers are using it) and EF doesn't;
NH has support for custom types, even complex, with "virtual" properties, which includes querying for these virtual properties, EF doesn't;
NH has formula properties, which can be any SQL, EF doesn't;
NH has automatic filters for entities and collections, EF doesn't;
NH supports collections of primitive types (strings, integers, etc) as well as components (complex types without identity), EF doesn't;
NH supports 6 kinds of collections (list, set, bag, map, array, id array), EF only supports one;
NH includes a proxy generator that can be used to customize the generated proxies, EF doesn't allow that;
NH has 3 mapping options (.HBM.XML, by code, by attributes) and EF only two (by attributes, by code);
NH allows query and insertion batching (this is because EF only really supports IDENTITY keys), EF doesn't;
NH has several optimistic control strategies (column on the db, including Oracle's ORA_ROWSCN, timestamp, integer version, all, dirty columns), EF only supports SQL Server' TIMESTAMP or all
Both are great solutions, although I personally think NHibernate is better for an inherited database.
There are some things that are clearly better in NHibernate, such as second-level caching support. Documentation is probably a bit sparser than EF, but if you're willing to go through the learning curve, NHibernate gives you a lot more power.
FluentNHibernate is great for typed mapping of classes to underlying tables but there are some places you will just have to revert to XML mappings. There is a new competing API from NHibernate itself however, and I have not checked it out yet (the above blog post mentions it).
If you want to rely on VS tooling support, EF is better. However there will be some magic sometimes (for e.g. EF can use reflection to even populate private properties of an object, NHibernate does not do that; this is a strength or a weakness depending on how you see it). EF also works well with other Microsoft supplied-frameworks (for e.g. RIA services). I also like EF-auto migrations (when you use code-first).
If you want more power in your hands and want to be able to fine-tune how things work with clear separation of concerns (ORM does only what the ORM should do), NH seems to be better. However, it is a bit irritating to make all the properties virtual for NH to be able to access them.
I have used both and it can get a bit clunky either ways sometimes to generate the sql you want; in those 5-10% cases, drop down one more level and use a micro-orm like Dapper, Massive or Petapoco.
EDIT:
NHibernate too can populate private properties, seemingly, so this was just ignorance on my part.
EF 5 or 6 on .NET 4.5 is one sure way to increase performance, don't expect any speed improvement with EF 5 on .NET 4.0 this is documented by Microsoft. (We have another issue with poorly written LINQ statements too).
In General, if performance is high priority, you can't beat ADO.NET with Stored Procedures. You simply cannot. Added ORM, adding and IOC container... how much testing for performance do you want to do?
Spin up a few VM's with JMETER and hit a server with EF and hit another with NHibernate, You just force JMETER to make enough calls to URLS and test the amount of concurrent users and you should see where your bottleneck are.
It may be worth adding here that Entity Framework has issues when attaching disconnected graphs and is missing functionality such as merge in Nhibernate. You can address it with other plugins, but not there out of the box. Here is a link to the feature on Codeplex requesting better support for working with disconnected entities, and there is some further discussion.

How would you use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up?

Whenever I watch a demo regarding the Entity Framework the demonstrator simply sets up some tables and performs Inserts, Updates and Deletes using automatically created code stubs but never shows any use of stored procedures. It seems to me that this is executing SQL from the client.
In my experience this is not particular good practice so I am presuming that my understanding of the Entity Framework is wrong.
Similarly WCF RIA Services demos use the EF and the demos are always the same. Can anyone shed any light on how you would use EF in a typical Business Layer/Data Access Layer/Stored Procedures set up.
I think I am confused and shouldn't be!!?
There's nothing wrong with executing SQL from the client. Most (if not all) of the problems that it might cause are in fact not there when using something like EF. For instance:
Client generated SQL might cause runtime syntax errors. This is not unlikely since the description of your query is mostly checked on compile time (assuming that the generator itself doesn't generate invalid SQL, which is also unlikely)
Client generated SQL might be inefficient. This is not true with modern database software which have query caches. EF works in a way that's compatible with query caches, i.e. it generates the same SQL consistently (as long as you use the same code consistently) and uses parameters for varying data.
Client generated SQL might be insecure (SQL injections and whatnot). This is all handled by the generator, which uses parameters for your values and does not interpolate user input into the query itself.
Back in the old Client / Server days, it used to be considered good practice to do all db updates using stored procedures.
But now, it's perfectly acceptable to have an O/RM generate SQL and run directly against DB.
Well, part of the reason why executing sql in stored procedures is a good idea is that it gives you a level of abstraction - when db changes inevitably occur, you make a change in a single place (the proc) rather than a dozen places (all the places where you were calling the client sql). Entity Framework provides this layer of abstraction through the data model, and you have the same advantage.
There are some other reasons why you might want to look at procs, like security granularity (only allowing certain users the right to execute), and some minor performance differences. Ultimately, you have to decide for yourself what the right trade-off is. EF is an attempt to dramatically reduce the developer time spent creating a data layer, with the trade-offs listed above.
never shows any use of stored procedures
Take a look at this video: Using Your Own Stored Procedures to Insert, Update and Delete Entities in Entity Framework.
Note that there are a lot of other videos on that topic there that are certainly worth watching!
The legend is that Scott Hanselman once said "It's not a real demo unless someone drags a datagrid" (pg 478 Silverlight 4 In Action, Pete Brown)
You have to remember that demos, are all about selling software, and not at all about communicating best practice. So your observations about the demos are absolutely correct, they cover the basics, and leave it to the observer to fill in the blanks.
As to your comment about Stored Procedures, and various answers to your question about the generator. The generator is good, and getting better. Howerver there are certain circumstances when it will generate completely unusable queries. (see my SO question here and discussed on the ADO.NET team blog)
Therefore there are occasions when hand crafted queries are your only recourse (either by way of stored proc, table value functions, views etc)

DAL Layer : EF 4.0 or Normal Data access layer with Stored Procedure

Application :
I am working on one mid-large size application which will be used as a product, we need to decide on our DAL layer. Application UI is in Silverlight and DAL layer is going to be behind service layer. We are also moving ahead with domain model, so our DB tables and domain classes are not having same structure. So patterns like Data Mapper and Repository will definitely come into picture.
I need to design DAL Layer considering below mentioned factors in priority manner
Speed of Development with above average performance
Maintenance
Future support and stability of the technology
Performance
Limitation :
1) As we need to strictly go ahead with microsoft, we can not use NHibernate or any other ORM except EF 4.0
2) We can use any code generation tool (Should be Open source or very cheap) but it should only generate code in .Net, so there would not be any licensing issue on per copy basis.
Questions
I read so many articles about EF 4.0, on outset it looks like that it is still lacking in features from NHibernate but it is considerably better then EF 1.0
So, Do you people feel that we should go ahead with EF 4.0 or we should stick to ADO .Net and use any code geneartion tool like code smith or any other you feel best
Also i need to answer questions like what time it will take to port application from EF 4.0 to ADO .Net if in future we stuck up with EF 4.0 for some features or we are having serious performance issue.
In reverse case if we go ahead and choose ADO .Net then what time it will take to swith to EF 4.0
Lastly..as i was going through the article i found the code only approach (with POCO classes) seems to be best suited for our requirement as switching is really easy from one technology to other.
Please share your thoughts on the same and please guide on the above questions
Using EF4 will save you a lot of manual coding - or code generation. That alone seems to justify using it - the best code and the only code guaranteed to be bug-free is the code you don't need to write.
EF4 and NHibernate and the like are very powerful tools that can handle the most demanding and complicated business requirements - things like inheritance in database tables and a lot more. They do add some overhead, however - and they're not always easy to learn and pick up.
So my provocative question would be: why not use Linq-to-SQL? If you only need SQL Server as a backend, if your mappings are almost always a 1:1 mapping between a table and a class in your domain model, this could be a lot easier than using either EF4 or NHibernate.
Linq-to-SQL is definitely easier and faster than EF4 - it's just a rather thin layer on top of your database. It's a lot easier than learning NHibernate - no messy mapping syntax to learn, you have support through a visual designer. And it's simple enough that you can even get in there and manipulate the code generation using e.g. T4 templates (see Damien Guard's Linq-to-SQL templates).
Linq-to-SQL is 100% supported even in .NET 4 / VS 2010, so there's a really good chance it will still be around in VS2012 (or whatever the next one will be called). So all those doomsday predictions about its demise are largely exaggerated - I wouldn't worry about it's future proofness for a second.
UPDATE: Some performance comparisons between Linq-to-SQL and EF:
Entity Framework and Linq-to-SQL Performance Comparison
Looking at EF Performance - some surprises
Entity Framework and LINQ to SQL
In general, Linq-to-SQL seems to have an edge on query performance, but EF seems to be doing better on updates.
Where do you get the idea from that a stored procedure bound DAL is normal? I personally said goodbye to them like 15 years ago going THEN for systems like any ORM on the market (hing: Entity Framework is ap retty bad one compared to stuff like nhibernate). I am personally always hughely amused that everyone except MS "fans" thing ORM's are something new.
Even given the limitations Entity Framework has - it is vastly better than anything you can manually come up with with stored procedures.
1: normally about 40% to 60% of you code deals with database interaction. This is eliminated by any non-stupid ORM -> make the math.
2: see 1.
3: good question. MS has a history of doing really stupid things in the area of data access, sadly.
4: most likely the same like your handwritten one - within 5%. Sadly your anemic ORM (Entity Framework) does not implement any of the really interesting features (caching etc.), so hugh gains are not expected automatically.

What problems have you had with Entity Framework?

We have used Entity Framework on 2 projects both with several 100 tables.
Our experiance is mainly positive. We have had large productivity gains, compare with using Enterprise Library and stored procedures.
However, when I suggest using EF on stackoverflow, I often get negative comments.
On the negative side we have found that there is a steep learning curve for certain functionality.
Finally, to the question: What problems have people had with EF, why do they prefer other ORMS?
Like you, my experience with the EF is mostly positive. The biggest problem I've had is that very complex queries can take a long time to compile. The visual designer is also much less stable and has fewer features than the framework itself. I wish the framework would put the GeneratedCode attribute on code it generates.
I recently used EF and had a relatively good experience with it. I too see a lot of negative feedback around EF, which I think is unfortunate considering all that it offers.
One issue that surprised me was the performance difference between two strategies of fetching data. Initially, I figured that doing eager loading would be more efficient since it would pull the data via a single query. In this case, the data was an order and I was doing an eager load on 5-8 related tables. During development, we found this query to be unreasonably slow. Using SQL profiler, we watched the traffic and analyzed the resulting queries. The generated SQL statement was huge and SQL Server didn't seem to like it all that much.
To work around the issue, I reverted to a lazy-loading / on-demand mode, which resulted in more queries to the server, but a significant boost in performance. This was not what I initially expected. My take-away, which IMHO holds true for all data access implementations, is that I really need to perf test the data access. This is true regardless of whether I use an ORM or SQL procs or parameterized SQL, etc.
I use Entity Framework too and have found the following disadvantages:
I can't work with Oracle that is really necessary for me.
Model Designer for Entity Framework. During update model from database storage part is regenerated too. It is very uncomfortably.
Doesn't have support for instead of triggers in Entity framework.

Manual DAL & BLL vs. ORM

Which approach is better: 1) to use a third-party ORM system or 2) manually write DAL and BLL code to work with the database?
1) In one of our projects, we decided using the DevExpress XPO ORM system, and we ran across lots of slight problems that wasted a lot of our time. Amd still from time to time we encounter problems and exceptions that come from this ORM, and we do not have full understanding and control of this "black box".
2) In another project, we decided to write the DAL and BLL from scratch. Although this implied writing boring code many, many times, but this approach proved to be more versatile and flexible: we had full control over the way data was held in the database, how it was obtained from it, etc. And all the bugs could be fixed in a direct and easy way.
Which approach is generally better? Maybe the problem is just with the ORM that we used (DevExpress XPO), and maybe other ORMs are better (such as NHibernate)?
Is it worth using ADO Entiry Framework?
I found that the DotNetNuke CMS uses its own DAL and BLL code. What about other projects?
I would like to get info on your personal experience: which approach do you use in your projects, which is preferable?
Thank you.
My personal experience has been that ORM is usually a complete waste of time.
First, consider the history behind this. Back in the 60s and early 70s, we had these DBMSes using the hierarchical and network models. These were a bit of a pain to use, since when querying them you had to deal with all of the mechanics of retrieval: follow links between records all over the place and deal with the situation when the links weren't the links you wanted (e.g., were pointing in the wrong direction for your particular query). So Codd thought up the idea of a relational DBMS: specify the relationships between things, say in your query only what you want, and let the DBMS deal with figuring out the mechanics of retrieving it. Once we had a couple of good implementations of this, the database guys were overjoyed, everybody switched to it, and the world was happy.
Until the OO guys came along into the business world.
The OO guys found this impedance mismatch: the DBMSes used in business programming were relational, but internally the OO guys stored things with links (references), and found things by figuring out the details of which links they had to follow and following them. Yup: this is essentially the hierarchical or network DBMS model. So they put a lot of (often ingenious) effort into layering that hierarchical/network model back on to relational databases, incidently throwing out many of the advantages given to us by RDBMSes.
My advice is to learn the relational model, design your system around it if it's suitable (it very frequently is), and use the power of your RDBMS. You'll avoid the impedance mismatch, you'll generally find the queries easy to write, and you'll avoid performance problems (such as your ORM layer taking hundreds of queries to do what it ought to be doing in one).
There is a certain amount of "mapping" to be done when it comes to processing the results of a query, but this goes pretty easily if you think about it in the right way: the heading of the result relation maps to a class, and each tuple in the relation is an object. Depending on what further logic you need, it may or may not be worth defining an actual class for this; it may be easy enough just to work through a list of hashes generated from the result. Just go through and process the list, doing what you need to do, and you're done.
Perhaps a little of both is the right fit. You could use a product like SubSonic. That way, you can design your database, generate your DAL code (removing all that boring stuff), use partial classes to extend it with your own code, use Stored Procedures if you want to, and generally get more stuff done.
That's what I do. I find it's the right balance between automation and control.
I'd also point out that I think you're on the right path by trying out different approaches and seeing what works best for you. I think that's ultimately the source for your answer.
Recently I made the decision to use Linq to SQL on a new project, and I really like it. It is lightweight, high-performance, intuitive, and has many gurus at microsoft (and others) that blog about it.
Linq to SQL works by creating a data layer of c# objects from your database. DevExpress XPO works in the opposite direction, creating tables for your C# business objects. The Entity Framework is supposed to work either way. I am a database guy, so the idea of a framework designing the database for me doesn't make much sense, although I can see the attractiveness of that.
My Linq to SQL project is a medium-sized project (hundreds, maybe thousands of users). For smaller projects sometimes I just use SQLCommand and SQLConnection objects, and talk directly to the database, with good results. I have also used SQLDataSource objects as containers for my CRUD, but these seem clunky.
DALs make more sense the larger your project is. If it is a web application, I always use some sort of DAL because they have built-in protections against things like SQL injection attacks, better handling of null values, etc.
I debated whether to use the Entity Framework for my project, since Microsoft says this will be their go-to solution for data access in the future. But EF feels immature to me, and if you search StackOverflow for Entity Framework, you will find several people who are struggling with small, obtuse problems. I suspect version 2 will be much better.
I don't know anything about nHibernate, but there are people out there who love it and would not use anything else.
You might try using NHibernate. Since it's open source, it's not exactly a black box. It is very versatile, and it has many extensibility points for you to plug in your own additional or replacement functionality.
Comment 1:
NHibernate is a true ORM, in that it permits you to create a mapping between your arbitrary domain model (classes) and your arbitrary data model (tables, views, functions, and procedures). You tell it how you want your classes to be mapped to tables, for example, whether this class maps to two joined tables or two classes map to the same table, whether this class property maps to a many-to-many relation, etc. NHibernate expects your data model to be mostly normalized, but it does not require that your data model correspond precisely to your domain model, nor does it generate your data model.
Comment 2:
NHibernate's approach is to permit you to write any classes you like, and then after that to tell NHibernate how to map those classes to tables. There's no special base class to inherit from, no special list class that all your one-to-many properties have to be, etc. NHibernate can do its magic without them. In fact, your business object classes are not supposed to have any dependencies on NHibernate at all. Your business object classes, by themselves, have absolutely no persistence or database code in them.
You will most likely find that you can exercise very fine-grained control over the data-access strategies that NHibernate uses, so much so that NHibernate is likely to be an excellent choice for your complex cases as well. However, in any given context, you are free to use NHibernate or not to use it (in favor of more customized DAL code), as you like, because NHibernate tries not to get in your way when you don't need it. So you can use a custom DAL or DevExpress XPO in one BLL class (or method), and you can use NHibernate in another.
I recently took part in sufficiently large interesting project. I didn't join it from the beginning and we had to support already implemented architecture. Data access to all objects was implemented through stored procedures and automatically generated wrapper-methods on .NET that returned DataTable objects. The development process in such system was really slow and inefficient. We had to write huge stored procedure on PL/SQL for every query, that could be expressed in simple LINQ query. If we had used ORM, we would have implement project several times faster. And I don't see any advantage of such architecture.
I admit, that it is just particular not very successful project, but I made following conclusion: Before refusing to use ORM think twice, do you really need such flexibility and control over database? I think in most cases it isn't worth wasted time and money.
As others explain, there is a fundamental difficulty with ORM's that make it such that no existing solution does a very good job of doing the right thing, most of the time. This Blog Post: The Vietnam Of Computer Science echoes some of my feelings about it.
The executive summary is something along the lines of the assumptions and optimizations that are incompatible between object and relational models. although early returns are good, as the project progresses, the abstractions of the ORM fall short, and the extra overhead of working around it tends to cancel out the successes.
I have used Bold for Delphi four years now. It is great but it is not available anymore for sale and it lacks some features like databinding. ECO the successor has all that.
No I'm not selling ECO-licenses or something but I just think it is a pity that so few people realize what MDD (Model Driven Development) can do. Ability to solve more complex problems in less time and fewer bugs. This is very hard to measure but I have heard something like 5-10 more efficient development. And as I work with it every day I know this is true.
Maybe some traditional developer that is centered around data and SQL say:
"But what about performance?"
"I may loose control of what SQL is run!"
Well...
If you want to load 10000 instances of a table as fast as possible it may be better to use stored procedures, but most application don't do this. Both Bold and ECO use simple SQL queries to load data. Performance is highly dependent of the number of queries to the database to load a certain amount of data. Developer can help by saying this data belong to each other. Load them as effiecent as possible.
The actual queries that is executed can of course be logged to catch any performance problems. And if you really want to use your hyper optimized SQL query this is no problem as long as it don't update the database.
There is many ORM system to choose from, specially if you use dot.net. But honestly it is very very hard to do a good ORM framework. It should be concentrated around the model. If the model change, it should be an easy task to change database and the code dependent of the model. This make it easy to maintain. The cost for making small but many changes is very low. Many developers do the mistake to center around the database and adapt everthing around that. In my opinion this is not the best way to work.
More should try ECO. It is free to use an unlimited time as long the model is no more than 12 classes. You can do a lot with 12 classes!
I suggest you to use Code Smith Tool for creating Nettiers, that is a good option for developer.