Using ADO.NET Entities LINQ Provider to model complex SQL Queries? - tsql

What I find really powerful in ADO.NET Entities or LINQ to SQL, is the ability to model complex queries. I really don't need the mappings that Entities or LINQ to SQL are doing for me - I just need the ability to model complex expressions that can be translated into T-SQL.
My question is - am I abusing too much? Can I use the Entity Framework for modeling queries and just that? Should I? I know I can write my own custom LINQ to SQL provider, but that is just not possible to handle in the time spans I have.
What is the best approach to model complex T-SQL queries? How do you handle conditional group byes, orders, joins, unions etc in the OOP world? Using StringBuilders for this kind of job feels too ugly and harder to maintain given the possibilities we have with Expression Trees.
When I use StringBuilder to model a complex SQL Query I feel kind of guilty! I feel the same way as when I have to hard code any number into my code that is different than 0 or 1. Feeling that makes you ask yourself if there is a better and cleaner way of doing it...
I must mention that I am using C# 4.0, but I am not specifically looking for an answer in this language, but rather in the domain of CLR 4. Are there any real enterprise libraries that can help with constructing complex SQL at runtime?

If you find it easier (not to mention less error prone) to compose a query in LINQ rather than T-SQL, then why not?
There is no law written that says you can't use EF or L2S to help you write T-SQL, and then just go off and execute the T-SQL directly yourself.
In fact I think this makes a lot of sense.
Don't feel guilty!

Related

Best way to build complex reports using Entity Framework

I am trying to do some dashboards and I am using LINQ to SQL to get a better performance than pull all data to memory.
The problem is that to do this, I cannot use methods (because linq to sql will not recognize methods to translate), so when I need a complex report, I need to repeat a lot of code and the code is becoming confuse.
One example:
this is just to get the 5 more profit service orders
So, I think that probably I am doing this in a wrong way. Do you have suggestions to do a better code?
Thank you

How to escape from ORMs limitations or should I avoid them?

In short, ORMs like Entity Framework provides a fast solution but with many limitations, When should they (ORMs) be avoided?
I want to create an engine of a DMS system, I wonder that how could I create the Business Logic Layer.
I'll discuss the following options:
Using Entity Framework and provides it as a Business later for the engine's clients.
The problem is that missing the control on the properties and the validation because it's generated code.
Create my own business layer classes manually without using Entity Framework or any ORM:
The problem is that it's a hard mission and something like reinvent the weel.
Create my own business layer classes up on the Entitiy Framework (use it)
The problem Seems to be code repeating by creating new classes with the same names and every property will cover the opposite one which is generated by the ORM.
Am I discuss the problem in a right way?
In short, ORMs should be avoided when:
your program will perform bulk inserts/updates/deletes (such as insert-selects, and updates/deletes that are conditional on something non-unique). ORMs are not designed to do these kinds of bulk operations efficiently; you will end up deleting each record one at a time.
you are using highly custom data types or conversions. ORMs are generally bad at dealing with BLOBs, and there are limits to how they can be told how to "map" objects.
you need the absolute highest performance in your communication with SQL Server. ORMs can suffer from N+1 problems and other query inefficiencies, and overall they add a layer of (usually reflective) translation between your request for an object and a SQL statement which will slow you down.
ORMs should instead be used in most cases of application-based record maintenance, where the user is viewing aggregated results and/or updating individual records, consisting of simple data types, one at a time. ORMs have the extreme advantage over raw SQL in their ability to provide compiler-checked queries using Linq providers; virtually all of the popular ORMs (Linq2SQL, EF, NHibernate, Azure) have a Linq query interface that can catch a lot of "fat fingers" and other common mistakes in queries that you don't catch when using "magic strings" to form SQLCommands. ORMs also generally provide database independence. Classic NHibernate HBM mappings are XML files, which can be swapped out as necessary to point the repository at MSS, Oracle, SQLite, Postgres, and other RDBMSes. Even "fluent" mappings, which are classes in code files, can be swapped out if correctly architected. EF has similar functionality.
So are you asking how to do "X" without doing "X"? ORM is an abstraction and as any other abstraction it has disadvantages but not those you mentioned.
Code (EFv4) can be generated by T4 template and T4 template is a code that can be modified
Generated code is partial class which can be combined with your partial part containing your logic
Writing classes manually is very common case - using designer as available in Entity framework is more rare
Disclaimer: I work at Mindscape that builds the LightSpeed ORM for .NET
As you don't ask about a specific issue, but about approaches to solving the flexibility problem with an ORM I thought I'd chime in with some views from a vendor perspective. It may or may not be of use to you but might give some food for thought :-)
When designing an O/R Mapper it's important to take into consideration what we call "escape hatches". An ORM will inevitably push a certain set of default behaviours which is one way that developer gain productivity gains.
One of the lessons we have learned with LightSpeed has been where developers need those escape hatches. For example, KeithS here states that ORMs are not good for bulk operations - and in most cases this is true. We had this scenario come up with some customers and added an overload to our Remove() operation that allowed you to pass in a query that removes all records that match. This saved having to load entities into memory and delete them. Listening to where developers are having pain and helping solve those problems quickly is important for helping build solid solutions.
All ORMs should efficiently batch queries. Having said that, we have been surprised to see that many ORMs don't. This is strange given that often batching can be done rather easily and several queries can be bundled up and sent to the database at once to save round trips. This is something we've done since day 1 for any database that supports it. That's just an aside to the point of batching made in this thread. The quality of those batches queries is the real challenge and, frankly, there are some TERRIBLE SQL statements being generated by some ORMs.
Overall you should select an ORM that gives you immediate productivity gains (almost demo-ware styled 'see I queried data in 30s!') but has also paid attention to larger scale solutions which is where escape hatches and some of the less demoed, but hugely useful features are needed.
I hope this post hasn't come across too salesy, but I wanted to draw attention to taking into account the thought process that goes behind any product when selecting it. If the philosophy matches the way you need to work then you're probably going to be happier than selecting one that does not.
If you're interested, you can learn about our LightSpeed ORM for .NET.
in my experience you should avoid use ORM when your application do the following data manipulation:
1)Bulk deletes: most of the ORM tools wont truly delete the data, they will mark it with a garbage collect ID (GC record) to keep the database consistency. The worst thing is that the ORM collect all the data you want to delete before it mark it as deleted. That means that if you want to delete 1000000 rows the ORM will first fetch the data, load it in your application, mark it as GC and then update the database;. which I believe is a huge waist of resources.
2)bulk inserts and data import:most of the ORM tools will create business layer validations on the business classes, this is good if you want to validate 1 record but if you are going to insert/import hundreds or even millions of records the process could take days.
3)Report generation: the ORM tools are good to create simple list reports or simple table joins like in a order-order_details scenario. but it most cases the ORM will only slow down the retrieve of the data and will add more joins that you need for a report. that translate in a give more work to the DB engine than you usually do with a SQL approach

With EF, does it make sense to learn DataSet

I'm doing some exercises on Data Access Layer in ASP.NET webforms. It looks to me as if everything I'm trying to do with DataSets, I can do with the Entity Framework. So today, does it make sense to learn DataSets? So many steps with DataSets just to get the same result as EF.
Do you need to know how to ride a bicycle in order to drive a car? Nope. Does it hurt to know both? Nope.
If you're going to use EF you don't need to know datasets; EF is an abstraction built on top of traditional ADO.NET. At the same time, knowing how the underlying ADO connections/readers/etc work doesn't hurt, it helps you understand what is going on underneath.
However, knowing SQL Server, TSQL, and a bit (or more) about how the SQL optimizer works is more important than knowing ADO.NET if you're going to use EF (or any other OR mapper). Although much of that is also "hidden" by OR mappers it is still important to know what goes on "behind the scenes" to avoid making mistakes that can hurt performance etc.

Philosophy of correct working with ORM (Entity Framework )

I'm an old-school database programmer. And all my life i've working with database via DAL and stored procedures. Now i got a requirement to use Entity Framework.
Could you tell me your expirience and architecture best practicies how to work with it ?
As I know ORM was made for programmers who don't know SQL expression. And this is only benefit of ORM. Am I right ?
I got architecture document and I don't know clearly what I shoud do with ORM. I think that my steps should be:
1) Create complete database
2) Create high-level entities in model such "Price" which is realy consists from few database tables
3) Map database tables on entities.
An ORM does a lot more than just allow non-SQL programmers to talk to databases!
Instead of having to deal with loads of handwritten DAL code, and getting back a row/column representation of your data, an ORM turns each row of a table into a strongly-typed object.
So you end up with e.g. a Customer, and you can access its phone number as a strongly-typed property:
string customerPhone = MyCustomer.PhoneNumber;
That is a lot better than:
string customerPhone = MyCustomerTable.Rows[5].Column["PhoneNumber"].ToString();
You get no support whatsoever from the IDE in making this work - be aware of mistyping the column name! You won't find out 'til runtime - either you get no data back, or you get an exception.... no very pleasant.
It's first of all much easier to use that Customer object you get back, the properties are nicely available, strongly-typed, and discoverable in Intellisense, and so forth.
So besides possibly saving you from having to hand-craft a lot of boring SQL and DAL code, an ORM also brings a lot of benefits in using the data from the database - discoverability in your code editor, type safety and more.
I agree - the thought of an ORM generating SQL statements on the fly, and executing those, can be scary. But at least in Entity Framework v4 (.NET 4), Microsoft has done an admirable job of optimizing the SQL being used. It might not be perfect in 100% of the cases, but in a large percentage of the time, it's a lot better than any SQL any non-expert SQL programmer would write...
Plus: in EF4, if you really want to and see a need to, you can always define and use your own Stored procs for INSERT, UPDATE, DELETE on any entity.
I can relate to your sentiment of wanting to have complete control over your SQL. I have been researching ORM usage myself, and while I can't state a case nearly as well as marc_s has, I thought I might chime in with a couple more points.
I think the point of ORM is to shift the focus away from writing SQL and DAL code, and instead focus more on the business logic. You can be more agile with an ORM tool, because you don't have to refactor your data model or stored procedures every time you change your object model. In fact, ORM essentially give you a layer of abstraction, so you can potentially make changes to your schema without affecting your code, and vice-versa. ORM might not always generate the most efficient SQL, but you may benefit in faster development time. For small projects however, the benefits of ORM might not be worth the extra time spent configuring the ORM.
I know that doesn't answer your questions though.
To your 2nd question, it seems to me that many developers on S.O. here who are very skilled in SQL still advocate the use of and themselves use ORM tools such as Hibernate, LINQ to SQL, and Entity Framework. In fact, you still need to know SQL sometimes even if you use ORM, and it's typically the more complicated queries, so your theory about ORM being mainly "for programmers who don't know SQL" might be wrong. Plus you get caching from your ORM layer.
Furthermore, Jeff Atwood, who is the lead developer of S.O. (this site here), claims that he loves SQL (and I'd bet he's very good at it), and he also strives to avoid adding extra tenchnologies to his stack, but yet he choose to use LINQ to SQL to build S.O. Years ago already he claimed that, "Stored Procedures should be considered database assembly language: for use in only the most performance critical situations."
To your 1st question, here's another article from Jeff Atwood's blog that talks about varies ways (including using ORM) to deal with the object-relational impedance mistmatch problem, which helped me put things in perspective. It's also interesting because his opinion of ORM must have changed since then. In the article he said you should, "either abandon relational databases, or abandon objects," as well as, "I tend to err on the side of the database-as-model camp." But as I said, some of the bullet points helped put things into perspective for me.

Manual DAL & BLL vs. ORM

Which approach is better: 1) to use a third-party ORM system or 2) manually write DAL and BLL code to work with the database?
1) In one of our projects, we decided using the DevExpress XPO ORM system, and we ran across lots of slight problems that wasted a lot of our time. Amd still from time to time we encounter problems and exceptions that come from this ORM, and we do not have full understanding and control of this "black box".
2) In another project, we decided to write the DAL and BLL from scratch. Although this implied writing boring code many, many times, but this approach proved to be more versatile and flexible: we had full control over the way data was held in the database, how it was obtained from it, etc. And all the bugs could be fixed in a direct and easy way.
Which approach is generally better? Maybe the problem is just with the ORM that we used (DevExpress XPO), and maybe other ORMs are better (such as NHibernate)?
Is it worth using ADO Entiry Framework?
I found that the DotNetNuke CMS uses its own DAL and BLL code. What about other projects?
I would like to get info on your personal experience: which approach do you use in your projects, which is preferable?
Thank you.
My personal experience has been that ORM is usually a complete waste of time.
First, consider the history behind this. Back in the 60s and early 70s, we had these DBMSes using the hierarchical and network models. These were a bit of a pain to use, since when querying them you had to deal with all of the mechanics of retrieval: follow links between records all over the place and deal with the situation when the links weren't the links you wanted (e.g., were pointing in the wrong direction for your particular query). So Codd thought up the idea of a relational DBMS: specify the relationships between things, say in your query only what you want, and let the DBMS deal with figuring out the mechanics of retrieving it. Once we had a couple of good implementations of this, the database guys were overjoyed, everybody switched to it, and the world was happy.
Until the OO guys came along into the business world.
The OO guys found this impedance mismatch: the DBMSes used in business programming were relational, but internally the OO guys stored things with links (references), and found things by figuring out the details of which links they had to follow and following them. Yup: this is essentially the hierarchical or network DBMS model. So they put a lot of (often ingenious) effort into layering that hierarchical/network model back on to relational databases, incidently throwing out many of the advantages given to us by RDBMSes.
My advice is to learn the relational model, design your system around it if it's suitable (it very frequently is), and use the power of your RDBMS. You'll avoid the impedance mismatch, you'll generally find the queries easy to write, and you'll avoid performance problems (such as your ORM layer taking hundreds of queries to do what it ought to be doing in one).
There is a certain amount of "mapping" to be done when it comes to processing the results of a query, but this goes pretty easily if you think about it in the right way: the heading of the result relation maps to a class, and each tuple in the relation is an object. Depending on what further logic you need, it may or may not be worth defining an actual class for this; it may be easy enough just to work through a list of hashes generated from the result. Just go through and process the list, doing what you need to do, and you're done.
Perhaps a little of both is the right fit. You could use a product like SubSonic. That way, you can design your database, generate your DAL code (removing all that boring stuff), use partial classes to extend it with your own code, use Stored Procedures if you want to, and generally get more stuff done.
That's what I do. I find it's the right balance between automation and control.
I'd also point out that I think you're on the right path by trying out different approaches and seeing what works best for you. I think that's ultimately the source for your answer.
Recently I made the decision to use Linq to SQL on a new project, and I really like it. It is lightweight, high-performance, intuitive, and has many gurus at microsoft (and others) that blog about it.
Linq to SQL works by creating a data layer of c# objects from your database. DevExpress XPO works in the opposite direction, creating tables for your C# business objects. The Entity Framework is supposed to work either way. I am a database guy, so the idea of a framework designing the database for me doesn't make much sense, although I can see the attractiveness of that.
My Linq to SQL project is a medium-sized project (hundreds, maybe thousands of users). For smaller projects sometimes I just use SQLCommand and SQLConnection objects, and talk directly to the database, with good results. I have also used SQLDataSource objects as containers for my CRUD, but these seem clunky.
DALs make more sense the larger your project is. If it is a web application, I always use some sort of DAL because they have built-in protections against things like SQL injection attacks, better handling of null values, etc.
I debated whether to use the Entity Framework for my project, since Microsoft says this will be their go-to solution for data access in the future. But EF feels immature to me, and if you search StackOverflow for Entity Framework, you will find several people who are struggling with small, obtuse problems. I suspect version 2 will be much better.
I don't know anything about nHibernate, but there are people out there who love it and would not use anything else.
You might try using NHibernate. Since it's open source, it's not exactly a black box. It is very versatile, and it has many extensibility points for you to plug in your own additional or replacement functionality.
Comment 1:
NHibernate is a true ORM, in that it permits you to create a mapping between your arbitrary domain model (classes) and your arbitrary data model (tables, views, functions, and procedures). You tell it how you want your classes to be mapped to tables, for example, whether this class maps to two joined tables or two classes map to the same table, whether this class property maps to a many-to-many relation, etc. NHibernate expects your data model to be mostly normalized, but it does not require that your data model correspond precisely to your domain model, nor does it generate your data model.
Comment 2:
NHibernate's approach is to permit you to write any classes you like, and then after that to tell NHibernate how to map those classes to tables. There's no special base class to inherit from, no special list class that all your one-to-many properties have to be, etc. NHibernate can do its magic without them. In fact, your business object classes are not supposed to have any dependencies on NHibernate at all. Your business object classes, by themselves, have absolutely no persistence or database code in them.
You will most likely find that you can exercise very fine-grained control over the data-access strategies that NHibernate uses, so much so that NHibernate is likely to be an excellent choice for your complex cases as well. However, in any given context, you are free to use NHibernate or not to use it (in favor of more customized DAL code), as you like, because NHibernate tries not to get in your way when you don't need it. So you can use a custom DAL or DevExpress XPO in one BLL class (or method), and you can use NHibernate in another.
I recently took part in sufficiently large interesting project. I didn't join it from the beginning and we had to support already implemented architecture. Data access to all objects was implemented through stored procedures and automatically generated wrapper-methods on .NET that returned DataTable objects. The development process in such system was really slow and inefficient. We had to write huge stored procedure on PL/SQL for every query, that could be expressed in simple LINQ query. If we had used ORM, we would have implement project several times faster. And I don't see any advantage of such architecture.
I admit, that it is just particular not very successful project, but I made following conclusion: Before refusing to use ORM think twice, do you really need such flexibility and control over database? I think in most cases it isn't worth wasted time and money.
As others explain, there is a fundamental difficulty with ORM's that make it such that no existing solution does a very good job of doing the right thing, most of the time. This Blog Post: The Vietnam Of Computer Science echoes some of my feelings about it.
The executive summary is something along the lines of the assumptions and optimizations that are incompatible between object and relational models. although early returns are good, as the project progresses, the abstractions of the ORM fall short, and the extra overhead of working around it tends to cancel out the successes.
I have used Bold for Delphi four years now. It is great but it is not available anymore for sale and it lacks some features like databinding. ECO the successor has all that.
No I'm not selling ECO-licenses or something but I just think it is a pity that so few people realize what MDD (Model Driven Development) can do. Ability to solve more complex problems in less time and fewer bugs. This is very hard to measure but I have heard something like 5-10 more efficient development. And as I work with it every day I know this is true.
Maybe some traditional developer that is centered around data and SQL say:
"But what about performance?"
"I may loose control of what SQL is run!"
Well...
If you want to load 10000 instances of a table as fast as possible it may be better to use stored procedures, but most application don't do this. Both Bold and ECO use simple SQL queries to load data. Performance is highly dependent of the number of queries to the database to load a certain amount of data. Developer can help by saying this data belong to each other. Load them as effiecent as possible.
The actual queries that is executed can of course be logged to catch any performance problems. And if you really want to use your hyper optimized SQL query this is no problem as long as it don't update the database.
There is many ORM system to choose from, specially if you use dot.net. But honestly it is very very hard to do a good ORM framework. It should be concentrated around the model. If the model change, it should be an easy task to change database and the code dependent of the model. This make it easy to maintain. The cost for making small but many changes is very low. Many developers do the mistake to center around the database and adapt everthing around that. In my opinion this is not the best way to work.
More should try ECO. It is free to use an unlimited time as long the model is no more than 12 classes. You can do a lot with 12 classes!
I suggest you to use Code Smith Tool for creating Nettiers, that is a good option for developer.