Need to understand IN, SELECT, VALUE, FROM, in Entity Framework

Need to understand IN, SELECT, VALUE, FROM, in Entity Framework - entity-framework

Working on this EF tutorial, I've difficult to understand the meaning of the following markups
<asp:EntityDataSource ID="CoursesEntityDataSource" runat="server"
...
Where="#PersonID IN (SELECT VALUE instructor.PersonID FROM it.People AS instructor)">
...
</asp:EntityDataSource>
So what's this, pure SQL or Linq 2 Entity?
For what I understand,
it.people is people object that comes out of the query
FROM and AS makes sense in a pure SQL point of view
Taken together, they don't really make sense to me, and the tutorial didn't give much information.
Thanks for helping

This is another part of Entity Framework called Object Services. It allows you to perform the same object queries but without using LINQ to Entities. Generally it's used when you want to stream query results in DataReaders or in very limited scenarios where LINQ to Entities can't give you what you need.

Related

Why does EF 4.1 not support complex queries as well as linq-to-sql?

I am in the process of converting our internal web application from Linq-To-Sql to EF CodeFirst from an existing database. I have been getting annoyed with Linq-To-Sql's limitations more and more lately, and having to update the edmx after updating a very intertwined database table finally frustrated me enough to switch to EF.
However, I am encountering several situations where using linq with Linq-To-Sql is more powerful than the latest Entity Framework, and I am wondering if anyone knows the reasoning for it? Most of this seems to deal with transformations. For example, the following query works in L2S but not in EF:
var client = (from c in _context.Clients
where c.id == id
select ClientViewModel.ConvertFromEntity(c)).First();
In L2S, this correctly retrieves a client from the database and converts it into a ClientViewModel type, but in EF this exceptions saying that Linq to Entities does not recognize the method (which makes sense as I wrote it.
In order to get this working in EF I have to move the select to after the First() call.
Another example is my query to retrieve a list of clients. In my query I transform it into an anonymous structure to be converted into JSON:
var clients = (from c in _context.Clients
orderby c.name ascending
select new
{
id = c.id,
name = c.name,
versionString = Utils.GetVersionString(c.ProdVersion),
versionName = c.ProdVersion.name,
date = c.prod_deploy_date.ToString()
})
.ToList();
Not only does my Utils.GetVersionString() method cause an unsupported method exception in EF, the c.prod_deploy_date.ToString() causes one too and it's a simple DateTime. Like previously, in order to fix it I had to do my select transformation after ToList().
Edit: Another case I just came across is that EF cannot handle where clauses that compare entities where as L2S has no issues for with it. For example the query
context.TfsWorkItemTags.Where(x => x.TfsWorkItem == TfsWorkItemEntity).ToList()
throws an exception and instead I have to do
context.TfsWorkItemTags.Where(x => x.TfsWorkItem.id == tfsWorkItemEntity.id).ToList()
Edit 2: I wanted to add another issue that I found. Apparently you can't use arrays in EF Linq queries, and this probably annoys me more than anything. So for example, right now I convert an entity that denotes a version into an int[4] and try to query on it. In Linq-to-Sql I used the following query:
return context.ReleaseVersions.Where(x => x.major_version == ver[0] && x.minor_version == ver[1]
&& x.build_version == ver[2] && x.revision_version == ver[3])
.Count() > 0;
This fails with the following exception:
The LINQ expression node type 'ArrayIndex' is not supported in LINQ to Entities.
Edit 3: I found another instance of EF's bad Linq implementation. The following is a query that works in L2S but doesn't in EF 4.1:
DateTime curDate = DateTime.Now.Date;
var reqs = _context.TestRequests.Where(x => DateTime.Now > (curDate + x.scheduled_time.Value)).ToList();
This throws an ArgumentException with the message DbArithmeticExpression arguments must have a numeric common type.
Why does it seem like they downgraded the ability for Linq queries in EF than in L2S?

Edit (9/2/2012): Updated to reflect .NET 4.5 and added few more missing features
This is not the answer - it cannot be because the only qualified person who can answer your question is probably a product manager from ADO.NET team.
If you check feature set of old datasets then linq-to-sql and then EF you will find that critical features are removed in newer APIs because newer APIs are developed in much shorter times with big effort to deliver new fancy features.
Just list of some critical features available in DataSets but not available in later APIs:
Batch processing
Unique keys
Features available in Linq-to-Sql but not supported in EF (perhaps the list is not fully correct, I haven't used L2S for a long time):
Logging database activity
Lazy loaded properties
Left outer join (DefaultIfEmpty) since the first version (EF has it since EFv4)
Global eager loading definitions
AssociateWith - for example conditions for eager loaded data
Code first since the first version
IMultipleResults supporting stored procedures returning multiple result sets (EF has it in .NET 4.5 but there is no designer support for this feature)
Support for table valued functions (EF has this in .NET 4.5)
And some others
Now we can list features available in EF ObjectContext API (EFv4) and missing in DbContext API (EFv4.1):
Mapping stored procedures
Conditional mapping
Mapping database functions
Defining queries, QueryViews, Model defined functions
ESQL is not available unless you convert DbContext back to ObjectContext
Manipulating state of independent relationships is not possible unless you convert DbContext back to ObjectContext
Using MergeOption.OverwriteChanges and MergeOption.PreserveChanges is not possible unless you convert DbContext back to ObjectContext
And some others
My personal feeling about this is only big sadness. Core features are missing and features existing in previous APIs are removed because ADO.NET team obviously doesn't have enough resources to reimplement them - this makes migration path in many cases almost impossible. The whole situation is even worse because missing features or migration obstacles are not directly listed (I'm afraid even ADO.NET team doesn't know about them until somebody reports them).
Because of that I think that whole idea of DbContext API was management failure. At the moment ADO.NET team must maintain two APIs - DbContext is not mature to replace ObjectContext and it actually can't because it is just a wrapper and because of that ObjectContext cannot die. Resources available for EF development was most probably halved.
There are more problems related. Once we leave ADO.NET team and look on the problem from the perspective of MS product suite we will see so many discrepancies that I sometimes even wonder if there is any global strategy.
Simply live with the fact that EF's provider works in different way and queries which worked in Linq-to-sql don't have to work with EF.

A little late to the game, but I found this post while searching for something else, and figured that I'd post an answer to the fundamental questions in the original post, which mostly boil down to "LINQ to SQL allows Expression [x], but EF doesn't".
The answer is that the query provider (the code that translates your LINQ expression tree into something that actually executes and returns an enumerable set of stuff) is fundamentally different between L2S and EF. To understand why, you have to realize that another fundamental difference between L2S and EF is that L2S is table-based and EF is entity-model-based. In other words, EF works with conceptual entity models, and knows that the underlying physical model (the DB tables) don't necessarily reflect conceptual entities. This is because tables are normalized/denormalized, and have weird ways of dealing with entity type generalization (inheritance). So in EF, you have a picture of the conceptual model (which is the objects you code against in VB/C#, etc.) and a mapping to the physical underlying table(s) that comprise your conceptual entities. L2S does not do this. Every "model entity" in L2S is strictly a single table, with exactly the table fields as-is.
So far, that in and of itself doesn't really explain the problems in the original post, but hopefully, you can begin to appreciate that fundamentally, EF is not L2S+ or L2S v4.0. It's a very different kind of product (a real ORM) even though there is some coincidental overlap in the fact that both use LINQ to get at database data.
One other interesting difference is that EF was built from the ground up to be DB-agnostic, whereas L2S only works against MS SQL Server (although anyone who's sniffed around the L2S code deep enough will see that there are some underpinnings to allow different DBs, but in the end, it was tied to MS SQL Server only). This difference also plays a big role in why some expressions work in L2S LINQ, but not in EF LINQ. EF's query provider deals with canonical DB expressions, which in plain english means LINQ expressions that have SQL query equivalents in nearly all relational databases out there. The underlying EF engine (query provider) translates the LINQ expressions to these canonical DB expressions, then hands the canonical expression tree off to a specific DB provider (say Oracle's or MySQL's EF provider) where it is translated to product-specific SQL. You can see here how these canonical expressions are supposed to be translated by the individual product-specific providers: http://msdn.microsoft.com/en-us/library/ee789836.aspx
Additionally, EF allows some product-specific DB functions (store functions) as expressions through extensions. The underlying product-specific providers are responsible for both providing and translating these.
That being the case, EF only allows expressions that are DB canonical expressions, or store-specific functions, becuase all expressions in the tree are converted to SQL for execution against the DB.
The difference with L2S is that L2S passes off any expressions that it can to the DB from its limited SQL generator, and then executes any expressions it can't translate to SQL on the materialized object set that is returned. While this makes it look simpler to use L2S, what you don't see is that half your expressions don't actually make it to the DB as SQL and this can cause some really inefficient queries bringing back larger sets of data which are then iterated again in CLR memory with regular object LINQ acting against them for the other expressions which L2S can't turn into SQL.
You get the exact same effects in EF by using EF to return the materialized data to object sets in memory, and then using additional LINQ statements on that set in memory - just as L2S does, but in this case, you just have to do it explicitly, just like when you say you have to call .First() before using a non-DB-canonical expression. Similarly, you can call .ToArray() or .ToList() before using additional expressions that can't be turned into SQL.
One other big difference is that in EF, entities must be used whole. Real model entities represent conceptual objects that are transacted on in whole. You never have half a User, for example. The User is an object whose state depends on all fields. If you want to return partial entities, or a flattened join of multiple entities, you have to define a projection (what EF calls a Complex Type), or you can use some of the new 4.1/4.2/4.3 POCO features.

Now that Entity Framework is open source, it's easy enough to see, especially from the comments on Issues from the team, that one of the goals is to provide EF as an open layer on top of multiple databases. To be fair, Microsoft has unsurprisingly only implemented that layer on top of their SQL Server, but there are other implementations, like DevArt's MySql EF Connector.
As part of that goal, it's wise to keep the public interface somewhat limited, and attempting to add an additional layer that asks - well, some of this might be done in memory, some of this might be done in SQL, who knows - definitely complicates the job for other implementers trying to tie EF into this or that database.
So, I agree with the other answer here - you'd have to ask the team - but you can also get a lot of info about that team's direction on the public bug tracker and their other publications, and this seems like a clear motivation.
That said the main difference between LINQ to SQL and EF is the way EF throws an exception on code that has to be run in memory, and if you're an Expressions ninja there's nothing stopping you from going the next step to wrap the DbContext class and make that work just like LINQ to SQL. On the other hand, what you gain there is a mixed bag - you make it implicit rather than explicit when the SQL is generated, and when it fires, and that can be viewed as a loss of performance and control in exchange for flexibility/ease of authoring.

Way to make EF include method generate inner joins?

Using EF4. I have a situation where a linq to EF query with explicit joins won't work due to a many-to-many table. I'm not going to break the database design to suit EF, so I want to use the include method. However, that always seems to generate left outer joins and I need inner joins (simple context.Table1s.Include("Table2") where tables are 1-to-1 or 1-to-many will demonstrate the problem).
Any way to force inner joins?

There is no way to make EF generate queries a certain way unfortunately.
However when you add the 3 entities you should only see 2 of them in the designer, the 3rd one is a "relation". So if you have for example the entities people, books and loans, you should only see people and books and people should have the navigation property called Books(), which returns all books related (loans) to that person.
And if you wish you could do an .Include("Books") on that navigation property aswell.

You can control it by writing better LINQ
I've asked a very similar question that got me a usable answer that provided a solution so I was able to translate a query to an inner join hence speeding up query execution time considerably. Especially the second solution in the accepted answer is the one I used afterwards, because I didn't really want to use eSQL since I don't really like magic strings. It would put me back 10 years in the past.

What are the pros/cons of returning POCO objects from a Repository rathen than EF Entities?

Following the way Rob does it, I have the classes that are generated by the Linq to SQL wizard, and then a copy of those classes that are POCOs. In my repositories I return these POCOs rather than the Linq to SQL models:
return from c in DataContext.Customer
where c.ID == id
select new MyPocoModels.Customer { ID = c.ID, Name = c.Name }
I understand that the benefit of this is that the POCO models can be instantiated easier so this will make my code more testable.
I'm now moving from Linq to SQL over to Entity Framework and I'm about half way through an EF book. It seems there's a lot of goodness I'm going to lose out on by returning POCOs from my repositories rather than the EF entities.
I still haven't really embraced unit testing, so I feel like I'm wasting a lot of time creating these extra POCOs and writing the code to populate them, when all I appear to be gaining is testable code, yet I'm also gonna lose out on a lot of the benefits of the EF by not being able to track my objects.
Does anyone have any advice for a relative newb to all this ORM/Repository stuff?
Anthony

Another reason people don't like the auto-generated objects (in LINQ to SQL for example) is because of their built-in "magic".
Usually the magic is invisible and you never notice it, but when you try to do things like serialize one of those objects and then deserialize it (for example when using web services) its internal connection to the data source is broken and special hacks need to be employed to "put the magic back in".
With POCOs, you don't have to worry about those sorts of things and can get a better separation between your data and service layers. The downside of course is that you have to write lots of boring POCO -> magic object and magic object -> POCO conversion code. But in the end I think it's usually worth it, especially for large or complex projects.

The main reason is that a lot of people like to develop their model with a specific mindset: like DDD for instance. They might want to use a specific pattern (like Spec or State) for things like statuses (instead of enums) - or you might want to use a Factory for instantiation.
OO breaks when you try to use Tables as Objects when things get more complex. Simple sites work OK - but when you get to big big things, it gets ugly.
So - as always - it depends what you think your project will turn into.

My experience is that when you start writting some complex queries .Include method is worthless and you will find yourself either:
a) Writting a lot of queries to get the data you want or
b) abusing of annonymous types to load the data in a single query and then writting a lot of code just to pass that data to your entities.
POCOs are the way to go, IMHO.

Challenge: Getting Linq-to-Entities to generate decent SQL without unnecessary joins

I recently came across a question in the Entity Framework forum on msdn:
http://social.msdn.microsoft.com/Forums/en-US/adodotnetentityframework/thread/bb72fae4-0709-48f2-8f85-31d0b6a85f68
The person who asked the question tried to do a relatively simple query, involving two tables, a grouping, order by, and an aggregation using Linq-to-Entities. A pretty straightforward Linq query, and straightforward to do in SQL as well - the kind of stuff people try to do every day.
However, when using Linq-to-Entities the outcome is a complex query with lots of unnecessary joins etc. I tried it and wasn't able to get Linq-to-Entities to generate a decent SQL query from it if using just pure Linq against the EF entities.
Having seen a fair share of monster queries from EF I thought maybe the OP (and me, and others) are doing something wrong. Maybe there is a better way to do this?
So here's my challenge: using the example from the EF forum and using just Linq-to-Entities against the two entities, is it possible to get EF to generate a SQL query without unnecessary joins and other complexities?
I'd like to see EF generate something a little bit closer to what Linq-to-SQL does for the same kind of queries, while still using Linq against a EF model.
Restrictions: use EFv1 .net 3.5 SP1 or EFv4 (beta 1 is part of the VS2010/.net4 beta available for download from Microsoft). No CSDL->SSDL mapping tricks, model 'definingqueries', stored procs, db-side functions, or views allowed. Just plain 1:1 mapping between the model and the db and a pure L2E query that does what the original thread on MSDN asked. An association must exist between the two entities (i.e. my "workaround #1" answer to the original thread is not a valid workaround)
Update: 500pt bounty added. Have fun.
Update: As mentioned above, a solution that uses EFv4 / .net 4 (β1 or later) is of course eligible for the bounty. If you're using .net 4 post β1, please include build number (e.g. 4.0.20605), the L2E query you used, and the SQL it generated and sent to the DB.
Update: This issue has been fixed in VS2010 / .net 4 beta 2. Although the generated SQL still has a couple of [relatively harmless] extra levels of nesting, it doesn't do any of the nutty stuff it used to. The final execution plan after SQL Server's optimizer has had a go at it is now as good as it can be. +++ for the dudes and dudettes responsible for the SQL generating part of EFv4...

If I was that worried about the crazy SQL, I just wouldn't do any of the grouping in the database. I would first query all of the data I needed by finishing it off with a ToList() while using the Include function to load all the data in a single select.
Here's my final result:
var list = from o in _entities.orderT.Include("personT")
.Where(p => p.personT.person_id == person_id &&
p.personT.created >= fromTime &&
p.personT.created <= toTime).ToList()
group o by new { o.name, o.personT.created.Year, o.personT.created.Month, o.personT.created.Day } into g
orderby g.Key.name
select new { g.Key, count = g.Sum(x => x.price) };
This results in a much simpler select:
SELECT
1 AS [C1],
[Extent1].[order_id] AS [order_id],
[Extent1].[name] AS [name],
[Extent1].[created] AS [created],
[Extent1].[price] AS [price],
[Extent4].[person_id] AS [person_id],
[Extent4].[first_name] AS [first_name],
[Extent4].[last_name] AS [last_name],
[Extent4].[created] AS [created1]
FROM [dbo].[orderT] AS [Extent1]
LEFT OUTER JOIN [dbo].[personT] AS [Extent2] ON [Extent1].[person_id] = [Extent2].[person_id]
INNER JOIN [dbo].[personT] AS [Extent3] ON [Extent1].[person_id] = [Extent3].[person_id]
LEFT OUTER JOIN [dbo].[personT] AS [Extent4] ON [Extent1].[person_id] = [Extent4].[person_id]
WHERE ([Extent1].[person_id] = #p__linq__1) AND ([Extent2].[created] >= #p__linq__2) AND ([Extent3].[created] <= #p__linq__3)
Additionally, with the example data provided, SQL Profiler only notices a 3 ms increase in duration of the SQL call.
Personally, I think that anyone that whines about not liking the output SQL of an ORM layer should go back to using Stored Procedures and Datasets. They simply aren't ready to evolve yet, and need to spend a few more years in the proverbial oven. :)

Interesting discussion. I have used 2 ORM models so far (NHibernate and LINQ-to-Entities). In my experience, there is always a line where you have to give up on ORM to generated SQL and resort back to stored procedures or views to achieve best scalable queries. Having said that, I personally think that LINQ works better on more normalized databases and all the nested queries/joins are not a major issue. There are some cases where, in order to increase performance or scalability, you have to use DB server features (indexed views for example on SQL 2008 SE works only with query hints) and you simply cannot use an ORM (except iBatis?).
Granted that you won't get the best performance or scalability by using these nested joins/queries generated by linq but please don't forget the advantages and development benefits given by LINQ (or NHibernate) in any project. Surely there must be some merit to it.
Finally, although I risk comparing apple and oranges but isn't think more like asking: Do you want rapid website development (asp.net webforms, swing) or more control on your HTML (asp.net mvc, RoR)? pick the thing that best suits your requirements.
My 2 cents!

The SQL that linq generates is very efficient. It may look bulky but it takes into account relations on tables and constraints etc. In my opinion you should just blindly use the linq commands and not worry about scale. There are benefits of the large queries as its automatically generated. It avoids any slip ups in relational constraints and adds its own wrappers for faults/exceptions.
If however you want to write the SQL's yourself and still want to work behind the confines of an ORM, then try iBatishttp://ibatis.apache.org/ You have to write the SQL's and joins yourself, so it gives you complete control over the backend model.
Personally, just use SQLMetal and linq. Dont worry about performance and scale, unless you need to.

Performance of Linq to Entities vs ESQL

When using the Entity Framework, does ESQL perform better than Linq to Entities?
I'd prefer to use Linq to Entities (mainly because of the strong-type checking), but some of my other team members are citing performance as a reason to use ESQL. I would like to get a full idea of the pro's/con's of using either method.

The most obvious differences are:
Linq to Entities is strongly typed code including nice query comprehension syntax. The fact that the “from” comes before the “select” allows IntelliSense to help you.
Entity SQL uses traditional string based queries with a more familiar SQL like syntax where the SELECT statement comes before the FROM. Because eSQL is string based, dynamic queries may be composed in a traditional way at run time using string manipulation.
The less obvious key difference is:
Linq to Entities allows you to change the shape or "project" the results of your query into any shape you require with the “select new{... }” syntax. Anonymous types, new to C# 3.0, has allowed this.
Projection is not possible using Entity SQL as you must always return an ObjectQuery<T>. In some scenarios it is possible use ObjectQuery<object> however you must work around the fact that .Select always returns ObjectQuery<DbDataRecord>. See code below...
ObjectQuery<DbDataRecord> query = DynamicQuery(context,
"Products",
"it.ProductName = 'Chai'",
"it.ProductName, it.QuantityPerUnit");
public static ObjectQuery<DbDataRecord> DynamicQuery(MyContext context, string root, string selection, string projection)
{
ObjectQuery<object> rootQuery = context.CreateQuery<object>(root);
ObjectQuery<object> filteredQuery = rootQuery.Where(selection);
ObjectQuery<DbDataRecord> result = filteredQuery.Select(projection);
return result;
}
There are other more subtle differences described by one of the team members in detail here and here.

ESQL can also generate some particularly vicious sql. I had to track a problem with such a query that was using inherited classes and I found out that my pidly-little ESQL of 4 lines got translated in a 100000 characters monster SQL statetement.
Did the same thing with Linq and the compiled code was much more managable, let's say 20 lines of SQL.
Plus, what other people mentioned, Linq is strongly type, although very annoying to debug without the edit and continue feature.
AD

Entity-SQL (eSQL) allows you to do things such as dynamic queries more easily than LINQ to Entities. However, if you don't have a scenario that requires eSQL, I would be hesitant to rely on it over LINQ because it will be much harder to maintain (e.g. no more compile-time checking, etc).
I believe LINQ allows you to precompile your queries as well, which might give you better performance. Rico Mariani blogged about LINQ performance a while back and discusses compiled queries.

nice graph showing performance comparisons here:
Entity Framework Performance Explored
not much difference seen between ESQL and Entities
but overall differences significant in using Entities over direct Queries
Entity Framework uses two layers of object mapping (compared to a single layer in LINQ to SQL), and the additional mapping has performance costs. At least in EF version 1, application designers should choose Entity Framework only if the modeling and ORM mapping capabilities can justify that cost.

The more code you can cover with compile time checking for me is something that I'd place a higher premium on than performance. Having said that at this stage I'd probably lean towards ESQL not just because of the performance, but it's also (at present) a lot more flexible in what it can do. There's nothing worse than using a technology stack that doesn't have a feature you really really need.
The entity framework doesn't support things like custom properties, custom queries (for when you need to really tune performance) and does not function the same as linq-to-sql (i.e. there are features that simply don't work in the entity framework).
My personal impression of the Entity Framework is that there is a lot of potential, but it's probably a bit to "rigid" in it's implementation to use in a production environment in its current state.

For direct queries I'm using linq to entities, for dynamic queries I'm using ESQL. Maybe the answer isn't either/or, but and/also.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse