I know some differences of LINQ to Entities and LINQ to Objects which the first implements IQueryable and the second implements IEnumerable and my question scope is within EF 5.
My question is what's the technical difference(s) of those 3 methods? I see that in many situations all of them work. I also see using combinations of them like .ToList().AsQueryable().
What do those methods mean, exactly?
Is there any performance issue or something that would lead to the use of one over the other?
Why would one use, for example, .ToList().AsQueryable() instead of .AsQueryable()?
There is a lot to say about this. Let me focus on AsEnumerable and AsQueryable and mention ToList() along the way.
What do these methods do?
AsEnumerable and AsQueryable cast or convert to IEnumerable or IQueryable, respectively. I say cast or convert with a reason:
When the source object already implements the target interface, the source object itself is returned but cast to the target interface. In other words: the type is not changed, but the compile-time type is.
When the source object does not implement the target interface, the source object is converted into an object that implements the target interface. So both the type and the compile-time type are changed.
Let me show this with some examples. I've got this little method that reports the compile-time type and the actual type of an object (courtesy Jon Skeet):
void ReportTypeProperties<T>(T obj)
{
Console.WriteLine("Compile-time type: {0}", typeof(T).Name);
Console.WriteLine("Actual type: {0}", obj.GetType().Name);
}
Let's try an arbitrary linq-to-sql Table<T>, which implements IQueryable:
ReportTypeProperties(context.Observations);
ReportTypeProperties(context.Observations.AsEnumerable());
ReportTypeProperties(context.Observations.AsQueryable());
The result:
Compile-time type: Table`1
Actual type: Table`1
Compile-time type: IEnumerable`1
Actual type: Table`1
Compile-time type: IQueryable`1
Actual type: Table`1
You see that the table class itself is always returned, but its representation changes.
Now an object that implements IEnumerable, not IQueryable:
var ints = new[] { 1, 2 };
ReportTypeProperties(ints);
ReportTypeProperties(ints.AsEnumerable());
ReportTypeProperties(ints.AsQueryable());
The results:
Compile-time type: Int32[]
Actual type: Int32[]
Compile-time type: IEnumerable`1
Actual type: Int32[]
Compile-time type: IQueryable`1
Actual type: EnumerableQuery`1
There it is. AsQueryable() has converted the array into an EnumerableQuery, which "represents an IEnumerable<T> collection as an IQueryable<T> data source." (MSDN).
What's the use?
AsEnumerable is frequently used to switch from any IQueryable implementation to LINQ to objects (L2O), mostly because the former does not support functions that L2O has. For more details see What is the effect of AsEnumerable() on a LINQ Entity?.
For example, in an Entity Framework query we can only use a restricted number of methods. So if, for example, we need to use one of our own methods in a query we would typically write something like
var query = context.Observations.Select(o => o.Id)
.AsEnumerable().Select(x => MySuperSmartMethod(x))
ToList – which converts an IEnumerable<T> to a List<T> – is often used for this purpose as well. The advantage of using AsEnumerable vs. ToList is that AsEnumerable does not execute the query. AsEnumerable preserves deferred execution and does not build an often useless intermediate list.
On the other hand, when forced execution of a LINQ query is desired, ToList can be a way to do that.
AsQueryable can be used to make an enumerable collection accept expressions in LINQ statements. See here for more details: Do i really need use AsQueryable() on collection?.
Note on substance abuse!
AsEnumerable works like a drug. It's a quick fix, but at a cost and it doesn't address the underlying problem.
In many Stack Overflow answers, I see people applying AsEnumerable to fix just about any problem with unsupported methods in LINQ expressions. But the price isn't always clear. For instance, if you do this:
context.MyLongWideTable // A table with many records and columns
.Where(x => x.Type == "type")
.Select(x => new { x.Name, x.CreateDate })
...everything is neatly translated into a SQL statement that filters (Where) and projects (Select). That is, both the length and the width, respectively, of the SQL result set are reduced.
Now suppose users only want to see the date part of CreateDate. In Entity Framework you'll quickly discover that...
.Select(x => new { x.Name, x.CreateDate.Date })
...is not supported (at the time of writing). Ah, fortunately there's the AsEnumerable fix:
context.MyLongWideTable.AsEnumerable()
.Where(x => x.Type == "type")
.Select(x => new { x.Name, x.CreateDate.Date })
Sure, it runs, probably. But it pulls the entire table into memory and then applies the filter and the projections. Well, most people are smart enough to do the Where first:
context.MyLongWideTable
.Where(x => x.Type == "type").AsEnumerable()
.Select(x => new { x.Name, x.CreateDate.Date })
But still all columns are fetched first and the projection is done in memory.
The real fix is:
context.MyLongWideTable
.Where(x => x.Type == "type")
.Select(x => new { x.Name, DbFunctions.TruncateTime(x.CreateDate) })
(But that requires just a little bit more knowledge...)
What do these methods NOT do?
Restore IQueryable capabilities
Now an important caveat. When you do
context.Observations.AsEnumerable()
.AsQueryable()
you will end up with the source object represented as IQueryable. (Because both methods only cast and don't convert).
But when you do
context.Observations.AsEnumerable().Select(x => x)
.AsQueryable()
what will the result be?
The Select produces a WhereSelectEnumerableIterator. This is an internal .Net class that implements IEnumerable, not IQueryable. So a conversion to another type has taken place and the subsequent AsQueryable can never return the original source anymore.
The implication of this is that using AsQueryable is not a way to magically inject a query provider with its specific features into an enumerable. Suppose you do
var query = context.Observations.Select(o => o.Id)
.AsEnumerable().Select(x => x.ToString())
.AsQueryable()
.Where(...)
The where condition will never be translated into SQL. AsEnumerable() followed by LINQ statements definitively cuts the connection with entity framework query provider.
I deliberately show this example because I've seen questions here where people for instance try to 'inject' Include capabilities into a collection by calling AsQueryable. It compiles and runs, but it does nothing because the underlying object does not have an Include implementation anymore.
Execute
Both AsQueryable and AsEnumerable don't execute (or enumerate) the source object. They only change their type or representation. Both involved interfaces, IQueryable and IEnumerable, are nothing but "an enumeration waiting to happen". They are not executed before they're forced to do so, for example, as mentioned above, by calling ToList().
That means that executing an IEnumerable obtained by calling AsEnumerable on an IQueryable object, will execute the underlying IQueryable. A subsequent execution of the IEnumerable will again execute the IQueryable. Which may be very expensive.
Specific Implementations
So far, this was only about the Queryable.AsQueryable and Enumerable.AsEnumerable extension methods. But of course anybody can write instance methods or extension methods with the same names (and functions).
In fact, a common example of a specific AsEnumerable extension method is DataTableExtensions.AsEnumerable. DataTable does not implement IQueryable or IEnumerable, so the regular extension methods don't apply.
ToList()
Execute the query immediately
AsEnumerable()
lazy (execute the query later)
Parameter: Func<TSource, bool>
Load EVERY record into application memory, and then handle/filter them. (e.g. Where/Take/Skip, it will select * from table1, into the memory, then select the first X elements) (In this case, what it did: Linq-to-SQL + Linq-to-Object)
AsQueryable()
lazy (execute the query later)
Parameter: Expression<Func<TSource, bool>>
Convert Expression into T-SQL (with the specific provider), query remotely and load result to your application memory.
That’s why DbSet (in Entity Framework) also inherits IQueryable to get the efficient query.
Do not load every record, e.g. if Take(5), it will generate select top 5 * SQL in the background. This means this type is more friendly to SQL Database, and that is why this type usually has higher performance and is recommended when dealing with a database.
So AsQueryable() usually works much faster than AsEnumerable() as it generate T-SQL at first, which includes all your where conditions in your Linq.
ToList() will being everything in memory and then you will be working on it.
so, ToList().where ( apply some filter ) is executed locally.
AsQueryable() will execute everything remotely i.e. a filter on it is sent to the database for applying.
Queryable doesn't do anything til you execute it. ToList, however executes immediately.
Also, look at this answer Why use AsQueryable() instead of List()?.
EDIT :
Also, in your case once you do ToList() then every subsequent operation is local including AsQueryable(). You can't switch to remote once you start executing locally.
Hope this makes it a little bit more clearer.
Encountered a bad performance on below code.
void DoSomething<T>(IEnumerable<T> objects){
var single = objects.First(); //load everything into memory before .First()
...
}
Fixed with
void DoSomething<T>(IEnumerable<T> objects){
T single;
if (objects is IQueryable<T>)
single = objects.AsQueryable().First(); // SELECT TOP (1) ... is used
else
single = objects.First();
}
For an IQueryable, stay in IQueryable when possible, try not be used like IEnumerable.
Update. It can be further simplified in one expression, thanks Gert Arnold.
T single = objects is IQueryable<T> q?
q.First():
objects.First();
I am trying to duplicate the following SQL statement as a LINQ to Entities query (where "PRODUCTS" is the table mapped to the entity) ... NOTE IQueryable ... most of what I have seen posted as solutions convert either the search parameters, or the dump the results into an IEnumerable and then proceed to convert from there. I am dealing with 100's of millions of records and cannot afford to load 200 million records into memory, only to have to filter through them again. I would like, if possible to do this in a single query to the databse.
select *
from PRODUCTS
where
MODEL_CODE = '65' and
CAST(SERIAL_NUMBER as int) > 927000 and
CAST(SERIAL_NUMBER as int) < 928000
I have tried the following ...
int startSN, endSN;
startSN = 9500
endSN = 9500
if (!int.TryParse(startSerialNumber, out startSN))
throw new InvalidCastException("The start serial number was not a valid value");
if (!int.TryParse(endSerialNumber, out endSN))
throw new InvalidCastException("The end serial number was not a valid value");
IQueryable<PRODUCT> resultList = base.Context.PRODUCTS.Where(b =>
(Convert.ToInt32(b.SERIAL_NUMBER) > startSN) &&
(Convert.ToInt32(b.SERIAL_NUMBER) < endSN)).AsQueryable();
I have tried a couple of other version of things similiar to this with no luck. I have looked at the following posts also with no luck.
Convert string to int in an Entity Framework linq query and handling the parsing exception - the solution converts query to a list before converting the entity properties.
Convert string to Int in LINQ to Entities ? -
This problem was just with converting the parameters which can be easily done outside the LINQ to Entities statement. I am already doing this for the parameters.
LINQ to Entities StringConvert(double)' cannot be translated to convert int to string - This problem is actually the reverse of mine, trying to convert an int to a string. 1) SqlFunctions does not provide a function for converting TO an int. 2) Ultimately the solution is to, again convert to an IEnumerable before converting/casting the values.
Anybody got any other ideas? I am little stumped on this one!
Thank you,
G
If you don't use code-first, but an EDMX based approach model defined functions are probably the best solution: Convert String to Int in EF 4.0
Alternatively you can use...
base.Context.PRODUCTS.SqlQuery(string sql, params object[] parameters)
...and then pass in the raw SQL statement from your question.
DbSet<T>.SqlQuery(...) returns a DbSqlQuery<T> as result. It is important to keep in mind that this type does not implement IQueryable<T>, but only IEnumerable<T>. Its signature is:
public class DbSqlQuery<TEntity> : IEnumerable<TEntity>, IEnumerable, IListSource
where TEntity : class
So you can extend this result with further LINQ methods, but it is only LINQ to Objects that will be executed in memory with the returned result set from the SQL query. You can not extend it with LINQ to Entities that would be executed in the database. Hence, adding .Where filters to DbSqlQuery<T> does not have any influence on the database query and the set of data that is loaded from the DB into memory.
That's actually not surprising as it would mean otherwise that a partial expression tree (from a Where method) had to be translated into SQL and then merged into a hand-written SQL statement so that a correct new composed SQL statement results and could be sent to the database. Sounds like a pretty hard task to me.
(as advised re-posting this question here... originally posted in msdn forum)
I am striving to write a "generic" routine for some simple CRUD operations using EF/Linq to Entities. I'm working in ASP.NET (C# or VB).
I have looked at:
Getting a reference to a dynamically selected table with "GetObjectByKey" (But I don't want anything from cache. I want data from database. Seems like not what this function is intended for).
CRM Dynamic Entities (here you can pass a tablename string to query) looked like the approach I am looking for but I don't get the idea that this CRM effort is necessarily staying current (?) and/or has much assurance for the future??
I looked at various ways of drilling thru Namespaces/Objects to get to where I could pass a TableName parameter into the oft used query syntax var query = (from c in context.C_Contacts select c); (for example) where somehow I could swap out the "C_Contacts" TEntity depending on which table I want to work with. But not finding a way to do this ??
Slightly over-simplyfing, I just want to be able to pass a tablename parameter and in some cases some associated fieldnames and values (perhaps in a generic object?) to my routine and then let that routine dynamically plug into LINQ to Entity data context/model and do some standard "select all" operations for parameter table or do a delete to parameter table based on a generic record id. I'm trying to avoid calling the various different automatically generated L2E methods based on tablename etc...instead just trying to drill into the data context and ultimately the L2E query syntax for dynamically passed table/field names.
Has anyone found any successful/efficient approaches for doing this? Any ideas, links, examples?
The DbContext object has a generic Set() method. This will give you
from c in context.Set<Contact>() select c
Here's method when starting from a string:
public void Test()
{
dynamic entity = null;
Type type = Type.GetType("Contract");
entity = Activator.CreateInstance(type);
ProcessType(entity);
}
public void ProcessType<TEntity>(TEntity instance)
where TEntity : class
{
var result =
from item in this.Set<TEntity>()
select item;
//do stuff with the result
//passing back to the caller can get more complicated
//but passing it on will be fine ...
}
I'm looking at the SQL generated when performing simple select queries. I'm using code first with the sample blog context from nuget.
If the following is run:
BlogContext _context = new BlogContext();
var comments = _context.Comments.Select(c => new CommentReadOnly {Author = c.Author});
var count = comments.Count();
The following SQL is produced:
SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
COUNT(1) AS [A1]
FROM [dbo].[Comments] AS [Extent1]
) AS [GroupBy1]
Where the count is performed in the SQL which is expected.
However if I change the code to look like this:
BlogContext _context = new BlogContext();
var comments = _context.Comments.Select(c => new CommentReadOnly {Author = c.Author});
var count = comments.Count();
private CommentReadOnly ToCommentReadOnly(Comment comment)
{
return new CommentReadOnly
{
Author = comment.Author,
};
}
The following SQL is produced:
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[PostID] AS [PostID],
[Extent1].[Text] AS [Text],
[Extent1].[Author] AS [Author]
FROM [dbo].[Comments] AS [Extent1]
With the count done in code.
The reason (I think) is because the first is returned as IQueryable where as the second is IEnumerable.
Is it possible to return the second query as IQueryable without executing the SQL?
The reason I ask is that I'm creating a generic repository layer that can query my entities and convert them to the required type (in the example above comment might have a couple of different 'readonly' objects). I don't want the SQL executing so early as paging may be done or other filtering in different situations.
I don't see any difference in those two queries. However, I guess you want to return a IQueryable object to the client so the client can perform further filtering and get the count from there.
You can simply return the object without doing the select and let the client do the rest.
return _context.Comments
The client can perform additional filtering on this IQueryable object
I think in your second query you execute the function ToCommentReadOnly() so this can't be done entirely in SQL and you end up with a Linq To Objects (IEnumerable).
But you state that you want to return an IQueryable from your Repository. This is not a recommended practice! The code to access the data should be hidden inside your repository otherwhise you will run into problems.
Say for example that your repository (which encapsulates your ObjectContext) goes out of scope after which you try to enumerate the IQueryable result the repository gave you. This will throw an error because the IQueryable can't be executed anymore.
If you expose an IQueryable from your Repository you give the end user of your Repository all the freedom they want in building their own queries, which is the thing you want to avoid by adding a repository!
So returning an IEnumerable from your Repository is a good thing :)
In a repository, I do this:
public AgenciesDonor FindPrimary(Guid donorId) {
return db.AgenciesDonorSet.Include("DonorPanels").Include("PriceAdjustments").Include("Donors").First(x => x.Donors.DonorId == donorId && x.IsPrimary);
}
then down in another method in the same repository, this:
AgenciesDonor oldPrimary = this.FindPrimary(donorId);
In the debugger, the resultsview shows all records in that table, but:
oldPrimary.Count();
is 1 (which it should be).
Why am I seeing all table entries retrieved, and not just 1? I thought row filtering was done in the DB.
If db.EntitySet really does fetch everything to the client, what's the right way to keep the client data-lite using EF? Fetching all rows won't scale for what I'm doing.
You will see everything if you hover over the AgenciesDonorSet because LINQ to Entities (or SQL) uses delayed execution. When the query is actually executed, it is just retrieving the count.
If you want to view the SQL being generated for any query, you can add this bit of code:
var query = queryObj as ObjectQuery; //assign your query to queryObj rather than returning it immediately
if (query != null)
{
System.Diagnostics.Trace.WriteLine(context);
System.Diagnostics.Trace.WriteLine(query.ToTraceString());
}
Entity Set does not implement IQueryable, so the extension methods that you're using are IEnumerable extension methods. See here:
http://social.msdn.microsoft.com/forums/en-US/linqprojectgeneral/thread/121ec4e8-ce40-49e0-b715-75a5bd0063dc/
I agree that this is stupid, and I'm surprised that more people haven't complained about it. The official reason:
The design reason for not making
EntitySet IQueryable is because
there's not a clean way to reconcile
Add\Remove on EntitySet with
IQueryable's filtering and
transformation ability.