Entity framework get customers by zip code - entity-framework

So I have by entity framework 5 set up. I have a Customers table in the database. What would be most efficient way to get customers of a given zip code for example 94023? I have this:
var customersOfLosAltos =
(myDbContext.CreateObjectSet<Customer>()).Where(c=>c.Zip == "94023");
But, intuitively, that seems pretty inefficient because as I understand it, it basically retrieves all customers from the data source, and then filter it out by the given zip. It might be OK if I only have a few hundred customers, what if I have a million customers?
Any thoughts? Thanks.

as I understand it, it basically retrieves all customers from the data source, and then filter it out by the given zip.
Your understanding is wrong. Entity framework turns your code in to a SQL query, so what the server actually returns is the result for the query
select * from Customer where Zip = '94023'
If you changed your code to
var customers = myDbContext.CreateObjectSet<Customer>().ToList();
var customersOfLosAltos= customers.Where(c=>c.Zip == "94023");
then because of that .ToList() it now does a unfiltered query to the database then in memory filters on the client it to just the customers you want. This is why you want to try to keep your query as a IQueryable for as long as possible before you get the results because any tweaks or changes you make to the query propagate back to the query performed on the server.
To make your query even more efficient you could add a Select clause
var lastNamesOfCustomersOfLosAltos = (myDbContext.CreateObjectSet<Customer>())
.Where(c=>c.Zip == "94023")
.Select(c=>c.LastName);
The SQL server now performs the query (when you retreive the results via a ToList(), or in a foreach, or via a .AsEnumerable(), ect.)
select LastName from Customer where Zip = '94023'

Related

Why would LINQ group by results be fewer from Visual Studio compared to SQL Server and Linqpad?

There are other questions similar to mine but they didn't help me. I'm performing what should be a simple Linq group by operation, and in SQL Server Management Studio and Linqpad I get 23,859 results from a table containing 36,102 total records. This is what I believe to be the correct result.
For some reason, when I move my query into my Visual Studio application code, I get 22,463 groups - and I cannot for the life of me figure out why.
I need to group this table's rows based on unique combinations of 8 columns. The columns contain account IDs, person IDs, device IDs, premise IDs, and address columns. Basically, a person can have multiple accounts, multiple premises, multiple devices, and each premise can have it's own address. I know the table design is lacking... it's customer provided and there are other columns that necessitate the format - it should not be relevant to the grouping though.
SQL Server: 23859 groups:
SELECT acct_id, per_id, dev_id, prem_id, address, city, state, postal
FROM z_AccountInfo GROUP BY acct_id, per_id, dev_id, prem_id, address, city, state, postal
ORDER BY per_id
Linqpad: 23859 groups:
//Get all rows...
List<z_AccountInfo> zAccounts = z_AccountInfo.ToList();
//Group them...
var zAccountGroups = (from za in zAccounts
group za by new { za.acct_id, za.per_id, za.dev_id, za.prem_id, za.address, za.city, za.state, za.postal } into zaGroups
select zaGroups).OrderBy(zag => zag.Key.per_id).ToList();
Visual Studio: 22463 groups - WRONG?:
//Intantiate list I can use outside of Entity Framework context...
List<z_AccountInfo> zAccounts = new List<z_AccountInfo>();
using (Entities db = Entities.CreateEntitiesForSpecificDatabaseName(implementation))
{
//Get all rows. Count verified to be correct...
zAccounts = db.z_AccountInfo.OrderBy(z => z.per_id).ToList();
}
// Group the rows. Doesn't work??? 22463 groups?
var zAccountGroups = (from z_AccountInfo za in zAccounts
group za by new { za.acct_id, za.per_id, za.dev_id, za.prem_id, za.address, za.city, za.state, za.postal } into zag
select zag).ToList();
I'm hoping someone can spot a syntax issue or something else I'm missing. Seems like Visual Studio is grouping something.. but it's off by 1396 groups... that's pretty significant.
UPDATE:
sgmoore's comment below put me on the track of making sure the zAccounts list from Linqpad and Visual Studio match. They do not!?! Querying the table in SQL Server shows this data (account / device / premise)
Inspecting the Visual Studio output in Beyond Compare shows the device ID 6106471 being erroneously repeated / duplicated for the 4 bottom rows... meaning there should be 2 groups here, but my query will only see 1...
Since I'm using Entity Framework to query the data in the table in Visual Studio, this makes me think something is wrong with my model but I have no idea what it could be. Beyond compare shows this same issue happening multiple times and explains why the group numbers are off. It's like EF knows there are 8 rows (in this case) - but the field that differentiates them doesn't come through.
I tried truncating the table and re-adding all of the data into it and re-running and the bad behavior persists. Quite confused here - I've never had this kind of issue with Entity Framework before.
I even ran SQL Profiler when VS was executing and trapped the query Entity Framework is firing to populate zAccounts. That query when fired by itself in SQL Server correctly shows the four 7066550 rows. This seems to be squarely on Entity Framework and the ToList() call that populates the full collection - ideas anyone?
Short answer - make sure the table in the Entity Framework model has an Entity Key on a column where the values of the column are unique.
Longer answer - to troubleshoot I ran SQL Profiler to ensure that the query EF was sending to SQL Server was correct - and it was. I ran that query and inspected the results to see the data I was wanting. The problem was my model. I had an Entity Key set on a field that did not contain unique values. My guess is that EF assumes that since the field is set as the Entity Key, the values must be unique. Based on that it somehow indexes or caches the first row where the "id" is and then projects that row's values into query results. That is a bad assumption in my view if there is not a validation check of the field marked as the Entity Key. I realize I'm to blame here for telling it to use a non-unique field as the Entity Key - but I don't see the case where this would be a good idea without it throwing at least a warning.
Anyway, to resolve, I added a proper id column to the table and set it's Identiy spec and auto-increment so that any rows in the table would have a unique id. After that, I updated my edmx to use my new column as the Entity Key and re-ran my code and then everything magically started working.

Crystal Reports: If/Else formula giving wrong results

I am writing a Crystal report with 2 separate queries - one using DB2 database and one using a SQL database. I linked the queries together using a field, though I did have to convert the data type on one so it could be linked (not sure if this is relevant).
The DB2 query is giving me all of my customer data displayed on the report, including a customer ID. The only thing I'm pulling from the SQL DB is the same customer ID, just from another table. If the customer ID from the DB2 query is found in the customer ID list from SQL, I do not want to display that customer record.
First, I created a formula called "CustID" like this:
if {Command.Customer_ID} = {Command1.Customer_ID} then "True" else "False"
I then wrote a Record for Select Expert like this:
{#CustID} = "False"
This is giving me no records returned at all. I feel like I have a lapse in my logic somewhere but I can figure out where. Thanks in advance.

#BatchFetch type JOIN

I'm confused about this annotation for an entity field that is of type of another entity:
#BatchFetch(value = BatchFetchType.JOIN)
In the docs of EclipseLink for BatchFetch they explain it as following:
For example, consider an object with an EMPLOYEE and PHONE table in
which PHONE has a foreign key to EMPLOYEE. By default, reading a list
of employees' addresses by default requires n queries, for each
employee's address. With batch fetching, you use one query for all the
addresses.
but I'm confused about the meaning of specifying BatchFetchType.JOIN. I mean, doesn't BatchFetch do a join in the moment it retrieves the list of records associated with employee? The records of address/phone type are retrieved using the foreign key, so it is a join itself, right?
The BatchFetch type is an optional parameter, and for join it is said:
JOIN – The original query's selection criteria is joined with the
batch query
what does this means? Isn't the batch query a join itself?
Joining the relationship and returning the referenced data with the main data is a fetch join. So a query that brings in 1 Employee that has 5 phones, results in 5 rows being returned, with the data in Employee being duplicated for reach row. When that is less ideal, say a query over 1000 employees, you resort to a separate batch query for these phone numbers. Such a query would run once to return 1000 employee rows, and then run a second query to return all employee phones needed to build the read in employees.
The three batch query types listed here then determine how this second batch query gets built. These will perform differently based on the data and database tuning.
JOIN - Works much the same away a fetch join would, except it only returns the Phone data.
EXISTS - This causes the DB to execute the initial query on Employees, but uses the data in an Exists subquery to then fetch the Phones.
IN - EclipseLink agregates all the Employee IDs or values used to reference Phones, and uses them to filter Phones directly.
Best way to find out is always to try it out with SQL logging turned on to see what it generates for your mapping and query. Since these are performance options, you should test them out and record the metrics to determine which works best for your application as its dataset grows.

SSRS - Data Driven Subscription -

What I'm trying to do:
I have a report already created that looks for something existing in one database and not in another. 99% of the time the report comes up empty. We do not need to know when there are no results to show. I only want to know when the query returns a result.
What I've done so far:
I have a the Data Source created and a table (view) created to where I can query for Subscriber information.
What I hope can be answered:
Is it all possible to have this report run and email my selected Subscribers only when there is data in the output?
I see you've already looked into Data-Driven subscriptions. You should be able to write your query in the data-driven subscription to test if the report should return results, and if not, send it to a dummy address, and only send it to your subscriber list if there will be data in it.
If you put the dummy address in your table with an IsDummy flag column, you could do something like this:
SELECT [EmailTo]
FROM SubscriptionTable
WHERE IsDummy=0
AND (SELECT COUNT(*) FROM SomeTable)>0 --report should have results
UNION ALL
SELECT [EmailTo]
FROM SubscriptionTable
WHERE IsDummy=1
AND (SELECT COUNT(*) FROM SomeTable)=0 --report should not have results
And that's only one way, there are probably lots of other ways that might suit your needs as well or better.

Linq To Entities - How to create a query where the table name is a parameter

Dynamic queries are not dynamic enough. I have seen solutions like this but still I have to indicate which table to use as the basis:
var query = db.Customers.Where("...").OrderBy("...").Select("...");
I want to create a simple query tool where the user will select from available tables using a drop-down list. As the result, I want to show first few records. Therefore, I need to change the table too! That is, I need something like this:
string selectedTable = "Customers";
var [tableName] = SomeTypecastingOperations(selectedTable);
var query = db.[tableName].Where("...").OrderBy("...").Select("...");
Is EF dynamic enough to handle this?
Linq-to-entities doesn't support that. You can achieve that with Entity SQL or some ugly code which will have conditional logic for every set you want to query (like a big switch for table names).