Get Objects from an ObjectSet by specifying a Range in EF - entity-framework

I am trying out EF 4.0.I have an Employee Object and using EF I am getting the Employee ObjectSet by simply calling
Context.Employees
Now the above call will spit following sql query
select * from Employees
The above query works fine and I don't have any complains on it however as you know this will not be performant enough if you have few millions of records in the table and it will definitely effect the performance.
So then I am trying to figure out a way to give a range to my ObjectSet where I can say something like get me records 30 to 60 from the Employee ObjectSet.
Is there a way to do something like this.
Any suggestions will be deeply appreciated.
Update:
I am trying to do this to get 20 Employees (page size) back based on the page index.
Thanks in advance.
NiK...

Ok, I finally figured out a way to do this which I think is pretty decent. And here is what I did.
I used the Skip and Take methods in IQueryable to skip and take objects based on the page index.
So I used the following code:
var empList = context.Employees.OrderBy("it.CreatedDate").Skip(pageIndex * 20 - 20).Take(20);
This is one way.
If anyone feel that this is not a good solution you are more than welcome to come up with something else which I can replace with.
Updated Code
As per Yury Tarabanko's suggestion I changed my code as follows:
var empList = context.Employees.OrderBy(x=>x.CreatedDate).Skip(pageIndex * 20 - 20).Take(20);
Thanks for those who took time to read my question.
Thnq,
NiK...

Related

Converting SQL query with FORMAT command to use in entity framework core

I have an SQL query:
SELECT
FORMAT(datetime_scrapped, 'MMMM-yy') [date],
count(FORMAT(datetime_scrapped, 'MMMM-yy')) as quantity
FROM scrap_log
GROUP BY FORMAT(datetime_scrapped, 'MMMM-yy')
It basically summarises all the entries in the scrap_log table by month/year and counts how many entries are in each month/year. Returns two columns (date and quantity). But I need to execute this in an ASP.NET core API using Entity Framework core. I tried using .fromSqlRaw(), but this expects all columns to be returned and so doesn't work.
I can find plenty of info on EF to implement group by and count etc... But I cannot find anything for the FORMAT(datetime, "MMMM-yy") part. Please could somebody explain to me how to do this?
EDIT: Seems already I appear to be going about this the wrong way in terms of efficiency. I will look into alternative solutions based on comments already made. Thanks for the fast response.

Shuffle ID's on table rails 4 [duplicate]

User.find(:all, :order => "RANDOM()", :limit => 10) was the way I did it in Rails 3.
User.all(:order => "RANDOM()", :limit => 10) is how I thought Rails 4 would do it, but this is still giving me a Deprecation warning:
DEPRECATION WARNING: Relation#all is deprecated. If you want to eager-load a relation, you can call #load (e.g. `Post.where(published: true).load`). If you want to get an array of records from a relation, you can call #to_a (e.g. `Post.where(published: true).to_a`).
You'll want to use the order and limit methods instead. You can get rid of the all.
For PostgreSQL and SQLite:
User.order("RANDOM()").limit(10)
Or for MySQL:
User.order("RAND()").limit(10)
As the random function could change for different databases, I would recommend to use the following code:
User.offset(rand(User.count)).first
Of course, this is useful only if you're looking for only one record.
If you wanna get more that one, you could do something like:
User.offset(rand(User.count) - 10).limit(10)
The - 10 is to assure you get 10 records in case rand returns a number greater than count - 10.
Keep in mind you'll always get 10 consecutive records.
I think the best solution is really ordering randomly in database.
But if you need to avoid specific random function from database, you can use pluck and shuffle approach.
For one record:
User.find(User.pluck(:id).shuffle.first)
For more than one record:
User.where(id: User.pluck(:id).sample(10))
I would suggest making this a scope as you can then chain it:
class User < ActiveRecord::Base
scope :random, -> { order(Arel::Nodes::NamedFunction.new('RANDOM', [])) }
end
User.random.limit(10)
User.active.random.limit(10)
While not the fastest solution, I like the brevity of:
User.ids.sample(10)
The .ids method yields an array of User IDs and .sample(10) picks 10 random values from this array.
Strongly Recommend this gem for random records, which is specially designed for table with lots of data rows:
https://github.com/haopingfan/quick_random_records
All other answers perform badly with large database, except this gem:
quick_random_records only cost 4.6ms totally.
the accepted answer User.order('RAND()').limit(10) cost 733.0ms.
the offset approach cost 245.4ms totally.
the User.all.sample(10) approach cost 573.4ms.
Note: My table only has 120,000 users. The more records you have, the more enormous the difference of performance will be.
UPDATE:
Perform on table with 550,000 rows
Model.where(id: Model.pluck(:id).sample(10)) cost 1384.0ms
gem: quick_random_records only cost 6.4ms totally
For MYSQL this worked for me:
User.order("RAND()").limit(10)
You could call .sample on the records, like: User.all.sample(10)
The answer of #maurimiranda User.offset(rand(User.count)).first is not good in case we need get 10 random records because User.offset(rand(User.count) - 10).limit(10) will return a sequence of 10 records from the random position, they are not "total randomly", right? So we need to call that function 10 times to get 10 "total randomly".
Beside that, offset is also not good if the random function return a high value. If your query looks like offset: 10000 and limit: 20 , it is generating 10,020 rows and throwing away the first 10,000 of them,
which is very expensive. So call 10 times offset.limit is not efficient.
So i thought that in case we just want to get one random user then User.offset(rand(User.count)).first maybe better (at least we can improve by caching User.count).
But if we want 10 random users or more then User.order("RAND()").limit(10) should be better.
Here's a quick solution.. currently using it with over 1.5 million records and getting decent performance. The best solution would be to cache one or more random record sets, and then refresh them with a background worker at a desired interval.
Created random_records_helper.rb file:
module RandomRecordsHelper
def random_user_ids(n)
user_ids = []
user_count = User.count
n.times{user_ids << rand(1..user_count)}
return user_ids
end
in the controller:
#users = User.where(id: random_user_ids(10))
This is much quicker than the .order("RANDOM()").limit(10) method - I went from a 13 sec load time down to 500ms.

Tableau rawsqlagg_real

Could somebody please give me a little guidance on rawsqlagg_real function in Tableau. What is right syntax for it when it is used to get data from MySQL.
I used it as per my understanding but I am getting an error "No such column [__measure__3]".
Code:
RAWSQLAGG_REAL("select count(Film Id) from flavia.TableforThe_top_10percent_of_the_user where count(distinct(User Id)) = %1",[it sucks])
I see a few issues here
Instead of WHERE, use HAVING
You have column names like Film Id, you should write them as 'Film Id' instead
Though I must say that it is better to do with LOD calculations as Tableau will be able to do better query optimizations that way. Plus it is less error prone and much easier to write.
I find another issue here in addition to using having instead of where. The filter value should be numeric, or the operator should be like and not =.
where count(distinct(User Id)) = **%1**

Entity Framework: Convert.ToDecimal not supported, any ideas? EF gives an error

I have been doing queries in EF and everything working great but now i have in the db 2 fields that are actually CHAR.. They hold a date but in the form of a number, in SQL Management Studio i can do date1 >= date2 for example and i can also check to see if a number i have falls in between these 2 dates.
Its nothing unusual, but basically a field that represents a date (the number grows as the date does)...
Now in EF when i try to do >= it states you can't do this on a string, ok understand its c# so i tried doing Convert.ToDecimal(date1) but it gives me an error saying that its not supported.
I have no option of changing the db fields, they are set in stone :-(
the way i got it to work was request of details and do a .ToList and then use the .ToDecimal and it works but of course this is doing it in memory! and this defeats the object of EF i.e. for example adding to the query using iqueryable.
Another way i got it to work was to pass the SQL query to SqlQuery of the dbcontext but again i lose a lot of ef functionality.
Can anyone help?
I am really stuck
As you say that you tried >= I assume that it would work for you if you could do that in plain SQL. And that is possible by doing
String.Compare(date1, date2) >= 0
EF is smart enough to translate that into a >= operator.
The advantage is that you do not need to compare converted values, so indexes can be used in execution plans.
First of all, you can at least enable deferred execution of the query by using AsEnumerable() instead of ToList(). This won't change the fact that the database would need to return all the records when you do in fact execute the query, however.
To let the database perform the filtering, you need your query to be compatible with SQL. Since you can't do ToDecimal() in SQL, you need to work with strings directly by converting your myvar to a string that is in the same format as dateStart and dateEnd, then form your query.

Hand made queries vs findDependentRowset

I have built a pretty big application with Zend and i was wondering which would be better, building query by hand (using the Zend object model)
$db->select()
->form('table')
->join('table2',
'table.id = table2.table_id')
or going with the findDependentRowset method (Zend doc for findDependentRowSet).
I was wondering since i did a test to fetch data over multiple tables and display all the informations from a table and the findDependentRowset seemed to run slower. I might be wrong but i guess it makes a new query every time findDependentRowset is called as in :
$table1 = new Model_Table1;
$rowset = $table1-fetchAll();
foreach($rowset as $row){
$table2data = $row->findDependentRowset('Model_Table2', 'Map');
echo $row['field'] . ' ' . $table2data['field'];
}
So, which one is better and is there a way using findDependentRowset to build complexes queries that could span over 5 tables which would run as fast as a hand made query?
Thanks
Generally, build your own query is the best way to go, because zend will create just one object (or set of objects) and do just one query.
If you use findDependentRowset Zend will perform another query and build another object (or set) with the result for each call.
You should use this only in very specific cases.
See this question: PHP - Query single value per iteration or fetch all at start and retrieve from array?