I'm using MVC5 along with EF6 to develop an application. I'm using SQL Server Express for database. I have two tables/entities in the database.
Vehicle - Contains Information about the vehicles
GsmDeviceLog - Contains data received from GPS Tracker Fit in the vehicle.
My GsmDeviceLogs table currently has around 20K records and it takes around 90 Seconds to execute the below code. Just to fetch one record(i.e. The Last Record).
Here is the code:
var dlog = db.Vehicles.Find(2).GsmDeviceLogs.LastOrDefault();
When I try to open the table using Server explorer it shows ALL the data with in 5-10 seconds. Can anyone help me get the details loaded quickly on the page as well.
Thanks for reading and paying attention to the question.
Can anyone suggest any means to reduce the time.
Your query should look like this:
var dlog = db.Vehicles
.Where(v => v.Id == 2)
.SelectMany(v => v.GsmDeviceLogs)
.OrderByDescending(gdl => gdl.Id) // or order by some datetime
.FirstOrDefault();
In your original query you are loading the Vehicle with Find. Then accessing the GsmDeviceLogs collection loads all logs of that vehicle into memory by lazy loading and then you pick the last one from the loaded collection in memory. It's probably the loading of all logs that consumes too much time.
The query above is executed completely in the database and returns only one GsmDeviceLog record. Side note: You must use OrderByDescending(...).FirstOrDefault here because LastOrDefault is not supported with LINQ to Entities.
Your Linq query is inefficient. try doing db.Vehicles.Single( x => x.Id).GsmDeviceLogs.OrderByDescending(x => x.Id).FirstOrDefault() or whatever your primary keys are
Related
I have a simple query
var count = await _context.ExchangeRate.AsNoTracking().CountAsync(u => u.Currency == "GBP");
The table has only 3 Columns and 10 rows data.
When I tried to execute the query from Net 5 project it is taking around 2.3 seconds for the first time and 500ms (+- 100) for subsequent requests. When I hit the same request in SSMS it is returning in almost no time (45ms as seen in sql profiler).
I have implemented ARITHABORT ON in EF from here
When I see in SQL Profiler it is setting ARITHABORT ON but still the query takes the same time for the first request and subsequent requests.
How do I achieve speed same as SSMS query speed. I need the query to run really speed as my project has requirement to the return the response in 1 second (Need to make atleast 5 simple DB calls...if 1 call takes 500ms then it is crossing 1 second requirement)
Edit
Tried with even ADO.Net. The execution time took as seen in SQL Profiler is 40ms where as when it reached the code it is almost 400ms. So much difference
using (var conn = new SqlConnection(connectionString))
{
var sql = "select count(ExchangeRate) as cnt from ExchangeRate where Currency = 'GBP'";
SqlCommand cmd = new SqlCommand();
cmd.CommandText = "SET ARITHABORT ON; " + sql;
cmd.CommandType = CommandType.Text;
cmd.Connection = conn;
conn.Open();
var t1 = DateTime.Now;
var rd = cmd.ExecuteReader();
var t2 = DateTime.Now;
TimeSpan diff = t2 - t1;
Console.WriteLine((int)diff.TotalMilliseconds);
while (rd.Read())
{
Console.WriteLine(rd["cnt"].ToString());
}
conn.Close();
}
Your "first run" scenario is generally the one-off static initialization of the DbContext. This is where the DbContext works out its mappings for the first time and will occur when the first query is executed. The typical approach to avoid this occurring for a user is to have a simple "warm up" query that runs when the service starts up.. For instance after your service initializes, simply put something like the following:
// Warm up the DbContext
using (var context = new AppDbContext())
{
var hasUser = context.Users.Any();
}
This also serves as a quick start-up check that the database is reachable and responding. The query itself will do a very quick operation, but the DbContext will resolve its mappings at this time so any newly generated DbContext instances will respond without incurring that cost during a request.
As for raw performance, if it isn't a query that is expected to take a while and tie up a request, don't make it async. Asynchronous requests are not faster, they are actually a bit slower. Using async requests against the DbContext is about ensuring your web server / application thread is responsive while potentially expensive database operations are processing. If you want a response as quickly as possible, use a synchronous call.
Next, ensure that any fields you are filtering against, in this case Currency, are indexed. Having a field called Currency in your entity as a String rather than a CurrencyId FK (int) pointing to a Currency record is already an extra indexing expense as indexes on integers are smaller/faster than those on strings.
You also don't need to bother with AsNoTracking when using a Count query. AsNoTracking applies solely when you are returning entities (ToList/ToArray/Single/Firstetc.) to avoid having the DbContext holding onto a reference to the returned entity. When you use Count/Any or projection to return properties from entities using Select there is no entity returned to track.
Also consider network latency between where your application code is running and the database server. Are they the same machine or is there a network connection in play? How does this compare when you are performing a SSMS query? Using a profiler you can see what SQL EF is actually sending to the database. Everything else in terms of time is a cost of: Getting the request to the DB, Getting the resulting data back to the requester, parsing that response. (If in the case where you are returning entities, allocating, populating, checking against existing references, etc... In the case of counts etc. checking existing references)
Lastly, to ensure you are getting the peak performance, ensure that your DbContexts lifetimes are kept short. If a DbContext is kept open and has had a number of tracking queries run against in (Selecting entities without AsNoTracking) those tracked entity references accumulate and can have a negative performance impact on future queries, even if you use AsNoTracking as EF looks to check through it's tracked references for entities that might be applicable/related to your new queries. Many times I see developers assume DbContexts are "expensive" so they opt to instantiate them as little as possible to avoid those costs, only to end up making operations more expensive over time.
With all that considered, EF will never be as fast as raw SQL. It is an ORM designed to provide convenience to .Net applications when it comes to working with data. That convenience in working with entity classes rather than sanitizing and writing your own raw SQL every time comes with a cost.
I have some data stored in a database (MongoDB) and in distributed cache redis.
While querying to the repository, I am using lazy loading approach which first finds the data in the cache if it's available, if not find it in the database and update the cache as well so that next time when the requirement comes it should be found in the cache.
Sample Model Used:
Person ( id, name, age, address (Reference))
Address (id, place)
PersonCacheModel extends Person with addressId.
I am not storing parent object with child object together in the cache that is why I've created personCacheModel with addressId and store this object in the cache and while getting the data personCacheModel converts to person and make a call to address repo to addressCache to fill the address details of the person object.
As far as I understand:
personRepository.findPersonByName(NAME + randomNumber);
Access Data from Cache = network time + cache access time + deserialize time
Access Data from database = network time + database query time + object mapping time
When I ran above approach for 1000 rows, accessing data from the database is faster than the accessing data from the cache. I believe cache access time must be smaller than the accessing MongoDB.
Please let me know if there's an issue with the approach or is this is the expected scenario.
to have a valid benchmark we need to consider hardware side and data processing side:
hardware - do we have same configuration, RAM, CPUs count, OS... etc
process - how data is transformed (on single thread, multi thread, per object, per request)
Performing a load test on your data set will give you an good overview of which process is faster in particular use case scenario.
It is hard to judge - what it should be as long as there mentioned above points will be know for us.
The other thing is to have more than one test scenario and have it stressed in let's say 10 sec time, minute , 5 an hour... so you can have digits that will tell you the truth.
To make things short, I have to make a script in Second Life communicating with an AppEngine app updating records in an ndb database. Records extracted from the database are sent as a batch (a page) to the LSL script, which updates customers, then asks the web app to mark these customers as updated in the database.
To create the batch I use a query on a (integer) property update_ver==0 and use fetch_page() to produce a cursor to the next batch. This cursor is also sent as urlsafe()-encoded parameter to the LSL script.
To mark the customer as updated, the update_ver is set to some other value like 2, and the entity is updated via put_async(). Then the LSL script fetches the next batch thanks to the cursor sent earlier.
My rather simple question is: in the web app, since the query property update_ver no longer satisfies the filter, is my cursor still valid ? Or do I have to use another strategy ?
Stripping out irrelevant parts (including authentication), my code currently looks like this (Customer is the entity in my database).
class GetCustomers(webapp2.RequestHandler): # handler that sends batches to the update script in SL
def get(self):
cursor=self.request.get("next",default_value=None)
query=Customer.query(Customer.update_ver==0,ancestor=customerset_key(),projection=[Customer.customer_name,Customer.customer_key]).order(Customer._key)
if cursor:
results,cursor,more=query.fetch_page(batchsize,start_cursor=ndb.Cursor(urlsafe=cursor))
else:
results,cursor,more=query.fetch_page(batchsize)
if more:
self.response.write("more=1\n")
self.response.write("next={}\n".format(cursor.urlsafe()))
else:
self.response.write("more=0\n")
self.response.write("n={}\n".format(len(results)))
for c in results:
self.response.write("c={},{},{}\n".format(c.customer_key,c.customer_name,c.key.urlsafe()))
self.response.set_status(200)
The handler that updates Customer entities in the database is the following. The c= parameters are urlsafe()-encoded entity keys of the records to update and the nv= parameter is the new version number for their update_ver property.
class UpdateCustomer(webapp2.RequestHandler):
#ndb.toplevel # don't exit until all async operations are finished
def post(self):
updatever=self.request.get("nv")
customers=self.request.get_all("c")
for ckey in customers:
cust=ndb.Key(urlsafe=ckey).get()
cust.update_ver=nv # filter in the query used to produce the cursor was using this property!
cust.update_date=datetime.datetime.utcnow()
cust.put_async()
else:
self.response.set_status(403)
Will this work as expected ? Thanks for any help !
Your strategy will work and that's the whole point for using these cursors, because they are efficient and you can get the next batch as it was intended regardless of what happened with the previous one.
On a side note you could also optimise your UpdateCustomer and instead of retrieving/saving one by one you can do things in batches using for example the ndb.put_multi_async.
I have two apps: one app is asp.net and another is a windows service running in background.
The windows service running in background is performing some tasks (read and update) on database while user can perform other operations on database through asp.net app. So I am worried about it as for example, in windows service I collect some record that satisfy a condition and then I iterate over them, something like:
IQueryable<EntityA> collection = context.EntitiesA.where(<condition>)
foreach (EntityA entity in collection)
{
// do some stuff
}
so, if user modify a record that is used later in the loop iteration, what value for that record is EF taken into account? the original retrieved when performed:
context.EntitiesA.where(<condition>)
or the new one modified by the user and located in database?
As far as I know, during iteration, EF is taken each record at demand, I mean, one by one, so when reading the next record for the next iteration, this record corresponds to that collected from :
context.EntitiesA.where(<condition>)
or that located in database (the one the user has just modified)?
Thanks!
There's a couple of process that will come into play here in terms of how this will work in EF.
Queries are only performed on enumeration (this is sometimes referred to as query materialisation) at this point the whole query will be performed
Lazy loading only effects navigation properties in your above example. The result set of the where statement will be pulled down in one go.
So what does this mean in your case:
//nothing happens here you are just describing what will happen later to make the
// query execute here do a .ToArray or similar, to prevent people adding to the sql
// resulting from this use .AsEnumerable
IQueryable<EntityA> collection = context.EntitiesA.where(<condition>);
//when it first hits this foreach a
//SELECT {cols} FROM [YourTable] WHERE [YourCondition] will be performed
foreach (EntityA entity in collection)
{
//data here will be from the point in time the foreach started (eg if you have updated during the enumeration in the database you will have out of date data)
// do some stuff
}
If you're truly concerned that this can happen then get a list of id's up front and process them individually with a new DbContext for each (or say after each batch of 10). Something like:
IList<int> collection = context.EntitiesA.Where(...).Select(k => k.id).ToList();
foreach (int entityId in collection)
{
using (Context context = new Context())
{
TEntity entity = context.EntitiesA.Find(entityId);
// do some stuff
context.Submit();
}
}
I think the answer to your question is 'it depends'. The problem you are describing is called 'non repeatable reads' an can be prevented from happening by setting a proper transaction isolation level. But it comes with a cost in performance and potential deadlocks.
For more details you can read this
I have been developing some single-user desktop apps using Entity Framework and SQL Server 3.5. I thought I had read somewhere that once records are in an EF cache for one context, if they are deleted using a different context, they are not removed from the cache for the first context even when a new query is executed. Hence, I've been writing really inefficient and obfuscatory code so I can dispose the context and instantiate a new one whenever another method modifies the database using its own context.
I recently discovered some code where I had not re-instantiated the first context under these conditions, but it worked anyway. I wrote a simple test method to see what was going on:
using (UnitsDefinitionEntities context1 = new UnitsDefinitionEntities())
{
List<RealmDef> rdl1 = (from RealmDef rd in context1.RealmDefs
select rd).ToList();
RealmDef rd1 = RealmDef.CreateRealmDef(100, "TestRealm1", MeasurementSystem.Unknown, 0);
context1.RealmDefs.AddObject(rd1);
context1.SaveChanges();
int rd1ID = rd1.RealmID;
using (UnitsDefinitionEntities context2
= new UnitsDefinitionEntities())
{
RealmDef rd2 = (from RealmDef r in context2.RealmDefs
where r.RealmID == rd1ID select r).Single();
context2.RealmDefs.DeleteObject(rd2);
context2.SaveChanges();
rd2 = null;
}
rdl1 = (from RealmDef rd in context1.RealmDefs select rd).ToList();
Setting a breakpoint at the last line I was amazed to find that the added and deleted entity was in fact not returned by the second query on the first context!
I several possible explanations:
I am totally mistaken in my understanding that the cached records
are not removed upon requerying.
EF is capricious in its caching and it's a matter of luck.
Caching has changed in EF 4.1.
The issue does not arise when the two contexts are
instantiated in the same process.
Caching works differently for SQL CE 3.5 than other versions of SQL
server.
I suspect the answer may be one of the last two options. I would really rather not have to deal with all the hassles in constantly re-instantiating contexts for single-user desktop apps if I don't have to do so.
Can I rely on this discovered behavior for single-user desktop apps using SQL CE (3.5 and 4)?
When you run the 2nd query on an the ObjectSet it's requerying the database, which is why it's reflecting the change exposed by your 2nd context. Before we go too far into this, are you sure you want to have 2 contexts like you're explaining? Contexts should be short lived, so it might be better if you're caching your list in memory or doing something else of that nature.
That being said, you can access the local store by calling ObjectStateManager.GetObjectStateEntries and viewing what is in the store there. However, what you're probably looking for is the .Local storage that's provided by DbSets in EF 4.2 and beyond. See this blog post for more information about that.
Judging by your class names, it looks like you're using an edmx so you'll need to make some changes to your file to have your context inherit from a DbSet to an objectset. This post can show you how
Apparently Explanation #1 was closer to the fact. Inserting the following statement at the end of the example:
var cached = context1.ObjectStateManager.GetObjectStateEntries(System.Data.EntityState.Unchanged);
revealed that the record was in fact still in the cache. Mark Oreta was essentially correct in that the database is actually re-queried in the above example.
However, navigational properties apparently behave differently, e.g.:
RealmDef distance = (from RealmDef rd in context1.RealmDefs
where rd.Name == "Distance"
select rd).Single();
SystemDef metric = (from SystemDef sd in context1.SystemDefs
where sd.Name == "Metric"
select sd).Single();
RealmSystem rs1 = (from RealmSystem rs in distance.RealmSystems
where rs.SystemID == metric.SystemID
select rs).Single();
UnitDef ud1 = UnitDef.CreateUnitDef(distance.RealmID, metric.SystemID, 100, "testunit");
rs1.UnitDefs.Add(ud1);
context1.SaveChanges();
using (UnitsDefinitionEntities context2 = new UnitsDefinitionEntities())
{
UnitDef ud2 = (from UnitDef ud in context2.UnitDefs
where ud.Name == "testunit"
select ud).Single();
context2.UnitDefs.DeleteObject(ud2);
context2.SaveChanges();
}
udList = (from UnitDef ud in rs1.UnitDefs select ud).ToList();
In this case, breaking after the last statement reveals that the last query returns the deleted entry from the cache. This was my source of confusion.
I think I now have a better understanding of what Julia Lerman meant by "Query the model, not the database." As I understand it, in the previous example I was querying the database. In this case I am querying the model. Querying the database in the previous situation happened to do what I wanted, whereas in the latter situation querying the model would not have the desired effect. (This is clearly a problem with my understanding, not with Julia's advice.)