Is it possible to improve EF6 WarmUp time? - entity-framework

I have an application in which I verify the following behavior: the first requests after a long period of inactivity take a long time, and timeout sometimes.
Is it possible to control how the entity framework manages dispose of the objects? Is it possible mark some Entities to never be disposed?
...in order to avoid/improve the warmup time?
Regards,

The reasons that similar queries will have an improved response time are manifold.
Most Database Management Systems cache parts of the fetched data, so that similar queries in the near future will be faster. If you do query Teachers with their Students, then the Teachers table will be joined with the Students table. This join result is quite often cached for a while. The next query for Teachers with their Students will reuse this join result and thus become faster
DbContext caches queried object. If you select a Single teacher, or Find one, it is kept in local memory. This is to be able to detect which items are changed when you call SaveChanges. If you Find the same Teacher again, this query will be faster. I'm not sure if the same happens if you query 1000 Teachers.
When you create a DbContext object, the initializer is checked to see if the model has been changed or not.
So it might seem wise not to Dispose() a created DbContext, yet you see that most people keep the DbContext alive for a fairly short time:
using (var dbContext = new MyDbContext(...))
{
var fetchedTeacher = dbContext.Teachers
.Where(teacher => teacher.Id = ...)
.Select(teacher => new
{
Id = teacher.Id,
Name = teacher.Name,
Students = teacher.Students.ToList(),
})
.FirstOrDefault();
return fetchedTeacher;
}
// DbContext is Disposed()
At first glance it would seem that it would be better to keep the DbContext alive. If someone asks for the same Teacher, the DbContext wouldn't have to ask the database for it, it could return the local Teacher..
However, keeping a DbContext alive might cause that you get the wrong data. If someone else changes the Teacher between your first and second query for this Teacher, you would get the old Teacher data.
Hence it is wise to keep the life time of a DbContext as short as possible.
Is there nothing I can do to improve the speed of the first query?
Yes you can!
One of the first things you could do is to set the initialize of your database such that it doesn't check the existence and model of the database. Of course you can only do this when you are fairly sure that your database exists and hasn't changed.
// constructor; disables initializer
public SchoolDBContext() : base(...)
{
//Disable initializer
Database.SetInitializer<SchoolDBContext>(null);
}
Another thing could be, if you already have fetched your object to update the database, and you are sure that no one else changed the object, you can Attach it, instead of fetching it again, as is shown in this question
Normal usage:
// update the name of the teacher with teacherId
void ChangeTeacherName(int teacherId, string name)
{
using (var dbContext = new SchoolContext(...))
{
// fetch the teacher, change the name and save
Teacher fetchedTeacher = dbContext.Teachers.Find(teacherId);
fetchedTeader.Name = name;
dbContext.SaveChanges();
}
}
Using Attach to update an earlier fetched Teacher:
void ChangeTeacherName (Teacher teacher, string name)
{
using (var dbContext = new SchoolContext(...))
{
dbContext.Teachers.Attach(teacher);
dbContext.Entry(teacher).Property(t => t.Name).IsModified = true;
dbContext.SaveChanges();
}
}
Using this method doesn't require to fetch the Teacher again. During SaveChanges the value of IsModified of all properties of all Attached items is checked. If needed they will be updated.

Related

ExecuteStoreCommand "Delete" returning different record count to "DeleteObject" post delete, why?

Got a strange one here. I am using EF 6 over SQL Server 2012 and C#.
If I delete a record, using DeleteObject, I get:
//order.orderitem count = 11
db.OrderItem.DeleteObject(orderitem);
db.SaveChanges();
var order = db.order.First(r => r.Id == order.id);
//Order.OrderItem count = 10, CORRECT
If I delete an Order Item, using ExecuteStoreCmd inline DML, I get:
//order.orderitem count = 11
db.ExecuteStoreCommand("DELETE FROM ORDERITEM WHERE ID ={0}", orderitem.Id);
var order = db.Order.First(r => r.Id == order.id);
//order.orderitem count = 11, INCORRECT, should be 10
So the ExecuteStoreCommand version reports 11, however the OrderItem is definitely deleted from the DB, so it should report 10. Also I would have thought First() does an Eager search thus repopulating the "order.orderitem" collection.
Any ideas why this is happening? Thanks.
EDIT: I am using ObjectContext
EDIT2: This is the closest working solution I have using "detach". Interestingly the "detach" actually takes about 2 secs ! Not sure what it is doing, but it works.
db.ExecuteStoreCommand("DELETE FROM ORDERITEM WHERE ID ={0}", orderitem.Id);
db.detach(orderitem);
It would be quicker to requery and repopulate the dataset. How can I force a requery? I thought the following would do it:
var order = db.order.First(r => r.Id == order.id);
EDIT3: This seems to work to force a refresh post delete, but still take about 2 secs:
db.Refresh(RefreshMode.StoreWins,Order.OrderItem);
I am still not really understanding why one cannot just requery as a Order.First(r=>r.id==id) type query oftens take much less than 2 secs.
This would likely be because the Order and it's order items are already known to the context when you perform the ExecuteStoredCommand. EF doesn't know that the command relates to any cached copy of Order, so the command will be sent to the database, but not update any loaded entity state. WHere-as the first one would look for any loaded OrderItem, and when told to remove it from the DbSet, it would look for any loaded entities that reference that order item.
If you don't want to ensure the entity(ies) are loaded prior to deleting, then you will need to check if any are loaded and refresh or detach their associated references.
If orderitem represents an entity should just be able to use:
db.OrderItems.Remove(orderitem);
If the order is loaded, the order item should be removed automatically. If the order isn't loaded, no loss, it will be loaded from the database when requested later on and load the set of order items from the DB.
However, if you want to use the SQL execute approach, detaching any local instance should remove it from the local cache.
db.ExecuteStoreCommand("DELETE FROM ORDERITEM WHERE ID ={0}", orderitem.Id);
var existingOrderItem = db.OrderItems.Local.SingleOrDefault(x => x.Id == orderItem.Id);
if(existingOrderItem != null)
db.Entity(existingOrderItem).State = EntityState.Detached;
I don't believe you will need to check for the orderItem's Order to refresh anything beyond this, but I'm not 100% sure on that. Generally though when it comes to modifying data state I opt to load the applicable top-level entity and remove it's child.
So if I had a command to remove an order item from an order:
public void RemoveOrderItem(int orderId, int orderItemId)
{
using (var context = new MyDbContext())
{
// TODO: Validate that the current user session has access to this order ID
var order = context.Orders.Include(x => x.OrderItems).Single(x => x.OrderId == orderId);
var orderItem = order.OrderItems.SingleOrDefault(x => x.OrderItemId == orderItemId);
if (orderItem != null)
order.OrderItems.Remove(orderItem);
context.SaveChanges();
}
}
The key points to this approach.
While it does mean loading the data state again for the operation, this load is by ID so it's fast.
We can/should validate that the data requested is applicable for the user. Any command for an order they should not access should be logged and the session ended.
We know we will be dealing with the current data state, not basing decisions on values/data from the point in time that data was first read.

Entity Framework, DbSet<T>, Dispose() performance questions

This is my first post :) I'm new to MVC .NET. And have some questions in regards to Entity Framework functionality and performance. Questions inline...
class StudentContext : DbContext
{
public StudentContext() : base("myconnectionstring") {};
public DbSet<Student> Students {get; set; }
...
}
Question: Does DbSet read all the records from the database Student table, and store it in collection Students (i.e. in memory)? Or does it simply hold a connection to this table, and (record) fetches are done at the time SQL is executed against the database?
For the following:
private StudentContext db = new StudentContext();
Student astudent = db.Students.Find(id);
or
var astudent = from s in db.Students
where s.StudentID == id)
select s;
Question: Which of these are better for performance? I'm not sure how the Find method works under-the-hood for a collection?
Question: When are database connections closed? During the Dispose() method call?
If so, should I call the Dispose() method for a class that has the database context instance? I've read here to use Using blocks.
I'm guessing for a Controller class get's instantiated, does work including database access, calls it's associated View, and then (the Controller) gets out of scope and is unloaded from memory. Or the garbase collector. But best call Dispose() to do cleanup explicitly.
The Find method looks in the DbContext for an entity which has the specified key(s). If there is no matching entity already loaded, the DbContext will makes a SELECT TOP 1 query to get the entity.
Running db.Students.Where(s => s.StudentID == id) will get you a sequence containing all the entities returned from a SQL query similar to SELECT * FROM Students WHERE StudentID = #id. That should be pretty fast; you can speed it up by using db.Students.FirstOrDefault(s => s.StudentID == id), which adds a TOP 1 to the SQL query.
Using Find is more efficient if you're loading the same entity more than once from the same DbContext. Other than that Find and FirstOrDefault are pretty much equivalent.
In neither case does the context load the entire table, nor does it hold open a connection. I believe the DbContext holds a connection until the DbContext is disposed, but it opens and closes the connection on demand when it needs to resolve a query.

JPA bulk insert if not exists

I have 3 entities - Player, Hand and PlayerHandStats. First two are regular tables, with ID as an PK. PlayerHandStats on the other hand has a composite PK (player_id, hand_id). Together they form some sort of a "package", that's why I am trying to persist them in one block. ParsedHand is just a class which bundles the entities - it contains 1 Hand, 2..10 Player and 2..10 PlayerHandStats. Below is my current naive approach, which doesn't really work.
public static void persist(ParsedHand parsedHand) {
EntityManager em = emf.createEntityManager();
em.getTransaction().begin();
em.persist(parsedHand.getHand());
Collection<Player> players = parsedHand.getPlayers().values();
for (Player player : players) {
em.persist(player);
}
Collection<PlayerHandStats> stats = parsedHand.getStats().values();
for (PlayerHandStats phs : stats) {
em.persist(stats);
}
em.getTransaction().commit();
em.close();
}
Problem is, that a specific Player entity may already exists in the DB - in that case the whole process terminates. I would like to keep it going, not perform any merge or update upon the entity, but retrieve its' ID (since at application level it has no ID assigned).
Quick example (NOTE: columns NAME and POKER_SITE form an UNIQUE CONSTRAINT):
ID |NAME |POKER_SITE
----+------------+------------
0 |neverlimp92 |PokerStars
1 |player01 |PokerStars
Now, let's say I have a Player entity at application level with fields (null, 'neverlimp92', 'PokerStars'). Obviously, it will return java.sql.SQLIntegrityConstraintViolationException, and the whole process terminates. How can I avoid this? Should I override the hashCode() and equals() methods, and perform
for (Player player : playes) {
if (!em.contains(player)) {
em.persist(player);
}
}
I am not sure if this is a smart thing to do, considering there may be potentially 10k's, even 100k's rows.
And also, if the entity does exists, what is the proper way of retrieving its ID and assigning it to the existing instance at the application level?
I have very little experience with JPA, and any kind of help, or pointing me in the right direction is much appreciated. Thank you.
If the players may be existing players, then for each player you need to query the database for the player with that name and site, if one exists use it, if it does not then persist it.
This depends on how likely it is that their is such a player. If it is very unlikely, then you could just persist it, and if the commit fails, then retry with the second algorithm.
In general it would be better if your player had its id, then you would know if it is new or existing.

Delete a child from an aggregate root

I have a common Repository with Add, Update, Delete.
We'll name it CustomerRepository.
I have a entity (POCO) named Customer, which is an aggregate root, with Addresses.
public class Customer
{
public Address Addresses { get; set; }
}
I am in a detached entity framework 5 scenario.
Now, let's say that after getting the customer, I choose to delete a client address.
I submit the Customer aggregate root to the repository, by the Update method.
How can I save the modifications made on the addresses ?
If the address id is 0, I can suppose that the address is new.
For the rest of the address, I can chose to attach all the addresses, and mark it as updated no matter what.
For deleted addresses I can see no workaround...
We could say this solution is incomplete and inefficient.
So how the updates of aggregate root childs should be done ?
Do I have to complete the CustomerRepository with methods like AddAddress, UpdateAddress, DeleteAddress ?
It seems like it would kind of break the pattern though...
Do I put a Persistence state on each POCO:
public enum PersistanceState
{
Unchanged,
New,
Updated,
Deleted
}
And then have only one method in my CustomerRepository, Save ?
In this case it seems that I am reinventing the Entity "Non-POCO" objects, and adding data access related attribute to a business object...
First, you should keep your repository with Add, Update, and Delete methods, although I personally prefer Add, indexer set, and Remove so that the repository looks like an in memory collection to the application code.
Secondly, the repository should be responsible for tracking persistence states. I don't even clutter up my domain objects with
object ID { get; }
like some people do. Instead, my repositories look like this:
public class ConcreteRepository : List<AggregateRootDataModel>, IAggregateRootRepository
The AggregateRootDataModel class is what I use to track the IDs of my in-memory objects as well as track any persistence information. In your case, I would put a property of
List<AddressDataModel> Addresses { get; }
on my CustomerDataModel class which would also hold the Customer domain object as well as the database ID for the customer. Then, when a customer is updated, I would have code like:
public class ConcreteRepository : List<AggregateRootDataModel>, IAggregateRootRepository
{
public Customer this[int index]
{
set
{
//Lookup the data model
AggregateRootDataModel model = (from AggregateRootDataModel dm in this
where dm.Customer == value
select dm).SingleOrDefault();
//Inside the setter for this property, run your comparison
//and mark addresses as needing to be added, updated, or deleted.
model.Customer = value;
SaveModel(model); //Run your EF code to save the model back to the database.
}
}
}
The main caveat with this approach is that your Domain Model must be a reference type and you shouldn't be overriding GetHashCode(). The main reason for this is that when you perform the lookup for the matching data model, the hash code can't be dependent upon the values of any changeable properties because it needs to remain the same even if the application code has modified the values of properties on the instance of the domain model. Using this approach, the application code becomes:
IAggregateRootRepository rep = new ConcreteRepository([arguments that load the repository from the db]);
Customer customer = rep[0]; //or however you choose to select your Customer.
customer.Addresses = newAddresses; //change the addresses
rep[0] = customer;
The easy way is using Self Tracking entities What is the purpose of self tracking entities? (I don't like it, because tracking is different responsability).
The hard way, you take the original collection and you compare :-/
Update relationships when saving changes of EF4 POCO objects
Other way may be, event tracking ?

Entity Framework 5 Foreign Key New Record on SaveChanges

I'm using .NET4.5/EF5 and have created the model from an existing database.
I'm using the following code:
Order currentOrder = new Order();
using (var db = new ILSEntities())
{
try
{
Event currentEvent = db.Events.OrderByDescending(u => u.EventID).FirstOrDefault();
currentOrder.Event = currentEvent;
db.Orders.Add(currentOrder);
db.SaveChanges();
And I'm seeing that a duplicate record is being created of the Event object I find, which is not what I wanted to happen.
I've read a lot of posts relating to similar problems, but where the context of the two participants in the foreign key relationships are different. Here, I'm saving with the same context I use to find one, and the other object is new.
I've also tried:
currentOrder.Event.EventID = currentEvent.EventID;
but that fails as well as I get an EF validation error telling me it needs values for the other members of the Event object.
I've also tried specifically setting the EntityState of the object being duplicated to Detached, Modified etc. after adding the Order object but before SaveChanges without success.
I'm sure this is a basic problem, but it's got me baffled
In my understanding, both parent and child objects have to be in the context before you assign any relationship between them to convince the entity framework that an entity exists in the database already. I guess you are trying to add new Order object to Database, to add new object you should be using AddObject method, Add() method is used to establish relation between entitties. In your code, currentOrder is not in the context. Try to hook it in the same context and then assign a relation. Your code should look like this :
Order currentOrder = new Order();
using (var db = new ILSEntities())
{
try
{
Event currentEvent = db.Events.OrderByDescending(u => u.EventID).FirstOrDefault();
db.Orders.Attach(currentOrder); //attach currentOrder to context as it was not loaded from the context
currentOrder.Events.Add(currentEvent);//establish relationship
db.ObjectStateManager.ChangeObjectState(currentOrder, EntityState.Added);
db.SaveChanges();
}
}
OK, I did in the end figure this out, and it was my fault.
The problem was that the Order object is FK'd into another table, Shipments, which is also FK'd into Events. The problem was that it was the Event reference in the Shipment object that was causing the new record. The solution was to let EF know about these relationships by adding them all within the same context.
The code assembling the object graph was spread over a number of webforms and the responses here made me take a step back and look at the whole thing critically so whilst no one of these answers is correct, I'm voting everybody who replied up