Entity Framework Query Execution Results - entity-framework

We are contemplating the use of Entity Framework for a bulk upload job. Because we are interested in tracking the individual results of each recorded, I am trying to find out whether all items inserted are done as part of a transaction and if a single failure would cause a rollback, or if there is a way to track the result of each individual result.
I was contemplating just looping through the items and calling Save() on each and wrapping that in a try/catch block, but just calling save() on the context seems more effective if I can tap into the individual results.
Thanks!

By default each time you call SaveChanges all records that are pending are saved at once. All operations within a SaveChanges call are within a transaction (source) and so if a particular SaveChanges call it fails it will rollback the queries in that particular call, but obviously not previously successful calls.
Ultimately it depends on where you call SaveChanges as to how much of a rollback you'd get. You can extend the length of a transaction with TransactionScope see this article for some examples on how to extend the scope.
My own experience is that calling SaveChanges too often can really decrease performance, and slow the entire process down a lot. But obviously waiting too long can mean a greater memory footprint. I found that EF can easily handle several thousand rows without much of a memory problem.
Placing the code in a try...catch will assist you, but you may want to look at entity validation as you can place restrictions on the objects and then validation will occur on the objects before any SQL is generated.

Related

What's the difference between DbUpdateConcurrencyException and 40001 postgres code?

I have two transactions.
In first I select an entity, do validations, upload provided by client file to S3 and then update this entity with info about S3 file.
Second transaction is simply deleting this entity.
Now, assume that someone called first transaction and immediately second. Second one will proceed faster and first one will throw DbUpdateConcurrencyException, as selected entity no longer exists on update query.
I get DbUpdateConcurrencyException, when my transaction has IsolationLevel.ReadCommited. But if I set IsolationLevel.Serializable it throws InvalidOperationException with 40001 postgres code. Could someone explain why do I get different errors, because it seems to me that outcome should be the same, as both errors invoked by updating non-existing entity?
The 40001 error corresponds to the SQLSTATE serialization_failure (see the table of error codes).
It's generated by the database engine in serializable isolation level when it detects that there are concurrent transactions and this transaction may have produced a result that could not have been obtained if the concurrent transactions had been run serially.
When using IsolationLevel.ReadCommited, it's impossible to obtain this error, because choosing this level of isolation precisely means that the client-side doesn't want to have these isolation checks being done by the database.
On the other hand, the DbUpdateConcurrencyException is probably not generated by the database engine. It's generated by the entity framework. The database itself is fine with an UPDATE updating zero row, it's not an error at the SQL level.
I think you get the serialization failure if the database errors out first, and the DbUpdateConcurrencyException error if the database doesn't error out, but the second layer in the order of layering (the EF) does.
The typical way to deal with serialization failures, at the serializable isolation level, is for the client-side to retry the transaction when it gets a 40001 error. The retried transaction will have a fresh view of the data and hopefully will pass (otherwise, loop on retrying).
The typical way to deal with concurrency at lesser isolation levels like Read Committed it to explicitly lock objets before accessing them to force the serialization of concurrent transactions.

Performance Entity framework 6 startup and update

Im using lazyloading and pre generated views.
I create the context.
I get all my objects about 6000 + navigations are filled in like 3sec. thats ok.
I update all my objects first time ( 4mins....)
I update all my objects second time ( 6sec )
I suspect lazyloading running on background and updating must be doing something that makes him reloop or EF startup is still loading..
I'm on EF 6.1 and the datas are hierarchical.
Database size is about 6000rows on 30 tables.
EF model is DatabaseFirst.
Any Workaround ?
If your concern is about the time difference between the two updates, I suspect that the second one runs faster because fewer objects have been modified.
Why do you need all the objects loaded into your context? What is the lifecycle of your context?
The general recommendation is that you create single use contexts - for a single web request or a single windows form.
You'll further see a lot of example with a using statement, where the life of the context is purposely kept very short. Letting the context live too long can increase memory usage and increase the possibility of concurrency problems. Your database and other layers also do their own caching.
Lastly, Lazy loading is fine, as long as you don't know if you're going to need things. You can consider explicitly loading data if you know you're going to need related records, and want to avoid multiple round trips.
http://msdn.microsoft.com/en-us/data/jj574232.aspx

entity framework concurrency: transactions or concurrency fixed?

I need to make stock control, so I need to ensure that when I modified the amount of product, is doing in the right way. I am using Entity framework 4.0
For example, if I use a transaction, when I load the record from the database, the recored is blocked, so I can substract or add to the loaded amount, the number of items that I need. However, this block the record in the database and perhaps for performance reasons is not the best way. This makes me ask when to use transactions with EF.
The other option is to use the concurrency fixed of entity framework, using a timespan column to detect if the record has been changed. In this case, if the record has been modified between my load and my update, I get the exception of concurrency. But it could occur that in my exception handler, if I update my context with the database data, between my refresh and the savechanges could be changed again.
Other problem is I finally can save the changes. For example, I have 10 units, I need to substract 8 but between my load and my update, other person substract 5 units. If I subtract 8, then in stock I have -3 units. This is not possible. If I have a transaction, I load the record, is blocked, so I can check if I have enough units, if yes, I can subtrack, if not, I send an exception.
So my question is, I know that EF is a transaction by itself, but it exists also transactions in EF, so it would be useful in some cases. When to use EF and cocurrency fixed and when to use transactions?
Thanks.
Daimroc.
Transactions should not be used to solve concurrency. You should make transactions as short as possible to not block your databases. The way to go here is optimistic concurrency - in the database (SqlServer) you create a rowversion column that changes automatically each time a row is modified. You use it as concurrency token. When saving changes EF checks this against the value on the entity and if they don't match it throws an exception.
Note that for SaveChanges EF always creates a transaction since typically you save more than one entity and if something goes wrong the database needs to be reverted to the original state - otherwise it would be in a corrupt state.
To prevent from going below zero - if you use the optimistic concurrency - you won't be able to save if the value in the database changed since the concurrency token will be different because row was modified and therefore the check on the client should be sufficient. Alternatively you could map saving an entity to a stored procedure that will check the value before saving and would return an error if the value after saving would be incorrect.

Are there drawbacks to going around Entity Framework?

I have some design requirements that are not supported by Entity Framework, but are easily met by a simple SQL Query.
Essentially I need to do an insert that sets an Identity value.
Are there drawbacks to making a sproc that does my insert and then having EF call that sproc?
Are there caching concerns I need to be worried about? (Because I will be updating data "behind EF's back".)
Are there concurrency issues?
Anything else I need to be worried about?
If you don't keep around your db context, i.e. you dispose it after every unit of work this should work just fine (this covers most web scenarios) - unless you concurrently operate on the same table - if that is the case you might want to use a lock to synchronize the SQL and the EF queries or catch OptimisticConcurrencyException thrown by EF.
If you do keep a context around on the other hand make sure that you refresh it with RefreshMode.StoreWins.
Also see "Saving Changes and Managing Concurrency"

Entity framework - what to do if SaveChanges fails and I don't want some changes to be made?

I think I'm running into a common problem:
I would like to try to insert an object to the database. If primary key is violated then I would like to abort the insert. (this is an example, the question really applies to any kind of error and any of the CRUD operations)
How can I discard changes made to EF context?
I can't afford recreating it every time something goes wrong.
PS. I know that perhaps I could check if everything is ok eg. by querying the db, but I don't like the idea. Db constraints are there for some reason and this way it's faster and I have to write less code.
You can detach inserted entity from ObjectContext. You can also use ObjectStateManager and its method GetObjectStateEntries. In ObjectStateEntry you can modify its state.
The problem is that you are not using technology in supposed way:
I can't afford recreating it every
time something goes wrong.
Sure you should because your code doesn't prevent such situations.
PS. I know that perhaps I could check
if everything is ok eg. by querying
the db, but I don't like the idea. Db
constraints are there for some reason
and this way it's faster and I have to
write less code.
Yes indeed you should check if everything is OK. Calling database to "validate" your data is something that DBAs really like (sarcasm). It is your responsibility to achieve the highest possible validity of your data before you call SaveChanges. I can imagine that many senior developers / team leaders would simply not pass your code through their code review. And btw. in the most cases it is not faster because of inter process or network communication.
Try using DbTransaction.
System.Data.Common.DbTransaction _tran = null;
_tran = _ent.Connection.BeginTransaction();
_tran .Commit (); //after SaveChanges();
and if theres an exception
do a rollback.
_tran.Rollback();