I created simple WinForms application that connect to Firebird server using EntityFramework. Database contains only one table with 4 fields (Id, FirstName, LastName, Email).
When I run in parallel thre different query for update different field I've got exception with message "lock conflict on no wait transaction".
Is it EF specific behavior or I need to tune firebird server for usage field level locking?
You get this error if you are updating the same row in multiple transactions at the same time. There is no such thing as field level locking in Firebird, as the row as a whole is versioned.
The only solutions available to you are: don't do this, update all fields in a single query, add a retry mechanism, or don't fire updates of different fields in parallel.
Related
We're trying to stream data to postgres 11, using the following query:
INSERT INTO identifier_to_item
values (:id, :identifier_value, :identifier_type, :identifier_manufacturer, :delivery_timestamp_utc, :item)
ON CONFLICT (identifier_value, manufacturer, type) DO UPDATE
SET item = :item, delivery_timestamp_utc = :delivery_timestamp_utc
WHERE identifier_to_item.delivery_timestamp_utc < :delivery_timestamp_utc
Basically "insert record in the table, if it already exists -> optionally override some fields based on the data already stored in the database".
We would like to hook this query to message queue and run it in high concurrent environment within several instances. It is possible that the same row will be accessed from different connections using this query. For us it's critical that only items with highest delivery timestamp will eventually make it to the table
According to documentation:
ON CONFLICT DO UPDATE guarantees an atomic INSERT or UPDATE outcome; provided there is no independent error, one of those two outcomes is guaranteed, even under high concurrency.
but is also accessing the fields in UPDATE WHERE part atomic and thread safe? Is this statement using some kind of pessimistic row/table locking?
PostgreSQL is not using threads on the server side.
PostgreSQL does not implement pessimistic/optimistic row level locking : it is the left to the application to decide to implement pessimistic or optimistic locking.
PostgreSQL does not escalate row level locks to table lock.
From the documenation:
ON CONFLICT DO UPDATE guarantees an atomic INSERT or UPDATE outcome; provided there is no independent error, one of those two outcomes is guaranteed, even under high concurrency.
It does not mention what happens on ON CONFLICT DO NOTHING.
As a test, I did an INSERT ... ON CONFLICT DO NOTHING with 10 threads thousands of times and did not see any errors.
I am working on my first project using an ORM (currently using Entiry Framework, although that's not set in stone) and am unsure what is the best practice when I need to add or subtract a given amount from a database field, when I am not interested in the new value and I know the field in question is frequently updated, so concurrency conflicts are a concern.
For example, in a retail system where I am recording a sale, as well as creating records for the sale and each of the line items, I need to update the quantity on hand of the items sold. It seems unnecessary to query the database for the existing quantity on hand, just so that I can populate the entity model before saving the updated quantity - and in the time taken for that round-trip, there is a chance that the same item will have been sold through another checkout or the website, so I either have a conflict or (if using a transaction) the other sale is blocked until I complete my update.
In SQL I would simply write
UPDATE Item SET Quantity=Quantity-1 WHERE ...
It seems the best option in this case is to fall back to ADO.NET + stored procedure for this one update, but is there a better way within Entity Framework?
You're right. ORMs are specialized in tracking changes to each individual entity, and applying those changes to the DB individually. Some ORMs support sending thechanges in btaches, but, even so, to modify all the records in a table implies reading them all, modifyng each one, and sending the changes back to the DB as individual UPDATEs.
And that's a big no-no! as you have corectly thought. It implies loading all the rows into memory, modifying all of them, track their changes, and send them back to the DB as indivudal updates, which is way more expensive that running a single UPDATE on the DB.
As to the final question, to run a SQL command you don't need to use traditional ADO.NET. You can run SQL queries directly from an EF DbContext using ExecuteSqlCommand like this:
MyDbContext.Database.ExecuteSqlCommand('Your SQL here!!');
I recommend you to look at the MSDN docs for Database class, to learn all the things that can be done, for example managing transactions, executing commands that return no data (as the previous example) or executing queries that return data, and even mapping them to entities (classes) in your model: SqlQuery().
So you can run SQL commands and queries without using a different technology.
I have multiple processes accessing the same database table. The table holds "TakenBy" column that is supposed to hold the ID of the taker process.
Entity Framework is my data access layer.
My question would be how can I use my DataContext object so I can retrieve rows from the above table, and have the "TakenBy" column updated at the same time.
This would allow me to overcome race-condition with the other processes, who also try to get the same records.
EF will not handle that for you. You must either use stored procedure or you must perform update once you load the record through your application and handle concurrency (either by optimistic way which means to use times tamp or row version column or by pessimistic way which means manual SQL query).
I am working on EJB 3.0 where entity beans are managed by JPA.My question is if two or more user will try to insert in same table using same form same time, how JPA will handle that situation.
It will manage it just fine, by using database transactions. If two threads try to create the same row (i.e. with the same primary key) at the same time, one will succeed, and the other will get an exception from the database, which will cause a rollback of its transaction. That means that all the other inserts, updates and deletes made in the same transaction will also be rollbacked, or cancelled if you prefer, leaving the database in a coherent state. That's the A in ACID.
If two threads insert two different rows at the same time in the same table, then the database will handle that just fine, and both rows will be inserted.
I have a PostgreSQL 9.2.2 database that serves orders to my ERP system. The database tables contain boolean columns indicating if a customer is added or not among other records. The code I use extracts the rows from the database and sends them to our ERP system one at a time (single threaded). My code works perfectly in this regard; however over the past year our volume has grown enough to require a multi-threaded solution.
I don't think the MVCC modes will work for me because the added_customer column is only updated once a customer has been successfully added. The default MVCC modes could cause the same row to be worked on at the same time resulting in duplicate web service calls. What I want to avoid is duplicate web service calls to our ERP system as they can be rather heavy, although admittedly I am not an expert on MVCC nor the other modes that PostgreSQL provides.
My question is: How can I be sure that a row, or series of rows returned in one select statement are excluded from other queries to the database in separate threads?
You will need to record the fact that the rows are being processed somehow. You will also need to deal with concurrent attempts to mark them as being processed and handle failures with sending them to your ERP system.
You may find SELECT ... FOR UPDATE useful to get a set of rows and simultaneously lock them against updates. One approach might be for each thread to select a target row, try to add it's ID to a "processing" table, then remove it in the same transaction you update added_customer.
If a thread fetches no candidate rows, or fails to insert then it just needs to sleep briefly and try again. If anything goes badly wrong then you should have rows left in the "processing" table that you can inspect/correct.
Of course the other option is to just grab a set of candidate rows and spawn a separate process/thread for each that communicates with the ERP. That keeps the database fetching single-threaded while allowing multiple channels to the ERP.
You can add a column user_is_proccesed to the table. It can hold the process id for the back end, that updates the record.
Then use a small serializable transaction to set the user_is_proccesed to "lock row for proccesing".
Something like:
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
UPDATE user_table
SET user_is_proccesed = pg_backend_pid()
WHERE <some condition>
AND user_is_proccesed IS NULL; -- no one is proccesing it now
COMMIT;
The key thing here - with SERIALIZABLE only one transaction can successfully update the record (all other concurrent SERIALIZABLE updates will fail with ERROR: could not serialize access due to concurrent update).