I am using EF code first. Recently I had to replace the following code:
User user = userRepository.GetByEmail("some#email.com");
if (user == null)
{
user = New User { Email = email, CreatedAt = DateTime.Now };
userRepository.Add(user);
unitOfWork.Commit();
}
with
Context.ExecuteSqlCommand("IF NOT EXISTS(SELECT 1 FROM Users WHERE Email = '{0}')
INSERT INTO Users(Email, CreatedAt)
VALUES ('email', GETDATE())");
The reason behind this is that it took EF a very long time to run the first piece of code when trying to add thousands of rows. By changing it to a ExecuteSqlCommand, the time to handle that many rows decreased by a multitude.
The problem I am seeing now (only occurred twice so far) is the following message from the database: Transaction (Process ID 52) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
How would I go about resolving this? Most of my data access is done through EF with a few exceptions like the the one above. I have never seen a deadlock in my logs before so I assume this ha something to do with the query.
My questions are:
Is there a way to write the query using No LOCK? How would that
query look?
Is there a way to tell EF to use NO LOCK for certain queries?
What you actually want is to lock the table earlier, not to prevent locking. Locking is necessary to ensure that between the two statements in your command, some other process hasn't come along and inserted the same user. (Locking is always necessary when inserting data because the physical storage is being modified.)
Assuming that this is actually the command that is causing deadlock, the following should resolve it, because it only asks for an exclusive lock :
Context.ExecuteSqlCommand("IF NOT EXISTS(SELECT 1 FROM Users WHERE Email = '{0}')
INSERT INTO Users(Email, CreatedAt)
VALUES ('email', GETDATE())");
Related
Well, Sorry, if you find this question weird, But Let me ask It anyway.
Imagine the following situation. There is 2 Clients, A and B. The A Client decided to create Profile and the Transaction in general takes 2 Minutes until Completion for example.
After 1 minute, B Client Decided to create a Profile with THE SAME Username And Password, (but the first Transaction is still in the Process, And we cannot apply the unique constraint, because there is no such User with this Username quite yet.)
So It will eventually end up with UNIQUE CONSTRAINT exception, and we'll need to make a rollback.
The Question is: How to avoid this situation?
I've heard about LOCK in PostgreSQL (that allows to lock the EXISTING ROW in order to others can't change it or read) but haven't find any similar to this sort of case.
Is there any feature, that provides some sort of functionality to block potential transactions?
Start the transaction like this:
BEGIN;
SET lock_timeout = 1;
INSERT INTO users (username, password) VALUES (...);
RESET lock_timeout;
/* the rest of the transaction */
COMMIT;
The second transaction that tries to create the same user won't block, but fail right away and can be rolled back.
EDIT: in case someone else stumbles across this, Laurenz Albe's post is a better solution. Use that instead.
The Question is: How to avoid this situation?
A simple way would be to split the commit into two parts, probably using a savepoint:
func createUser(user: User) error {
this.db.exec('INSERT INTO users VALUES ($1, $2)', user.username, user.hashedPassword);
this.db.withTransaction(func (tx Transaction) {
tx.exec('DELETE FROM users WHERE username = $1', user.username);
sp = tx.createSavepoint();
tx.exec('INSERT INTO users VALUES ($1, $2)', user.username, user.hashedPassword);
try {
// your code that takes two minutes
tx.commit();
} catch (e) {
tx.rollbackToSavepoint(sp);
tx.commit();
}
});
}
Where you first insert your row, immediately commiting the change. Now any new user can't use that username.
Then, start a transaction, delete the user. Create a savepoint. Create the user again. Now, instead of rolling back the entire transaction if something fails, rollback to the savepoint (where the user was created then deleted, effectively a no-op). If it works, since you deleted then created again, then the delete was effectively a no-op.
I am unsure how to design security policies for a following system including counters in postgres/supabase. My database includes two tables:
Users:
uuid|name|follower_counter
------------------------------
xyz |tobi| 1
Following-Relationship
follower| following
---------------------------
uuid_1 | uuid_2
Once a user follows a different user, I would like to use a postgres function/transaction to
Insert a new following-follower relationship
Update the followed users' counter
BEGIN
create follower_relationship(follower_id, following_id);
update increment_counter_of_followed_person(following_id);
END;
The constraint should be that the users table (e.g. the name column) can only be altered by the user owning the row. However, the follower_counter should open to changes from users who start following that user.
What is the best security policy design here? Should I add column security or should exclude the counters to a different table?
Do I have to pass parameters to the "block transaction" to ensure that the update and insert functions are called with the needed rights? With which rights should I call the block function?
It might be better to take a different approach to solve this problem. Instead of having a column dedicated to counting the followers, I would recommend actually counting the number of followers when you query the users. Since you already have Following-Relationship table, we just need to count the rows within the table where following or follower is the querying user.
When you have a counter, it might be hard to keep the counter accurate. You have to make sure the number gets decremented when someone unfollows. What if someone blocks a user? What if a user was deleted? There could be a lot of situations that could throw off the counter.
If you count the number of followings/followers on the fly, you don't need to worry about those situations at all.
Now obvious concern with this approach that you might have is performance, but you should not worry too much about it. Postgres is a powerful database that has been battle tested for decades, and with a proper index in place, it can easily perform these query on the fly.
The easiest way of doing this in Supabase would be to create a view like this the following. Once you create a view, you can query it from your Supabase client just like a typical table!
create or replace view profiles as
select
id,
name,
(select count(*) from following_relationship where followed_user_id = id) as follower_count,
(select count(*) from following_relationship where following_user_id = id) as following_count
from users;
I am creating a data flow task which will be extracting data from a source table and will be updating a destination table as follows:
1) Use the unique id in the source record to find the record you want to update in the destination table.
2) If the ID does not exist in the destination table, check whether the email of the source record exists in the destination table instead.
a) If the email exists, update the destination record through the email. Also update the unique id of that destination record.
b) If the email does not exist, insert a new record to the destination table.
So, with simple words, I am creating a task that will be updating a table on its unique id and if it does not have a match, it will be attempting to update on its email. If it still does not find a match, it will be inserting a new record.
This means that I will have two updates running in parallel as you can see in the image (the two circled components will be running in parallel)
SSIS_Data_Flow_Task
Now, this generates a deadlock issue because of those two updates.
I have tried using With (NOLOCK) but this hint is for reading data, not updating it. I also have searched for delay tasks to delay one of the two data pipelines until the other is finished.
Any ideas? Could I maybe design my data flow task differently in order to avoid having multiple parallel updates in the first place?
Any help will be greatly appreciated.
with these type of flows i always work with a work table (dest table id, work type (U or I), ...). in a first step i fill a table with work that needs to be done, then i apply the work.
I am referencing the 2 step newsletter example at http://agiletoolkit.org/codepad/newsletter. I modified the example into a 4 step process. The following page class is step 1, and it works to insert a new record and get the new record id. The problem is I don't want to insert this record into the database until the final step. I am not sure how to retrieve this id without using the save() function. Any ideas would be helpful.
class page_Ssp_Step1 extends Page {
function init(){
parent::init();
$p=$this;
$m=$p->add(Model_Publishers);
$form=$p->add('Form');
$form->setModel($m);
$form->addSubmit();
if($form->isSubmitted()){
$m->save();//inserts new record into db.
$new_id=$m->get('id');//gets id of new record
$this->api->memorize('new_id',$new_id);//carries id across pages
$this->js()->atk4_load($this->api->url('./Step2'))->execute();
}
}
}
There are several ways you could do this, either using atk4 functionality, mysql transactions or as a part of the design of your application.
1) Manage the id column yourself
I assume you are using an auto increment column in MySQL so one option would be to not make this auto increment but use a sequence and select the next value and save this in your memorize statement and add it in the model as a defaultValue using ->defaultValue($this->api->recall('new_id')
2) Turn off autocommit and create a transaction around the inserts
I'm from an oracle background rather than MySQL but MySQL also allows you to wrap several statements in a transaction which either saves everything or rollsback so this would also be an option if you can create a transaction, then you might still be able to save but only a complete transaction populating several tables would be committed if all steps complete.
In atk 4.1, the DBlite/mysql.php class contains some functions for transaction support but the documentation on agiletoolkit.org is incomplete and it's unclear how you change the dbConnect being used as currently you connect to a database in lib/Frontend.php using $this->dbConnect() but there is no option to pass a parameter.
It looks like you may be able to do the needed transaction commands using this at the start of the first page
$this->api->db->query('SET AUTOCOMMIT=0');
$this->api->db->query('START TRANSACTION');
then do inserts in various pages as needed. Note that everything done will be contained in a transaccion so if the user doesnt complete the process, nothing will be saved.
On the last insert,
$this->api->db->query('COMMIT');
Then if you want to, turn back on autocommit so each SQL statement is committed
$this->api->db->query('SET AUTOCOMMIT=1');
I havent tried this but hopefully that helps.
3) use beforeInsert or afterInsert
You can also look at overriding the beforeInsert function on your model which has an array of the data but I think if your id is an auto increment column, it won't have a value until the afterInsert function which has a parameter of the Id inserted.
4) use a status to indicate complete record
Finally you could use a status column on your record to indicate it is only at the first stage and this only gets updated to a complete status when the final stage is completed. Then you can have a housekeeping job that runs at intervals to remove records that didn't complete all stages. Any grid or crud where you display these records would be limited with AddCondition('status','C') in the model or added in the page so that incomplete ones never get shown.
5) Manage the transaction as non sql
As suggested by Romans, you could store the result of the form processing in session variables instead of directly into the database and then use a SQL to insert it once the last step is completed.
We have a table called Contracts. These contract records are created by users on an external site and must be approved or rejected by staff on an internal site. When a contract is rejected, it's simply deleted from the db. When it's accepted, however, a new record is generated called Contract Acceptance which is written to its own table and is derived from data that exists on the contract.
The problem is that two internal staff members may each end up opening the same contract. The first user accepts and a contract acceptance record is generated. Then, with the same contract record still open on the page, the second user accepts the contract again, creating a duplicate acceptance record.
The quick and dirty way to get past this is to retrieve the contract from the db just before it's accepted, check the status, and produce an error message saying that it's already been accepted. This would probably work for most circumstances, but the users could still click the Accept button at the exact same time and sneak by this validation code.
I've also considered a thread lock deep in the data layer that prevents two threads from entering the same region of code at the same time, but the app exists on two load-balanced servers, so the users could be on separate servers which would render this approach useless.
The only method I can think of would have to exist at the database. Conceptually, I would like to somehow lock the stored procedure or table so that it can't be updated twice at the same time, but perhaps I don't understand Oracle enough here. How do updates work? Are update requests somehow queued up so that they do not occur at the exact same time? If this is so, I could check the status of the record in th SQL and return a value in an out parameter stating it has already been accepted. But if update requests aren't queued then two people could still get into the update sql at the exact same time.
Looking for good suggestions on how to go about this.
First, if there can only be one Contract Acceptance per Contract, then Contract Acceptance should have the Contract ID as its own primary (or unique) key: that will make duplicates impossible.
Second, to prevent the second user from trying to accept the contract while the first user is accepting it, you can make the acceptance process lock the Contract row:
select ...
from Contract
where contract_id = :the_contract
for update nowait;
insert into Contract_Acceptance ...
The second user's attempt to accept will then fail with an exception :
ORA-00054: resource busy and acquire with nowait specified
In general, there are two approaches to the problem
Option 1: Pessimistic Locking
In this scenario, you're pessimistic so you lock the row in the table when you select it. When a user queries the Contracts table, they'd do something like
SELECT *
FROM contracts
WHERE contract_id = <<some contract ID>>
FOR UPDATE NOWAIT;
Whoever selects the record first will lock it. Whoever selects the record second will get an ORA-00054 error that the application will then catch and let them know that another user has already locked the record. When the first user completes their work, they issue their INSERT into the Contract_Acceptance table and commit their transaction. This releases the lock on the row in the Contracts table.
Option 2: Optimistic Locking
In this scenario, you're being optimistic that the two users won't conflict so you don't lock the record initially. Instead, you select the data you need along with a Last_Updated_Timestamp column that you add to the table if it doesn't already exist. Something like
SELECT <<list of columns>>, Last_Updated_Timestamp
FROM Contracts
WHERE contract_id = <<some contract ID>>
When a user accepts the contract, before doing the INSERT into Contract_Acceptance, they issue an UPDATE on Contracts
UPDATE Contracts
SET last_updated_timestamp = systimestamp
WHERE contract_id = <<some contract ID>>
AND last_update_timestamp = <<timestamp from the initial SELECT>>;
The first person to do this update will succeed (the statement will update 1 row). The second person to do this will update 0 rows. The application detects the fact that the update didn't modify any rows and tells the second user that someone else has already processed the row.
In Either Case
In either case, you probably want to add a UNIQUE constraint to the Contract_Acceptance table. This will ensure that there is only one row in the Contract_Acceptance table for any given Contract_ID.
ALTER TABLE Contract_Acceptance
ADD CONSTRAINT unique_contract_id UNIQUE (Contract_ID)
This is a second line of defense that should never be needed but protects you in case the application doesn't implement its logic correctly.