Race condition in amplify datastore - flutter

When updating an object, how can I handle race condition?
final object = await Amplify.Datastore.query(Object.classtype, where: Object.ID.eq('aa');
Amplify.Datastore.save(object.copywith(count: object.count + 1 ));
user A : execute first statement
user B : execute first statement
user A : execute second statement
user B : execute second statement
=> only updated + 1

Apparently the way to resolve this is to either
1 - use conflict resolution, available from Datastore 0.5.0
One of your users (whichever is slowest) gets sent back the rejected version plus the latest version from server, you get both objects back to resolve discrepancies locally and retry update.
2 - Use a custom resolver
here..
and check ADD expressions
You save versions locally and your vtl is configured to provide additive values to the pipeline instead of set values.
This nice article might also help to understand that
Neither really worked for me, one of my devices could be offline for days at a time and i would need multiple updates to objects to be performed in order, not just the last current version of the local object.
What really confuses me is that there is no immediate way to just increment values, and keep all incremented objects' updates in the outbox instead of just the latest object, then apply them in order when connection is made..
I basically wrote in a separate table to do just that to solve my problem, but of course with more tables and rows, comes more reads and writes and therefore more expense.
Have a look at my attempts here if you want the full code lmk
And then i guess hope for an update to amplify that includes increment values logic to update values atomically out of the box to avoid these common race conditions.
Here is some more context

Related

Change a particular node in branch that each users have in firebase

How can I update a particular node that all users have in their own branch?
I am trying to update a users set of points back to 15 after a certain amount of days by a push of a button. I know how to update 1 users node but can't seem to figure out how to do create a function that when I push a button it resets all users that have that branch/var back to 15.
Otherwise, I would have to do it individually which is obviously not efficient.
This is my code to reset 1 of the student's node by a click of a button
Database.database().reference().root.child("students").child(userId).updateChildValues(["points": self.newPoints])
How do I write a method to reset all students who have the value "points" key by a press of a button?
Any help would be appreciated.
Thanks
You can't use wildcards to replace userId and therefore apply it to every node. You would probably need to fetch the entire students node, and update them one by one using a for loop, which would be very inefficient.
You probably want to use a different data structure in your database that would be more appropriate to your situation. Thats said, if you still want to stick with your current structure you can try having your client update a single boolean flag in your database, and then program a trigger function in node.js so that the update is handled on the server side directly. Check out : Firebase Cloud Functions

What will happen if two processes modify data in two transactions at the same time and there is a unique constraint on the table?

I am thinking about a race condition in a production system I am working on. Database is PostgreSQL. Application is written in Java, but this is not relevant.
There is a table called "versions", which contains columns "entity_ID" and "version" (and some other fields). This table contains versions of a certain entity.
There is an application where user can modify those entities.
Every modification of an entity creates a new version to the tabel "versions" (using a trigger). This trigger finds the last version in the same table "versions" and inserts a new row with the same entity_ID, but version = (last version + 1).
There is a nightly job that is run in PostgreSQL every 4:00 that also changes those entities and therefore updates data in the table "versions". This procedure was designed to finish its work by the morning (before users of the application start to use it), but unfortunately runs into the day. As this procedure is run in a function, then it is one big transaction. Therefore the changes done by it are not visible to the application.
The nightly job uses the following workflow:
Set "failed_counter" = 0
Iterate over entities that need to be modified
Do modifications to the entity inside a BEGIN .. EXCEPTION .. END block
If there is an EXCEPTION, increase the "failed_counter". Log the exception and the failed entity to a log table.
If "failed_counter" > 10, cancel work.
End work
This has caused the following race condition to happen a few times (lets assume that X is the last version of entity A):
Nightly job starts
Nightly job modifies entity A, creating version X+1
Application is used to also modify entity A, creating also version X+1 (because the nightly job transaction has not COMMITed, so the version X+1 is not visible to the application)
Nightly job ends, causing COMMIT
There are now two versions with version number X+1, which causes application to break.
I thought that I could just solve the problem by using an UNIQUE CONSTRAINT over fields (entity_ID, version). I thought that it would cause the application to receive an error (due to violating the UNIQUE CONSTRAINT) at race condition step 3. But I am not sure how does the unique constraint work in this situation. In race condition step 3, when the application adds a version, does the database check the UNIQUE CONSTRAINT? I suppose not, since the transaction of the nightly process has not been completed. If I am correct and the UNIQUE CONSTRAINT is checked only at race condition step 4, when COMMIT is made, then this causes the whole nightly procedure to fail, which is not desired result.
So, the question is the following.
When is the UNIQUE CONSTRAINT checked: At race condition step 3 or race condition step 4?
If the answer to the last question is "race condition 4", then how could I change the design of the system to avoid the above-mentioned problems?
By default, unique constraints in PostgreSQL are checked at the end of each statement. It's easy to test the behavior using psql.
Some big, red flags . . .
As this procedure is run in a function, then it is one big transaction.
It's not one, big transaction because you're running a function. It's one, big transaction because you haven't run the function several times over smaller subsets of the data. Whether you can run the function over subsets is application-dependent.
Iterate over entities that need to be modified
Rough rule of thumb for SQL databases: iteration is always a mistake.
SQL is a set-oriented language. Dealing with sets is usually faster than iteration, and often by several orders of magnitude.
If "failed_counter" > 10, cancel work.
This looks suspicious. Why are nine failures ok? Why are any failures ok?
I thought that I could just solve the problem by using an UNIQUE CONSTRAINT over fields (entity_ID, version).
That you don't already have a unique constraint on those two columns is a big, waving red flag. Fix this first.
The fact that an application should apparently be waiting for a batch job to finish, but isn't waiting, might or might not be a system design issue. (It smells like a system design issue.)
There is a nightly job that is run in PostgreSQL every 4:00 ...
Did you think of starting at 3:00?
Test this, but not on your production server.
Drop the trigger.
Add a column of type timestamp with time zone.
Set that column's default value. Most applications will use current_timestamp, but you might want clock_timestamp() instead. Docs
Add a unique constraint on {entity_id, new timestamp column}.
Eliminating the trigger might speed things up enough for you.

Salesforce.com: UNABLE_TO_LOCK_ROW, unable to obtain exclusive access to this record

In our production org, we have a system of uploading sales data into Salesforce using command line data loader. This data is loaded into a temporary object Temp. We have created a formula field (which combines three fields) to form a unique key. The purpose of the object is to reduce user efforts for creating the key manually.
There is an after insert trigger on Temp which calls an asynchronous method which upserts the data to another object SalesData using the key. The insert/update trigger on SalesData checks the various fields and creates/updates the records in another object SalesRecords. After the insertion/updation is complete, all the records in temp object Temp are deleted. The SalesRecords object does not have any trigger on it and is a child of another object Sales. The Sales object has some rollup fields which are summing up fields from SalesRecords object.
Lately, we are getting the below error for some of the records which are updated.
UNABLE_TO_LOCK_ROW, unable to obtain exclusive access to this record
Please provide some pointers to resolve the issue
this could either be caused by conflicting DML operations in the various trigger execution or some recursive trigger execution. i would assume that the async executions cause multiple subsequent updates on the same records, probably on the SalesRecords object. I would recommend to try to simplify the process to avoid too many related trigger executions.
I'm a little surprised you were able to get this to work in the first place. After triggers should be used with caution and only when before triggers can't be. One reason for this is that you don't need to perform additional DML to make changes to records, since in before triggers you simply change the values and the insert/update commit happens automatically. But recursive trigger firings is the main problem with after triggers.
One quick way to avoid trigger re-entry is to use a public static Boolean in a class that states whether you're already in this trigger from the same thread of execution.
Something like:
public static Boolean isExecuting = false;
Once set to true, any trigger code that is a re-fire can be avoided with:
if(Class.isExecuting == false)
{
Class.isExecuting = true;
// Perform trigger logic
// ...
}
Additionally, since the order of trigger execution cannot be determined up front, you might be seeing an issue with deletions or other data changes that depend on other parts of your flow to finish first.
Also, without knowing the details of your custom unique 3-part key, I'd wonder if there's a problem there too such as whether it's truly unique or not. Case insensitivity is a common mistake and it's the reason there are 15 AND 18 character Ids in Salesforce. For example, when people export to Excel (a case-insensitive environment) and do VLOOKUPs, they would occasionally find the wrong record. The 3-digit calculated suffix was added to disambiguate for case-insensitive environments.
Googling for this same error lead me to this post:
http://boards.developerforce.com/t5/General-Development/Unable-to-obtain-exclusive-access-to-this-record/td-p/345319
Which points out some common causes for this to happen:
Sharing Rules are being calculated.
A picklist value has been replaced and replacement is in progress.
A custom index creation/removal is in progress.
Most unlikely one - someone else is already editing the same record that you are trying to access at the same time.
Posting here in case somebody else needs it.
I got this error multiple times today. Turned out one of our vendors was updating their installed package during that time in the same org. All kinds of things were going wrong also - some object validation exceptions were being thrown on DMLs, without any error message content.
Resolution
The error is shown when a field update such as a roll-up summary field is being attempted on a parent object that already had a field update to cause the roll-up summary field to calculate. This could also occur if a trigger or another apex job running on the master object and it also attempting to do an update.
You can either reduce the batch size and try again or create separate smaller files to be imported if this issue occurs.

How to safely increment a counter in Entity Framework

Let's say I have a table that tracks the number of times a file was downloaded, and I expose that table to my code via EF. When the file is downloaded I want to update the count by one. At first, I wrote something like this:
var fileRecord = (from r in context.Files where r.FileId == 3 select r).Single();
fileRecord.Count++;
context.SaveChanges();
But then when I examined the actual SQL that is generated by these statements I noticed that the incrementing isn't happening on the DB side but instead in my memory. So my program reads the value of the counter in the database (say 2003), performs the calculation (new value is 2004) and then explicitly updates the row with the new Count value of 2004. Clearly this isn't safe from a concurrency perspective.
I was hoping the query would end up looking instead like:
UPDATE Files SET Count = Count + 1 WHERE FileId=3
Can anyone suggest how I might accomplish this? I'd prefer not to lock the row before the read and then unlock after the update because I'm afraid of blocking reads by other users (unless there is someway to lock a row only for writes but not block reads).
I also looked at doing a Entity SQL command but it appears Entity SQL doesn't support updates.
Thanks
You're certainly welcome to call a stored procedure with EF. Write a sproc with the SQL you show then create a function import in your EF model mapped to said sproc.
You will need to do some locking in order to get this to work. But you can minimise the amount of locking.
When you read the count and you want to update it, you must lock it, this can be done by placing the read and the update inside a transaction scope. This will protect you from race conditions.
When you read the value and you just want to read it, you can do this with a transaction isolation level of ReadUncommited, this read will then not be locked by the read/write lock above.

How do you manage concurrent access to forms?

We've got a set of forms in our web application that is managed by multiple staff members. The forms are common for all staff members. Right now, we've implemented a locking mechanism. But the issue is that there's no reliable way of knowing when a user has logged out of the system, so the form needs to be unlocked. I was wondering if there was a better way to manage concurrent users editing the same data.
You can use optimistic concurrency which is how the .Net data libraries are designed. Effectively you assume that usually no one will edit a row concurrently. When it occurs, you can either throw away the changes made, or try and create some nicer retry logic when you have two users edit the same row.
If you keep a copy of what was in the row when you started editing it and then write your update as:
Update Table set column = changedvalue
where column1 = column1prev
AND column2 = column2prev...
If this updates zero rows, then you know that the row changed during the edit and you can then deal with it, or simply throw an error and tell the user to try again.
You could also create some retry logic? Re-read the row from the database and check whether the change made by your user and the change made in the database are able to be safely combined, then do so automatically. Or you could present a choice to the user as to whether they still wish to make their change based on the values now in the database.
Do something similar to what is done in many version control systems. Allow anyone to edit the data. When the user submits the form, the database is checked for changes. If the record has not been changed prior to this submission, allow it as usual. If both changes are the same, ignore the incoming (now redundant) change.
If the second change is different from the first, the record is now in conflict. The user is presented with a new form, which indicates which fields were changed by the conflicting update. It is then the user's responsibility to resolve the conflict (by updating both sets of changes), or to allow the existing update to stand.
As Spence suggested, what you need is optimistic concurrency. A standard website that does no accounting for whether the data has changed uses what I call "last write wins". Simply put, whichever connection saves to the database last, that version of the data is the one that sticks. In optimistic concurrency, you use a "first write wins" logic such that if two connections try to save the same row at the same time, the first one that commits wins and the second is rejected.
There are two pieces to this mechanism:
The rules by which you fail the second commit
How the system or the user handles the rejected commit.
Determining whether to reject the commit
Two approaches:
Comparison column that changes each time a commit happens
Compare the data with its committed version in the database.
The first one entails using something like SQL Server's rowversion data type which is guaranteed to change each time the row changes. The upside is that it makes it simple to roll your own logic to determine if something has changed. When you get the data, you pull the rowversion column's value and when you commit, you compare that value with what is currently in the database. If they are different, the data has changed since you last retrieved it and you should reject the commit otherwise proceed to save the data.
The second one entails comparing the columns you pulled with their existing committed values in the database. As Spence suggested, if you attempt the update and no rows were updated, then clearly one of the criteria failed. This logic can get tricky when some of the values are null. Many object relational mappers and even .NET's DataTable and DataAdapter technology can help you handle this.
Handling the rejected commit
If you do not leave it up to the user, then the form would throw some message stating that the data has changed since they last edited and you would simply re-retrieve the data overwriting their changes. As you can imagine, users aren't particularly fond of this solution especially in a high volume system where it might happen frequently.
A more sophisticated (and also more complicated) approach is to show the user what has changed allow them to choose which items to try to re-commit, Behind the scenes you would retrieve the data again, overwrite the values picked by the user with their entries and try to commit again. In high volume system, this will still be problematic because by the time the user has tried to re-commit, the data may have changed yet again.
The checkout concept is effectively pessimistic concurrency where users "lock" rows. As you have discovered, it is difficult to implement in a stateless environment. Users are notorious for simply closing their browser while they have something checked out or using the Back button to return a set that was checked out and try to recommit it. IMO, it is more trouble than it is worth to try go this route in a web-based solution. Assuming you write the user name that last changed a given row, with optimistic concurrency, you can inform the user whose changes are rejected who saved the data before them.
I have seen this done two ways. The first is to have a "checked out" column in your database table associated with that data. Your service would have to look for this flag to see if it is being edited. You can have this expire after a time threshold is met (with a trigger) if the user doesn't commit changes. The second way is having a dedicated "checked out" table that stores id's and object names (probably the table name). It would work the same way and you would have less lookup time, theoretically. I see concurrency issues using the second method, however.
Why do you need to look for session timeout? Just synchronize access to your data (forms or whatever) and that's it.
UPDATE: If you mean you have "long transactions" where form is locked as soon as user opens editor (or whatever) and remains locked until user commits changes, then:
either use optimistic locking, implement it by versioning of forms data table
optimistic locking can cause loss of work, if user have been away for a long time, then tried to commit his changes and discovered that someone else already updated a form. In this case you may want to implement explicit "locking" of form, where user "locks" form as soon as he starts work on it. Other user will notice that form is "locked" and either communicate with lock owner to resolve issue, or he can "relock" form for himself, loosing all updates of first user in process.
We put in a very simple optimistic locking scheme that works like this:
every table has a last_update_date
field in it
when the form is created
the last_update_date for the record
is stored in a hidden input field
when the form is POSTED the server
checks the last_update_date in the
database against the date in the
hidden input field.
If they match,
then no one else has changed the
record since the form was created so
the system updates the data.
If they don't match, then someone else has
changed the record since the form was
created. The system sends the user back to the form edit page and tells the user that someone else edited the record and they must reapply their changes.
It is very simple and works well enough.
You can use "timestamp" column on your table. Refer: What is the mysterious 'timestamp' datatype in Sybase?
I understand that you want to avoid overwriting existing data with consecutively updates.
If so, when the user opens a screen you have to get last "timestamp" column to the client.
After changing data just before update, you should check the "timestamp" columns(yours and db) to make sure if anyone has changed tha data while he is editing.
If its changed you will alert an error and he has to startover. If it is not, update the data. Timestamp columns updated automatically.
The simplest method is to format your update statement to include the datetime when the record was last updated. For example:
UPDATE my_table SET my_column = new_val WHERE last_updated = <datetime when record was pulled from the db>
This way the update only succeeds if no one else has changed the record since the last read.
You can message to the user on conflict by checking if the update suceeded via a SELECT after the UPDATE.