Akka Persistence: Where do the execution of the Command Goes when it is not simply a state update - scala

Just for clarification: Where do the execution of a command goes, when the execution is not simply a state update (like in most examples found online)
For instance, in my case,
The Command is FetchLastHistoryChangeSet which consist in fetching the last history changeset from an external service based on where we left off last time. In other words the time of the newest change of the previous history ChangeSet Fetched.
The Event would be HistoryChangeSetFetched(changeSet, time). In correlation to what has been said above, the time should be that of the newest change of the newly history ChangeSet Fetched (as per the command event currently being handled)
Now in all example that i see, it is always: (i) validating the command, then, (ii) persisting the event, and finally (iii) handling the event.
It is in handling the event that i have seen custom code added in addition to the updatestate logic. Where, the custom code is usually added after the update state function. But this custom is most of the time about sending message back to the sender, or broadcasting it to the event bus.
As per my example, it is clear that i need to do quite few operation to actually call Persist(HistoryChangeSetFetched(changeSet, time)). Indeed i need the new changeset, and the time of the newest change of it.
The only way i see it possible is to do the fetch in the validating the command
That is:
case FetchLastHistoryChangeSet => val changetuple = if ValidateCommand(FetchLastHistoryChangeSet) persit(HistoryChangeSetFetched(changetuple._1, changetuple._2)) { historyChangeSetFetched =>
updateState(historyChangeSetFetched)
}
Where the ValidateCommand(FetchLastHistoryChangeSet)
would have as logic, to read last changeSet time (newest change of the changeSet), fetch a new changeset based on it, if it exist, get the time of its newest change, and return the tuple.
My question is, is that how it is supposed to work. Validating command
can be something as complex as that ? i.e. actually executing the
command ?

As it says in the documentation: "validation can mean anything, from simple inspection of a command message's fields up to a conversation with several external services"
So I think what you're trying to do is exactly right. Any interaction with an external service must be done at the command validation stage.

Related

Race condition in amplify datastore

When updating an object, how can I handle race condition?
final object = await Amplify.Datastore.query(Object.classtype, where: Object.ID.eq('aa');
Amplify.Datastore.save(object.copywith(count: object.count + 1 ));
user A : execute first statement
user B : execute first statement
user A : execute second statement
user B : execute second statement
=> only updated + 1
Apparently the way to resolve this is to either
1 - use conflict resolution, available from Datastore 0.5.0
One of your users (whichever is slowest) gets sent back the rejected version plus the latest version from server, you get both objects back to resolve discrepancies locally and retry update.
2 - Use a custom resolver
here..
and check ADD expressions
You save versions locally and your vtl is configured to provide additive values to the pipeline instead of set values.
This nice article might also help to understand that
Neither really worked for me, one of my devices could be offline for days at a time and i would need multiple updates to objects to be performed in order, not just the last current version of the local object.
What really confuses me is that there is no immediate way to just increment values, and keep all incremented objects' updates in the outbox instead of just the latest object, then apply them in order when connection is made..
I basically wrote in a separate table to do just that to solve my problem, but of course with more tables and rows, comes more reads and writes and therefore more expense.
Have a look at my attempts here if you want the full code lmk
And then i guess hope for an update to amplify that includes increment values logic to update values atomically out of the box to avoid these common race conditions.
Here is some more context

Complicated job aggregate

I have a very complicated job process and it's not 100% clear to me where to handle what.
I don't want to have code, it just the question who is responsible for what.
Given is the following:
There is a root directory "C:\server"
Inside are two directories "ftp" and "backup"
Imagine the following process:
An external customer sends a file into the ftp directory.
An importer application get's the file and now the fun starts.
A job aggregate have to be created for this file.
The command "CreateJob(string file)" is fired.
?. The file have to be moved from ftp to backup. Inside the CommandHandler or inside the Aggregate or on JobCreated event?
StartJob(Guid jobId) get's called. A third folder have to be created "in-progress", File have to be copied from backup to in-progress. Who does it?
So it's unclear for me where Filesystem things have to be handled if the Aggregate can not work correctly without the correct filesystem.
Because my first approach was to do that inside an Infrastructure layer/lib which listen to the events from the job layer. But it seems not 100% correct?!
And top of this, what is with replaying?
You can't replay things/files that were moved, you have to somehow simulate that a customer sends the file to the ftp folder...
Thankful for answers
The file have to be moved from ftp to backup. Inside the CommandHandler or inside the Aggregate or on JobCreated event?
In situations like this, I move the file to the destination folder in the Application service that sends the command to the Aggregate (or that calls a command-like method on the Aggregate, it's the same) before the command is sent to the Aggregate. In this way, if there are some problems with the file-system (not enough permissions or space is not available etc) the command is not sent. These kind of problems should not reach our Aggregate. We most protect it from the infrastructure. In fact we should keep the Aggregate isolated from anything else; it must contain only pure business logic that is used to decide what events get generated.
Because my first approach was to do that inside an Infrastructure layer/lib which listen to the events from the job layer. But it seems not 100% correct?!
Indeed, this seems like over engineering to me. You must KISS.
StartJob(Guid jobId) get's called. A third folder have to be created "in-progress", File have to be copied from backup to in-progress. Who does it?
Whoever's calling the StartJob could do the moving, before the StartJob gets called. Again, keep the Aggregate pure. In this case it depends on your framework/domain details.
And top of this, what is with replaying? You can't replay things/files that where moved, you have to somehow simulate that a customer sends the file to the ftp folder...
The events are loaded from the event store and replayed in two situations:
Before every command gets sent to the Aggregate, the Aggregate Repository loads all the events from the event store then it applies every one of them to the Aggregate, probably calling some applyThisEvent(TheEvent) method on the Aggregate. So, this methods should be with no side effects (pure) otherwise you change the outside world again and again at every command execution and you don't want that.
The read-models (the projections, the query-models) that present data to the user listen to those events and update the database tables that hold the data that the users see. The events are sent to those read-models after they are generated and every time the read-models are being recreated. When you invent a new read-model, you must pass it all the events that were previous generated by the aggregates in order to build the correct/complete state on them. If your read-model's event listeners have side effects what do you think happens when you replay those long past events? The outside world is modified again and again and you don't want that! The read-models only interpret the events, they don't generate other events and they don't change the outside world.
There is a special third case when events reach another type of model, a Saga. A Saga must receive an event only once! This is the case that you thought to use in Because my first approach was to do that inside an Infrastructure layer/lib which listen to the events from the job layer. You could do this in your case but is not KISS.
I have a very complicated job process and it's not 100% clear to me where to handle what. I don't want to have code, it just the question who is responsible for what.
The usual answer is that the domain model -- aka the "aggregate" makes decisions, and saves them. Observing those decisions, some event handler induces side effects.
And top of this, what is with replaying? You can't replay things/files that where moved, you have to somehow simulate that a customer sends the file to the ftp folder...
You replay the events to the aggregate, so that it is restored to the state where it made the last decision. That's a separate concern from replaying the side effects -- which is part of the motivation for handling the side effects elsewhere.
Where possible, of course, you prefer to have the side effects be idempotent, so that a duplicated message doesn't create a problem. But notice that from the point of view of the model, it doesn't actually matter whether the side effect succeeds or not.

Salesforce.com: UNABLE_TO_LOCK_ROW, unable to obtain exclusive access to this record

In our production org, we have a system of uploading sales data into Salesforce using command line data loader. This data is loaded into a temporary object Temp. We have created a formula field (which combines three fields) to form a unique key. The purpose of the object is to reduce user efforts for creating the key manually.
There is an after insert trigger on Temp which calls an asynchronous method which upserts the data to another object SalesData using the key. The insert/update trigger on SalesData checks the various fields and creates/updates the records in another object SalesRecords. After the insertion/updation is complete, all the records in temp object Temp are deleted. The SalesRecords object does not have any trigger on it and is a child of another object Sales. The Sales object has some rollup fields which are summing up fields from SalesRecords object.
Lately, we are getting the below error for some of the records which are updated.
UNABLE_TO_LOCK_ROW, unable to obtain exclusive access to this record
Please provide some pointers to resolve the issue
this could either be caused by conflicting DML operations in the various trigger execution or some recursive trigger execution. i would assume that the async executions cause multiple subsequent updates on the same records, probably on the SalesRecords object. I would recommend to try to simplify the process to avoid too many related trigger executions.
I'm a little surprised you were able to get this to work in the first place. After triggers should be used with caution and only when before triggers can't be. One reason for this is that you don't need to perform additional DML to make changes to records, since in before triggers you simply change the values and the insert/update commit happens automatically. But recursive trigger firings is the main problem with after triggers.
One quick way to avoid trigger re-entry is to use a public static Boolean in a class that states whether you're already in this trigger from the same thread of execution.
Something like:
public static Boolean isExecuting = false;
Once set to true, any trigger code that is a re-fire can be avoided with:
if(Class.isExecuting == false)
{
Class.isExecuting = true;
// Perform trigger logic
// ...
}
Additionally, since the order of trigger execution cannot be determined up front, you might be seeing an issue with deletions or other data changes that depend on other parts of your flow to finish first.
Also, without knowing the details of your custom unique 3-part key, I'd wonder if there's a problem there too such as whether it's truly unique or not. Case insensitivity is a common mistake and it's the reason there are 15 AND 18 character Ids in Salesforce. For example, when people export to Excel (a case-insensitive environment) and do VLOOKUPs, they would occasionally find the wrong record. The 3-digit calculated suffix was added to disambiguate for case-insensitive environments.
Googling for this same error lead me to this post:
http://boards.developerforce.com/t5/General-Development/Unable-to-obtain-exclusive-access-to-this-record/td-p/345319
Which points out some common causes for this to happen:
Sharing Rules are being calculated.
A picklist value has been replaced and replacement is in progress.
A custom index creation/removal is in progress.
Most unlikely one - someone else is already editing the same record that you are trying to access at the same time.
Posting here in case somebody else needs it.
I got this error multiple times today. Turned out one of our vendors was updating their installed package during that time in the same org. All kinds of things were going wrong also - some object validation exceptions were being thrown on DMLs, without any error message content.
Resolution
The error is shown when a field update such as a roll-up summary field is being attempted on a parent object that already had a field update to cause the roll-up summary field to calculate. This could also occur if a trigger or another apex job running on the master object and it also attempting to do an update.
You can either reduce the batch size and try again or create separate smaller files to be imported if this issue occurs.

Play Model save function isn't actually writing to the database

I have a play model called "JobStatus" and it's just got one property, an enum with a JobState, (Running/notRunning).
The class extends model and is implemented as a singleton. You call it's getInstance() method to get the only record in the underlying table.
I have a job that runs every month and in the job I will toggle the state of the JobStatus object back and forth at various times and call .save().
I've noticed it isn't actually saving.
When the job starts off, it's first line of code is
JobStatus thisJobStatus = jobStatus.getInstance();
...// exit if already running
thisJobStatus.JobState = JobState.Running;
thisJobStatus.save()
then when the job is done it will change the status back to NotRunning and save again.
The issue is that when I look in the MySql database the actual record value is never changed.
This causes a catastrophic failure because when other nodes try to run the job they check the state and since they're seeing it as NotRunning, they all try to run the job also.
So my clever scheme for managing job state is failing because the actual value isn't getting commited to the DB.
How do I force Play to write to the DB right away when I call .save() on a model?
Thanks
Josh
try adding this to your JobStatus and call it after save.
public static void commit(){
JobStatus.em().getTransaction().commit();
JobStatus.em().getTransaction().begin();
JobStatus.em().flush();
JobStatus.em().clear();
}
I suppose you want to mark your job as "running" pretty much as the first thing when the job starts? In that case, you shouldn't have any other ongoing database statements yet...
To commit your changes in the database immediately (instead of after the job has ended), add the following commands after the thisJobStatus.save(); method call:
JPA.em().flush();
JPA.em().getTransaction().commit();
Additionally, since you're using MySQL, you might want to lock the row immediately upon retriveval using the SELECT ... FOR UPDATE clause. (See MySQL Reference Manual for more information.) Of course, you wouldn't want to have that in your getInstance() method, otherwise every fetch operation would lock the record.

CQRS/EventStore - changing two aggregates

I have a command that updates two aggregates. Since aggregate routes are transactional boundaries, I have a command that does a repository.Save() action on the first aggregate and then I fire another command (from within the first command) which acts on the second aggregate. Each Save() actions starts its Event-Store transaction and commits the changes and then publishes them.
First is this correct, i.e. letting one command notify another aggregate via another command?
I noticed in Mark Nihjof's code that he uses event handlers which is nice as you could register the event handlers to the same event. I tried doing this using J Oliver's Event-Store but my commits.events in IDispatchCommit were referencing the first aggregates values when processing the second. This caused some weird errors.
So should I find a way of making this work with EventHandlers or is firing off commands within commands okay?
JD
Edit - I have used switched my wire up to use .UsingAsynchronousDispatchScheduler() and am now allowing registered events to fire more than one event handler which in turn fires a command on the other aggregate and it seems to work. So, is this the correct way to do it and not use commmands firing commands?
I think there's a million and one ways to skin this cat. I'm not sure firing a command from an event handler is the way to go, I have to command handlers respond to the same command in this instance.
I do find documently good for a reference app. Have you looked a that?