Resequencing of workflow events - cadence-workflow

Looking for suggestion or solution on the following usecase
Application receives messages ordered by change time identified by a
functional key (e.g. employee id). There can be multiple messages
for a functional key
Each message triggers a workflow. If there is a pending workflow for an employee would like to queue the new messages until the
pending workflow is complete.
Is there any way in cadence to resequence the messages to process them as a group identified by a functional key in the message?

I would have a single workflow per employee which receives a message (possibly with SignalWithStart) and queues it up in a variable if there is already processing going for a message. The processing can be implemented as a child workflow or directly as part of the employee workflow. When the processing is done the new one is kicked off if there is a buffered request. If there are no buffered requests and processing is done the employee workflow can exit.

Related

Kogito - wait until data from multiple endpoints is received

I am using Kogito with Quarkus. I have set on drl rule and am using a bpmn configuration. As can be seen below, currently one endpoint is exposed, that starts the process. All needed data is received from the initial request, it is then evaluated and process goes on.
I would like to extend the workflow to have two separate endpoints. One to provide the age of the person and another to provide the name. The process must wait until all needed data is gathered before it proceeds with evaluation.
Has anybody come across a similar solution?
Technically you could use a signal or message to add more data into a process instance before you execute the rules over the entire data, see https://docs.kogito.kie.org/latest/html_single/#ref-bpmn-intermediate-events_kogito-developing-process-services.
In order to do that you need to have some sort of correlation between these events, otherwise, how do you map that event name 1 should be matched to event age 1. If you can keep the process instance id, then the second event can either trigger a rest endpoint to the specific process instance or send it a message via a message broker.
You also have your own custom logic to aggregate the events and only fire a new process instance once your criteria of complete data is met, and there is also plans in Kogito to extend the capabilities of how correlation is done, allowing for instance to use variables of the process as the identifier. For example, if you have person.id as correlation and event to name and age of the same id would signal the same process instance. HOpe this info helps.

Resume Cadence Workflow based on signal without blocking the thread

We want to build a workflow which contains below steps in that order
Execute some synchronous activities.
Trigger an external operation via kafka event.
Listen to the kafka events for the result of the operation.
Execute some other activities based on the result.
Kafka may contain events not related to workflow, so we need a separate workflow to filter the events for that particular workflow.
Using cadence I'm planning to split it into two workflows
Workflow1 : 1 -> 2 -> wait for signal -> 4
Workflow2 : 3 -> Call workflow1.signal
Is it possible to wait for a signal in workflow1 without actually blocking the thread, so that the thread can process another workflow in the meantime.
I think there is some misunderstanding on how Temporal/Cadence works. There is no requirement to not block a thread for other workflows to be able to make progress. Worker instance will have no problem dealing with such situation.
So I would recommend to block the thread in the workflow one to wait for the signal as it is the simplest way to solve your business requirements.
As a side note I don't understand why you need a second workflow. There is no need to have a workflow to filter Kafka events. You can do it directly in a Kafka consumer that signals the first workflow.
I have some experience writing Kafka/Kinesis consumers (not working with Cadence but plan to do so soon). My feeling is that you only need 1 consumer thread blocked and waiting for new events from the Kafka stream. And this consumer can live anywhere as long as it can talk to your Cadence system to send a signal to a workflow. For each Kafka message (after filter out non related), if it can be designed to contain all the information for the consumer to decide which workflow to signal, it will be very simple. If you have no control over what is in the message (sounds like you have an existing stream), it is a little trick. Your consumer may need to look up which workflow to call based on some other identifier in the message

Implementing a pub-sub pattern with Axon

We have a multi-step process we'd like to implement using a pub-sub pattern, and we're considering Axon for a big part of the solution.
Simply, the goal is to generate risk scores for insurance companies. These steps would apply generally to a pub-sub application:
A client begins the process by putting a StartRiskScore message on a bus, specifying the customer ID. The client subscribes to RiskScorePart3 messages for the customer ID.
Actor A, who subscribes to StartRiskScore messages, receives the message, generates part 1 of the risk score, and puts it on the bus as a RiskScorePart1 message, including the customer ID.
Actor B, who subscribes to RiskScorePart1 messages, receives the message, generates part 2 of the risk score, and puts it on the bus as a RiskScorePart2 message, including the customer ID.
Actor C, who subscribes to RiskScorePart2 messages, receives the message, generates part 3 of the risk score, and puts it on the bus as a RiskScorePart3 message, including the customer ID.
The original client, who already subscribed to RiskScorePart3 messages for the customer ID, receives the message and the process is complete.
I considered the following Axon implementation:
A. Make an aggregate called RiskScore
B. StartRiskScore becomes a command associated with the RiskScore aggregate.
C. The command handler for StartRiskScore becomes Actor A. It processes some data and puts a RiskScorePart1 event on the bus.
Now, here's the part I'm concerned about...
D. I'd create a RiskScorePart1 event handler in a separate PubSub object, which would do nothing but put a CreateRiskScorePart2 command on the command bus using the data from the event.
E. In the RiskScore aggregate, a command handler for CreateRiskScorePart2 (Actor B) would do some processing, then put a RiskScorePart2 event on the bus.
F. Similar to step D, a PubSub event handler for RiskScorePart2 would put a CreateRiskScorePart3 command on the command bus.
G. Similar to step E, a RiskScore aggregate command handler for CreateRiskScorePart3 (Actor C) would do some processing, then put a RiskScorePart3 event on the bus.
H. In the aggregate and the RiskScoreProjection query module, a RiskScorePart3 event handler would update the aggregate and projection, respectively.
I. The client is updated by a subscribed query to the projection.
I understand that replay occurs when a service is restarted. That's bad for old events because I don't want to re-fire commands from the PubSub handlers. It's good news for new events that occurred while the PubSub service was down.
EDIT #1:
I've considered using an Axon saga, which would be great. However, the same questions still exist even if PubSub is a saga:
How to ensure PubSub event handlers process each event exactly once, even after a restart?
Is there a different approach I should be taking to implement a pub-sub pattern in Axon?
Thanks for your help!
I think I can give some guidance in this area.
In your update you've pointed out that you envisioning the usage of a Saga to perform this set up.
I'd however would like to point out that a Saga is meant to 'Orchestrate a Complex Business Transaction between Bounded Contexts/Aggregates'. The scenario you're describing is not a transaction between other contexts and/or aggregates, it's all contained in a single Aggregate Root, the RiskScore.
I'd thus suggest against using a Saga for this situation, as the tool (read: Saga) is relatively heavy wait for what you're describing.
Secondly, from the steps you describe from A to I, it looks as if the components described in steps D and F are purely there to react with a command on the event. Thus, they perform zero business functionality, taking that assumption.
Taking my initial point of a transaction contained in a single Aggregate Root and the fact no business functionality occurs on the dispatching of the command back in to the aggregate, why not contain the entirety of the operation within the RiskScore aggregate?
You can very easily handle the events an Aggregate publishes with the #EventSourcingHandler and on that method apply another event. Or, if you would like to be 'pure' about segregating state updates and apply events, you could just apply more events for the separate risk-score steps there after.
Any how, I don't see why you would need to hold tightly towards the pub-sub pattern. I'd take a solution which resolves the business needs as best as possible. That might be an existing pattern, but could just as well be any other approach you can think off.
This is my two cents to the situation, hope they help!

Synchronous event triggering

I want to trigger at the exact same time through message receipt, some processes into different Actors. Considering my Actors possible heavily stacked mailBoxes, what would be the best method to implement this?
I'm assuming you want the actors to read the messages at the same time. This, of course is not possible (while an actor is processing a message he cannot be disturbed).
But you can make sure that your trigger message is the next message they will take from the mailbox. This can be achieved by using a priority mailbox, for example this one: http://doc.akka.io/api/akka/snapshot/index.html#akka.dispatch.UnboundedStablePriorityMailbox
The messages in the mailbox will be sorted by priority. If you give your trigger messages the highest priority, they will be processed first.

Process messages from Azure in LIFO

I am using the Azure REST API to read messages from an Azure Queue using Peek-Lock Message. Is there any way I can read the last message that was posted in the queue rather than reading from a queue based mechanism (FIFO)?
Also, is there a faster way to process messages from Azure other than using the Peek-Lock Message REST API?
Thanks!
Is there any way I can read the last message that was posted in the
queue rather than reading from a queue based mechanism (FIFO)?
Using the REST API, unfortunately there's no way to process the last message first. You would have to implement something on your own. If you know that your queue can't have more than 32 messages at a time, you could possibly get all 32 messages in one go and sort them on the client side based on the message insertion time. Yet another (crazy) idea would be to create a new queue for each message and name the queue using the following pattern: "q"-(DateTime.MaxValue.Ticks - DateTime.UtcNow.Ticks). Now list queues and get only the 1st queue. This will give you the message you last inserted.
Also, is there a faster way to process messages from Azure other than
using the Peek-Lock Message REST API?
One possibility could be to fetch more than one messages from a queue and process them in parallel on the client side.