Windows Workflow: Persistence and Polling - workflow

I'm currently learning the WF framework, so bear with me; mostly I'm looking for where to start looking, not necessarily a direct answer. I just can't seem to figure out how to begin researching what I'd like in The Google.
Let's say I have a simple one-step workflow (much more complicated than that, but for simplicity's sake). This workflow needs to watch a certain record in the database to see when it changes. I don't have the capability to "push" via a trigger from the database when the row changes, so I need to poll for it every so often.
This workflow needs to be persisted to the database to be durable against restarts and whatnot as this is a long-running workflow. I'm trying to figure out the best way to get it to check every 3 minutes or so and also persist to the database. Do the persistence capabilities of the framework allow for that? It seems to be time-based. And since the workflow won't be reawakened by an external event, how does it reload from the database and check the same step it did previously again? Does it attempt the last unfulfilled activity automatically upon reloading?
Do "while" activities with a delay attached to it work at all, or can it be handled solely through the persistence services?

I'm not sure what you mean by "handled soley through persistence services"? Persistence refers only to the storing of an idle workflow.
You could have a Delay and a Code activity in a Sequence in a While loop. When in the Delay the workflow will go idle and may be persisted if necessary. However depending on how much state is needed when persisting the workflow and/or how many such workflows you would have running at any one time may mean that a leaner approach is necessary.
A leaner approach would be to externalise the DB watching and have some "DB watching" workflow service raise an event when the desired change has occured. This service would be added to Workflow runtime.
To that end you need a service contract which is defined by an Inteface with the [ExternalDataExchange] attribute. This interface in turn defines an event that the service will raise when the desired DB change is detected. It also defines a method that a Workflow can call to specify what what change this service should be looking for. The method should accept an instance GUID so that the requesting instance can be found when the DB change is detected.
In the workflow you use a CallExternalMethodActivity to call this services method. You then flow to a HandleExternalEventActivity which listen for the event. At this point the workflow will go idle and can be persisted. It will remain there until the service raises the event.

Related

Should I use child workflow or use activity to start new workflow

Like the title. Seems like both ways should work but child workflow seems easier.
It’s strongly recommended to always use activity to start new workflow and never use ChildWorkflow until the reset feature is working with Child Workflow https://github.com/uber/cadence/issues/3914
https://github.com/temporalio/temporal/issues/3141
To get result back to parent from child workflow, use signal. To link the two workflows, use search attributes when starting new workflows.
As Quanzheng said, if you need to use Reset, then Child Workflows are not currently an option.
Apart from that issue, the semantics of Child Workflows are quite different from starting a new workflow via an Activity.
The primary differences are that:
By default, Terminations and Cancellations are propagated to Child Workflows, although this can be overridden at Child Workflow creation time. This behavior is possible to implement with co-equal Workflows, but requires a careful Workflow implementation which never terminates without terminating its children.
Waiting for a Child Workflow to complete is directly supported in the Temporal API, whereas waiting for an arbitrary Workflow is not. See this issue.
Whether you need either of those capabilities, and whether you use Reset, should tell you if Child Workflows are appropriate to your use-case.

What should I consider when using Cadence/Temporal to design a new project?

I am new to Cadence/Temporal and was wondering what the design review process is like. My team is ready to have a formal design review out but was wondering if there is a template available to capture Cadence/Temporal specific information?
This is something I try to call as "workflow-oriented-architecture". I would suggest to think more about the below aspects:
Different options/alternatives of “what part of the process” in the design that can be modeled as workflow. Based on that,
What will be the workflowID with which IDReusePolicy? It's usually recommended to use some business ID to guarantee the uniqueness so that there is only one workflow executing for a business entity
How is the Workflow started with what information as input parameters?
What Cadence/Temporal concepts you are planning to use, and how does a workflow interact with other system?
Regular/local/long-running activity is for making an action to external system
Durable timer (use workflow.Sleep or Workflow.Await) is to wait for certain time then wake up. Unlike using sleep in native language, durable timer is reliable that whatever host restart won't impact the firing
signal is to receive an event from external system
query is to let external system to get some workflow states
search attributes can do two things: a) letting application searching for workflows with some conditions using ListWorkflowExecutions API, and letting application to get the basic status by DescribeWorkflowExecution API
How do you handle failure, especially using Cadence/Temporal concepts: activityRetry, workflowRetry, reset

What are the best practices when working with data from multiple sources in Flutter/Bloc?

The Bloc manual describes the example of a simple Todos app. It works as an example, but I get stuck when trying to make it into a more realistic app. Clearly, a more realistic Todos app needs to keep working when the user temporarily loses network connection, and also needs to occasionally check the server for updates that the user might have added from another device.
So as a basic data model I have:
dataFromServer, which is refreshed every five minutes, and
localData, that describes what changes have been made locally but haven't been synchronized to the server yet.
My current idea is to have three kinds of events:
on<GetTodosFromServer>() which runs every few minutes to check the server for updates and only changes the dataFromServer,
on<TodoAdded>() (and its friends TodoDeleted, TodoChecked, and so on) which get triggered when the user changes the data, and only change the localData, and
on<SyncTodoToServer>() which runs whenever the user changes the todo list, or when network connectivity is restored, and tries to send the changes to the server, retrieves the new value from the server, and then sets the new dataFromServer and localData.
So obviously there's a lot of interaction between these three methods. When a new todo is added after the synchronization to the server starts, but before synchronization is finished, it needs to stay in the local changes object. When GetTodosFromServer and SyncTodoToServer both return server data, they need to find out who has the latest data and keep that. And so on.
Coming from a Redux background, I'm used to having two reducers (one for local data, one for server data) that would only respond to simple actions. E.g. an action { "type": "TodoSuccessfullySyncedToServer", uploadedData: [...], serverResponse: [...] } would be straightforward to parse for both the localData and the dataFromServer reducer. The reducer doesn't contain any of the business logic, it receives actions one by one and all you need to think about inside the reducer is the state before the action, the action itself, and the state after the action. Anything you rely on to handle the action will be in the action itself, not in the context. So different pieces of code that generate those actions can just fire these actions without thinking, knowing that the reducer will handle them correctly.
Bloc on the other hand seems to mix business logic and updating the state. API calls are made within the event handlers, which will emit a value possibly many seconds later. So every time you return from an asynchronous call in an event handler, you need to think about how the state might have changed while that call was happening and the consequences this has on what you're currently doing. Also, an object in the state can be updated by different events that need to coordinate among themselves how to avoid conflicts while doing so.
Is there a best practice on how to avoid the complexity that brings? Is it best practice to split large events into "StartSyncToServer" and "SuccessfullySyncedToServer" events where the second behaves a lot like a Redux reducer? I don't see any of that in the examples, so is there another way this complexity is typically avoided in Bloc? Or is Bloc entirely unopinionated on these things?
I'm not looking for personal opinions here, only if there's something I missed in the Bloc manual (or other authoritative source) about how this was intended to work.

Is there a way to rely on Postgres Notify/Listen mechanism?

I have implemented a Notify/Listen mechanism, so when a special request is sent to the web server, using notify I can notify the workers (in Python) that there's a pending request waiting to be processed.
The implementation works fine, but the problem is that if the workers server is restarting, the notification gets lost, since at that particular time there's no listener.
I can implement a service like MQRabbit or similar, but my needs are so simple that implement such a monster is too much.
Is there any way, a configuration variable perhaps, that can give some persistence to the notification mechanism?
Thanks in advance
I don't think there is a way to persist notification channels, but you can simply store the pending requests to a table, and have the worker check for any missed work on startup.
Either a timestamp or a pending/completed flag would work, depending on what kind of work it's doing.
For consistency, you can have the NOTIFY fire from an INSERT trigger on the queue table, and have the worker always check for any remaining work (not just a specific request) when notified.

What triggers UI refresh in CQRS client app?

I am attempting to learn and apply the CQRS design approach (pattern and architecture) to a new project but seem to be missing a key piece.
My client application executes a query and retrieves a list of light-weight, read-only DTOs from the read model. The user selects an item and clicks a button to initiate some action. The action is performed by creating and sending the corresponding command object to the write model (where the command handler carries out the action, updates the data store, etc.) At some point, however, I need to update the UI to reflect changes to the state of the application resulting from the action.
How does the UI know when it is time to refresh the original list?
Additional Info
I have noticed that most articles/blogs discussing CQRS use MVC client apps in their examples. I am working on a Silverlight client right now and am beginning to wonder if the pattern simply doesn't work in that case.
Follow-Up Question
After thinking more about Bartlomiej's response and subsequent discussion, I am wondering about error handling in CQRS. Given that commands are basically fire-and-forget asynchronous operations, how do we report an error condition to the UI?
I see 'refreshing the UI' to take one of two forms:
The operation succeeds, data has changed and the UI should be updated to reflect these changes
The operation fails, data has not changed but the user should be notified of the failure and potential corrective actions.
Even with a Post-Redirect-Get pattern in an MVC, you can't really Redirect until you know the outcome of the operation. None of the examples I've seen thus far address these real-world concerns.
I've been struggling with similar issues for a WPF client. The re-query trigger for any data is dependent on the data your updating, commands tend to fall into categories:
The command is a true fire and forget method, it informs the back-end of a state change but this change does not need to be reflected in the UI, or the change simply isn't important to the UI.
The command will alter the result of a single query
The command will alter the result of multiple queries, usually (in my domain at least) in a cascading fashion, that is, changing the state of a single "high level" piece of data will likely affect many "low level" caches.
My first trigger is the page load, very few items are exempt from this as most pages must assume data has been updated since it was last visited. Though some systems may be able to escape with only updating financial and other critical data in this way.
For short commands I also update data when 'success' is returned from a command. Though this is mostly laziness as IMHO all CQRS commands should be fired asynchronously. It's still an option I couldn't live without but one you may have to if your implementation expects high latency between command and query.
One pattern I'm starting to make use of is the mediator (most MVVM frameworks come with one). When I fire a command, I also fire a message to the mediator specifying which command was launched. Each Cache (A view model property Retriever<T>) listens for commands which affect it and then updates appropriately. I try to minimise the number of messages while still minimising the number of caches that update unnecessary from a single message so I'll (hopefully) eventually end up with a shortlist of update reasons, with each 'reason' updating a list of caches.
Another approach is simple honesty, I find that by exposing graphically how the system updates itself makes users more willing to be patient with it. On firing a command show some UI indicating you're waiting for the successful response, on error you could offer to retry / show the error, on success you start the update of the relevant fields. Baring in mind that this command could have been fired from another terminal (of which you have no knowledge) so data will need to timeout eventually to avoid missing state changes invoked by other machines also.
Noting the irony that the only efficient method of updating cache's and values on a client is to un-separate the commands and queries again, be it through hardcoding or something like a hashmap.
My two cents.
I think MVVM actually fits into CQRS quite well. The ViewModel simply becomes an observable ReadModel.
1 - You initialize your ViewModel state via a query on the ReadModel.
2 - Changes on your ViewModel are automatically reflected on any Views that are bound to it.
3 - Certain changes on your ViewModel trigger a command to propegate to a message queue, an object responsible for sending those commands to the server takes those messages off the queue and sends them to the WriteModel.
4 - Clients should be well formed, meaning the ViewModel should have performed appropriate validation before it ever triggered the command. Once the command has been triggered, any event notifications can be published onto an event bus for the client to communicate changes to other ViewModels or components in the system interested in those changes. These events should carry the relevant information necessary. Typically, this means that other view models usually don't have to re-query the read model as a result of the change unless they are dependent on other data that needs to be retrieved.
5 - There is an object that connects to the message bus on the server for real-time push notifications when other clients make changes that this client is interested in knowing about, falling back to long-polling if necessary. It propagates those to the internal message bus that ties the components on the client together.
6 - The last part to handle is the fact that clients can be occasionally connected, which should be the only reason a command fails (they don't have internet access at the moment), which is when the client should be notified of problems.
In my ASP.NET MVC 3 I use 2 techniques depending on use case:
already well-known Post-Redirect-Get pattern which fits nicely with CQRS. Your MVC action that triggers the command returns a redirection to action that performs a query.
in some cases, like real-time updates of other clients, I rely on domain events/messages. I create an event handler that uses singlarR to push changes to all connected and interested clients.
There are two major ways you can take as far as I know :
1) design your UI , so that the user does not see its changes right away. Like for instance a message to tell him his action is a success, and offering him different choices to continue his work. this should buy you enough time to have updated your readmodel.
2) more complex, but you might keep the information you have send to the server and shows them in the interface.
The most important I guess, educate your user if you can so that they know why the data is not here... yet!
I am thinking about it only now, but these are for sync command handling, not async, in async things go really harder on the brain...the client interface becomes an event eater too..