Should I use child workflow or use activity to start new workflow - cadence-workflow

Like the title. Seems like both ways should work but child workflow seems easier.

It’s strongly recommended to always use activity to start new workflow and never use ChildWorkflow until the reset feature is working with Child Workflow https://github.com/uber/cadence/issues/3914
https://github.com/temporalio/temporal/issues/3141
To get result back to parent from child workflow, use signal. To link the two workflows, use search attributes when starting new workflows.

As Quanzheng said, if you need to use Reset, then Child Workflows are not currently an option.
Apart from that issue, the semantics of Child Workflows are quite different from starting a new workflow via an Activity.
The primary differences are that:
By default, Terminations and Cancellations are propagated to Child Workflows, although this can be overridden at Child Workflow creation time. This behavior is possible to implement with co-equal Workflows, but requires a careful Workflow implementation which never terminates without terminating its children.
Waiting for a Child Workflow to complete is directly supported in the Temporal API, whereas waiting for an arbitrary Workflow is not. See this issue.
Whether you need either of those capabilities, and whether you use Reset, should tell you if Child Workflows are appropriate to your use-case.

Related

What should I consider when using Cadence/Temporal to design a new project?

I am new to Cadence/Temporal and was wondering what the design review process is like. My team is ready to have a formal design review out but was wondering if there is a template available to capture Cadence/Temporal specific information?
This is something I try to call as "workflow-oriented-architecture". I would suggest to think more about the below aspects:
Different options/alternatives of “what part of the process” in the design that can be modeled as workflow. Based on that,
What will be the workflowID with which IDReusePolicy? It's usually recommended to use some business ID to guarantee the uniqueness so that there is only one workflow executing for a business entity
How is the Workflow started with what information as input parameters?
What Cadence/Temporal concepts you are planning to use, and how does a workflow interact with other system?
Regular/local/long-running activity is for making an action to external system
Durable timer (use workflow.Sleep or Workflow.Await) is to wait for certain time then wake up. Unlike using sleep in native language, durable timer is reliable that whatever host restart won't impact the firing
signal is to receive an event from external system
query is to let external system to get some workflow states
search attributes can do two things: a) letting application searching for workflows with some conditions using ListWorkflowExecutions API, and letting application to get the basic status by DescribeWorkflowExecution API
How do you handle failure, especially using Cadence/Temporal concepts: activityRetry, workflowRetry, reset

Reuse Jobs in GitHub Actions Workflow

I’m migrating a pipeline from Circle CI to Github Actions and am finding it a bit weird that I can only run jobs once instead of creating a job, then calling it from the workflow section, making it possible to call a job multiple times without duplicating the commands/scripts in that job.
My pipeline pushes out code to three environments, then runs a lighthouse scan for each of them. In circle ci I have 1 job to push the code to my envs and 1 job to run lighthouse. Then from my workflow section, I just call the jobs 3 times, passing the env as a parameter. Am I missing something or is there no way to do this in github actions? Do I just have to write out my commands 3 times in each job?
There are 3 main approaches for code reusing in GitHub Actions:
Reusing workflows
The obvious option is using the "Reusable workflows" feature that allows you to extract some steps into a separate "reusable" workflow and call this workflow as a job in other workflows.
Takeaways:
Reusable workflows can't call other reusable workflows.
The strategy property is not supported in any job that calls a reusable workflow.
Env variables and secrets are not inherited.
It's not convenient if you need to extract and reuse several steps inside one job.
Since it runs as a separate job, you have to use build artifacts to share files between a reusable workflow and your main workflow.
You can call a reusable workflow in synchronous or asynchronous manner (managing it by jobs ordering using needs keys).
A reusable workflow can define outputs that extract outputs/outcomes from executed steps. They can be easily used to pass data to the "main" workflow.
Dispatched workflows
Another possibility that GitHub gives us is workflow_dispatch event that can trigger a workflow run. Simply put, you can trigger a workflow manually or through GitHub API and provide its inputs.
There are actions available on the Marketplace which allow you to trigger a "dispatched" workflow as a step of "main" workflow.
Some of them also allow doing it in a synchronous manner (wait until dispatched workflow is finished). It is worth to say that this feature is implemented by polling statuses of repo workflows which is not very reliable, especially in a concurrent environment. Also, it is bounded by GitHub API usage limits and therefore has a delay in finding out a status of dispatched workflow.
Takeaways
You can have multiple nested calls, triggering a workflow from another triggered workflow. If done careless, can lead to an infinite loop.
You need a special token with "workflows" permission; your usual secrets.GITHUB_TOKEN doesn't allow you to dispatch a workflow.
You can trigger multiple dispatched workflows inside one job.
There is no easy way to get some data back from dispatched workflows to the main one.
Works better in "fire and forget" scenario. Waiting for a finish of dispatched workflow has some limitations.
You can observe dispatched workflows runs and cancel them manually.
Composite Actions
In this approach we extract steps to a distinct composite action, that can be located in the same or separate repository.
From your "main" workflow it looks like a usual action (a single step), but internally it consists of multiple steps each of which can call own actions.
Takeaways:
Supports nesting: each step of a composite action can use another composite action.
Bad visualisation of internal steps run: in the "main" workflow it's displayed as a usual step run. In raw logs you can find details of internal steps execution, but it doesn't look very friendly.
Shares environment variables with a parent job, but doesn't share secrets, which should be passed explicitly via inputs.
Supports inputs and outputs. Outputs are prepared from outputs/outcomes of internal steps and can be easily used to pass data from composite action to the "main" workflow.
A composite action runs inside the job of the "main" workflow. Since they share a common file system, there is no need to use build artifacts to transfer files from the composite action to the "main" workflow.
You can't use continue-on-error option inside a composite action.
Source: my "DRY: reusing code in GitHub Actions" article
I'm currently in the exact same boat and just found an answer. You're looking for a Composite Action, as suggested in this answer.
Reusable workflows can't call other reusable workflows.
Actually, they can, since Aug. 2022:
GitHub Actions: Improvements to reusable workflows
Reusable workflows can now be called from a matrix and other reusable workflows.
You can now nest up to 4 levels of reusable workflows giving you greater flexibility and better code reuse.
Calling a reusable workflow from a matrix allows you to create richer parameterized builds and deployments.
Learn more about nesting reusable workflows.
Learn more about using reusable workflows with the matrix strategy.

What is the exact use-case for ContinueAsNew

Team,
What is the exact use case to use continueAsNew?
As we have support for CronSchedule to do periodic activities, I don't know the scenario to use this.
Are we having this to give backward compatibility
There are many scenarios besides cron that require always running workflows. For example, a workflow that listens for external events and keeps some aggregated state. Such workflow will eventually run out of the history size limit. To support such workflow processing an unlimited number of events, it has to call continue as new periodically.

Alfresco, recognize when a workflow is started

I use Alfresco Community 5.2 and my need is to perform some work when one of the default Alfresco's workflow is started.
I could override all the workflows definitions, but I wonder if there is a better and quicker way to do that. The perfect would be a behavior which triggers when a workflow is started.
Is there something like that ?
Any other approach is accepted. Thanks.
There isn't anything similar to a behavior for workflows that I know of, although if your workflows will always have documents attached you could consider binding a behavior to the workflow package type (I don't recall off-hand what that type is--it might just be cm:folder which wouldn't be that useful).
This is kind of a hack suggestion, but you could implement a quartz job that would run every 30 seconds or every minute or so that would use the workflow service to check to see if any new workflows have started since the last check. If so, your code could be notified and passed the workflow ID, process ID, etc.
The straightforward solution is as you suggested in your original post--just modify the out-of-the-box processes with a task listener that fires when the workflow starts.
Following Jeff suggestion, and this tutorial, I managed to implement a task creation/completion listener and do my logic inside those blocks, resolving the problem.

Windows Workflow: Persistence and Polling

I'm currently learning the WF framework, so bear with me; mostly I'm looking for where to start looking, not necessarily a direct answer. I just can't seem to figure out how to begin researching what I'd like in The Google.
Let's say I have a simple one-step workflow (much more complicated than that, but for simplicity's sake). This workflow needs to watch a certain record in the database to see when it changes. I don't have the capability to "push" via a trigger from the database when the row changes, so I need to poll for it every so often.
This workflow needs to be persisted to the database to be durable against restarts and whatnot as this is a long-running workflow. I'm trying to figure out the best way to get it to check every 3 minutes or so and also persist to the database. Do the persistence capabilities of the framework allow for that? It seems to be time-based. And since the workflow won't be reawakened by an external event, how does it reload from the database and check the same step it did previously again? Does it attempt the last unfulfilled activity automatically upon reloading?
Do "while" activities with a delay attached to it work at all, or can it be handled solely through the persistence services?
I'm not sure what you mean by "handled soley through persistence services"? Persistence refers only to the storing of an idle workflow.
You could have a Delay and a Code activity in a Sequence in a While loop. When in the Delay the workflow will go idle and may be persisted if necessary. However depending on how much state is needed when persisting the workflow and/or how many such workflows you would have running at any one time may mean that a leaner approach is necessary.
A leaner approach would be to externalise the DB watching and have some "DB watching" workflow service raise an event when the desired change has occured. This service would be added to Workflow runtime.
To that end you need a service contract which is defined by an Inteface with the [ExternalDataExchange] attribute. This interface in turn defines an event that the service will raise when the desired DB change is detected. It also defines a method that a Workflow can call to specify what what change this service should be looking for. The method should accept an instance GUID so that the requesting instance can be found when the DB change is detected.
In the workflow you use a CallExternalMethodActivity to call this services method. You then flow to a HandleExternalEventActivity which listen for the event. At this point the workflow will go idle and can be persisted. It will remain there until the service raises the event.