Can a child workflow be executed asynchronously? - cadence-workflow

I'm trying to implement a perpetual workflow that commences with an activity that blocks until a message is delivered (namely, Redis' BLPOP). Once it completes, I want to start a new workflow asynchronously to do some sort of processing and return ContinueAsNew immediately.
I've tried to start the processing workflow using child workflows. What I've observed is that my parent workflow completes before the child is executed. Unless I process the returned future, but I don't really want to do that.
What would be the right way to do this? Is it possible to start a new regular workflow within a workflow? Would such action be implemented as part of the workflow or within an activity?
Thank you in advance!

The solution is to wait for a child workflow to start before completing or continuing as new the parent.
If you are using the Go Cadence Client the workflow.ExecuteChildWorkflow returns a ChildWorkflowFuture which extends a Future that returns the child workflow result. It also has GetChildWorkflowExectution method that returns a Future that becomes ready as soon as the child is started. So to wait for a child workflow to start the following code can be used:
f := workflow.ExecuteChildWorklfow(ctx, childFunc)
var childWE WorkflowExecution
// The following line unblocks as soon as the child is started.
if err := f.GetChildWorkflowExecution().Get(&childWE); err != nil {
return err
}
Child workflow has started with Workflow ID found in childWE.ID and Run ID in childWE.RunID
The Java equivalent is:
ChildType child = Workflow.newChildWorkflowStub(ChildType.class);
// result promise becomes ready when the child completes
Promise<String> result = Async.function(child::executeMethod);
// childWE promise becomes ready as soon as the child is started
Promise<WorkflowExecution> childWE = Workflow.getWorkflowExecution(child);

Related

Talend ETL - running child job in tLoop

I am trying to run a child job in tLoop. The child job connects to salesforce and downloads "Account" object to local SQL Server table. There are problems with connection to Salesforce, it takes few attempts to connect. Hence, I put the connection stuff in child job and now trying to call the child job in a loop. Below is the image of my parent job.
As you can see in image the tRunJob_1 has error because of Salesforce connection problem in child job. This is correct behaviour.
The setRetryConnect that is connected to OnComponentError has this code: context.retryConnect = true;
The setRetryConnect that is connected to OnComponentOk has this code: context.retryConnect = false;
So, I am tripping this context variable depending on whether child job succeeds or fails.
My tLoop looks as below:
I want the tLoop to run as many times till the condition remains true. That is till the time it continues to error out. However, it just iterates once and then stops. Could anyone please let me know what correction need to be done here to make the tLoop work?
I couldn't re-pro your issue with SalesForce but by looking at your job what I feel is that when you say - "it just iterates once and then stops" is the expected behavior.
As per your job flow after the tRunJob you are using OnComponentOk/OnComponentError trigger which would process and stop the job run as it would have completed the job execution. What it would have ideal was to keep everything in a subjob post tLoop so that it will iterate till the condition is met.
Sample job for explanation -
Here used tSetGlobalVar to define a global variable (in place of your context variable). Then use the globalMap variable as ((Boolean)globalMap.get("tLoop")) in your "Condition" for the tLoop.
And then finally run some code in the tJava component that does something and conditionally sets the global variable to false to mark the ending of loop.
tRunJob provides an Return Code ((Integer)globalMap.get("tRunJob_1_CHILD_RETURN_CODE"))
If you're running your child Job a number of times and want your Job to exit with non-Zero if one of these iterations fails, then after each iteration, you should test this return code and store it in your own globalMap Object if it is non-zero
int returnCode = ((Integer)globalMap.get("tRunJob_1_CHILD_RETURN_CODE"));
if (returnCode > 0) {
globalMap.put("tLoop", false);
}
else {
System.out.println(returnCode);
};
Found the answer myself, posting it here so that it may help others. It appears like OnComponentError breaks the tLoop. Disabled the OnComponentError flow and un-checked the 'Die on Child Error' checkbox in tRunJob.
The tLoop remains as it is. No changes here.
The retryConnect will use the below code. It uses CHILD_RETURN_CODE to check whether the child job threw error. In case of success, its value is 0. I am tripping the variable when the child job succeeds, so the loop will stop. As you can see, the tLoop shows 2 iterations, it is working as expected now. Thanks.

Celery Task Chaining/Worker Release

If I write a celery task that calls other celery tasks, can I release the parent task/worker without waiting for the downstream tasks to finish?
The situation:
I am working with an API that returns some data and the arguments for the next API call. I want to put all the data behind the API into a database. My current method is to query the API for the batch to work on, start some downstream processors, then recursively re-call the API+processing chain. I fear this will lock up workers waiting for all the recursive API calls to finish, when the workers do not care about the results of their children.
Pseudocode:
#task
def apiPing(start=None):
""" Returns a dict of 5 elements, starting at the *start* element, or the
beginning of the list if start is not specified. Also present in the dict is 'remaining',
indicating how many elements are left in the API's list"""
return json.loads(api(start))
#task
def processList(data)
""" Takes a result from API ping, starts a task to store each element and a
chain to recall the API and process that."""
for element in data:
store(element).delay()
if data['remaining']!=0:
chain = chain(apiPing.s(data['last']), processList.s())
chain.delay()
I understand from here that the above is very close to bad; I do not want workers handling processList() to be locked up until all of the data in the API is handled. Is there a way to start the downstream tasks and release the parent worker, or refactor the above to not lock up workers?
Testing reveals that workers are in fact locked this way:
from celery import task
from time import sleep
#task
def parent():
print "In parent"
child.apply_async()
print "Out of parent"
#task
def child():
print "In child"
sleep(10)
print "Out of child"
[2013-08-05 18:37:29,264: WARNING/PoolWorker-4] In parent
[2013-08-05 18:37:31,278: WARNING/PoolWorker-2] In child
[2013-08-05 18:37:41,285: WARNING/PoolWorker-2] Out of child
[2013-08-05 18:37:41,298: WARNING/PoolWorker-4] Out of parent

Get executed jobs form the scheduler in Quartz

I want to retrieve scheduled but already executed jobs from the scheduler in Quartz. Is there any way to do so?
Well, first you need to retrieve a list of all currently scheduled jobs:
Scheduler sched = new StdSchedulerFactory().getScheduler();
List jobsList = sched.getCurrentlyExecutingJobs();
Then, it's a matter of iterating the list to retrieve the context for reach job. Each context has a getPreviousFireTime().
Iterator jobsIterator = jobsList.listIterator();
List<JobExecutionContext> executedJobs = new List<JobExecutionContext>();
while(jobsIterator.hasNext())
{
JobExecutionContext context = (JobExecutionContext) jobsIterator.next();
Date previous = context.getPreviousFireTime();
if (previous == null) continue; //your job has not been executed yet
executedJobs.Add(context); //there's your list!
}
The implementation may be slightly different depending on which quartz you use (java or .net) but the principle is the same.
Set the property JobDetail.setDurability(true) - which instructs Quartz not to delete the Job when it becomes an "orphan" (when the Job not longer has a Trigger referencing it).

How to log/dump/outout dom html

I've tried:
console.log(element('.users').html());
but the only thing I get is
LOG: { name: 'element \'.users\' html', fulfilled: false }
I assume you are using Angular scenario runner.
The element().html() dsl returns a future (see wikipedia). You are logging the future that will eventually contain the element, but at the point when you are calling console.log, the future is not resolved yet - there is no value in there.
Try this:
element('.users').query(function(elm, done) {
console.log(elm.html());
done();
});
The whole scenario runner works as a queue. The test code is executed (synchronously) and each command (eg. element().html in your case) adds some action into this queue. Then, these actions are executed asynchronously - once first action finishes (by calling done()), the second action is executed, etc... Therefore the individual actions can be asynchronous, but the test code is synchronous, which is more readable.

Display progress when running long operation?

in my ASP.NET MVC3 Project, I've got an action which runs a certain amount of time.
It would be nice, if it could send partial responses back to the view.
The goal would be to show the user some progress-information.
Has anybody a clue how to make that work?
I did a try with some direct output to the response, but it's not being sent to the client in parts but all on one block:
[HttpPost]
public string DoTimeConsumingThings(int someId)
{
for (int i = 0; i < 10; i++)
{
this.Response.Write(i.ToString());
this.Response.Flush();
Thread.Sleep(500); // Simulate time-consuming action
}
return "Done";
}
In the view:
#Ajax.ActionLink("TestLink", "Create", new AjaxOptions()
{ HttpMethod = "POST", UpdateTargetId="ProgressTarget" })<br />
<div id="ProgressTarget"></div>
Can anybody help me making progressive action-results?
Thanks!!
Here's how you could implement this: start by defining some class which will hold the state of the long running operation -> you will need properties such as the id, progress, result, ... Then you will need two controller actions: one which will start the task and another one which will return the progress. The Start action will spawn a new thread to execute the long running operation and return immediately. Once a task is started you could store the state of this operation into some common storage such as the Application given the task id.
The second controller action would be passed the task id and it will query the Application to fetch the progress of the given task. During that time the background thread will execute and every time it progresses it will update the progress of the task in the Application.
The last part is the client: you could poll the progress controller action at regular intervals using AJAX and update the progress.