Setting up Task Constraints - azure-batch

How can I go about setting the max task retry count and the max task wall clock time for a task in Azure Batch?
I tried the following:
task.Constraints.MaxTaskRetryCount = 3;
task.Constraints.MaxWallClockTime = TimeSpan.FromMinutes(10);
But this didn't seem to work as it breaks the code in that the task is then never created. What could possibly be the problem? What am I missing here?
Thanks!

I was able to fix it using TaskConstraints object
For eg :
CloudTask task = new CloudTask(taskId, taskCommandLine);
TaskConstraints taskConstraints = new TaskConstraints();
taskConstraints.MaxTaskRetryCount = 3;
taskConstraints.MaxWallClockTime = TimeSpan.FromMinutes(10);
task.Constraints = taskConstraints;

Related

Schedule Celery task to run after other task(s) complete

I want to accomplish something like this:
results = []
for i in range(N):
data = generate_data_slowly()
res = tasks.process_data.apply_async(data)
results.append(res)
celery.collect(results).then(tasks.combine_processed_data())
ie launch asynchronous tasks over a long period of time, then schedule a dependent task that will only be executed once all earlier tasks are complete.
I've looked at things like chain and chord, but it seems like they only work if you can construct your task graph completely upfront.
For anyone interested, I ended up using this snippet:
#app.task(bind=True, max_retries=None)
def wait_for(self, task_id_or_ids):
try:
ready = app.AsyncResult(task_id_or_ids).ready()
except TypeError:
ready = all(app.AsyncResult(task_id).ready()
for task_id in task_id_or_ids)
if not ready:
self.retry(countdown=2**self.request.retries)
And writing the workflow something like this:
task_ids = []
for i in range(N):
task = (generate_data_slowly.si(i) |
process_data.si(i)
)
task_id = task.delay().task_id
task_ids.append(task_id)
final_task = (wait_for(task_ids) |
combine_processed_data.si()
)
final_task.delay()
That way you would be running your tasks synchronously.
The solution depends entirely on how and where data are collected. Roughly, given that generate_data_slowly and tasks.process_data are synchronized, a better approach would be to join both in one task (or a chain) and to group them.
chord will allow you to add a callback to that group.
The simplest example would be:
from celery import chord
#app.task
def getnprocess_data():
data = generate_data_slowly()
return whatever_process_data_does(data)
header = [getnprocess_data.s() for i in range(N)]
callback = combine_processed_data.s()
chord(header)(callback).get()

simulink - GetSet Custom Storage Class

I am having a model which takes two input & multiplies them & give the output.
output_1 = input_1 * input_2
I have declared my simulink signals as CustomStorageClass= GetSet
input_1 = Simulink.Signal;
input_1.CoderInfo.StorageClass = 'Custom';
input_1.CoderInfo.CustomStorageClass = 'GetSet';
input_1.CoderInfo.CustomAttributes.GetFunction = 'Get_input_1';
input_1.CoderInfo.CustomAttributes.SetFunction = 'Set_input_1';
input_1.CoderInfo.CustomAttributes.HeaderFile = 'signals.h';
input_2 = Simulink.Signal;
input_2.CoderInfo.StorageClass = 'Custom';
input_2.CoderInfo.CustomStorageClass = 'GetSet';
input_2.CoderInfo.CustomAttributes.GetFunction = 'Get_input_2';
input_2.CoderInfo.CustomAttributes.SetFunction = 'Set_input_2';
input_2.CoderInfo.CustomAttributes.HeaderFile = 'signals.h';
output_1 = Simulink.Signal;
output_1.CoderInfo.StorageClass = 'Custom';
output_1.CoderInfo.CustomStorageClass = 'GetSet';
output_1.CoderInfo.CustomAttributes.GetFunction = 'Get_output_1';
output_1.CoderInfo.CustomAttributes.SetFunction = 'Set_output_1';
output_1.CoderInfo.CustomAttributes.HeaderFile = 'signals.h';
Now I am trying to convert my model to code using simulink coder.
In code generation setting of the model i have selected ert.tlc file in the system target file settings.
But the generated code does not have a Get_input_1() or Get_input_2() call like shown in this link.
http://www.mathworks.com/help/ecoder/ug/getset-custom-storage-classes.html
What i have missed in the setting. Please suggest
I know you probably already solved this issue, but I have also seen this behavior before.
Sometimes MATLAB does not update the header files correctly. If you had set a different configuration for your variable and then made a change involving the header files, I would recommend erasing the *_ert_rtw and slprj folders (they will appear again). It is similar to doing a "Make clean" operation, to ensure that everything is brand new.

How to ensure only one job fires at a time in Quartz.NET?

I have a Windows Service that uses Quartz.NET to execute jobs that are scheduled. I only want it to pick up a single job at a time. However, occasionally I am seeing behavior that indicates that it has picked up two jobs at once.
There are two log files (the regular one and one automatically generated when the regular one is in use) with jobs that start at the exact same time. I can see both jobs executing in the QRTZ_FIRED_TRIGGERS table, but only one has the correct instance ID, which is odd.
I have configured Quartz to use only a single thread. Is this not how you tell it to only pick up a single job at a time?
Here is my quartz.config file with sensitive values hashed out:
quartz.scheduler.instanceName = DefaultQuartzJobScheduler
quartz.scheduler.instanceId = ######################
quartz.jobstore.clustered = true
quartz.jobstore.clusterCheckinInterval = 15000
quartz.threadPool.type = Quartz.Simpl.SimpleThreadPool, Quartz
quartz.jobStore.useProperties = false
quartz.jobStore.type = Quartz.Impl.AdoJobStore.JobStoreTX, Quartz
quartz.jobStore.driverDelegateType = Quartz.Impl.AdoJobStore.OracleDelegate, Quartz
quartz.jobStore.tablePrefix = QRTZ_
quartz.jobStore.lockHandler.type = Quartz.Impl.AdoJobStore.UpdateLockRowSemaphore, Quartz
quartz.jobStore.misfireThreshold = 60000
quartz.jobStore.dataSource = default
quartz.dataSource.default.connectionString = ######################
quartz.dataSource.default.provider = OracleClient-20
# Customizable values per Node
quartz.threadPool.threadCount = 1
quartz.threadPool.threadPriority = Normal
Make the threadcount = 1.
<add key="quartz.threadPool.threadCount" value="1"/>
<add key="quartz.threadPool.threadPriority" value="Normal"/>
(as you have done)
Make each of your jobs "Stateful"
[PersistJobDataAfterExecution]
[DisallowConcurrentExecution]
public class StatefulDoesNotRunConcurrentlyJob : IJob /* : IStatefulJob */ /* Error 43 'Quartz.IStatefulJob' is obsolete: 'Use DisallowConcurrentExecutionAttribute and/or PersistJobDataAfterExecutionAttribute annotations instead. */
{
}
I've left in the name of the ~~older~~ version of how to do this (namely, the "IStatefulJob") and the error message that is generated when you code to the outdated "IStatefulJob" interface. But the error message gives the hint.
Basically, if you have 1 thread AND every job is marked with "DisallowConcurrentExecution", it should result in 1 job at any given time..running in "serial mode".

CQ Workflow- Comment not transferring between participant with a process step in between

When there is a process step between two participant steps, the comments is not passing between participants. My Workflow is like this-
ParticipantA ---> Process step X (ecma script) ----> Process step Y (ecma script) -----> ParticipantB
When I add some comment at ParticipantA step it does not carry forward to ParticipantB. Seems OOB functionality has limitations on this.
As a workaround, I am trying to get it at "Process step X" and passing on to Process step Y. I am able to get it but not able to set it for next step.
Below is my code-
log.info("Noop process called for: " + workItem.getWorkflowData().getPayload());
var comment = workItem.getMetaDataMap().get("comment");
log.info("Comment in approval process-----------" + comment);
var workflowData = workItem.getWorkflowData();
if (workflowData.getPayloadType() == "JCR_PATH") {
log.info("setting comment in meta data----------------");
workflowData.getMetaDataMap().put("comment", comment);
}
Can you help on how to set comment for next step?
Thanks in advance.
Regards,
Vivek
You would need to actually store your comment within a workflow Metadata Map. This
should help.
Once you have successfully stored your comment, you can access it later.
Hope this helps
I guess it is a session change within the workflow. The WorkflowData instance will be newly set. You can easily check it in the debugger of your ide. You have to iterate over the HistoryItems as illustrated here:
final List<HistoryItem> history = workflowSession.getHistory(workItem.getWorkflow());
final List<String> comments = new ArrayList<>();
if (history.size() > 0) {
HistoryItem current = history.get(history.size() - 1);
do {
comments.add(current.getComment());
current = current.getPreviousHistoryItem();
} while (current != null);
}
Comments are empty strings, if not set - if i'm not mistaken.

Simplest way to process a list of items in a multi-threaded manner

I've got a piece of code that opens a data reader and for each record (which contains a url) downloads & processes that page.
What's the simplest way to make it multi-threaded so that, let's say, there are 10 slots which can be used to download and process pages in simultaneousy, and as slots become available next rows are being read etc.
I can't use WebClient.DownloadDataAsync
Here's what i have tried to do, but it hasn't worked (i.e. the "worker" is never ran):
using (IDataReader dr = q.ExecuteReader())
{
ThreadPool.SetMaxThreads(10, 10);
int workerThreads = 0;
int completionPortThreads = 0;
while (dr.Read())
{
do
{
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
if (workerThreads == 0)
{
Thread.Sleep(100);
}
} while (workerThreads == 0);
Database.Log l = new Database.Log();
l.Load(dr);
ThreadPool.QueueUserWorkItem(delegate(object threadContext)
{
Database.Log log = threadContext as Database.Log;
Scraper scraper = new Scraper();
dc.Product p = scraper.GetProduct(log, log.Url, true);
ManualResetEvent done = new ManualResetEvent(false);
done.Set();
}, l);
}
}
You do not normally need to play with the Max threads (I believe it defaults to something like 25 per proc for worker, 1000 for IO). You might consider setting the Min threads to ensure you have a nice number always available.
You don't need to call GetAvailableThreads either. You can just start calling QueueUserWorkItem and let it do all the work. Can you repro your problem by simply calling QueueUserWorkItem?
You could also look into the Parallel Task Library, which has helper methods to make this kind of stuff more manageable and easier.