Using Atlasboard Job without a scheduler - atlasboard

While building an Atlasboard Job, I would like to control where and when data is pushed to the widget.
I could not find where the "interval" config parameter is documented - my understanding is that the job is scheduled each interval ms.
I would like to take control over when my job updates the widget. I therefore did a small test:
setInterval(function() {
x = x+1;
jobCallback(null, {title: config.widgetTitle + " - "+x});
}, 10000);
At first I was happy as it seem to have worked but than I noticed the log messages:
[dashboard: xxx] [job: xxx] 14:04:27.93 <warn> WARNING!!!!: job_callback executed more than once for job xxx in dashboard xxx (scheduler.js)
[dashboard: xxx] [job: xxx] 14:04:27.96 <warn> WARNING!!!!: job_callback executed more than once for job xxx in dashboard xxx (scheduler.js)
[dashboard: xxx] [job: xxx] 14:04:37.94 <warn> WARNING!!!!: job_callback executed more than once for job xxx in dashboard xxx (scheduler.js)
I might add that after few more minutes the job seem to replicate itself and global parameters seem to not keep their global value between job instances - such that soon after my single 'recurring job' becomes ten jobs, than 100 etc.
Is there a way to gain better control over when the data is pushed to the widget?
Is there better documentation than https://bitbucket.org/atlassian/atlasboard and http://atlasboard.bitbucket.org/ ?

I found out the following during my search and looking into the jobWorker code (I could not find any documentation anywhere).
There is a pushUpdate member function that was added to the jobWorker - it appears as if it is best used during the Job onInit() - in this way we assure that it is activated only once!
onInit: function (config, dependencies) {
var jobWorker = this;
var x = 0;
setInterval(function() {
x = x+1;
jobWorker.pushUnpdate({title: config.widgetTitle + " - "+x);
}, 10000);
}
Note that if you take control over the scheduling of the updates, there is no need to implement a Job onRun() - meaning that no scheduler will be activated for your Job.

Related

(Laravel 5) Monitor and optionally cancel an ALREADY RUNNING job on queue

I need to achieve the ability to monitor and be able to cancel an ALREADY RUNNING job on queue.
There's a lot of answers about deleting QUEUED jobs, but not on an already running one.
This is the situation: I have a "job", which consists of HUNDREDS OF THOUSANDS rows on a database, that need to be queried ONE BY ONE against a web service.
Every row needs to be picked up, queried against a web service, stored the response and its status updated.
I had that already working as a Command (launching from / outputting to console), but now I need to implement queues in order to allow piling up more jobs from more users.
So far I've seen Horizon (which doesn't runs on Windows due to missing process control libs). However, in some demos seen around it lacks (I believe) a couple things I need:
Dynamically configurable timeout (the whole job may take more than 12 hours, depending on the number of rows to process on the selected job)
Ability to CANCEL an ALREADY RUNNING job.
I also considered the option to generate EACH REQUEST as a new job instead of seeing a "job" as the whole collection of rows (this would overcome the timeout thing), but that would give me a Horizon "pending jobs" list of hundreds of thousands of records per job, and that would kill the browser (I know Redis can handle this without itching at all). Further, I guess is not possible to cancel "all jobs belonging to X tag".
I've been thinking about hitting an API route, fire the job and decouple it from the app, but I'm seeing that this requires forking processes.
For the ability to cancel, I would implement a database with job_id, and when the user hits an API to cancel a job, I'd mark it as "halted". On every loop I would check its status and if it finds "halted" then kill itself.
If I've missed any aspect just holler and I'll add it or clarify about it.
So I'm asking for an advice here since I'm new to Laravel: how could I achieve this?
So I finally came up with this (a bit clunky) solution:
In Controller:
public function cancelJob()
{
$jobs = DB::table('jobs')->get();
# I could use a specific ID and user owner filter, etc.
foreach ($jobs as $job) {
DB::table('jobs')->delete($job->id);
}
# This is a file that... well, it's self explaining
touch(base_path(config('files.halt_process_signal')));
return "Job cancelled - It will stop soon";
}
In job class (inside model::chunk() function)
# CHECK FOR HALT SIGNAL AND [OPTIONALLY] STOP THE PROCESS
if ($this->service->shouldHaltProcess()) {
# build stats, do some cleanup, log, etc...
$this->halted = true;
$this->service->stopProcess();
# This FALSE is what it makes the chunk() method to stop looping
return false;
}
In service class:
/**
* Checks the existence of the 'Halt Process Signal' file
*
* #return bool
*/
public function shouldHaltProcess() :bool
{
return file_exists($this->config['files.halt_process_signal']);
}
/**
* Stop the batch process
*
* #return void
*/
public function stopProcess() :void
{
logger()->info("=== HALT PROCESS SIGNAL FOUND - STOPPING THE PROCESS ===");
$this->deleteHaltProcessSignalFile();
return ;
}
It doesn't looks quite elegant, but it works.
I've surfed the whole web and many goes for Horizon or other tools that doesn't fit my case.
If anyone has a better way to achieve this, it's welcome to share.
Laravel queue have 3 important config:
1. retry_after
2. timeout
3. tries
See more: https://laravel.com/docs/5.8/queues
Dynamically configurable timeout (the whole job may take more than 12
hours, depending on the number of rows to process on the selected job)
I think you can config timeout + retry_after about 24h.
Ability to CANCEL an ALREADY RUNNING job.
Delete job in jobs table
Delete process by process id in your server
Hope it help you :)

Talend ETL - running child job in tLoop

I am trying to run a child job in tLoop. The child job connects to salesforce and downloads "Account" object to local SQL Server table. There are problems with connection to Salesforce, it takes few attempts to connect. Hence, I put the connection stuff in child job and now trying to call the child job in a loop. Below is the image of my parent job.
As you can see in image the tRunJob_1 has error because of Salesforce connection problem in child job. This is correct behaviour.
The setRetryConnect that is connected to OnComponentError has this code: context.retryConnect = true;
The setRetryConnect that is connected to OnComponentOk has this code: context.retryConnect = false;
So, I am tripping this context variable depending on whether child job succeeds or fails.
My tLoop looks as below:
I want the tLoop to run as many times till the condition remains true. That is till the time it continues to error out. However, it just iterates once and then stops. Could anyone please let me know what correction need to be done here to make the tLoop work?
I couldn't re-pro your issue with SalesForce but by looking at your job what I feel is that when you say - "it just iterates once and then stops" is the expected behavior.
As per your job flow after the tRunJob you are using OnComponentOk/OnComponentError trigger which would process and stop the job run as it would have completed the job execution. What it would have ideal was to keep everything in a subjob post tLoop so that it will iterate till the condition is met.
Sample job for explanation -
Here used tSetGlobalVar to define a global variable (in place of your context variable). Then use the globalMap variable as ((Boolean)globalMap.get("tLoop")) in your "Condition" for the tLoop.
And then finally run some code in the tJava component that does something and conditionally sets the global variable to false to mark the ending of loop.
tRunJob provides an Return Code ((Integer)globalMap.get("tRunJob_1_CHILD_RETURN_CODE"))
If you're running your child Job a number of times and want your Job to exit with non-Zero if one of these iterations fails, then after each iteration, you should test this return code and store it in your own globalMap Object if it is non-zero
int returnCode = ((Integer)globalMap.get("tRunJob_1_CHILD_RETURN_CODE"));
if (returnCode > 0) {
globalMap.put("tLoop", false);
}
else {
System.out.println(returnCode);
};
Found the answer myself, posting it here so that it may help others. It appears like OnComponentError breaks the tLoop. Disabled the OnComponentError flow and un-checked the 'Die on Child Error' checkbox in tRunJob.
The tLoop remains as it is. No changes here.
The retryConnect will use the below code. It uses CHILD_RETURN_CODE to check whether the child job threw error. In case of success, its value is 0. I am tripping the variable when the child job succeeds, so the loop will stop. As you can see, the tLoop shows 2 iterations, it is working as expected now. Thanks.

Quartz Scheduler Administration Page: Information about misfired triggers

We have a Java application (ESB) that uses quartz 2.2.1 and we use it to schedule hundreds of user jobs.
I want to build monitoring page (or scheduler administration page) in my application for our users so that they can see if quartz scheduler is running fine or there is any issue in this component.
Does quartz provides any monitoring API for this purpose? Can anyone please tell us what all data points should we show in this monitoring (or administration) page based on your experience? Some of the points that I can think of:
Scheduler Status (Running | Paused | Shutdown).
Number of jobs running with "Previous Fire Time" and "Next FireTime" information.
Thread pool implementation and its size.
JDBCJobStore configuration details.
Is there a way to show the information about triggers that were misfired? I don't see any API that provides me information about misfired triggers. Can anyone tell me how to get this information from scheduler?
Any help in this regard shall be appreciated.
You named here many different issues
Scheduler Status(1) :
Scheduler sched =...
sched.isInStandbyMode();
sched.isStarted();
sched.isShutdown();
Number of jobs running with...(2) see here
Scheduler scheduler = new StdSchedulerFactory().getScheduler();
for (String groupName : scheduler.getJobGroupNames()) {
for (JobKey jobKey : scheduler.getJobKeys(GroupMatcher.jobGroupEquals(groupName))) {
String jobName = jobKey.getName();
String jobGroup = jobKey.getGroup();
//get job's trigger
List<Trigger> triggers = (List<Trigger>) scheduler.getTriggersOfJob(jobKey);
Date nextFireTime = triggers.get(0).getNextFireTime();
System.out.println("[jobName] : " + jobName + " [groupName] : "
+ jobGroup + " - " + nextFireTime);
}
}
Thread pool implementation and its size.(3)
Scheduler sched =...
sched.getMetaData().getThreadPoolClass()
sched.getMetaData().getThreadPoolSize()
Regarding "triggers that were misfired", you can use listeners, specifically TriggerListener.html#triggerMisfired may be helpful for you.

Schedule Node.js job every five minutes

I'm new to node.js. I need node.js to query a mongodb every five mins, get specific data, then using socket.io, allow subscribed web clients to access this data. I already have the socket.io part set up and of course mongo, I just need to know how to have node.js run every five minutes then post to socket.io.
What's the best solution for this?
Thanks
var minutes = 5, the_interval = minutes * 60 * 1000;
setInterval(function() {
console.log("I am doing my 5 minutes check");
// do your stuff here
}, the_interval);
Save that code as node_regular_job.js and run it :)
You can use this package
var cron = require('node-cron');
cron.schedule('*/5 * * * *', () => {
console.log('running a task 5 minutes');
});
This is how you should do if you had some async tasks to manage:
(function schedule() {
background.asyncStuff().then(function() {
console.log('Process finished, waiting 5 minutes');
setTimeout(function() {
console.log('Going to restart');
schedule();
}, 1000 * 60 * 5);
}).catch(err => console.error('error in scheduler', err));
})();
You cannot guarantee however when it will start, but at least you will not run multiple time the job at the same time, if your job takes more than 5 minutes to execute.
You may still use setInterval for scheduling an async job, but if you do so, you should at least flag the processed tasks as "being processed", so that if the job is going to be scheduled a second time before the previous finishes, your logic may decide to not process the tasks which are still processed.
#alessioalex has the right answer when controlling a job from the code, but others might stumble over here looking for a CLI solution. You can't beat sloth-cli.
Just run, for example, sloth 5 "npm start" to run npm start every 5 minutes.
This project has an example package.json usage.
there are lots of Schedule package that would help you to do this in node.js . Just choose one of them based on your needs
following are list of packages:
Agenda,
Node-schedule,
Node-cron,
Bree,
Cron,
Bull

Quartz.Net - delay a simple trigger to start

I have a few jobs setup in Quartz to run at set intervals. The problem is though that when the service starts it tries to start all the jobs at once... is there a way to add a delay to each job using the .xml config?
Here are 2 job trigger examples:
<simple>
<name>ProductSaleInTrigger</name>
<group>Jobs</group>
<description>Triggers the ProductSaleIn job</description>
<misfire-instruction>SmartPolicy</misfire-instruction>
<volatile>false</volatile>
<job-name>ProductSaleIn</job-name>
<job-group>Jobs</job-group>
<repeat-count>RepeatIndefinitely</repeat-count>
<repeat-interval>86400000</repeat-interval>
</simple>
<simple>
<name>CustomersOutTrigger</name>
<group>Jobs</group>
<description>Triggers the CustomersOut job</description>
<misfire-instruction>SmartPolicy</misfire-instruction>
<volatile>false</volatile>
<job-name>CustomersOut</job-name>
<job-group>Jobs</job-group>
<repeat-count>RepeatIndefinitely</repeat-count>
<repeat-interval>43200000</repeat-interval>
</simple>
As you see there are 2 triggers, the first repeats every day, the next repeats twice a day.
My issue is that I want either the first or second job to start a few minutes after the other... (because they are both in the end, accessing the same API and I don't want to overload the request)
Is there a repeat-delay or priority property? I can't find any documentation saying so..
I know you are doing this via XML but in code you can set the StartTimeUtc to delay say 30 seconds like this...
trigger.StartTimeUtc = DateTime.UtcNow.AddSeconds(30);
This isn't exactly a perfect answer for your XML file - but via code you can use the StartAt extension method when building your trigger.
/* calculate the next time you want your job to run - in this case top of the next hour */
var hourFromNow = DateTime.UtcNow.AddHours(1);
var topOfNextHour = new DateTime(hourFromNow.Year, hourFromNow.Month, hourFromNow.Day, hourFromNow.Hour, 0, 0);
/* build your trigger and call 'StartAt' */
TriggerBuilder.Create().WithIdentity("Delayed Job").WithSimpleSchedule(x => x.WithIntervalInSeconds(60).RepeatForever()).StartAt(new DateTimeOffset(topOfNextHour))
You've probably already seen this by now, but it's possible to chain jobs, though it's not supported out of the box.
http://quartznet.sourceforge.net/faq.html#howtochainjobs