Microsoft HPC Default Job Email Settings / Notifications - hpc

I've got a Microsoft HPC cluster setup and we're using an engineering application to submit jobs to the compute cluster queue. Users remote desktop to the head-node of the cluster to submit jobs to avoid file-transfer delays across our WAN, this leads to completed or failed jobs sitting for review for a long time.
Unfortunately we can't use the HPC Job Manager to submit jobs to the queue, we have to use the engineering application submission tool. That application doesn't have an option to send an email on the job status (completed, failed etc.)
In the HPC Job Manager > New Job there are options for "Send a notification when this job" starts / completes etc. which would be great if we could use that to submit.
All jobs submitted use the Default job template but on that template I don't options to edit or add email settings to it.
Is there an option I'm missing or is this even a possible configuration?

Related

Datastage Director issue while scheduling

I am trying to schedule datastage job using datastage director client 11.7.
But facing issue:
Error adding to schedule:sh:/usr/bin/at:permission denied
Make sure that the datastage engine user is able to schedule cron and at jobs. cron is used to schedule recurring jobs wheras at is used to schedule "one-shots". In some environments, these services are not activated by default anymore and must be enabled.
Looks like dsadm is unable to perform scheduling.
DataStage uses Linux internal scheduling utility such as cron/at to schedule the jobs, check whether dsadm has the privilege to do it.

Prevent Cron jobs from Single Point of Failure

I have many cron jobs running on server, which include like DB Backup (daily), sending Notification to users(hourly).
Currently i have 5 API Servers, and cron jobs is setup on one of the API Server.
I want to prevent Cron jobs from Single Point of failure. What if the machine crashed in which cron jobs has been setup.
Any suggestions please.

A Workload Scheduler's step doesn't proceed and remains queued status

I created a simple process by Applicaiton Lab interface in Bluemix Workload Scheduler. I ran my process, but the step didn't proceed and remained in queued status.
How can I proceed the step?
I executed the process by "Run now". The process doesn't have triggers
The step remains "Queued status".
The Process information
There is only one step. The step is "ping www.ibm.com"
Process doesn't have trigger. It is an on-demand process.
There might be a problem with the agent as I can successfully run a simple workload process without any issues. If you are using the Workload Automation Agent that is created for you then you will need to open a support ticket to have the Workload team look at that agent.
reviewing your question I think that a process submitted to the Workload Scheduler service should be a process which will complete: a ping command like the one you are trying to submit, will never complete if not 'killed' using CTRL+C (or called with [-c count] option)

Automatically handling job and trigger changes in Quartz.net using AdoJobStore

I am writing a Quartz.net application using AdoJobStore to allow automated report scheduling.
In my scenario, users will define custom reports to be scheduled in one application which will add the required jobs and triggers to the database (using the AdoJobStore routines).
A separate Quartz.net application then reads these settings from the database (also using the AdoJobStore routines) and emails the reports as necessary.
Is there a way to get the quartz scheduler to automatically start scheduling new jobs and triggers that have been added to the database after the scheduler last started, or will I need to write a routine that periodically checks for database changes, and if found restart the Quartz scheduler instance?
You can handle all of this directly with Quartz.Net. Here's one way to do it:
Set up a Quartz.Net server as a windows service. The distribution comes with a Windows Service implementation, or you can build your own. Enable remoting on the quartz server.
From the application where users will configure their reports and schedules, connect to the Quartz.Net server using the Quartz.Net library and directly schedule the jobs and triggers as necessary.
You'll probably want to store the user's report configuration elsewhere in case the user wants to look at it later or change/copy it. Store this data somewhere else other than Quartz.Net. If the user changes the stored report configuration, connect again to the Quartz.Net server and update/reschedule the jobs using the Quartz.Net library. Alternatively, you could create a job that runs on the Quartz.Net server and periodically checks whether there have been any report configuration changes.
You'll have to create the actual jobs that will generate your reports in a generic enough fashion so that any report can be built by passing in data to job via the JobDataMap, instead of having to create a job for each report.

Open Source Job Scheduler with REST API

Are there any open source Job Scheduler with REST API for commercial use which will support features like:
Tree like Job dependency
Hold & Release
Rerun failed steps
Parallelism
Help would be appreciated :)
NOTE: we are looking for open source alternative for TWS,Control-M,AutoSys.
JobScheduler would seem to meet your requirements:
Open Source see: Open Source and Commercial Licenses
Rest API see: Web Service Integration
Parallelism see: Organisation of Jobs and Job Chains
I think that these areas are also covered (I downloaded and trialled the application): See here
Tree like Job dependency
Hold & Release
Rerun failed steps
I'm not affiliated with SOS GmbH
ProActive Scheduler is an open source job scheduler.
It is part of OW2 organization
It is written in Java so it comes with a Java and a REST API
It provides workflows that are set of tasks with dependencies and more (loop,replicate, branch), upon failures you can control if the task should be cancelled or restarted
Parallelism and distribution is at the heart of it, with features like for instance
Commercial Support is provided by Activeeon, the company behind ProActive (full disclosure: I work for Activeeon).
You might be interested in DKron
Dkron is a system service that runs scheduled jobs at given intervals or times, just like the cron unix service but distributed in several machines in a cluster. If a machine fails (the leader), a follower will take over and keep running the scheduled jobs without human intervention. Dkron is Open Source and freely available.
http://dkron.io/
While not open source, a more cost effective job scheduler solution with REST API and support for the features listed is ActiveBatch workload automation and job scheduling. I do work for the company (being up-front) but our customers love how they can easily extend their automated processes to connect to any application, any service, any server with our REST API adapter. You can get more information here: https://www.advsyscon.com/en-us/activebatch/rest-api-adapter