Is it possible to register jobs from existing batch metadata tables - spring-batch

We're trying to create a UI screen which will be able to trigger spring batch jobs and use our existing database with job execution records.
I was able to get all existing jobNames via jobExplorer but now I get an error on jobRegistry.getJob(jobName). It seems the jobs are not registered in jobRegistry.
The actual configuration of the jobs is placed in another application. I try to trigger the job from another application (solely for batch related functions - runJob, stopJob, view executions, etc).
EDIT:
Is it possible to be able to register the jobs to JobRegistry from existing batch metadata tables? - What I mean by this is that the jobs and step beans would be recreated from existing metadata tables.
What we did for a workaround is that execution records can be retrieved using the metadata tables but the runJob, stopJob functions would need to be redirected to exposed endpoints of the batch job applications.

Related

ETL Spring batch, Spring cloud data flow (SCDF)

We have a use case where data can be sourced from different sources (DB, FILE etc) and transformed and stored to various sinks (Cassandra, DB or File).We would want the ability to split the jobs and do parallel loads - looks like Spring Batch RemoteChunking provides that ability.
I am new to SCDF and Spring batch and wondering what is the best way to use it.
Is there a way to provide configuration for these jobs (source connection details, table and query) and can this be done through an UI (SCDF Server UI ?). Is it possible to compose the flow?
This will run on Kubernetes and our applications are deployed through Jenkins pipeline.
We would want the ability to split the jobs and do parallel loads - looks like Spring Batch RemoteChunking provides that ability.
I don't think you need remote chunking, you can rather run parallel jobs, where each job handles an ETL process (for a particular file, db table).
Is there a way to provide configuration for these jobs (source connection details, table and query)
Yes, those can be configured like any regular Spring Batch job is configured.
and can this be done through an UI (SCDF Server UI ?
If you make them configurable through properties of your job, you can specify them through the UI when you run the task.
Is it possible to compose the flow?
Yes, this is possible with Composed Task.

Spring batch jobOperator - how are multiple concurrent instances of a job from the same XML file controlled?

When we run multiple concurrent jobs with different parameters, how can we control (stop, restart) the appropriate jobs? Our internal code provides the jobExecution object, but under the covers The jobOperator uses the job name to get the job instance.
In our case all of the jobs are from "do-stuff.xml" (okay, it's sanitized and not very original). After looking at the spring-batch source code, our concern is that if there is more then one job running and we stop a job it will take the most recently submitted job and stop it.
The JobOperator will allow you to fetch all running executions of the job using getRunningExecutions(String jobName). You should be able to iterate over that list to find the one you want. Then, just call stop(long executionId) on the one you want.
Alternatively, we've also implemented listeners (both at step and chunk level) to check an outage status table. When we want to implement a system-wide outage, we add the outage there and have our listener throw an exception to bring our jobs down. once the outage is lifted, all "failed" executions may be restarted.

Automatically handling job and trigger changes in Quartz.net using AdoJobStore

I am writing a Quartz.net application using AdoJobStore to allow automated report scheduling.
In my scenario, users will define custom reports to be scheduled in one application which will add the required jobs and triggers to the database (using the AdoJobStore routines).
A separate Quartz.net application then reads these settings from the database (also using the AdoJobStore routines) and emails the reports as necessary.
Is there a way to get the quartz scheduler to automatically start scheduling new jobs and triggers that have been added to the database after the scheduler last started, or will I need to write a routine that periodically checks for database changes, and if found restart the Quartz scheduler instance?
You can handle all of this directly with Quartz.Net. Here's one way to do it:
Set up a Quartz.Net server as a windows service. The distribution comes with a Windows Service implementation, or you can build your own. Enable remoting on the quartz server.
From the application where users will configure their reports and schedules, connect to the Quartz.Net server using the Quartz.Net library and directly schedule the jobs and triggers as necessary.
You'll probably want to store the user's report configuration elsewhere in case the user wants to look at it later or change/copy it. Store this data somewhere else other than Quartz.Net. If the user changes the stored report configuration, connect again to the Quartz.Net server and update/reschedule the jobs using the Quartz.Net library. Alternatively, you could create a job that runs on the Quartz.Net server and periodically checks whether there have been any report configuration changes.
You'll have to create the actual jobs that will generate your reports in a generic enough fashion so that any report can be built by passing in data to job via the JobDataMap, instead of having to create a job for each report.

Disable Spring Batch Jobs

I was wondering if there's a way to enable/disable all of defined spring-batch jobs programmatically? For instance when I deploy my app, the database is empty and at that moment my jobs are running and throwing exceptions. I would like to have the jobs disabled until some data is populated in the database (until certain tables appear). Is this possible?
Have you take a look at this question? How Spring Boot run batch jobs
You can disable the job at startup by adding spring.batch.job.enabled=false to application.propertiesfile.
Then you can use the JobLauncher to run the job when your database is initialised.

Quartz Scheduler using database

I am using Quartz to schedule cron jobs in my web application. i am using a oracle Databse to store jobs and related info. When i add the jobs in the Database, i need to re-start the server/application (tomcat server) for these new jobs to get scheduled. How can i add jobs in the database and make them work without restarting the server.
I assume you mean you are using JDBCJobStore? In that case it is not ideal to make direct changes in the database tables storing the job data. However, I suppose you could set up a separate job that runs every X minutes / hours, checks whether there are new jobs in the database (that need to be scheduled), and schedule them as usual.
Add jobs via the Scheduler API.
http://www.quartz-scheduler.org/docs/best_practices.html