Setting Job name for Spring batch job - spring-batch

I have a requirement to run the same job multiple times in a day. For example: the job name is myHourlyJob. This job needs to run every hour and there is a validation that I need to do to check that for a day all 24 jobs ran successfully. My requirement is to add dynamic param in the jobname so I can identify each job based on the name.
For example :
the job running on 1st Jan at 9am, the job name should be : myHourlyJob_20220109_9.
the job running on 1st Jan at 2pm, the job name should be : myHourlyJob_20220109_14.
In general job name = myHourlyJob_YYYYMMDD_jobRunHour
It is possible to define the job name in Spring batch?

I believe the job name should not be changed, it is the same the job so it should have the same name. What you are describing is exactly the concept of job instance in Spring Batch, one instance for each hour in your case. It is the same job, it does the same thing over and over again, why should it have 24 names a day?
I suggest you keep a single name for your job, and launch a different instance every hour. In your case, the run time (the hour of the day) should be passed as an identifying job parameter and you will have 24 job instances for the same job per day. Please refer to the reference documentation for more details about these concepts.

Related

How to concatenate kubernetes jobs?

I want to create a chain process. When a parent's job ends, a child (or multiple children) job starts. The main problem is that I want to have only the current job pods deployed.
Something like: job1 starts --> job 1 finish --> job 2 starts --> job 2 finish --> job 3 starts --> job 3 finish.
How can I do this? I thought that I could create in disabled mode job 1, job 2, and job 3 at the same time and enable them when the respective parent job ends (maybe with a service?).
I recently read about a tool called argo-workflow but I am not sure if it will do the chain effect that I am searching for.
Yes. CI/CD solution like Argo workflow is the way to go. You can check this example on how to execute different task one by one using steps here.

Getting kubernetes cronjob history

I have a CronJob which runs every 15 Mins. Say, Its running for the last 1 year.
Is it possible to get the complete history using Kube API? Or, Is it possible to control the maximum history that can be stored?
Also, Can we get the status( Success/ Failure ) of each run along with the total completion time?
Does the POD die after completing the Job?
A CronJob creates a Job object for each execution.
For regular Jobs you can configure .spec.ttlSecondsAfterFinished along with the TTLAfterFinished feature gate to configure which Job instances are retained.
For CronJob you can specify the .spec.successfulJobsHistoryLimit to configure the number of managed Job instances to be retained.
You can get the desired information from these objects.
The pod does not die when the job completes, it is the other way around: If the pod terminates without an error, the job is considered completed.
The .spec.successfulJobsHistoryLimit and .spec.failedJobsHistoryLimit fields are optional.
These fields specify how many completed and failed jobs should be kept.
By default, they are set to 3 and 1 respectively.

configuring multiple versions of job in spring batch

SpringBatch seems to be lacking the metadata for the job definition in the database.
In order to create a job instance in the database, the only thing it considers is jobName and jobParamter, "JobInstance createJobInstance(String jobName, JobParameters jobParameters);"
But,the object model of Job is rich enough to consider steps and listeners. So, if i create a new version of the existing job, by adding few additional steps, spring batch does not distinguish it from the previous version. Hence, if i ran the previous version today and run the updated version, spring batch does not run the updated version, as it feels that previous run was successful. At present, it seems like, the version number of the job, should be part of the name. Is this correct understanding ?
You are correct that the framework identifies each job instance by a unique combination of job name and (identifying) job parameters.
In general, if a job fails, you should be able to re-run with the same parameters to restart the failed instance. However, you cannot restart a completed instance. From the documentation:
JobInstance can be restarted multiple times in case of execution failure and it's lifecycle ends with first successful execution. Trying to execute an existing JobIntance that has already completed successfully will result in error. Error will be raised also for an attempt to restart a failed JobInstance if the Job is not restartable.
So you're right that the same job name and identifying parameters cannot be run multiple times. The design framework prevents this, regardless of what the business steps job performs. Again, ignoring what your job actually does, here's how it would work:
1) jobName=myJob, parm1=foo , parm2=bar -> runs and fails (assume some exception)
2) jobName=myJob, parm1=foo , parm2=bar -> restarts failed instance and completes
3) jobName=myJob, parm1=foo , parm2=bar -> fails on startup (as expected)
4) jobName=myJob, parm1=foobar, parm2=bar -> new params, runs and completes
The "best practices" we use are the following:
Each job instance (usually defined by run-date or filename we are processing) must define a unique set of parameters (otherwise it will fail per the framework design)
Jobs that run multiple times a day but just scan a work table or something use an incrementer to pass a integer parameter, which we increase by 1 upon each successful completion
Any failed job instances must be either restarted or abandoned before pushing code changes that affect the the job will function

Spring and Quartz integration in cluster mode is not overwritting existing jobs

I am using Spring 3 and Quartz 1.8.5 to schedule jobs in a clustered mode. I have placed, overwriteExistingJobs=true in the Spring's scheduler configuration.
There is a requirement for me to create dynamic jobs programmatically apart from the jobs which are part of the configuration using Quartz jobs. Everything works fine till i re-start the server. At this point , there is a problem with overwriteExistingJobs=true.
Say, if i have a dynamic job created to execute every two minutes. And, i stop the server and start it after ten minutes, the job executes five times as soon as the server starts. But, if there is a job which is part of the spring configuration , like the one given in spring documentation , it is over-written when the server re-starts.
My observation has been that for jobs which are configured in the spring configuration file and added to the org.springframework.scheduling.quartz.SchedulerFactoryBean, the
PREV_FIRE_TIME in QRTZ_TRIGGERS table gets updated to '-1' but for dynamically created jobs it is not over-written.
The fix is as follows:
a) I have CronTriggers associated with dynamic jobs so what i did was to provide the mis-fire instruction.
JobDetail jobDetail = new JobDetail(job.getDescription(), job.getName(),job.getClass());
CronTrigger crTrigger = new CronTrigger( "cronTrigger", job.getName(), cronExpression);
crTrigger.setStartTime(firstFireTime);
crTrigger.setMisfireInstruction(CronTrigger.MISFIRE_INSTRUCTION_DO_NOTHING);
scheduler.scheduleJob(jobDetail, crTrigger);
b)The mis-fire threshold was pretty high (6000000). So, what i did was to reduce the misfire threshold and it worked like a charm.

Problem in submitting jobs in oracle

A job has been submitted and an entry is also there in dba_jobs but this job is not comming in the running state.So there is no entry for the job in dba_jobs_running.But the parameter 'JOB_QUEUE_PROCESS' has the value 10
and there are no jobs in the running state.Please suggest how to solve this problem.
SELECT NEXT_DATE, NEXT_SEC, BROKEN, FAILURES, WHAT
FROM DBA_JOBS
WHERE JOB = :JOB_ID
What's that return? A BROKEN job won't kick off, and if the NEXT_DATE/NEXT_SEC is in the past, it won't kick off either.
I hope you labeled that database parameter correctly i.e. 'JOB_QUEUE_PROCESSES=10'.
This is typically why a job won't run.
Also check that the user/schema that is running the job is correct too.
An alternative is to use a different scheduling tool to run the job (i.e. cron on linux)