In spring batch, what is a scope of an ItemReader without scope="..."?

In spring batch, what is a scope of an ItemReader without scope="..."? - spring-batch

If I have a web application, with an application-context that loads everything for my webapp and all my jobs configuration files, and if I have in a job a simple ItemReader without scope="step", the reader is a singleton, right ? So if I launch twice my job from a controller via a SimpleJobLauncher, I will use the same bean, right ? Unless I put scope="step", in order to have one bean per job execution ?
On the other hand, if I launch the job from a CommandLineJobRunner, I will have two distinct application contexts, so two different beans, right ?
Are my assertions valid ?
Thanks

Yes that is correct. Basically, every Bean-instance in a SpringContext is a singleton.
However, most readers or writers have a state. For instance, FlatFileItemReader can only run once, after that it points to the end of the file and its "close" method had been called. Therefore, if you simply start the job again, it will not work, since the FlatfileItemReader is closed.
For such cases, you will need to define them with sope=step.

Related

Spring batch create several system calls

I am starting with spring batch and I have a code structure with list of objects, for each object I need to call an external ant process. I understand I can use the SystemCommandTasklet for this however I am not sure how to set it for multiple values (so far it seems it can only make 1 execution?) Thanks in advance

A file prepared by one spring batch job is not accessible to other for deletion

I have a requirement where I have to prepare a file using one job and another job which runs once a day will send the file to external system and delete/or move from the location. When this job tries to delete/or move the file it can't access it.
I tried setting writable to true when file is created. Running jobs on separate times (Running one job at a time). Tried adding "delete" as a step to the same job as well. Nothing worked.
I am using file.delete(). Also tried Files.deleteIfExists().
I suspect the first job is not assigning proper permissions but don't know a way around it set permissions in spring batch

Are these jobs run by the same user? i.e. Same user and permissions?
Also what is the actual error message? Does it say permissions denied? If so they it is likely an OS restriction not Spring Batch/Java limitation.
An easier solution would be to just add a step to the first job to send the files are part of the job and drop the job that just transfers the files.

Answering my own question 😀. Hope it helps someone.
Issue was the last ItemWriter was holding the resources because I was using the composite writer. While using CompositeWriter beforeStep, afterStep methods are “hidden”. You have to call them explicitely. I selected the approach to write a custom writer which will explicitely call writer.close().
Adding afterStep method and calling super.close() should also work. Though I have nit tries that out.

Spring Batch architecture

Hi
I am a novice in Spring Batch world and last days I've spent time watching Michael Minella's youtube video, read some documentation and successfully run some demo projects I found on the internet. I think Spring Batch is a hot candidate for our needs. But here is our story.
I am working in a company that developed their own scheduling and batch framework, for more than a decade ago, for their business department. The framework is capable of running DB stored procs, DB functions and dynamic SQLs. Needless to say it is very challenging to maintain it since too many people with various development skills did the coding and they don't work here anymore. Our framework may handle jobs and steps to run sequentially as well as async (as Spring Batch). We have also a Job Repository where we store whole job definitions (users create new jobs via GUI), job instances with its context (in case the server goes down, when server is up it will resume running a job).
My questions are following:
Can we create new Spring Batch jobs dynamically (either via XML og code) and via standard SB interfaces store them to the JobRepository DB?
Today, at certain time period, we have up to hundred of job executions simultaneously. They are also reusing a connection pool to the DB. Older Spring Batch ref documentation states JobFactory will create fresh ApplicationContext for each job execution. How can we achieve reusing connection pools if this is the case in Spring Batch.
I know there is a support for continuing failed steps but what if the server/app goes down, will I be able to restart my app and retrieve job instance with its context from JobRepository in order to continue from failed step?
Can a "step1.1" in "job1" be dependent on "step 2.1" from "job2" finishing within last hour? In such scenarios I may be using a step listener on "step1.1" to accomplish this?
Kind regards
Toto

You have a lot of material here to cover, so let me respond one point at a time:
Can we create new Spring Batch jobs dynamically (either via XML or code) and via standard SB interfaces store them to the JobRepository DB?
Can you generate a job definition dynamically? Yes. We do it in Spring XD with regards to the job orchestration piece (the composed job DSL is used to generate an XML file for example.
Does Spring Batch provide facilities to do this? No. You'd have to code it yourself.
Also note that you'd have to store the definition in your own table (the schema defined by Spring Batch doesn't have a table for this).
Today, at certain time period, we have up to hundred of job executions simultaneously. They are also reusing a connection pool to the DB. Older Spring Batch ref documentation states JobFactory will create fresh ApplicationContext for each job execution. How can we achieve reusing connection pools if this is the case in Spring Batch.
You can use parent/child context configurations to reuse beans including a DataSource. Define the DataSource in the parent and then the jobs that depend on it in child contexts.
I know there is a support for continuing failed steps but what if the server/app goes down, will I be able to restart my app and retrieve job instance with its context from JobRepository in order to continue from failed step?
This is really an orchestration concern. Spring Batch, by design, does not address the orchestration of jobs into consideration. This allows you to orchestrate them how you want.
The way I'd recommend handling this is via Spring XD or (depending on your timelines) Spring Cloud Data Flow. These tools provide orchestration capabilities including the redeployment of a job if it goes down. That being said, it won't restart a job that was running if it fails because that typically requires some form of human decision based on use case. However, Spring XD currently (and Spring Cloud Data Flow will) have the capabilities to implement something like this in a pretty straight forward way.
Can a "step1.1" in "job1" be dependent on "step 2.1" from "job2" finishing within last hour? In such scenarios I may be using a step listener on "step1.1" to accomplish this?
In cases like this, I'd start to question how your job is configured. You can use a JobExecutionDecider to decide if a step should be executed or not if it still makes sense.
All things considered, while you can accomplish most of what you're looking for with Spring Batch, using something like Spring XD or Spring Cloud Data Flow will make your life a lot easier.

Can we create new Spring Batch jobs dynamically (either via XML og code) and via standard SB interfaces store them the JobRepository DB?
It is easy to use StepBuilderFactory, FlowBuilder etc. to programatically build the Spring Batch artifacts. You'll probably want to back those artifacts with Spring Beans (to get nice facilities like the step/job spring scopes, injection and so on) and for that you can use prototype, execution scoped and job scoped beans, or even use facilities such as BeanDefinitionBuilder to dynamically create beans.
Older Spring Batch ref documentation states JobFactory will create fresh ApplicationContext for each job execution. How can we achieve reusing connection pools if this is the case in Spring Batch.
The GenericApplicationContextFactory creates a child application context. You can have the "global" beans in the parent application context.
I know there is a support for continuing failed steps but what if the server/app goes down, will I be able to restart my app and retrieve job instance with its context from JobRepository in order to continue from failed step?
Yes, but not that easily.
Can a "step1.1" in "job1" be dependent on "step 2.1" from "job2" finishing within last hour? In such scenarios I may be using a step listener on "step1.1" to accomplish this?
A JobExecutionDecider will likely be the best option there.

Spring Batch - execute a set of steps 'x' times based on a condition

I need to execute a sequence of steps a specific number of times.. any pointers on what is the best way to do this in Spring Batch. I am able to implement executing a single step 'x' times. but my requirement is to execute a set of steps - based on a condition 'x' times.Any pointers will help.
Thanks
Lakshmi

You could put all steps in a job an start the whole job several times. There are different ways, how a job actually is launched in spring-batch. have a look at joboperator and launcher and then simply implement a loop around the launching of the job.
You can do this after the whole spring-context is initialized, so there will be no overhead concerning that. But you must by attention about the scope of your beans, especially the reader and writers.
Depending on your needs concerning failurehandling and restart, you also have pay attention how you manage the execution context of your job and steps.

You can simulate a loop with SB using a JobExecutionDecider:
Put it in front of all steps.
Store x in job execution context and check for x value into
decider: move to 'END' if x equals desidered value or increment it
and move to first step of set.
After last step move back to start (the decider).

quartz-scheduler depend jobs

I'm working on a project with Quartz and has been a problem with the dependencies with jobs.
we have a setup where A and B aren't dependent on eachother, though C is:
A and B can run at the same time, but C can only run when both A and B are complete.
Is there a way to set this kind of scenario up in Quartz, so that C will only trigger when A and B finish?

Not directly AFAIK, but it should be not too hard to use a TriggerListener to implement such a functionality (a TriggerListener is run both a start and end of jobs, and you can set them up for individual triggers or trigger groups).
EDIT: there is even a specific FAQ Topic about this problem:
There currently is no "direct" or "free" way to chain triggers with
Quartz. However there are several ways you can accomplish it without
much effort. Below is an outline of a couple approaches:
One way is to use a listener (i.e. a TriggerListener, JobListener or
SchedulerListener) that can notice the completion of a job/trigger and
then immediately schedule a new trigger to fire. This approach can get
a bit involved, since you'll have to inform the listener which job
follows which - and you may need to worry about persistence of this
information. See the listener
org.quartz.listeners.JobChainingJobListener which ships with Quartz -
as it already has some of this functionality.
Another way is to build a Job that contains within its JobDataMap the
name of the next job to fire, and as the job completes (the last step
in its execute() method) have the job schedule the next job. Several
people are doing this and have had good luck. Most have made a base
(abstract) class that is a Job that knows how to get the job name and
group out of the JobDataMap using pre-defined keys (constants) and
contains code to schedule the identified job. This abstract Job's
implementation of execute() delegates to an abstract template method
such as "doWork()" (where the extending Job class's real work goes)
and then it contains the code for scheduling the follow-up job. Then
they simply make extensions of this class that included the work the
job should do. The usage of 'durable' jobs, or the overloaded
addJob(JobDetail, boolean, boolean) method (added in Quartz 2.2) helps
the application define all the jobs at once with their proper data,
without yet creating triggers to fire them (other than one trigger to
fire the first job in the chain).
In the future, Quartz will provide a much cleaner way to do this, but
until then, you'll have to use one of the above approaches, or think
of yet another that works better for you.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse