How to use Spring Cloud Dataflow to get Spring Batch Status - spring-batch

I have been using Spring Batch and my metadata is in DB2. I have been using Spring Batch admin API (jars) to look at the current status of various jobs and getting details about job, like number of items read, commit count, etc. Now, since Spring Batch Admin is moved to spring-data-cloud, how do look at these informations? Is there a good API set I could use?

Basically, in Spring Cloud Data flow, you first need to create Spring Cloud Task that will have your Batch application: See example [here][1]
With the help of Spring Cloud #EnableTaskLauncher you can get the current status of job, run the job, stop the job, etc.
You need to send TasKLauncherRequest for it.
See APIs of TaskLauncher
Edition:
To get spring batch status, u need to have first Task execution id of spring cloud task. Set<Long> getJobExecutionIdsByTaskExecutionId(long taskExecutionId); method of [TaskExplorer][3]
See Task Explorer for all the apis. With it, use JobExplorer to get status of jobs

Related

How to see the details of Spring Batch on UI?

I am using Spring Batch and my org not willing to use Spring Cloud Data Flow, is there any way we can create UI and show details of batch job and somehow also restart the batch Job?

Using cosmos db for Spring batch job repository

Is it possible to use CosmosDB as a job repository for Spring Batch?
If that is not possible, can we go with an in-memory DB to handle our Spring batch jobs?
The job itself is triggered on message arrival in a remote queue. We use a variation of the process indicator in our current Spring batch job, to keep track of "chunks" which are being processed. Our attributes for saveStep are also disabled . The reader always uses a DB query to avoid picking up the same chunks and prevent duplicate processing.
We don't commit the message on the queue , till all records for that job are processed. So if the node dies and comes back up in the middle of processing , the same message would be redelivered , which takes of job restarts. Given all this, we have a choice of either coming up with a way to implement a cosmos job repository or simply use in-memory and plug in an "afterJob" listener to clean up the in-memory job data to ensure that java mem is not used in Prod. Any recommendations?
Wanted to provide information that Azure Cosmos DB just release v3 of the Spring Data connector for the SQL API:
The Spring on Azure team, in partnership with the Azure Cosmos DB team, are proud to have just made the Spring Data Azure Cosmos DB v3 generally available. This is the latest version of Azure Cosmos DB’s SQL API Spring Data connector.
Also, Spring.io has an example microservices solution (Spring Cloud Data Flow) based on batch that could be used as an example for your solution.
Additional Information:
Spring Data Azure Cosmos DB v3 for Core (SQL) API: Release notes and resources (link)
A well written 3rd party blog that is super helpful:
Introduction to Spring Data Azure Cosmos DB (link)

ETL Spring batch, Spring cloud data flow (SCDF)

We have a use case where data can be sourced from different sources (DB, FILE etc) and transformed and stored to various sinks (Cassandra, DB or File).We would want the ability to split the jobs and do parallel loads - looks like Spring Batch RemoteChunking provides that ability.
I am new to SCDF and Spring batch and wondering what is the best way to use it.
Is there a way to provide configuration for these jobs (source connection details, table and query) and can this be done through an UI (SCDF Server UI ?). Is it possible to compose the flow?
This will run on Kubernetes and our applications are deployed through Jenkins pipeline.
We would want the ability to split the jobs and do parallel loads - looks like Spring Batch RemoteChunking provides that ability.
I don't think you need remote chunking, you can rather run parallel jobs, where each job handles an ETL process (for a particular file, db table).
Is there a way to provide configuration for these jobs (source connection details, table and query)
Yes, those can be configured like any regular Spring Batch job is configured.
and can this be done through an UI (SCDF Server UI ?
If you make them configurable through properties of your job, you can specify them through the UI when you run the task.
Is it possible to compose the flow?
Yes, this is possible with Composed Task.

Batch Job exit status using Spring Cloud Task

I'm trying to setup a spring batch project to be deployed on Spring Cloud Data Flow server, but first I must "wrapp" it on a Spring Cloud Task application.
Spring Batch generates metadata (start/end, status, parameters, etc) on BATCH_ tables. Cloud Task do the same but on TASK_ tables.
Reading the documentation of Spring Cloud Task, it said that in order to pass the batch information to the task, it must be set
spring.cloud.task.batch.failOnJobFailure=true and also
To have your task return the exit code based on the result of the
batch job execution, you will need to write your own
CommandLineRunner.
So, any indications on how I should write my own CommandLineRunner ?
For now, only having set the propertie, if I force the task to fail, I'm getting Failed to execute CommandLineRunner .... Job UsersJob failed during execution for jobId 3 with jobExecutionId of 6

Convert non-launchable job to launchable job in Spring Batch Admin

I have a Spring Batch job developed with Spring Boot (1.4.1.RELEASE).
It successfully runs from command line and writes job execution data to MySQL. It shows up as non-launchable job in Spring Batch Admin (2.0.0.M1, pointing to MySQL) and I can see job execution metrics.
Now I'd like to turn it into a launchable job so I can run it within Spring Batch Admin.
I wonder if anyone has done that before. The documentation has a section Add your Own Jobs For Launching. But it does not specify where to add the implementation jar(s) for the job?
Is it spring-batch-admin/WEB-INF/lib?
With Spring Boot, the non-launchable job is one big, all-in-one executable jar. Its dependencies overlap with Spring Batch Admin. For example, they both have spring-batch*.jar, spring*.jar but different versions.
Is there a way, like the job definition xml file, to keep them in separate contexts? Thank you.
Spring Batch Admin looks for your job definitions in src/main/resources/META-INF/spring/batch/jobs folder. You could add your job-definition.xml file in that folder and define your batch jobs in that xml.