I am trying to do a simple integration between our control-M batch environment and our kafka environment. What i want to have is to be able to public where certain jobs or jobnets in Kafka are complete, have an issue with extra information like start and end times.
So we can implement a stream processor that would abstract the details away and tell the our event processing system that the daily end of day processing is complete. Financial/banking.
I looked to see if there is API of some sorts but i only see maintenance of reports not reporting the actual running of the batch
Related
I want to make something to monitor some Kafka topic continuously, and then execute some batch job when a message comes in (hitting some REST api and storing response). I set something up with KafkaItemReader, however, it turns off if it doesn't receive a message for 30 seconds based on pollTimeout. How can I make it run indefinitely? Since this is not an obvious option I'm wondering if I am using the right tool for the job.
Likely answer: you are not supposed to do this.
That's correct. Batch processing is about processing finite data sets. If your data source is an infinite stream of records and you want to monitor it continuously, then a streaming solution is more appropriate for your use case.
Is it possible to use spring batch as a regular job framework?
I want to create a device service (microservice) that has the responsibility
to get events and trigger jobs on devices. The devices are remote so it will take time for the job to be complete, but it is not a batch job (not periodically running or partitioning large data set).
I am wondering whether spring batch can still be used a job framework, or if it is only for batch processing. If the answer is no, what jobs framework (besides writing your own) are famous?
Job Description:
I need to execute against a specific device a job that will contain several steps. Each step will communicate with a device and wait for a device to confirm it executed the former command given to it.
I need retry, recovery and scheduling features (thought of combining spring batch with quartz)
Regarding read-process-write, I am basically getting a command request regarding a device, I do a little DB reads and then start long waiting periods that all need to pass in order for the job/task to be successful.
Also, I can choose (justify) relevant IMDG/DB. Concurrency is outside the scope (will be outside the job mechanism). An alternative that came to mind was akka actors. (job for a device will create children actors as steps)
As far as I know - not periodically running or partitioning large data set are not primary requirements for usage of Spring Batch.
Spring Batch is basically a read - process - write framework where reading & processing happens item by item and writing happens in chunks ( for chunk oriented processing ) .
So you can use Spring Batch if your job logic fits into - read - process - write paradigm and rest of the things seem secondary to me.
Also, with Spring Batch , you should also evaluate the part about Job Repository . Spring Batch needs a database ( either in memory or on disk ) to store job meta data and its not optional.
I think, you should put more explanation as why you need a Job Framework and what kind of logic you are running that you are calling it a Job so I will revise my answer accordingly.
I have created a NiFi workflow as shown below:
GenerateFlowFile --> Custom Processor --> LogAttribute
My custom processor has a property as start date. But the start date should change in each scheduled run based on the maximum end date from previous run. Basically looking for incremental data fetch from the server.
Could you please help, how this can be achieved in Apache NiFi?
Processor scheduling is usually left to the data flow manager configuring the processor into their flow. I recommend you let them schedule the processor, expecting it to run on a periodic basis.
But you can use Apache NiFi's State Manager feature to store data that tracks your incremental progress. You could then decide what action to take, if any, when the processor is triggered. If there is nothing to do, don't do anything.
The best examples of this are List* processors like ListFile. These processors typically store a timestamp of the file they last read, the use that timestamp to determine which newer files should be acted on, regardless of how frequently they are asked to check. It is likely that most executions of a List* processor will result in no output.
There are some examples of reading and persisting state data in the AbstractListProcessor class.
Am invoking spring jobs based on event, however i hv couple jobs to execute on specific event which could execute in parallel, Is there any utility class which can execute multiple jobs in parallel? Thanks
We don't offer anything specific for launching multiple jobs based on a single message out of the box with Spring Batch. However, writing a message handler that can handle that scenario should be pretty trivial.
We have job which polls for file and db every M-F between 1PM-5PM using cron expression. During this time if file arrives it downloads the file and invoke a job. This is working fine and we have used spring integration and batch.
Now we need some customization where we have multiple job where job1 one should poll like above once file is processed successfully, it should stop polling.
Second requirement is, in case if file does not come during polling period we want to send some notification to ops team so that they can take some actions.
Would that help ? Exit Spring Integration when no more messages
You would be able to implement custom behavior in that advice, based on polling result and the time of the day.
Garry is also mentionning that conditional pollers are coming in next versions :
http://docs.spring.io/spring-integration/docs/4.2.0.BUILD-SNAPSHOT/reference/html/messaging-channels-section.html#conditional-pollers