Running batch jobs in Pivotal Cloud foundary - spring-batch

We have a requirement to migrate mainframe batch jobs to PCF cloud but as 3R's of security Rotate, Repave and Repair it might be possible that in the instance where batch job is running as spring batch that instance can be repaved/repair and our running jobs got terminated. In that scenario how to ensure that during repavement/repair of an PCF instance our jobs will not get impacted. We are looking for best way to migrate jobs in PCF cloud, any help/suggestion will be really helpful.

Related

Implementation of batch job on Mulesoft Runtime Fabric on Azure Kubernetes Service (Specific to MuleSoft implementation)

Did anyone use MuleSoft Batch process on Runtime Fabric on Azure/AWS? How was your experience with that implementation? Any best practices? I am trying to work on an example where we need to push 100Million messages to Cosmos and the solution is supposed to be deployed on RTF on Azure. Batch process supports persistent queues and I don't see any settings that can help configuring external queues for persistence as Pods may crash and the persistent files will be lost.
Are there any other alternatives for this other than batch job? If we use Parallel for each, that works as splitter and aggregator and it is not efficient.
Any suggestions are appreciated.
Persistent queues are a feature only for Anypoint Platforms CloudHub deployments and are not available for Anypoint Runtime Fabric. Even in CloudHub they are guaranteed to provide reliability of Mule batch processes (see this KB for more information). Assume that a crash will restart the worker or pod respectively and the batch queues and stores may be lost.

Conditionally launch Spring Cloud Task on a specific node of Kubernetes cluster

I am building a data pipeline for batch processing. And I find that Spring Cloud Data Flow is a quite attractive framework to use. Without much knowledge in SCDF and Kubernetes, I am not sure whether it is possible to conditionally launch a Spring Cloud Task on a specific machine.
Suppose I have two physical servers that are for running the batch process (Server A and Server B). By default, I would like my Spring cloud task to be launched on Server A. If the Server A is shut down, the task should be deployed on server B. Can Kubernetes / SCDF handle this kind of mechanism? I am wondering whether the nodeselector is the thing that I should look into.
Yes, you can pass deployment.nodeSelector as a deployment property when launching the task.
The deployment.nodeSelector is a Kubernetes deployment property and hence, you need to pass something like this:
task launch mytask --properties "deployer.<taskAppName>.kubernetes.deployment.nodeSelector=foo1:bar1,foo2:bar2"
You can check the list of supported Kubernetes deployer properties here

How to set scheduler for Spring Batch jobs in Spring Cloud Data Flow?

I’m setting up a new Spring Batch Jobs and want to deploy it using SCDF. However, I have found that SCDF does not support scheduler feature in local framework.
I have 3 questions to ask you:
Can someone explain how scheduler of SCDF work?
Are there any ways to schedule 1 job using SCDF?
Can I use my local server as a Cloud Foundry? and how?
Yes, Spring Cloud Data Flow does not support scheduling on local platform. Please note that the local SCDF server is for development purposes only and by design, the scheduling support is intended to be relying on the platform. Hence, SCDF scheduling feature is supported on Cloud Foundry and Kubernetes using the CF and K8s schedulers.
1) Can s/o explain how scheduler of SCDF work?
sure, Similar to how the deployer is used for launching task/deploying the stream, there is an SPI for scheduling the tasks under spring-cloud-deployer project. The underlying scheduler implementations can implement this. Currently, we have CF and K8s scheduler implementations in spring-cloud-deployer-cloudfoundry and spring-cloud-deployer-kubernetes.
As a user, you can configure a scheduler for a task (batch) application (via SCDF Dashboard, shell etc.,). You can specify a cron expression to schedule the task. Once configured, the SCDF delegates the schedule request to the platform scheduler using the above-mentioned scheduler implementations. Once scheduled, it is the platform (PCF scheduler on CF, K8s scheduler on K8s) that takes care of the task using the schedule.
2) Are there any ways to schedule 1 job using SCDF?
Yes, based on the answer from 1
3) Can I use my local server as a cloud Foundry? and How?
To run SCDF on local pointing to the CF instance, you can set the necessary CF deployer properties and start the SCDF server instance. It is similar to how you configure multi platforms in SCDF server. You can find more documentation on this here.

Orchestration of batch job into a microservices architecture - SCDF

i have a microservice which he have 5 embeded batch job that runs every night at 00:00 , i want to outsource those batches using Spring Cloud DataFlow , my questions are :
-how can i connect SCDF to the actual microservice for local deployment
-is there an alternative to get a scheduler in SCDF for
local deployment
Spring Cloud Data Flow uses Spring Cloud Skipper to deploy and launch.
This question seems similar to your query. Does spring-cloud-dataflow provide support for scheduling applications defined as tasks?

Spring batch admin and starting Master/Slave jobs

Is it possible to configure spring batch admin to start Master and slave jobs. We have one process as master and 3-4 slave nodes.
Spring batch admin is running in separate JVM process but all spring batch jobs are using same batch db schema.
Spring Batch Admin only has the abilities to launch locally deployed jobs. So while you can launch a job that has master/slave configurations, the job that owns the master must be deployed locally. You could wire things up to launch remote jobs, but you'd have to wire things up yourself.
That being said, Spring XD (http://projects.spring.io/spring-xd/) is a distributed runtime that is able to launch jobs that are remotely deployed.