How to use static spring cloud stream url for launching spring cloud tasks? - kubernetes

Platform used : Kubernetes.
I have an issue with Spring cloud stream url. I am launching my spring cloud tasks using spring cloud stream. Streams are deployed in kubernetes platform. Stream contains http-kafka as source and taskLauncerKafka as sink. I used http-kafka kubernetes service url to launch tasks. Kubernetes service url changes after each deployment which causes problem.The changes in the service name after each stream deployment is difficult to manage. I have tried enabling loadbalacer also. In that case also external ip-address changed after each stream roll-out.
I am using skipper for managing the deployments. Every time stream is deployed stream version changes which also changes stream url.
In my case , I have multiple instances from where I can launch spring-cloud task. If the stream url changes I need to make changes in the configmap of the deployment project for all instance and need to redeploy all instance.
Any solution ? I am thinking of centralised configuration management using spring-cloud-config server or zookeeper. In this case also I need to update the url. I can avoid deploying multiple instances using centralised configuration management.
Skipper server version : 2.4.1.RELEASE
Dataflow server version : 2.5.1.RELEASE

Which version of SCDF/Skipper you are running?
This looks similar to the issue https://github.com/spring-cloud/spring-cloud-skipper/issues/953 which was subsequently addressed in Skipper 2.6.0.

Related

Is it possible to make auto-refresh properties for Spring Cloud clients in a **multi-pod** environment

Is it possible to make auto-refresh properties for Spring Cloud clients in a multi-pod environment (Google Kubernetes Engine)?
I found several work arounds:
Using Spring Cloud Bus (too heavy solution).
Running refresh inside code using RefreshEvent and #Schedule it (not recommended by Spring).
Creating a new endpoint in Config Server to perform a refresh on all Spring Cloud clients.

Conditionally launch Spring Cloud Task on a specific node of Kubernetes cluster

I am building a data pipeline for batch processing. And I find that Spring Cloud Data Flow is a quite attractive framework to use. Without much knowledge in SCDF and Kubernetes, I am not sure whether it is possible to conditionally launch a Spring Cloud Task on a specific machine.
Suppose I have two physical servers that are for running the batch process (Server A and Server B). By default, I would like my Spring cloud task to be launched on Server A. If the Server A is shut down, the task should be deployed on server B. Can Kubernetes / SCDF handle this kind of mechanism? I am wondering whether the nodeselector is the thing that I should look into.
Yes, you can pass deployment.nodeSelector as a deployment property when launching the task.
The deployment.nodeSelector is a Kubernetes deployment property and hence, you need to pass something like this:
task launch mytask --properties "deployer.<taskAppName>.kubernetes.deployment.nodeSelector=foo1:bar1,foo2:bar2"
You can check the list of supported Kubernetes deployer properties here

Running a spring batch with partitions in cloud foundry

I have created an app with spring batch(with partition) application taking example of this https://github.com/mminella/S3JDBC. My app is reading some files from object store and doing some processing and writing back to object the store. My app with local partition works fine in my machine.
I changed the maven, to run in cloud foundry , did change for deployer partition handler and step execution listener and deploying on pcf.
But while trying to push and run the app on pcf , I am getting an issue :
Failing URI /v2/info. I tried to log the error found that there is one call to my app e.g https://mypcf.com:443/v2/info and after that it gives the error. I cant provide full logs because of some restrictions. So I want to know :
To deploy a spring batch in pcf(is there any extra configuration
needed except the maven dependency and code changes for
deployerpartitionhandler and stepexecutionlistener and #cloudtask):
org.springframework.cloud spring-cloud-deployer-cloudfoundry
1.1.0.M1
Is it mandatory to have a separate data base service like my-sql for the partition job. Cant I use H2(the default one, if I
don't configure anything)?
Do I need to do any configuration in pcf to support running multiple partitions ?
As I am running remote partitioning , can I run that app on local STS or Intellij(not on PCF-DEV)so that it will run my app in
pcf(remote) and launch the workers.(Sorry for the stupid question ,
I am new to PCF).
Thanks for checking out my example. To answer your questions:
You should be able to use the latest deployer release (instead of that rather old version).
Yes. Partitioned steps need to all be able to share the same job repository data store so an in memory database like H2 will not work for that use case.
Besides defining your datasource, that's all that is required to live in PCF. That being said, there are other things that need to be configured, but you can use other mechanisms to do so (Spring Cloud Config Server, application.properties/yml, etc).
Yes, you should be able to run the master locally and have it deploy the workers onto PCF if you're using the CF deployer.

Spring cloud data flow deployment

I wanna deploy the Spring-cloud-data-flow on several hosts.
I will deploy the server of Spring-cloud-data-flow on one host-A, and deploy the agents on the other hosts(These hosts are in charge of executing the tasks).
Except the host-A, all the other hosts run the same tasks.
Shall I modify on the basis of the Spring Cloud Data Flow Local Server or on the Spring Cloud Data Flow Apache Yarn Server or other better choice?
Do you mean how the apps are deployed on several hosts? If so, the apps are deployed using the underlying deployer implementation. For instance, if it is local deployer then, each app is deployed by spawning a new process. You can scale out the number apps deployment using the count property during the stream deploy. I am not sure what do you mean by the agents here.

Using logstash, config server and eureka with spring cloud task and dataflow

We have an existing microservice environment with logstash, config and eureka servers. We are now setting up a Spring Cloud Dataflow (Kubernetes) environment (primarily intially to run tasks/batch jobs).
Ideally we would like the tasks to use the existing logstash, config and eureka servers via the standard spring boot configuration (annotations etc) to support the following scenarios:
Logstash: When a task runs its logs are output to logstash and viewable from Kibana
Config Server: To support changing configuration properties for tasks. eg a periodic task's configuration can be tweaked by altering the values on the configuration server and next time the task runs it will use the new values.
My understanding is that config server properties will override properties in the task definition which override properties in the internal application.properties.
Eureka: Each task would register itself in Eureka. The main reason for this is that our tasks have web actuator endpoints exposed and we can then can use Spring Boot Admin (which can discover services via eureka) to access the actuator endpoints and information while a task is running.
(Some of our tasks can take hours to run and this would enable us to monitor them, adjust logging etc)
Is this a sensible approach - or are there any potential issues to look out for here (eg short lived tasks with eureka). I can’t find any discussion of this in the existing spring cloud data flow or spring cloud task documentation.
You may try logstash-logback-encoder for SCDF integration with ELK stack. It works fine for our SCDF on Yarn stream application.
Config Server should work for any Spring Boot application.