Launching task in Spring Cloud Dataflow with application properties - spring-cloud

I have a Spring Cloud Task in SCDF that launches successfully with the task definition:
some-task --some.property=test
I'd like to set the some.property property on task launch instead though. I thought I could do this by setting the deployment property app.*.some.property=test, but this doesn't work with either the local or cloudfoundry task launchers/deployers.
The above deployment property works with streams, but not tasks. Is it suppose to work with tasks, if not, why?

Yes. We can pass properties while launching a Task.
Task applications required same Database connections which Dataflow server uses to log the steps and executions. I deployed below task in local SCDF.
task create --definition "timestmp_custm --timestamp.format=\"dd.MM.yyyy\"" --name taskTimestmp2
task launch taskTimestmp2 --arguments "--spring.datasource.url=jdbc:mysql://localhost:3306/mydb --spring.datasource.username=root --spring.datasource.driver-class-name=org.mariadb.jdbc.Driver"

Related

How to force fargate service to be launched only from AWS Lambda

I've created a simple task to print a hello world. I've created a ECR image, docker compose and ecs-params.yml.
I get the cloudwatch log for the print, but the task keeps launching every minute, which I guess it's due to REPLICA service type.
How can I stop this from happening, I want to launch this Fargate task ONLY from a lambda, and when it finishes I don't it to be relaunched.
Thanks in advance
If you want a one-shot / one-off / standalone task to be launched by ECS and have it run until it finishes, you wouldn't use an ECS service definition but merely a task.
You can run tasks on their own without packaging as an ECS service.
See: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_run_task.html
If you are using the ECS CLI, then there is also ecs-cli compose create. So, you would use that call and not the one also creating an ECS service along with it.
You can then use AWS Lambda and send an ecs:RunTask AWS API call to invoke/start the ECS task.

Conditionally launch Spring Cloud Task on a specific node of Kubernetes cluster

I am building a data pipeline for batch processing. And I find that Spring Cloud Data Flow is a quite attractive framework to use. Without much knowledge in SCDF and Kubernetes, I am not sure whether it is possible to conditionally launch a Spring Cloud Task on a specific machine.
Suppose I have two physical servers that are for running the batch process (Server A and Server B). By default, I would like my Spring cloud task to be launched on Server A. If the Server A is shut down, the task should be deployed on server B. Can Kubernetes / SCDF handle this kind of mechanism? I am wondering whether the nodeselector is the thing that I should look into.
Yes, you can pass deployment.nodeSelector as a deployment property when launching the task.
The deployment.nodeSelector is a Kubernetes deployment property and hence, you need to pass something like this:
task launch mytask --properties "deployer.<taskAppName>.kubernetes.deployment.nodeSelector=foo1:bar1,foo2:bar2"
You can check the list of supported Kubernetes deployer properties here

Spring Cloud Data Flow UI

We have a Spring Batch Application that is triggered by a Task Command Line Runner that is periodically triggered. We are looking for a UI to view the Job Execution status, can we use the Spring Cloud Data Flow UI dependency and get the UI view capability of these Job Executions?
You cannot just use the SCDF GUI outside on your own without SCDF — they are tightly coupled.
When Task/batch-job are launched from SCDF, the task/job executions are automatically tracked in the common datasource; likewise, the SCDF GUI will show task and batch-job details automatically, as well [see task executions / job executions].
Whether using a scheduler or manually launching the jobs, as far as the launch from both approaches goes through SCDF, everything should just work.

How to set scheduler for Spring Batch jobs in Spring Cloud Data Flow?

I’m setting up a new Spring Batch Jobs and want to deploy it using SCDF. However, I have found that SCDF does not support scheduler feature in local framework.
I have 3 questions to ask you:
Can someone explain how scheduler of SCDF work?
Are there any ways to schedule 1 job using SCDF?
Can I use my local server as a Cloud Foundry? and how?
Yes, Spring Cloud Data Flow does not support scheduling on local platform. Please note that the local SCDF server is for development purposes only and by design, the scheduling support is intended to be relying on the platform. Hence, SCDF scheduling feature is supported on Cloud Foundry and Kubernetes using the CF and K8s schedulers.
1) Can s/o explain how scheduler of SCDF work?
sure, Similar to how the deployer is used for launching task/deploying the stream, there is an SPI for scheduling the tasks under spring-cloud-deployer project. The underlying scheduler implementations can implement this. Currently, we have CF and K8s scheduler implementations in spring-cloud-deployer-cloudfoundry and spring-cloud-deployer-kubernetes.
As a user, you can configure a scheduler for a task (batch) application (via SCDF Dashboard, shell etc.,). You can specify a cron expression to schedule the task. Once configured, the SCDF delegates the schedule request to the platform scheduler using the above-mentioned scheduler implementations. Once scheduled, it is the platform (PCF scheduler on CF, K8s scheduler on K8s) that takes care of the task using the schedule.
2) Are there any ways to schedule 1 job using SCDF?
Yes, based on the answer from 1
3) Can I use my local server as a cloud Foundry? and How?
To run SCDF on local pointing to the CF instance, you can set the necessary CF deployer properties and start the SCDF server instance. It is similar to how you configure multi platforms in SCDF server. You can find more documentation on this here.

Spring Cloud Dataflow - how to pass credentials to task

I use spring cloud dataflow deployed to pivotal cloud foundry, to run spring batch jobs as spring cloud tasks, and the jobs require aws credentials to access an s3 bucket.
I've tried passing the aws credentials as task properties, but the credentials are showing up in the task's log files as arguments or properties. (https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#spring-cloud-dataflow-global-properties)
For now, I am manually setting the credentials as env variables in pcf after each deployment, but I'm trying to automate this. The tasks aren't deployed until the tasks are actually launched, so on a deployment I have to launch the task, then wait for it to fail due to missing credentials, then set the env variable credentials with the cf cli. How so I provide these credentials, without them showing in the pcf app's logs?
I've also explored using vault and spring cloud config, but again, I would need to pass credentials to the task to access spring cloud config.
Thanks!
Here's a Task/Batch-Job example.
This App uses spring-cloud-starter-aws. And this starter already provides the Boot autoconfiguration and the ability to override AWS creds as Boot properties.
You'd override the properties while launching from SCDF like:
task launch --name S3LoaderJob --arguments "--cloud.aws.credentials.accessKey= --cloud.aws.credentials.secretKey= --cloud.aws.region.static= --cloud.aws.region.auto=false"
You can decide to control the log-level of the Task, so it doesn't log them in plain text.
Secure credentials for tasks should be configured either via environment variables in your task definition or by using something like Spring Cloud Config Server to provide them (and store them encrypted at rest). Spring Cloud Task stores all command line arguments in the database in clear text which is why they should not be passed that way.
After considering the approaches included in the provided answers, I continued testing and researching and concluded that the best approach is to use a Cloud Foundry "User Provided Service" to supply AWS credentials to the task.
https://docs.cloudfoundry.org/devguide/services/user-provided.html
Spring Boot auto-processes the VCAP_SERVICES environment variable included in each app's container.
http://engineering.pivotal.io/post/spring-boot-injecting-credentials/
I then used properties placeholders in the application-cloud.properties to map the processed properties into spring-cloud-aws properties:
cloud.aws.credentials.accessKey=${vcap.services.aws-s3.credentials.aws_access_key_id}
cloud.aws.credentials.secretKey=${vcap.services.aws-s3.credentials.aws_secret_access_key}