What is the format for passing in additional arguments or environment variables to the Data Flow Server in SCDF running on Kubernetes? When running locally in Docker Compose, I can do something like below, but not sure what the equivalent is when deploying to Kubernetes using the helm chart.
dataflow-server:
image: springcloud/spring-cloud-dataflow-server:${DATAFLOW_VERSION:-2.9.0-SNAPSHOT}
container_name: dataflow-server
ports:
- "9393:9393"
environment:
- spring.cloud.dataflow.applicationProperties.stream.spring.cloud.stream.kafka.binder.brokers=pkc...:9092
- spring.cloud.dataflow.applicationProperties.stream.spring.cloud.stream.kafka.binder.configuration.ssl.endpoint.identification.algorithm=https
- ...
I see that there's a parameter for the helm chart, server.extraEnvVars, but i'm just not exactly sure how to set the above spring.cloud.dataflow.applicationProperties parameters into a format where it will be picked up by the Data Flow Server.
The properties you are looking for might be here under
Kafka Chart Parameters
-> externalKafka.brokers
So in your case I would try
helm install my-release --set externalKafka.borkers=pkc...:9092 bitnami/spring-cloud-dataflow
But I don't see a parameter for the ssl.endpoint.identification.algorithm property.
You could try running in the SCDF shell with something like
stream deploy yourstream --properties "spring.cloud.dataflow.applicationProperties.stream.spring.cloud.stream.kafka.binder.configuration.ssl.endpoint.identification.algorithm=https..., spring.cloud.dataflow.applicationProperties.stream.spring.cloud.stream.kafka.binder.brokers=pkc...:9092"
Related
I am trying to read airflow variables into my ETL job to populate variables in the curation script. I am using the KubernetesPodOperator. How do I access the metadata database from my k8's pod?
Error I am getting from airflow:
ERROR - Unable to retrieve variable from secrets backend (MetastoreBackend). Checking subsequent secrets backend.
This is what I have in main.py for outputting into the console log. I have a key in airflow variables named "AIRFLOW_URL".
from airflow.models import Variable
AIRFLOW_VAR_AIRFLOW_URL = Variable.get("AIRFLOW_URL")
logger.info("AIRFLOW_VAR_AIRFLOW_URL: %s", AIRFLOW_VAR_AIRFLOW_URL)
Can anyone point me in the right direction?
Your DAG can pass them as environment variables to your Pod, using a template (e.g. KubernetesPodOperator(... env_vars={"MY_VAR": "{{var.value.my_var}}"}, ...)).
It looks like you have a secrets backend set in config without having a secrets backend set up, so Airflow is trying to go there to fetch your variable. See this link.
Alter your config to remove the backend and backend_kwargs keys, and it should look at your Airflow variables first.
[secrets]
backend =
backend_kwargs =
Use Case
The docker-compose.yml defines multiple services which represent the full application stack. In development mode, we'd like to dynamically exclude certain services, so that we can run them from an IDE.
As of Docker compose 1.28 it is possible to assign profiles to services as documented here but as far as I have understood, it only allows specifying which services shall be started, not which ones shall be excluded.
Another way I could imagine is to split "excludable" services into their own docker-compose.yml file but all of this seems kind of tedious to me.
Do you have a better way to exclude services?
It seems we both overlooked a certain very important thing about profiles and that is:
Services without a profiles attribute will always be enabled
So if you run docker-compose up with a file like this:
version: "3.9"
services:
database:
image: debian:buster
command: echo database
frontend:
image: debian:buster
command: echo frontend
profiles: ['frontend']
backend:
image: debian:buster
command: echo backend
profiles: ['backend']
It will start the database container only. If you run it with docker-compose --profile backend up it will bring database and backend containers. To start everything you need to docker-compose --profile backend --profile frontend up or use a single profile but several times.
That seems to me as the best way to make docker-compose not to run certain containers. You just need to mark them with a profile and it's done. I suggest you give the profiles reference a second chance as well. Apart from some good examples it explains how the feature interacts with service dependencies.
I have to deploy one of the docker image using postsqlDB , connection srting is like below , what is best
methode i can use.
"postgresql://username#host.name.svc.cluster.local?sslmode=require"
I have used env like below but not working,
-name : DB_ADDRESS
value: "postgresql://username#tcp(host.name.svc.cluster.local)?sslmode=require"
In past I had to create Postgresql DB. I would suggest you to use Helm chart link.
It gives you a lot of flexibility in configurations.
I am working on spring cloud dataflow stream app. I am able to run Spring cloud data flow server locally with the skipper running in Cloud Foundry with below configuration . Now i am trying to run the same with the skipper running in kubernetes cluster. How can i specify the same ?
manifest.yml
---
applications:
- name: poc-scdf-server
memory: 1G
instances: 1
path: ../target/scdf-server-1.0.0-SNAPSHOT.jar
buildpacks:
- java_buildpack
env:
JAVA_VERSION: 1.8.0_+
JBP_CONFIG_SPRING_AUTO_RECONFIGURATION: '{enabled: false}'
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_URL:
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_ORG: <org>
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_SPACE: <space>
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_DOMAIN: <url>
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_USERNAME: <user>
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_PASSWORD: <pwd>
SPRING_CLOUD_DATAFLOW_TASK_PLATFORM_CLOUDFOUNDRY_ACCOUNTS[default]_CONNECTION_SKIPSSLVALIDATION: true
SPRING_CLOUD_SKIPPER_CLIENT_SERVER_URI: <skipper_url> SPRING_CLOUD_GAIA_SERVICES_ENV_KEY_PREFIX:spring.cloud.dataflow.task.platform.cloudfoundry.accounts[default].connection.
In v2.3, we have recently added the platform-specific docker-compose.yml experience for the Local mode. You can find the new files here.
With this infrastructure, you could start SCDF locally, but also bring the platform accounts for CF, K8s, or even both! See docs.
You can also use the DockerComposeIT.java to bring thigs up and running with automation, as well.
We are using Helm Charts for deploying a service in several environments on Kubernetes cluster. Now for each environment there are a list of variables like the database url, docker image tag etc. What is the most obvious and correct way of defining Helm related values.yaml in such case where all the Helm template files remain same for all the environment except for some parameters as stated above.
One way to do this would be using multiple value files, which helm now allows. Assume you have the following values files:
values1.yaml:
image:
repository: myimage
tag: 1.3
values2.yaml
image:
pullPolicy: Always
These can both be used on command line with helm as:
$ helm install -f values1.yaml,values2.yaml <mychart>
In this case, these values will be merged into
image:
repository: myimage
tag: 1.3
pullPolicy: Always
You can see the values that will be used by giving the "--dry-run --debug" options to the "helm install" command.
Order is important. If the same value appears in both files, the values from values2.yaml will take precedent, as it was specified last. Each chart also comes with a values file. Those values will be used for anything not specified in your own values file, as if it were first in the list of values files you provided.
In your case, you could specify all the common settings in values1.yaml and override them as necessary with values2.yaml.