Kafka broker.id: env variable vs config file precedence - apache-kafka

I'm setting up a Kafka cluster, in which I'm setting the broker.id=-1 so that broker.ids are automatically generated. but in some cases want to set them using environment variables (i.e. KAFKA_BROKER_ID).
If done so, will the nodes with the KAFKA_BROKER_ID env variables use the env variable or auto-generate them?

Depends on how you are deploying your Kafka installation.
Out of the box, Kafka does not use system properties to configure broker id, so you need to put the value into .properties file.
(among others: grepping for KAFKA_BROKER_ID in Kafka source returns nothing)
KAFKA_BROKER_ID appears to be added by multiple Docker images, you'd need to contact the author of the one you are using.

Related

How to provide custom ActiveMQ Artemis configuration in kubernetes

I am using the Artemis Cloud operator for deploying ActiveMQ Artemis in k8s cluster. I wanted to change some properties of brokers that were not available in ActiveMQ Artemis custom resources. Specifically, I wanted to change log level from INFO to WARN. Below were the options I came across.
Create a custom broker init image and have a script written to modify the logging.properties file
Add properties in broker.properties config map. (Which I am not able to because the config map is immutable)
My questions are
Whether my above observations are correct or not?
Whether any environmental variables present for this configuration?
Do we have better way to change this specific configuration?
Creating a custom broker init image and having a script written to modify the logging.properties file is the only supported way at the moment.
Soon ActiveMQ Artemis will move to SLF4J and after that ArtemisCloud will provides an easy way to change default logging properties. The idea to use a config map is great, feel free to raise an issue for the ArtemisCloud operator https://github.com/artemiscloud/activemq-artemis-operator/issues

Is there a way to add variables to Kafka server.properties?

I don't have any experience with Kafka yet and need to automate a task. Is there a way that I can use env variables in the configuration file?
To be more specific:
advertised.listeners=INSIDE://:9092,OUTSIDE://<hostname>:29092
I'd like to extract and use the hostname from my env variables.
Property files offer no variable interpolation
If you started Kafka via Docker processes, or write your own shell scripts which generate a property file prior to starting the broker, then you could inject values
Some examples include confd, consul-template, dockerize

environmental variables in bosh deployment

I would like for a job J from a release R in a bosh deployment to start with a certain environmental variable E set, which is not available in the job's properties for configuration
Can this be specified in the deployment file or when calling the bosh cli?
Unfortunately, I am pretty sure this is not possible. BOSH does not understand environment variables. Instead, it executes an ERB template with the properties configured in the manifest. For example in this job template from log-cache is executed with the properties from a manifest along with defaults from the job spec.
If you need to have a particular environment variable set for testing/development, you can bosh ssh on to an instance where you are going to run the job and then mutate the generated file. Given the CF deployment example, bosh ssh doppler/0 and then modify the generated bpm.yml in /var/vcap/jobs/log-cache/config/bpm.yml. This is a workaround for debugging and development, if you need to set a field in a manifest reach out to the release author and open an issue or PR the ability to set environment variable as a property by adding it to the job spec.
(note the versions used in the example are just from HEAD and may not actually work)

Kafka-connect. Add environment variable to custom converter config

I'm using kafka-connect-elasticsearch with a custom converter, which extents standard org.apache.kafka.connect.json.JsonConverter.
In my custom converter I need to access an environment variable.
Let's assume, I need to append to every message the name of the cluster, which is written to environment variable CLUSTER.
How can I access my environment variable in the converter?
Maybe read it at converter configuration phase (configure(Map<String, ?> configs) methond)?
How can I forward CLUSTER env variable value to this configs map?
You can't get it in that map.
You would need to use System.getenv("CLUSTER")

spark-jobserver - managing multiple EMR clusters

I have a production environment that consists of several (persistent and ad-hoc) EMR Spark clusters.
I would like to use one instance of spark-jobserver to manage the job JARs for this environment in general, and be able to specify the intended master right when I POST /jobs, and not permanently in the config file (using master = "local[4]" configuration key).
Obviously I would prefer to have spark-jobserver running on a standalone machine, and not on any of the masters.
Is this somehow possible?
You can write a SparkMasterProvider
https://github.com/spark-jobserver/spark-jobserver/blob/master/job-server/src/spark.jobserver/util/SparkMasterProvider.scala
A complex example is here https://github.com/spark-jobserver/jobserver-cassandra/blob/master/src/main/scala/spark.jobserver/masterLocators/dse/DseSparkMasterProvider.scala
I think all you have to do is write one that will return the config input as spark master, that way you can pass it as part of job config.