App properties in Spring Cloud Data Flow application - spring-cloud

Based on the documentation for Spring Cloud Data Flow (SCDF) only properties that are prefixed by either "deployed." or "app." are considered when deploying an application (be it a source, processor or sink) as part of a stream.
However, I've noticed that besides the prefix, all the properties must be provided as "strings", no matter what their original type is; otherwise, they are simply discarded by SCDF as per this line of code:
propertiesToUse = DeploymentPropertiesUtils.convert(props);
which does this:
public static Map<String, String> convert(Properties properties) {
Map<String, String> result = new HashMap<>(properties.size());
for (String key : properties.stringPropertyNames()) {
result.put(key, properties.getProperty(key));
}
return result;
}
As you can see from the snippet above, it only considers "stringPropertyNames" which filters out anything that is not provided as a "String".
I presume this behaviour is intentional, but why? Why not just pick up all the properties defined by the user with the proper prefix?
Thanks for your support.

All the deployment properties are expected to be of Map<String, String> based on the contract set by the deployer SPI.
I believe one of the reasons is to have String key, values being passed to target deployment platform without serialization/de-serialization hurdle. and, using String values is similar to how one could set those key, value properties as environment variables (for example) in the target deployment platform.

Related

Cannot get custom store connected to a Transformer with Spring Cloud Stream Binder Kafka 3.x

Cannot get custom store connected to my Transformer in Spring Cloud Stream Binder Kafka 3.x (functional style) following examples from here.
I am defining a KeyValueStore as a #Bean with type StoreBuilder<KeyValueStore<String,Long>>:
#Bean
public StoreBuilder<KeyValueStore<String,Long>> myStore() {
return Stores.keyValueStoreBuilder(
Stores.persistentKeyValueStore("my-store"), Serdes.String(),
Serdes.Long());
}
#Bean
#DependsOn({"myStore"})
public MyTransformer myTransformer() {
return new MyTransformer("my-store");
}
In debugger I can see that the beans get initialised.
In my stream processor function then:
return myStream -> {
return myStream
.peek(..)
.transform(() -> myTransformer())
...
MyTransformer is declared as
public class MyTransformer implements Transformer<String, MyEvent, KeyValue<KeyValue<String,Long>, MyEvent>> {
...
#Override
public void init(final ProcessorContext context) {
this.context = context;
this.myStore = context.getStateStore(storeName);
}
Getting the following error when application context starts up from my unit test:
Caused by: org.apache.kafka.streams.errors.StreamsException: Processor KSTREAM-TRANSFORM-0000000002 has no access to StateStore my-store as the store is not connected to the processor. If you add stores manually via '.addStateStore()' make sure to connect the added store to the processor by providing the processor name to '.addStateStore()' or connect them via '.connectProcessorAndStateStores()'. DSL users need to provide the store name to '.process()', '.transform()', or '.transformValues()' to connect the store to the corresponding operator, or they can provide a StoreBuilder by implementing the stores() method on the Supplier itself. If you do not add stores manually, please file a bug report at https://issues.apache.org/jira/projects/KAFKA.
In the application startup logs when running my unit test, I can see that the store seems to get created:
2021-04-06 00:44:43.806 INFO [ main] .k.s.AbstractKafkaStreamsBinderProcessor : state store my-store added to topology
I'm already using pretty much every feature of the Spring Cloud Stream Binder Kafka in my app and from my unit test, everything works very well. Unexpectedly, I got stuck at adding the custom KeyValueStore to my Transformer. It would be great, if you could spot an error in my setup.
The versions I'm using right now:
org.springframework.boot:spring-boot:jar:2.4.4
org.springframework.kafka:spring-kafka:jar:2.6.7
org.springframework.kafka:spring-kafka-test:jar:2.6.7
org.springframework.cloud:spring-cloud-stream-binder-kafka-streams:jar:3.0.4.RELEASE
org.apache.kafka:kafka-streams:jar:2.7.0
I've just tried with
org.springframework.cloud:spring-cloud-stream-binder-kafka-streams:jar:3.1.3-SNAPSHOT
and the issue seems to persist.
In your processor function, when you call .transform(() -> myTransformer()), you also need to provide the state store names in order for this to be connected to that transformer. There are some overloaded transform methods in the KStream API that takes state store names as a vararg. I wonder if this is the issue that you are running into. You may want to change that call to .transform(() -> myTransformer(), "myStore").

Kafka Streams - ACLs for Internal Topics

I'm trying to setup a secure Kafka cluster and having a bit of difficulty with ACLs.
The Confluent security guide for Kafka Streams (https://docs.confluent.io/current/streams/developer-guide/security.html) simply states that the Cluster Create ACL has to be given to the principal... but it doesn't say anything about how to actually handle the internal topics.
Through research and experimentation, I've determined (for Kafka version 1.0.0):
Wildcards cannot be used along with text for topic names in ACLs. For example, since all internal topics are prefixed with the application id, my first thought was to apply an acl to topics matching '<application.id>-*'. This doesn't work.
Topics created by the Streams API do not get read/write access granted to the creator automatically.
Are the exact names of the internal topics predictable and consistent? In other words, if I run my application on a dev server, will the exact same topics be created on the production server when run? If so, then I can just add ACLs derived from dev before deploying. If not, how should the ACLs be added?
Are the exact names of the internal topics predictable and consistent? In other words, if I run my application on a dev server, will the exact same topics be created on the production server when run?
Yes, you'll get the same exact topics names from run to run. The DSL generates processor names with a function that looks like this:
public String newProcessorName(final String prefix) {
return prefix + String.format("%010d", index.getAndIncrement());
}
(where index is just an incrementing integer). Those processor names are then used to create repartition topics with a function that looks like this (the parameter name is a processor name generated as above):
static <K1, V1> String createReparitionedSource(final InternalStreamsBuilder builder,
final Serde<K1> keySerde,
final Serde<V1> valSerde,
final String topicNamePrefix,
final String name) {
Serializer<K1> keySerializer = keySerde != null ? keySerde.serializer() : null;
Serializer<V1> valSerializer = valSerde != null ? valSerde.serializer() : null;
Deserializer<K1> keyDeserializer = keySerde != null ? keySerde.deserializer() : null;
Deserializer<V1> valDeserializer = valSerde != null ? valSerde.deserializer() : null;
String baseName = topicNamePrefix != null ? topicNamePrefix : name;
String repartitionTopic = baseName + REPARTITION_TOPIC_SUFFIX;
String sinkName = builder.newProcessorName(SINK_NAME);
String filterName = builder.newProcessorName(FILTER_NAME);
String sourceName = builder.newProcessorName(SOURCE_NAME);
builder.internalTopologyBuilder.addInternalTopic(repartitionTopic);
builder.internalTopologyBuilder.addProcessor(filterName, new KStreamFilter<>(new Predicate<K1, V1>() {
#Override
public boolean test(final K1 key, final V1 value) {
return key != null;
}
}, false), name);
builder.internalTopologyBuilder.addSink(sinkName, repartitionTopic, keySerializer, valSerializer,
null, filterName);
builder.internalTopologyBuilder.addSource(null, sourceName, new FailOnInvalidTimestamp(),
keyDeserializer, valDeserializer, repartitionTopic);
return sourceName;
}
If you don't change your topology—like, if don't change the order of how it's built, etc—you'll get the same results no matter where the topology is constructed (presuming you're using the same version of Kafka Streams).
If so, then I can just add ACLs derived from dev before deploying. If not, how should the ACLs be added?
I have not used ACLs, but I imagine that since these are just regular topics, then yeah, you can apply ACLs to them. The security guide does mention:
When applications are run against a secured Kafka cluster, the principal running the application must have the ACL --cluster --operation Create set so that the application has the permissions to create internal topics.
I've been wondering about this myself, though, so if I am wrong I am guessing someone from Confluent will correct me.

How to pass any number of headers to Feign client without knowing all the names?

I've a use case where I need to pass all headers that start with a certain prefix to the feign client. I don't know the number or exact names of these headers. There doesn't seem to be a way to to do this easily as the Feign client expects all headers to be specified using #RequestHeader("name"). It doesn't seem to support something like #RequestHeader HttpHeaders, which would be very useful.
Any suggestions?
As of this writing, Feign doesn't support dynamic headers or query parameters using a Map. The Spring Cloud Feign client relies on the Spring annotations instead of Feign annotations, and the implementations of AnnotatedParameterProcessor have a bug such that they don't do what the documentation states they should be doing.
RequestHeader doc:
If the method parameter is Map, MultiValueMap, or HttpHeaders then the
map is populated with all header names and values.
RequestParam doc:
If the method parameter is Map or MultiValueMap and a parameter name
is not specified, then the map parameter is populated with all request
parameter names and values.
I submitted a pull request that will fix this. Until then, I'm using an extension of SpringMvcContract that uses my own AnnotatedParameterProcessor implementations. I set the custom SpringMvcContract using a Feign.Builder as follows:
#Autowired
FormattingConversionService feignConversionService;
#Bean
#Scope(SCOPE_PROTOTYPE)
public Feign.Builder feignBuilder() {
return HystrixFeign.builder()
.contract(feignContract());
}
#Bean
public Contract feignContract() {
return new EnhancedSpringMvcContract(feignConversionService);
}
From the documentation, you should be able to specify a header map for dynamic headers.
In cases where both the header field keys and values are dynamic and the range of possible keys cannot be known ahead of time and may vary between different method calls in the same api/client (e.g. custom metadata header fields such as "x-amz-meta-" or "x-goog-meta-"), a Map parameter can be annotated with HeaderMap to construct a query that uses the contents of the map as its header parameters.
#RequestLine("POST /")
void post(#HeaderMap Map<String, Object> headerMap);

spring cloud programmatic metadata generation

Is there anyway that I can generate some metadata to add to the service when it registers.
We are moving from Eureka to Consul and I need to add a UUID value to the registered metadata when a service starts. So that later I can get this metadata value when I retrieve the service instances by name.
Some background: We were using this excellent front end UI from https://github.com/VanRoy/spring-cloud-dashboard. It is set to use the Eureka model for services in which you have an Application with a name. Each application will have multiple instances each with an instance id.
So with the eureka model there is a 2 level service description whereas the spring cloud model is a flat one where n instances each of which have a service id.
The flat model won't work with the UI that I referenced above since there is no distinction between application name and instance id which is the spring model these are the same.
So if I generate my own instance id and handle it through metadata then I can preserve some of the behaviour without rewriting the ui.
See the documentation on metadata and tags in spring cloud consul. Consul doesn't support metadata on service discovery yet, but spring cloud has a metadata abstraction (just a map of strings). In consul tags created with key=value style are parsed into that metadata map.
For example in, application.yml:
spring:
cloud:
consul:
discovery:
tags: foo=bar, baz
The above configuration will result in a map with foo→bar and baz→baz.
Based on Spencer's answer I added an EnvironmentPostProcessor to my code.
It works and I am able to add the metadata tag I want programmatically but it is a complement to the "tags: foo=bar, baz" element so it overrides that one. I will probably figure a way around it in the next day or so but I thougth I would add what I did for other who look at this answer and say, so what did you do?
first add a class as follows:
#Slf4j
public class MetaDataEnvProcessor implements EnvironmentPostProcessor, Ordered {
// Before ConfigFileApplicationListener
private int order = ConfigFileApplicationListener.DEFAULT_ORDER - 1;
private UUID instanceId = UUID.randomUUID();
#Override
public int getOrder() {
return this.order;
}
#Override
public void postProcessEnvironment(ConfigurableEnvironment environment,
SpringApplication application) {
LinkedHashMap<String, Object> map = new LinkedHashMap<>();
map.put("spring.cloud.consul.discovery.tags", "instanceId="+instanceId.toString());
MapPropertySource propertySource = new MapPropertySource("springCloudConsulTags", map);
environment.getPropertySources().addLast(propertySource);
}
}
then add a spring.factories in resources/META-INF with eht following line to add this processor
org.springframework.boot.env.EnvironmentPostProcessor=com.example.consul.MetaDataEnvProcessor
This works fine except for the override of what is in your application.yml file for tags

Alternate way of configuring data sources in quartz scheduler properties file

We are configuring the Quartz Scheduler data sources as specified in the documentation that is by providing all the details without encrypting the data base details. By this the data base details are exposed to the other users and any one who have access to the file system can easily get hands on.
So are there any other ways to provide the data sources details using API or provide the database details by encrypting and providing the details as part of quartz.properties file
On class "StdSchedulerFactory" you can call the method "initialize(Properties props)" to set needed propertries by API. Then you don't need a property-file. (See: StdSchedulerFactory API)
Example:
public Scheduler createSchedulerWithProperties(Properties props)
throws SchedulerException {
StdSchedulerFactory factory = new StdSchedulerFactory(props);
return factory.getScheduler();
}
But then you have to set all properties of SchedulerFactory. Also the properties, that have a default value with default constructor. (Search for 'quartz.properties' inside of 'quartz-2.2.X.jar' to get default property values of quartz.)