Migrate Spring cloud stream listener (kafka) from declarative to functional model - apache-kafka

I'm trying to migrate an implementation of spring cloud streams (kafka) declarative way to the recommended functional model
In this blog post they say :
...a functional programming model in Spring Cloud Stream (SCSt). It’s
less code, less configuration. Most importantly, though, your code is
completely decoupled and independent from the internals of SCSt
My current implementation:
Declaring the MessageChanel
#Input(PRODUCT_INPUT_TOPIC)
MessageChannel productInputChannel();
Using #StreamListener which is deprecated now
#StreamListener(StreamConfig.PRODUCT_INPUT_TOPIC)
public void addProduct(#Payload Product product, #Header Long header1, #Header String header2)

Here it is
#Bean
public Consumer<Product> addProduct() {
return product -> {
// your code
};
}
I am not sure what is the value of PRODUCT_INPUT_TOPIC, but let's assume input.
So the s-c-stream will automatically create a binding for you with name addProduct-in-0. Here are the details. You can use it as is, but if you still want to use the custom name, you can use spring.cloud.stream.function.bindings.addProduct-in-0=input. - see more here.
If you need access to headers, you can just pass a Message as input argument
Here it is
#Bean
public Consumer<Message<Product>> addProduct() {
return message -> {
Product product = message.getPayload();
// your code
};
}

Related

Error handling in Spring Cloud Kafka Streams

I'm using Spring Cloud Stream with Kafka Streams. Let's say I have a processor which is a Function which converts a KStream of Strings to a KStream of CityProgrammes. It invokes an API to find the City by name and an other transformation which finds any events near that city.
Now the problem is that any error happens during the transformation, the whole application stops. I want to send that one particular message to a DLQ and move along. I've been reading for days and everyone suggests to handle errors within the called services but that is a nonesense in my opinion, plus I still need to return a KStream: how do I do that within a catch?
I also looked at UncaughtExeptionHandler but it is not aware of the message and only able to restart the processing which won't skip this invalid message.
This might sound like an A-B problem so the question rephrased: how do I maintain the flow in a KStream when an exception occurs and send the invalid item to the DLQ?
When it comes to the application-level errors you have, it is up to the application itself how the error is handled. Kafka Streams and the Spring Cloud Stream binder mainly support deserialization and serialization errors at the framework level. Although that is the case, I think your scenario can be handled. If you are using Kafka Client prior to 2.8, here is an SO answer I gave before on something similar: https://stackoverflow.com/a/66749750/2070861
If you are using Kafka/Streams 2.8, here is an idea that you can use. However, the code below should only be used as a starting point. Adjust it according to your use case. Read more on how branching works in Kafka Streams 2.8. The branching API is significantly refactored in 2.8 from the prior versions.
public Function<KStream<?, String>, KStream<?, Foo>> convert() {
Foo[] foo = new Foo[0];
return input -> {
final Map<String, ? extends KStream<?, String>> branches =
input.split(Named.as("foo-")).branch((key, value) -> {
try {
foo[0] = new Foo(); // your API call for CitiProgramme converion here, possibly.
return true;
}
catch (Exception e) {
Message<?> message = MessageBuilder.withPayload(value).build();
streamBridge.send("to-my-dlt", message);
return false;
}
}, Branched.as("bar"))
.defaultBranch();
final KStream<?, String> kStream = branches.get("foo-bar");
return kStream.map((key, value) -> new KeyValue<>("", foo[0]));
};
}
}
The default branch is ignored in this code because that only contains the records that threw exceptions. Those were handled by the catch statement above in which we send the records to a DLT programmatically. Finally, we get the good records and map them to a new KStream and send it through the outbound.

Cannot get custom store connected to a Transformer with Spring Cloud Stream Binder Kafka 3.x

Cannot get custom store connected to my Transformer in Spring Cloud Stream Binder Kafka 3.x (functional style) following examples from here.
I am defining a KeyValueStore as a #Bean with type StoreBuilder<KeyValueStore<String,Long>>:
#Bean
public StoreBuilder<KeyValueStore<String,Long>> myStore() {
return Stores.keyValueStoreBuilder(
Stores.persistentKeyValueStore("my-store"), Serdes.String(),
Serdes.Long());
}
#Bean
#DependsOn({"myStore"})
public MyTransformer myTransformer() {
return new MyTransformer("my-store");
}
In debugger I can see that the beans get initialised.
In my stream processor function then:
return myStream -> {
return myStream
.peek(..)
.transform(() -> myTransformer())
...
MyTransformer is declared as
public class MyTransformer implements Transformer<String, MyEvent, KeyValue<KeyValue<String,Long>, MyEvent>> {
...
#Override
public void init(final ProcessorContext context) {
this.context = context;
this.myStore = context.getStateStore(storeName);
}
Getting the following error when application context starts up from my unit test:
Caused by: org.apache.kafka.streams.errors.StreamsException: Processor KSTREAM-TRANSFORM-0000000002 has no access to StateStore my-store as the store is not connected to the processor. If you add stores manually via '.addStateStore()' make sure to connect the added store to the processor by providing the processor name to '.addStateStore()' or connect them via '.connectProcessorAndStateStores()'. DSL users need to provide the store name to '.process()', '.transform()', or '.transformValues()' to connect the store to the corresponding operator, or they can provide a StoreBuilder by implementing the stores() method on the Supplier itself. If you do not add stores manually, please file a bug report at https://issues.apache.org/jira/projects/KAFKA.
In the application startup logs when running my unit test, I can see that the store seems to get created:
2021-04-06 00:44:43.806 INFO [ main] .k.s.AbstractKafkaStreamsBinderProcessor : state store my-store added to topology
I'm already using pretty much every feature of the Spring Cloud Stream Binder Kafka in my app and from my unit test, everything works very well. Unexpectedly, I got stuck at adding the custom KeyValueStore to my Transformer. It would be great, if you could spot an error in my setup.
The versions I'm using right now:
org.springframework.boot:spring-boot:jar:2.4.4
org.springframework.kafka:spring-kafka:jar:2.6.7
org.springframework.kafka:spring-kafka-test:jar:2.6.7
org.springframework.cloud:spring-cloud-stream-binder-kafka-streams:jar:3.0.4.RELEASE
org.apache.kafka:kafka-streams:jar:2.7.0
I've just tried with
org.springframework.cloud:spring-cloud-stream-binder-kafka-streams:jar:3.1.3-SNAPSHOT
and the issue seems to persist.
In your processor function, when you call .transform(() -> myTransformer()), you also need to provide the state store names in order for this to be connected to that transformer. There are some overloaded transform methods in the KStream API that takes state store names as a vararg. I wonder if this is the issue that you are running into. You may want to change that call to .transform(() -> myTransformer(), "myStore").

Reactive end-points design

Im learning WebFlux.
Wiki says that reactive programming is:
For example, in an imperative programming setting, a:=b+c would mean that a is being assigned the result of b+c in the instant the expression is evaluated, and later, the values of b and/or c can be changed with no effect on the value of a.
However, in reactive programming, the value of a is
automatically updated whenever the values of b and/or c change; without the program having to re-execute the sentence
a:=b+c to determine the presently assigned value of a.
Ok. When Im reproducing example like:
#RestController
public class PersonController {
private final PersonRepository repository;
public PersonController(PersonRepository repository) {
this.repository = repository;
}
#PostMapping("/person")
Mono<Void> create(#RequestBody Publisher<Person> personStream) {
return this.repository.save(personStream).then();
}
#GetMapping("/person")
Flux<Person> list() {
return this.repository.findAll();
}
#GetMapping("/person/{id}")
Mono<Person> findById(#PathVariable String id) {
return this.repository.findOne(id);
}
}
I'm Posting 2 persons. (on the chrome page 1)
Then getting list of all persons (on the chrome page 2)
Then adding one more person (on the chrome page 3)
Then I'm getting back to the page 2 (with no refreshing), I dont see updated list of persons, should I?
Also, how should work UPDATE/DELETE operations here?
I guess you're referring to the reactive programming wikipedia page and maybe reading too much into that example.
This example (and the famous spreadsheet one) usually point to UI rich applications that are listening to user events and publishing application events to update the UI.
Reactive programming and Reactive Streams by themselves aren't enough to set up such an infrastructure.
In your Controller, operations are performed and values are published in a reactive way: with backpressure support and access to a reactive API to compose them. Once the JSON response is rendered, the client doesn't receive new elements from the server.
You can create such a system though, by publishing events and having a persistent connection (SSE, for example) between server and browser.

Streaming custom objects

I am building a SOAP webservice. I am using JAX-WS to create this service and deploying it on a Glassfish 3.1.2 server.
I have no problem having this service return a String build with the XML representation of what I want. I can also get it to return a specific object. What I am having issues with is streaming this resource.
This is what I have so far :
Interface :
#MTOM
#WebService
#XmlRootElement(name="root.element.class.location")
#SOAPBinding(style = Style.RPC, use=Use.LITERAL)
public interface ResultsServer {
#WebMethod
#XmlMimeType("text/xml")
public Test getResultDataAsXML(#WebParam(name="Id") Integer id) throws Exception;
}
Implementation :
---- Edit ----
This is where I would like to stream my resource. Let's say I need my results object becomes extremely large, I don't want to hold this is memory and would like to start sending it without holding it. (commented this in code)
#WebService(endpointInterface = "my.endpoint.class")
#StreamingAttachment(parseEagerly=true, memoryThreshold=4000000L)
public class ResultsServerImpl implements ResultsServer {
#Override
public Test getResultDataAsXML(Integer id) throws Exception {
Test results = new Test();
for(int i=0; i<[very large number]; i++) {
results.getResults().add("here : " + i);
/**at one point, this is too large to hold in memory
I would like to be able to start returning the object here
so it is not taking up all available memory */
}
return results; //or close the stream
}
}
---- End Edit ----
And my Test class is a simple class looking like this :
public class Test {
private ArrayList<String> results;
public Test() {
results = new ArrayList<String>();
}
public ArrayList<String> getResults() {
return results;
}
public void setResults(ArrayList<String> results) {
this.results = results;
}
}
Let's assume that this Test object becomes very big (and more complexe). I need to be able to stream this object. How would I go to proceed in streaming this.
Ideally, I would like to keep the structure of this object.
From what I have read so far, I would need to convert this object in some sort of DataHandler and return this object.
Any help is welcome! Thank you.
The JAX-WS implementation will leverage a JAXB implementation to marshal the object (most likely to a StAX XMLStreamWriter) so the output will be streamed (there won't be an XML document created in memory).
#BlaiseDoughan I think you've worded this the way I was looking for.
Yes that would be to prevent the instance of Test of fully being saved
in memory. Is there a way to do this?
If you want the data to appear in the messages as XML (as opposed to a SOAP attachment), the you could leverage JAXB's marshal events. In the beforeMarshal event you could load data into an object and then clear it in the afterUnmarshal method. Ultimately all the data will be pulled in, but it won't all be referenced at the same time.
http://docs.oracle.com/javase/6/docs/api/javax/xml/bind/Unmarshaller.html#unmarshalEventCallback
I would recommend using the xstream (http://x-stream.github.io/) library from thoughtworks for your streaming as it is bindable on both sides of your service and is compatible with SOAP envelops. In fact there is even an integration with ActiveSOAP.
An example of a SOAP envelope wrapped xstream object can be seen at http://jira.codehaus.org/secure/attachment/19097/SoapEnvelopeTestCase.java. A full usage from jboss can be seen at https://issues.jboss.org/secure/attachment/12325534/SOAPClient.java?_sscc=t.
XStream has been used for some very large streaming processes (I've used it for some large 100+ MB text objects without issue).

Spring DefaultMessageListenerContainer/SimpleMessageListenerContainer (JMS/AMQP) Annotation configuration

So I'm working on a project where many teams are using common services and following a common architecture. One of the services in use is messaging, currently JMS with ActiveMQ. Pretty much all teams are required to follow a strict set of rules for creating and sending messages, namely, everything is pub-subscribe and the messages that are sent are somewhat like the following:
public class WorkDTO {
private String type;
private String subtype;
private String category;
private String jsonPayload; // converted custom Java object
}
The 'jsonPayload' comes from a base class that all teams extend from so it has common attributes.
So basically in JMS, everyone is always sending the same kind of message, but to different ActiveMQ Topics. When the message (WorkDTO) is sent via JMS, first it is converted into a JSON object then it is sent in a TextMessage.
Whenever a team wishes to create a subscriber for a topic, they create a DefaultMessageListenerContainer and configure it appropriately to receive messages (We are using Java-based Spring configuration). Basically every DefaultMessageListenerContainer that a team defines is pretty much the same except for maybe the destination from which to receive messages and the message handler.
I was wondering how anyone would approach further abstracting the messaging configuration via annotations in such a case? Meaning, since everyone is pretty much required to follow the same requirements, could something like the following be useful:
#Retention(RetentionPolicy.RUNTIME)
#Target(ElementType.TYPE)
public #interface Listener {
String destination();
boolean durable() default false;
long receiveTimeout() default -1; // -1 use JMS default
String defaultListenerMethod() default "handleMessage";
// more config details here
}
#Listener(destination="PX.Foo", durable=true)
public class FooListener {
private ObjectMapper mapper = new ObjectMapper(); // converts JSON Strings to Java Classes
public void handleMessage(TextMessage message){
String text = message.getText();
WorkDTO dto = mapper.readValue(text, WorkDto.class);
String payload = dto.getPayload();
String type = dto.getType();
String subType = dto.getSubType();
String category = dto.getCategory();
}
}
Of course I left out the part on how to configure the DefaultMessageListenerContainer by use of the #Listener annotation. I started looking into a BeanFactoryPostProcessor to create the necessary classes and add them to the application context, but I don't know how to do all that.
The reason I ask the question is that we are switching to AMQP/RabbitMQ from JMS/ActiveMQ and would like to abstract the messaging configuration even further by use of annotations. I know AMQP is not like JMS so the configuration details would be slightly different. I don't believe we will be switching from AMQP to something else.
Here teams only need to know the name of the destination and whether they want to make their subscription durable.
This is just something that popped into my head just recently. Any thoughts on this?
I don't want to do something overly complicated though so the other alternative is to create a convenience method that returns a pre-configured DefaultMessageListenerContainer given a destination and a message handler:
#Configuration
public class MyConfig{
#Autowired
private MessageConfigFactory configFactory;
#Bean
public DefaultMessageListenerContainer fooListenerContainer(){
return configFactory.getListenerContainer("PX.Foo", new FooListener(), true);
}
}
class MessageConfigFactory {
public DefaultMessageListenerContainer getListener(String destination, Object listener, boolean durable) {
DefaultMessageListenerContainer l = new DefaultMessageListenerContainer();
// configuration details here
return l;
}
}