Kafka and Microservices using Producer Consumer - apache-kafka

I need to use Apache Kafka in my project of Micro-Services. I need my one micro-service to produce data and another to consume the same data. How can I make Kafka do the same between the two services

I would recommend you to take a look at Spring Cloud Stream as it does exactly what you need.
From docs:
Framework for building message-driven microservices. Spring Cloud Stream builds upon Spring Boot to create DevOps friendly microservice applications and Spring Integration to provide connectivity to message brokers. Spring Cloud Stream provides an opinionated configuration of message brokers, introducing the concepts of persistent pub/sub semantics, consumer groups and partitions across several middleware vendors. This opinionated configuration provides the basis to create stream processing applications.
By adding #EnableBinding to your main application, you get immediate connectivity to a message broker and by adding #StreamListener to a method, you will receive events for stream processing.

Related

What is the difference between Apache Kafka and Kafka Streams on Spring Cloud Stream?

In the Spring Cloud website (https://spring.io/projects/spring-cloud-stream), are listed the binders options to use. And there we have the Apache Kafka and the Kafka Streams options.
What's the difference between them?
For what purpose we should choose between these two?
The Apache Kafka binder is used for basic kafka client usage consumer/producer api,
Kafka Stream binder is built upon the base apache kafka binder and adds the ability to use kafka streams api,
Kafka streams api is lightweight code libraries which gives you the functionality to manipulate data from topic/s in kafka to other topic/s in kafka , allow you to transform, enhance, filter,join, aggregate and more...
The Apache Kafka Binder implementation maps each destination to an Apache Kafka topic. The consumer group maps directly to the same Apache Kafka concept. Partitioning also maps directly to Apache Kafka partitions as well.
The binder currently uses the Apache Kafka kafka-clients version 2.3.1. This client can communicate with older brokers (see the Kafka documentation), but certain features may not be available. For example, with versions earlier than 0.11.x.x, native headers are not supported. Also, 0.11.x.x does not support the autoAddPartitions property
https://docs.spring.io/spring-cloud-stream-binder-kafka/docs/3.1.3/reference/html/spring-cloud-stream-binder-kafka.html#_apache_kafka_binder
Spring Cloud Stream includes a binder implementation designed explicitly for Apache Kafka Streams binding. With this native integration, a Spring Cloud Stream "processor" application can directly use the Apache Kafka Streams APIs in the core business logic.
Kafka Streams binder implementation builds on the foundations provided by the Spring for Apache Kafka project.
Kafka Streams binder provides binding capabilities for the three major types in Kafka Streams - KStream, KTable and GlobalKTable.
Kafka Streams applications typically follow a model in which the records are read from an inbound topic, apply business logic, and then write the transformed records to an outbound topic. Alternatively, a Processor application with no outbound destination can be defined as well.
https://docs.spring.io/spring-cloud-stream-binder-kafka/docs/3.1.3/reference/html/spring-cloud-stream-binder-kafka.html#_kafka_streams_binder

Kafka Streams without Sink

I'm currently planning the architecture for an application that reads from a Kafka topic and after some conversion puts data to RabbitMq.
I'm kind new for Kafka Streams and they look a good choice for my task. But the problem is that Kafka server is hosted at another vendor's place, so I can't even install Cafka Connector to RabbitMq Sink plugin.
Is it possible to write Kafka steam application that doesn't have any Sink points, but just processes input stream? I can just push to RabbitMQ in foreach operations, but I'm not sure will Stream even work without a sink point.
foreach is a Sink action, so to answer your question directly, no.
However, Kafka Streams should be limited to only Kafka Communication.
Kafka Connect can be installed and ran anywhere, if that is what you wanted to use... You can also use other Apache tools like Camel, Spark, NiFi, Flink, etc to write to RabbitMQ after consuming from Kafka, or write any application in a language of your choice. For example, the Spring Integration or Cloud Streams frameworks allows a single contract between many communication channels

Build a data transformation service using Kafka Connect

Kafka Streams is good, but I have to do every configuration very manual. Instead Kafka Connect provides its API interface, which is very useful for handling the configuration, as well as Tasks, Workers, etc...
Thus, I'm thinking of using Kafka Connect for my simple data transforming service. Basically, the service will read the data from a topic and send the transformed data to another topic. In order to do that, I have to make a custom Sink Connector to send the transformed data to the kafka topic, however, it seems those interface functions aren't available in SinkConnector. If I can do it, that would be great since I can manage tasks, workers via the REST API and running the tasks under a distributed mode (multiple instances).
There are 2 options in my mind:
Figuring out how to send the message from SinkConnector to a kafka topic
Figuring out how to build a REST interface API like Kafka Connect which wraps up the Kafka Streams app
Any ideas?
Figuring out how to send the message from SinkConnector to a kafka topic
A sink connector consumes data/messages from a Kafka topic. If you want to send data to a Kafka topic you are likely talking about a source connector.
Figuring out how to build a REST interface API like Kafka Connect which wraps up the Kafka Streams app.
using the kafka-connect-archtype you can have a template to create your own Kafka connector (source or sink). In your case that you want to build some stream processing pipeline after the connector, you are mostly talking about a connector of another stream processing engine that is not Kafka-stream. There are connectors for Kafka <-> Spark, Kafka <-> Flink, ...
But you can build your using the template of kafka-connect-archtype if you want. Use the MySourceTask List<SourceRecord> poll() method or the MySinkTask put(Collection<SinkRecord> records) method to process the records as stream. They extend the org.apache.kafka.connect.[source.SourceTask|sink.SinkTask] from Kafka connect.
a REST interface API like Kafka Connect which wraps up the Kafka Streams app
This is exactly what KsqlDB allows you to do
Outside of creating streams and tables with SQL queries, it offers a REST API as well as can interact with Connect endpoints (or embed a Connect worker itself)
https://docs.ksqldb.io/en/latest/concepts/connectors/

What's the difference between Spring Cloud Bus and Spring for Apache Kafka?

Using Spring for Apache Kafka, or Spring AMQP, I can achieve message pub/sub. Spring Cloud Bus uses kafka/rabbitmq to do the approximately same things, what's the differencce between them?
Spring Cloud Bus is an abstraction built on top of Spring Cloud Stream (and hence kafka and rabbitmq). It is not general purpose, but is built for sending administrative commands to multiple nodes of a service at once. For example, sending a refresh (from spring cloud commons) to all nodes of the user service. There is only one channel, where in spring cloud stream there are many. Think of it as distributed spring boot actuator.

KAFKA Producer API Vs JMS Producer API

High level Design of application :
Upstream system sends stream of data, data is received by Java Application. Using KAFKA as data store, logstash will publish stored data in Elastic index, and all the application will use elastic search query to get the data.
Problem : To publish data from Java application to KAFKA, which API Kafka JMS client or Java Kafka Producer/Consumer API should be used?
As per kafka documentation, If you are interested in writing new Java applications then you are encouraged to use the Java Kafka Producer/Consumer APIs as they provide advanced features not available when using the kafka-jms-client https://docs.confluent.io/current/clients/kafka-jms-client/docs/index.html .
Also as per KAFKA documentation it is not typical Messgaing broker and not all JMS concepts map 1:1 kafka.
Is there any benefit of using JMS API for KAFKA since KAFKA is not typical Messaging broker [and application will be still tightly couple to KAFKA] and not all JMS concepts can be mapped to kafka?