I have just started learning Kafka and I am trying to build a prototype to have a producer which is a REST API and send the data to Kafka consumer. I went through quite a lot of documentation to figure out some particular procedure.
I wasn't able to understand if there is a single connector that I could just use just like the fileconnector or the JDBC connectors provided for Apache Kafka. Should I be writing a custom connector for this ?
I am pretty much confused on where to start. I am particularly looking for some structured documentation or idea on how to get this done.
It sounds like you're talking about the functionality that already exists in the REST Proxy. This provides a REST API for producing data into Kafka, or consuming data from Kafka.
Edit: From your comment I understand your question to be different. If you want to pull data from a REST endpoint into Kafka you can use Kafka Connect and the kafka-connect-rest plugin. There's an example of it in use here.
There's no need to write a connector (besides, it's not possible with python). One already exists for HTTP.
https://github.com/llofberg/kafka-connect-rest
Related
OpenAPI is good for RESTful services and at the moment, I'm hacking it to do it for asynchronous messaging system (specifically Kafka) by using POST to a /topic so that I can use redoc do create a website for the API.
I am trying to see if there's already established system of documenting for this. Especially since the GET /events which is used for event sourcing is getting larger and larger by the day.
It seems asyncAPI is basically what you are looking for: openapi but for topics instead of REST endpoints.
https://www.asyncapi.com/docs/getting-started/coming-from-openapi/
CloudEvents is a CNCF backed project for documenting event sourcing, one specification is for Kafka
https://github.com/cloudevents/sdk-java/blob/master/kafka/README.md
If you want a REST API, look at the Kafka REST Proxy
Consider using Protocol Buffers within Kafka.
https://developers.google.com/protocol-buffers/
Protocol Buffers require an API contract (".proto" file) if you want to call or implement a service. The contracts are both human and machine readable.
Protocol Buffers can also be used with other messaging systems and other protocols like HTTP (check out "gRPC" for that). So your documentation / contract is more portable.
Of course this only works for projects having the flexibility to change their payload format.
Need to develop a rest API which can read published messages from kafka cluster to a dataware house application.
Materials available over internet say use POST/GET commands , but i don't think this is for production use rather useful for testing purposes.
How to implement it in scala/ Java Programming?
Materials available over internet say use POST/GET commands , but i don't think this is for production use rather useful for testing purposes
Please link to where you read this... All production web-services operate over (more than) these two HTTP methods, hundreds of thousands times a day...
If you want to really use Kafka for throughput, though, you wouldn't "hide it" behind a REST interface, though. You would distribute SSL certs plus usernames+passwords, to remote clients, for example.
Need to develop a rest API which can read publish messages from kafka
REST is not meant to keep an open connection, primarily because it is stateless (it shouldn't maintain where you are reading from in Kafka)... It would make more sense to forward a websocket from a Kafka consumer, which is different from a REST API.
how to implement it in scala/ Java Programming
The Confluent REST Proxy is already written in Java, and it is open-source (and used in Production at several companies, I believe). If you need inspiration, then you can start there. Otherwise, you can find examples of Spring and Vert.x, for example, with their Kafka integrations in their respective documentations, but you'll be re-implementing a lot of the existing functionality.
For a Java/Kotlin Spring boot app, if I want to send messages to Kafka or consume messages from Kafka. Would you recommend using Spring Kafka library or just using Kafka Java API.
Not quite sure are there any more benefits Spring provides or just a wrapper? For Spring they provide a lot of annotations which seems more magics when having some runtime error.
Want to hear some opinions.
Full disclosure: I am the project lead for Spring for Apache Kafka.
It's entirely up to you and your colleagues.
It's somewhat comparable to writing assembly code Vs. using a high level language and a compiler.
For an existing Spring shop that is familiar with spring-messaging (JMS, RabbitMQ etc), it's a natural fit, the APIs will be very familiar (POJO listeners, MessageConverters, KafkaTemplate, etc, etc).
When using the simplest APIs, Spring takes care of the low-level stuff like committing offsets, transaction synchronization, error handling, etc, etc.
If you have very basic requirements and/or want to write all that code yourself, then use the native APIs.
I've been looking into Apache Camel and Kafka over the past two days in hopes of learning about messaging frameworks/brokers. Is a possible use case of Camel/Kafka using Kafka as the message broker while implementing the producers and consumers with Camel? I saw a brief example of something similar, but can't seem to find it again. If not, what is the point of the Camel:Kafka component?
Yes Apache Camel makes using Kafka easier as it hides a bunch of the complexities, which is the main point about Camel components. However if you need to do something really advanced or be in control yourself then sometimes a Camel component may lack a functionality for that, or some hooks/apis you need, and if so people ask for that and we improve these components over time, to the communities requirements. And if you cannot wait/do that, then you do NOT have to use a Camel component and can use the Kafka API yourself directly - after all its all just Java.
There is also an Camel example here: https://github.com/apache/camel/tree/master/examples/camel-example-kafka.
And the Camel in Action 2nd edition book covers Camel with Kafka in its cluster chapter.
I am trying to understand the differences between something like Kafka and something like Camel. To my understanding Camel would provide much more abstraction for developers without having to worry about changing protocols/systems to some extent. How would Kafka not be able to handle most of what Camel can do now? I am reading through the documentation and it seems like Kafka has been updated/upgraded enough to slightly break away from being a message broker only. I guess my question would really come down to how does Kafka compare to Camel in regards to future proofing systems and where does Kafka fall short of Camel? I am under the impression that Kafka doesn't scale as well as a system grows.
Edit: This is strictly based around messages.The documentation surrounding Camel makes it very clear that it's based around Enterprise Integration Patterns, but the deeper I dive into Kafka documentation the same patterns can be implemented. Am I missing something?
Apache Kafka : Is a streaming processing platform. It is based on massively scalable publish subscribe message queue architecture. There are many other platforms which are based on JMS publish subscribe model, which could do the same(with some exceptions). Some of the most popular are Apache-Activemq, RabbitMq
Apache Camel : Is a message oriented middleware. It has implemented almost all the Enterprise Integration Patterns.
You can use Apache Camel with Apache Kafka. Or you can use Apache Kafka without Apache Camel also.
They are two totally different things.
Think about Camel as an interface definition tool where you can define endpoints or channels where messages fly in. But they are abstract. Compare Camel with Spring Integration for instance.
Kafka can provide those messages, so it can implement those abstract channels or endpoints. But so can ActiveMQ and others.
Kafka is a message broker. It is comparable with other message brokers like ActiveMQ, RabbitMQ, Azure Service Bus etc. Camel is an integration middleware. It is more comparable to Apache ServiceMix.
Taking a look at the theory of an Event-Driven Architecture https://www.oreilly.com/library/view/software-architecture-patterns/9781491971437/ch02.html we could differentiate two different kinds of Event-driven topologies depending on whether we need an event mediator or not.
Message broker. In this category we find Kafka as it doesn't rely on a message mediator. Of course as written on previous answers, we could use Kafka together with a mediator depending on our needs.
Message mediator. In this category we find products like Camel. You may see it as a message controller.