Kafka Streams - can I use in production - apache-kafka

I wonder if I can use Kafka Streams in production.
Is it really open source? Or Should we buy some license?
I am looking up documentation Kafka Streams and Licence FAQs but it isn't clear for me.

Kafka Streams is Apache 2.0 Licensed, not sure why you're looking at Confluent License pages
Its source code exists with the Kafka Broker, JVM clients, and Kafka Connect sources

Related

Is Kafka Streams library v1.1.0 forward compatible with Kafka cluster v2.8.1?

I'm running a Kafka Streams application. Here's the current version compatibility:
kafka-streams v1.1.0 (with custom changes on top of it)
kafka cluster v2.1.1
I'm planning to upgrade my Kafka cluster to v2.8.1. Is the same kafka-streams library compatible with the newer kafka version? I'm not able to find any official compatibility matrix for streams. I could find one from Confluent, but none from the official Kafka website.
I ran my Kafka Streams app against Kafka v2.8.1 and it does seems to run. I'd like to avoid any surprises at a later point in time, hence looking for any pointers to compatibility or support.
The Kafka Streams API uses the Kafka consumer and producer API underneath. The Kafka protocol is backward compatible so you should not have any problem (so having new Kafka cluster working with old Kafka clients).
Takes only into account that starting with Kafka 4.0.0 (not released yet), they are planning to remove this backward compatibility. See KIP-896 for more details here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-896%3A+Remove+old+client+protocol+API+versions+in+Kafka+4.0

Confluent platform vs Debezium

I'm trying to use Debezium platform to make a Kafka-cdc.But I was confused.
What is really difference between Confluent platform and Debezium?
Confluent (https://www.confluent.io/) is a platform which mainly integrate Apache-Kafka (https://kafka.apache.org/) and its ecosystems. So let say the basic Confluent platform has Zookeeper, apache kafka, KSql and thier Control Center.
Debezium is another platform to focus Database Streaming.
So you think Confluent is the general Streaming, and Debezium actually has a connector https://debezium.io/documentation/reference/stable/connectors/index.html that can be integreted to Confluent like in https://www.confluent.io/hub/debezium/debezium-connector-postgresql
At the time of writing, Confluent Platform does not have any CDC connectors, and you don't really need it. Apache Kafka Connect that is bundled as part of the Confluent Platform is all that's needed, and can be downloaded directly from Apache Kafka site instead.
Debezium is built on Kafka Connect API, and provided as a plug in.

MirrorMaker2 compatibility with Kafka 2.0.0

Trying to understand compatibility of MM2 libraries packaged as part of core kafka release 2.4 with Kafka Cluster running with Kafka core version 2.0.0.
The major improvements to MM2 got implemented as part of the KIP-382 got merged via this PR.
In this PR I don’t see any changes being made to core kafka. The changes are only seen with Kafka-connect and mirrormaker related code.
As I know Kafka Connect is a client application to kafka (Similar to any standard consumer and producer application). And MM2 built on top of Kafka Connect Architecture.
Though MirrorMaker2 related libraries are bundled with Core kafka libraries i.e., 2.4 Kafka version as part of release cycles and packages, I think MM2 doesn't have absolute dependency on the same version of the Kafka Broker (i.e., the one built with kafka 2.4.).
The reason being MM2 component is client side component.
This is valid to assume that we can set MM2 using kafka core 2.4 release packages to work with Kafka Broker set up with Kafka 2.0.0?
Any responses, comments or any materials in this regard are highly appreciated. Thank you.

Upgrading Kafka client from 0.8.2.0 to 0.11.0.0

Currently, at my company we are migrating from Kafka 0.8 to 0.11, brokers migration steps and clearly stated in kafka documentation here
What I am stuck in is, upgrading the kafka clients (producers, consumers, spark-streaming), I don't find any documentation/ articles listing out clearly what are the required changes or steps to follow to upgarde the client, all what I found is the java doc Producer Client
What I did so far is to change the kafka client version in my gradle to kafka-clients-0.11.0.0, and everything from the compilation point of view went fine with no code changes at all.
What I seek help with is, is there any expected problems I should take care of, any pointers for client changes other than the kafka-client version?
I went through lots of experiments to get this done.
For the consumers and producers, I just used the kafka consumers and producers 0.11.0.
The trick part was replacing spark-streaming, spark-streaming latest version only support upto kafka 0.10.X, which doesn't contains any updates related to the new broker.
What I recommend here, if you are about to write an application from scratch and your main goal is realtime streaming go for kafka-streaming API, it is just AWESOME!, if you already have spark streaming app (which was my case), you should either judge which is more important than the other wether to get stuck with the kafka-broker version 10.X and spark-streaming which was [experimental][1] btw.
The benefits of having the streaming inside kafka not spark the following:
Kafka streaming is a normal jar that can be injected in any java application, so you don't care that much about deployment, and environment
Auto-scaling is so easy when using kafka-streaming using any scaleset provided by any cloud service provider, unlike scaling a HDP cluster.
Monitoring using something like prometheus would be much easier.

Monitor Kafka using Opensource tools

Any Opensource tool to monitor confluent Kafka? Most of the opensource tools available are specific to Apache Kafka but not for Confluent Kafka.
we want to monitor atleast the connectors, streams and cluster health
The Kafka that is distributed in the Confluent Platform is Apache Kafka. There really is no such thing as "Confluent Kafka". Any tools that work with the latest version of Apache Kafka (including Kafka Connect and Kafka Streams) will work with the same versions of Kafka included with Confluent Open Source.
Confluent 3.3 includes Apache Kafka 0.11
Confluent 3.2 includes Apache Kafka 0.10.2
Confluent 3.1 includes Apache Kafka 0.10.1
Confluent 3.0 includes Apache Kafka 0.10.0
Confluent 2.0 includes Apache Kafka 0.9
Confluent 1.0 includes Apache Kafka 0.8.2
Note: Confluent Enterprise includes its own monitoring and management GUI called Control Center. Control Center is a separate process so the Apache Kafka is still the same as the open source version.
You can use updated version of KafkaOffsetMonitor. It supports SSL/TLS and Kerbros. Also uses Kafka 1.1.0 library.
You should be able to use kafka-monitor for monitoring your cluster's health as well as Burrow and KafkaOffsetMonitor for monitoring your consumer application lag. Also, you should definitely use something like jmxtrans for collecting your Kafka broker metrics.