strimzi kafka operator have supported kafka versions - kubernetes

Why does the strimzi kafka operator have supported kafka versions; why do I care about this, if the version of kafka is being managed by the operator?
Is this only mentioned for client support?

The Apache Kafka versions supported by the different Strimzi versions are listed on the Strimzi website. Supported in this case means the versions for which we ship container images and which were tested. There are several reasons why we don't support more versions:
While you might not care about this, if the version of kafka is being managed by the operator, the operator still cares because it needs to understand what it operates because it encodes the operational knowledge.
As any other software, also Apache Kafka evolves, APIs (for example around the Admin APIs) and configurations (e.g. new options are added in different versions and the operator needs to understand them to validate them or update them) are changing etc. So supporting old versions is not always easy without code complexity.
We have limited resources to build and test the software. Both in terms of contributors but also as CI resources to run the build and test pipelines.
The current Strimzi commitment to what Kafka versions does it support is listed here. If you are interested, you can always join the project and help to make things better. Sicne Strimzi is open source, you can also always try to add another Kafka versions yourself and build and test it.
The Kafka consumers and producers have normally very good backwards / forwards compatibility. So you do not necessarily need to always use the same version of the clients as the brokers.

Related

What is the difference between confluent kafka-rest & Strimzi Kafka Bridge

I would like to use a HTTP proxy on top of Kafka. I see two projects with same purpose :
https://github.com/confluentinc/kafka-rest
https://github.com/strimzi/strimzi-kafka-bridge
I use strimzi operator to spin up Kafka on Kubernetes.
Both are open source
Both can be used in commercial - self hosted cloud applications.
Both provide REST proxy on top of Kafka
How do they differ ? When to use which one ?
The licensing and nature of the projects are a bit different. Strimzi is an independent project under the Cloud Native Computing Foundation. And the Strimzi Bridge (and all other Strimzi components) is licensed under Apache License 2.0 as a recognized open-source license. The Confluent REST Proxy on the other hand uses a proprietary Confluent Community license - whether that works for you or not depends on how you use it.
On the other hand, if I remember correctly, the Confluent REST Proxy has more features around things such as topic and cluster management etc. For many use cases, both will do the job. But it is good to check what exact features you need as they might not support the same things.
The Strimzi Bridge is also directly supported by the Strimzi Operator. So if you already use it, it might be easier for you to stick with it. Similarly, if you would be already using the Confluent Platform, it might be easier to go with the Confluent REST proxy.
Disclaimer: I'm one of the Strimzi maintainers.

Kafka 1.x Obsolete Vulnerability

All of us know, Kafka is still using log4j 1.x jar files even though Log4j 1.x has reached End of Life in 2015 and is no longer supported. So it became obsolete vulnerability for Kafka.
Are there any ways to replace log4j 1.x in present Kafka (docker images) or Is there any development work is going on to replace these log4j1.x? If yes, is there any ETA provided by Kafka team?
Yes, there is active work to replace log4j, if not already done so. Please search Kafka JIRA.
Also refer to Kafka CVE list - https://kafka.apache.org/cve-list
And Confluent's own announcements - https://support.confluent.io/hc/en-us/articles/4412615410580-December-2021-Log4j-Vulnerabilities-Advisory
Regarding out of support versions of Kafka, that you're not willing to upgrade, then you can try replacing log4j directly with reload4j (I've not tried it), but of course, this will not handle nested dependencies very accurately.

Kafka Streams client library compatibility with kafka broker version

I am using kafka client & streams library version 2.7.0 for building my application. However the kafka brokers(2 different clusters) are at older version ( 2.4.1 & 2.6.0).
As i understand we can use the latest clients & Streams library and it should run fine with older version of kafka brokers. Am i correct ? Is there any compatibility matrix between client & streams library with kafka brokers ?
I tried running in my application (with 2.7.0 client library) in local environment ( with kafka version 2.6.0) and it worked fine but wanted to get the supported compatibility between them
Update: As onecricketeer has helpfully pointed out, you can refer to the Kafka Compatability Matrix. He also notes:
There is a general answer. Clients above 0.10.2 work with brokers down to that version for all basic functionality until stated otherwise. Extra functionality includes transactional/idempotence and record headers, which Spring may depend on, but Kafka Streams natively has no dependency on.
Additionally, the upgrade section of the Kafka Documentation provides guidance on upgrade order for various Kafka versions.
The compatability matrix provided by the spring-cloud-stream project may also be of assistance.

Kafka - Confluent Hub - Exploit only part of it

I already saw a similar question in SO, but not clearly answer my doubts.
We have different Kafka clusters and lot of exploitation operational habits around it. We have our way to start/stop the cluster, lots of exploit scripts that help maintain the cluster etc..
Now we would like to use Kafka connect connectors for new needs, but from what I saw, Kafka connect is extremely coupled to confluent-hub.
It's like I can't even use the connectors without having to install a full operational confluent-hub.
This makes it very difficult for us to use Kafka connect connectors, I understand that confluent-hub might be a framework that help running those connectors, but it's like we can't even use a dissociated Kafka cluster ( a one not exploited by confluent-hub..).
But maybe I miss something..
Do you know if there is any way to use properly Kafka connectors on a already existing Kafka cluster ( completely independent from confluent-hub) ?
EDITED :
It's more a question regarding the high coupled behaviour between confluent-hub and Kafka-connect. All the features that comes with Kafka connect ( distributed workers to handle different fail over scenarios, etc..) are not usable without confluent-hub, thus a "need" to have Kafka cluster running exclusively via confluent-hub, which is not an easy task when you already have an existing big Kafka cluster with lots of OPS habits on it.
Kafka Connect is part of Apache Kafka. It's a pluggable framework for streaming integration between systems in and out of Kafka.
To use Kafka Connect you need connectors for the specific technology with which you want to integrate. For example, S3 sink, Elasticsearch sink, JDBC source or sink, and so on.
The connector API is part of Apache Kafka, and available for anyone who wants to develop a connector.
Connectors are written by various people and organisations, and available in various different ways. How you obtain a connector depends on which connector you want, how its licensed, and how the author has made it available for distribution. It could be you go to github, clone the repo and build the JAR. It could be you can download the JAR directly.
All that Confluent Hub does is make lots of these connectors available for you in one place, easily searchable, and with an optional CLI tool that will install them for you.
Do you have to use Confluent Hub? No, not at all. Might it make your life easier in locating connectors that you want to use, and make it easier to install them? Hopefully :)
Disclaimer: I work for Confluent.

Apache Kafka and supported platforms

Basic question, which platforms and languages does Apache Kafka currently support?
Kafka is written in Scala, which means it runs on the JVM, so you can effectively run on any OS that supports the JVM. However, the brokers extract a huge performance boost by using the OS s kernel buffer cache. Im not sure how good this is with a non-unix system like Windows. The kafka source code base provides first class support for Scala and Java Clients . You could also find producer and consumer clients in languages like Php,C++, python etc under the contrib directory.
Apache Kafka runs well and is most stable and performant on Linux (either bare metal Linux, Linux VMs in private or public clouds, or Linux based docker containers). Kafka has been known to run on Windows but most vendors that commercially support Kafka do not extend their support to Windows for production servers so it's "community supported" by the Kafka community. Kafka also runs quite well on macOS for development.
The Apache Kafka distribution includes support for Java and Scala clients only but the larger Kafka community has created a long list of clients for other languages. A good list of the available options for clients is on the apache kafka wiki here: https://cwiki.apache.org/confluence/display/KAFKA/Clients
You will find that for some languages (like C#/.Net, Python, or Go) there are 2 or 3 or even more options for client libraries. Some are up to date with the newest Kafka wire protocol changes such as Exactly-Once Semantics, and message Headers which were added in Apache Kafka 0.11 or timestamps which were added in 0.10, or the security enhancements and new consumer api added in 0.9, and others are not. Some have the full set of functions/methods provided in Java (like seek(), or consumer group management, or interceptors) but others do not. Some are written purely in the target language and others are wrappers in the librdkafka C/C++ library. Some are commercially supported by a vendor and others are not, so choose based on your needs in terms of functionality, stability, execution environment, and supportability.