Kafka-connect, Bootstrap broker disconnected - apache-kafka

Im trying to setup Kafka Connect with the intent of running a ElasticsearchSinkConnector.
The Kafka-setup, consisting of 3 brokers secured using Kerberos, SSL and and ACL.
So far Ive been experimenting with running the connect-framework and the elasticserch-server localy using docker/docker-compose (Confluent docker-image 5.4 with Kafka 2.4) connecting to the remote kafka-installation (Kafka 2.0.1 - actually our production environement).
KAFKA_OPTS: -Djava.security.krb5.conf=/etc/kafka-connect/secrets/krb5.conf
CONNECT_BOOTSTRAP_SERVERS: srv-kafka-1.XXX.com:9093,srv-kafka-2.XXX.com:9093,srv-kafka-3.XXX.com:9093
CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: user-grp
CONNECT_CONFIG_STORAGE_TOPIC: test.internal.connect.configs
CONNECT_OFFSET_STORAGE_TOPIC: test.internal.connect.offsets
CONNECT_STATUS_STORAGE_TOPIC: test.internal.connect.status
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_ZOOKEEPER_CONNECT: srv-kafka-1.XXX.com:2181,srv-kafka-2.XXX.com:2181,srv-kafka-3.XXX.com:2181
CONNECT_SECURITY_PROTOCOL: SASL_SSL
CONNECT_SASL_KERBEROS_SERVICE_NAME: "kafka"
CONNECT_SASL_JAAS_CONFIG: com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="/etc/kafka-connect/secrets/kafka-connect.keytab" \
principal="<principal>;
CONNECT_SASL_MECHANISM: GSSAPI
CONNECT_SSL_TRUSTSTORE_LOCATION: <path_to_truststore.jks>
CONNECT_SSL_TRUSTSTORE_PASSWORD: <PWD>
When starting the connect-framework everything seem to work fine, I can see logs claiming that the kerberos authentication is successfull etc.
The problem comes when I try to start a connect-job using curl.
curl -X POST -H "Content-Type: application/json" --data '{ "name": "kafka-connect", "config": { "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector", "tasks.max": 1, "topics": "test.output.outage", "key.ignore": true, "connection.url": "http://elasticsearch1:9200", "type.name": "kafka-connect" } }' http://localhost:8083/connectors
The job seem to startup without issues but as soon as it is about to start consuming from the kafka-topic I get:
kafka-connect | [2020-04-06 10:35:33,482] WARN [Consumer clientId=connector-consumer-user-grp-2-0, groupId=connect-user-2] Bootstrap broker srv-kafka-1.XXX.com:9093 (id: -1 rack: null) disconnected (org.apache.kafka.clients.NetworkClient)
repeted in the connect-log for all brokers.
What is the nature of this issue? Comunication with the brokers seem to work well - the connect-job is communicated back to the kafka as intended and when the connect-framework is restarted the job seem to resume as intended (even though still faulty).
Anyone have an idea what might be causing this? or how I should go about to debug it.
Since it is our production-environment I have only a limited possibility to change the server-configuration. But from what I can tell nothing in the logs seems to indicate there is something wrong.
Thanks in advance

Per docs, you need to also configure security on the consumer/producer for the connector(s) that Kafka Connect is running. You do this by adding a consumer/producer prefix. So since you're using Docker, and the error suggests that you were creating a sink connector (i.e. requiring a consumer), add to your config:
CONNECT_CONSUMER_SECURITY_PROTOCOL: SASL_SSL
CONNECT_CONSUMER_SASL_KERBEROS_SERVICE_NAME: "kafka"
CONNECT_CONSUMER_SASL_JAAS_CONFIG: com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="/etc/kafka-connect/secrets/kafka-connect.keytab" \
principal="<principal>;
CONNECT_CONSUMER_SASL_MECHANISM: GSSAPI
CONNECT_CONSUMER_SSL_TRUSTSTORE_LOCATION: <path_to_truststore.jks>
CONNECT_CONSUMER_SSL_TRUSTSTORE_PASSWORD: <PWD>
If you're also creating a source connector you'll need to replicate the above but for PRODUCER_ too

Related

Need Kafka KRAFT SASL_PLAINTEXT

I’m trying to get Kafka in kraft mode up n’ running with SASL_PLAINTEXT
I’ve got a functioning kafka broker/controller up n’ running locally, without SASL using this servier.properties
process.roles=broker,controller
node.id=1
controller.quorum.voters=1#localhost:9093
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
inter.broker.listener.name=PLAINTEXT
advertised.listeners=PLAINTEXT://:9092
controller.listener.names=CONTROLLER
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
I’ve bound ports from the kafka docker container 9092 to 9092 on the host
kafka-topics.sh --list --bootstrap-server localhost:9092
kafka-topics.sh --bootstrap-server localhost:9092 --topic test --create --partitions 2 --replication-factor 1
Works like a charm, and I can produce and consume. Docker container logs looks good as well.
I need some users to handle ACL on our topics, so I thought it was easy to just replace all PLAINTEXT fields with SASL_PLAINTEXT, I was wrong!!
We handle encryption on another level, so SASL_PLAINTEXT is sufficient, we don't need SASL_SSL
This is the config/kraft/sasl_server.properties i've been trying out so far, with no luck.
I've constructed this properties file by reading this https://docs.confluent.io/platform/current/kafka/authentication_sasl/authentication_sasl_plain.html
process.roles=broker,controller
node.id=1
controller.quorum.voters=1#localhost:9094
listeners=SASL_PLAINTEXT://:9092,CONTROLLER://:9094
advertised.listeners=SASL_PLAINTEXT://:9092
controller.listener.names=CONTROLLER
listener.security.protocol.map=CONTROLLER:SASL_PLAINTEXT,SASL_PLAINTEXT:SASL_PLAINTEXT
sasl.enabled.mechanisms=PLAIN
sasl.mechanism.inter.broker.protocol=PLAIN
security.inter.broker.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
security.protocol=SASL_PLAINTEXT
listener.name.controller.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="admin" \
password="admin-secret" \
user_admin="admin-secret";
plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="admin" \
password="admin-secret";
I’m getting this error
java.lang.IllegalArgumentException: Could not find a 'KafkaServer' or 'controller.KafkaServer' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
What am I doing wrong here?
process.roles=$KAFKA_PROCESS_ROLES
node.id=$KAFKA_NODE_ID
controller.quorum.voters=$KAFKA_CONTROLLER_QUORUM_VOTERS
listeners=BROKER://:9092,CONTROLLER://:9093
advertised.listeners=BROKER://:9092
listener.security.protocol.map=BROKER:SASL_PLAINTEXT,CONTROLLER:SASL_PLAINTEXT
inter.broker.listener.name=BROKER
controller.listener.names=CONTROLLER
sasl.enabled.mechanisms=PLAIN
sasl.mechanism.controller.protocol=PLAIN
sasl.mechanism.inter.broker.protocol=PLAIN
listener.name.broker.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="admin" \
password="$KAFKA_ADMIN_PASSWORD" \
user_admin="$KAFKA_ADMIN_PASSWORD";
listener.name.controller.plain.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required \
username="admin" \
password="$KAFKA_ADMIN_PASSWORD" \
user_admin="$KAFKA_ADMIN_PASSWORD";
Here is a working configuration.
sasl.mechanism.controller.protocol=PLAIN was important.

Kafka sink connector not committing offsets but consuming messages

I am using kafka-connector to sink messages to snowflake.
Docker image: cp-kafka-connect-base:6.2.0
I have two consumer pods running in distributed mode. Please find the connect-config below
connector.class: "com.snowflake.kafka.connector.SnowflakeSinkConnector"
tasks.max: "2"
topics: "test-topic"
snowflake.topic2table.map: "test-topic:table1"
buffer.count.records: "500000"
buffer.flush.time: "240"
buffer.size.bytes: "100000000"
snowflake.url.name: "<url>"
snowflake.warehouse.name: "name"
snowflake.user.name: "username"
snowflake.private.key: "key"
snowflake.private.key.passphrase: "pass"
snowflake.database.name: "db-name"
snowflake.schema.name: "schema-name"
key.converter: "com.snowflake.kafka.connector.records.SnowflakeJsonConverter"
value.converter: "com.snowflake.kafka.connector.records.SnowflakeJsonConverter"
envs:
CONNECT_GROUP_ID: "testgroup"
CONNECT_CONFIG_STORAGE_TOPIC: "snowflakesync-config"
CONNECT_STATUS_STORAGE_TOPIC: "snowflakesync-status"
CONNECT_OFFSET_STORAGE_TOPIC: "snowflakesync-offset"
CONNECT_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_REST_ADVERTISED_HOST_NAME: "localhost"
CONNECT_REST_PORT: "8083"
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_OFFSET_FLUSH_INTERVAL_MS: "5000"
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: "3"
CONNECT_INTERNAL_KEY_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECT_INTERNAL_VALUE_CONVERTER: "org.apache.kafka.connect.json.JsonConverter"
CONNECTOR_NAME: "test-conn"
I am running two pods with the above config. Two pods are properly attached to one partition each and starts consuming.
Question ::
Whenever I deploy / restart the pods,the offets are getting committed [ CURRENT-OFFSET is getting updated] only ONCE , post that the sink connector keeps consuming the messages from topic, but the current-offset is NOT at all updated. ( offsets are not getting committed )
kafka-consumer-groups --bootstrap-server <server> --describe --group connect-test-conn
This is the command used to check the Current-offset is getting updated or not. Since only once the current_offset is updated, it always shows a lag and the lag keeps increasing.
But , I could see in logs ( put records ) & from snowflake the events are getting persisted.
Would like to know why the offsets are not getting committed continuously.
Example case: ( output of consumer-group command )
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
events-sync 0 6408022 25524319 19116297 connector-consumer-events-sync-0-b9142c5f-3bb7-47b1-bd44-a169a7984952 /xx.xx.xx.xx connector-consumer-events-sync-0
events-sync 1 25521059 25521202 143 connector-consumer-events-sync-1-107f2aa8-969c-4d7e-87f8-fdb2be2480b3 /xx.xx.xx.xx connector-consumer-events-sync-1

Problems starting new Kafka Connect SSL-based connector

Im trying to setup a new ElasticSearchSink-job on our KafkaConnect cluster. The cluster has been working smoothly for a couple of months with a SASL-SSL secured connection to Kafka and HTTPS to an Elastic-instance on host A. The KC-Cluster normally run in Kubernetes but for testing purposes I also run it locally using docker(image based on Confluent's KC-image v6.0.0), the Kafka resides in a test-environment and the job is started using REST-calls.
The docker-composed-file used for running it localy looks like this
version: '3.7'
services:
connect:
build:
dockerfile: Dockerfile.local
context: ./
container_name: kafka-connect
ports:
- "8083:8083"
environment:
KAFKA_OPTS: -Djava.security.krb5.conf=/<path-to>/secrets/krb5.conf
-Djava.security.auth.login.config=/<path-to>/rest-basicauth-jaas.conf
CONNECT_BOOTSTRAP_SERVERS: <KAFKA-INSTANCE-1>:2181,<KAFKA-INSTANCE-2>:2181,<KAFKA-INSTANCE-3>:2181
CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
CONNECT_REST_PORT: 8083
CONNECT_REST_EXTENSION_CLASSES: org.apache.kafka.connect.rest.basic.auth.extension.BasicAuthSecurityRestExtension
CONNECT_GROUP_ID: <kc-group>
CONNECT_CONFIG_STORAGE_TOPIC: service-assurance.test.internal.connect.configs
CONNECT_OFFSET_STORAGE_TOPIC: service-assurance.test.internal.connect.offsets
CONNECT_STATUS_STORAGE_TOPIC: service-assurance.test.internal.connect.status
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_KEY_CONVERTER: org.apache.kafka.connect.converters.IntegerConverter
CONNECT_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_ZOOKEEPER_CONNECT: <KAFKA-INSTANCE-1>:2181,<KAFKA-INSTANCE-2>:2181,<KAFKA-INSTANCE-3>:2181
CONNECT_SECURITY_PROTOCOL: SASL_SSL
CONNECT_SASL_KERBEROS_SERVICE_NAME: "kafka"
CONNECT_SASL_JAAS_CONFIG: com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="/<path-to>/kafka-connect.keytab" \
principal="<AD-USER>";
CONNECT_SASL_MECHANISM: GSSAPI
CONNECT_SSL_TRUSTSTORE_LOCATION: "/<path-to>/truststore.jks"
CONNECT_SSL_TRUSTSTORE_PASSWORD: <pwd>
CONNECT_CONSUMER_SECURITY_PROTOCOL: SASL_SSL
CONNECT_CONSUMER_SASL_KERBEROS_SERVICE_NAME: "kafka"
CONNECT_CONSUMER_SASL_JAAS_CONFIG: com.sun.security.auth.module.Krb5LoginModule required \
useKeyTab=true \
storeKey=true \
keyTab="/<path-to>/kafka-connect.keytab" \
principal="<AD-USER>";
CONNECT_CONSUMER_SASL_MECHANISM: GSSAPI
CONNECT_CONSUMER_SSL_TRUSTSTORE_LOCATION: "/<path-to>/truststore.jks"
CONNECT_CONSUMER_SSL_TRUSTSTORE_PASSWORD: <pwd>
CONNECT_PLUGIN_PATH: "/usr/share/java,/etc/kafka-connect/jars"
With a similar kuberneted-configuration.
The connector is started using something like:
curl -X POST -H "Content-Type: application/json" --data '{
"name": "connector-name",
"config": {
"connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
"tasks.max": 2,
"batch.size": 200,
"max.buffered.records": 1500,
"flush.timeout.ms": 120000,
"topics": "topic.connector",
"auto.create.indices.at.start": false,
"key.ignore": true,
"value.converter.schemas.enable": false,
"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"schema.ignore": true,
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"behavior.on.malformed.documents" : "ignore",
"behavior.on.null.values": "ignore",
"connection.url": "https://<elastic-host>",
"connection.username": "<user>",
"connection.password": "<pwd>",
"type.name": "_doc"
}
}' <host>/connectors/
Now, Ive been tasked to setup yet another connector, this time hosted on host B. The problem that I am experiencing is the infamous:
sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
I have modified the working truststore to include both the CA-Root-certificate for host B aswell. I belive the truststore is working as I am able to use it from a java code-snippet (actually found on a Atlassian page, SSLPoke.class) to connect to both A and B succesfully.
The connectors connecting to host A still work with the newly updated truststore but not the connector connecting to host B.
I have scanned to internet for clues on how to solve this and come across suggestions to explicitly add:
"elastic.https.ssl.truststore.location": "/<pathto>/truststore.jks",
"elastic.https.ssl.truststore.password": "<pwd>",
To the connector-configuration. Some other page suggested adding the truststore to the KC-configuration KAFKA_OPTS as such:
KAFKA_OPTS: -Djava.security.krb5.conf=/<path-to>/secrets/krb5.conf
-Djava.security.auth.login.config=/<path-to>/rest-basicauth-jaas.conf
-Djavax.net.ssl.trustStore=/<path-to>/truststore.jks
Following these suggestions I can actually get the connector connecting to host B to start succefully. But now comes the anoying part. With adding the extra param to KAFKA_OPTS my old connectors connecting to A stops working!! - with the exact same error! So now I have a case of either having connectors connecting to A OR connectors connecting to B working, but not at the same time.
Please, If anyone could give me some pointers or ideas on how to fix this it would be much appreciated, cause this is driving me nuts.

Cannot setup kerberized kafka broker: Error in authenticating with a Zookeeper Quorum member: the quorum member's saslToken is null

So I have been trying to setup kafka broker and zookeeper on a single node with kerberos enabled.
Most of it is based on this tutorial : https://qiita.com/visualskyrim/items/8f48ff107232f0befa5a
System : Ubuntu 18.04
Setup : A zoopeeker instance and a kafka broker process in one EC2 box, A KDC in another EC2 box. Both on the same security group with open ports on UDP 88.
Here is what I have done so far.
Downloaded the kafka broker from here : https://kafka.apache.org/downloads
Created a KDC and correctly generated keytab (verified via kinit -t). Then defined krb5_config file and host entries to the kdc in /etc/hosts file.
Created two jaas configs
cat zookeeper_jaas.conf
Server {
com.sun.security.auth.module.Krb5LoginModule required debug=true
useKeyTab=true
keyTab="/etc/kafka/zookeeper.keytab"
storeKey=true
useTicketCache=false
principal="zookeeper";
};
cat kafka_jaas.conf
cat /etc/kafka/kafka_jaas.conf
KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required debug=true
useKeyTab=true
useTicketCache=false
storeKey=true
keyTab="/etc/kafka/kafka.keytab"
principal="kafka";
};
Client {
com.sun.security.auth.module.Krb5LoginModule required debug=true
useKeyTab=true
storeKey=true
useTicketCache=false
keyTab="/etc/kafka/kafka.keytab"
principal="kafka";
};
Added some lines in the kafka broker config.
The config/zookeeper file has these extra lines added
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
jaasLoginRenew=3600000
kerberos.removeHostFromPrincipal=true
kerberos.removeRealmFromPrincipal=true
and config/server.properties (config file for broker) has these extra lines added
listeners=SASL_PLAINTEXT://kafka.com:9092
security.inter.broker.protocol=SASL_PLAINTEXT
sasl.mechanism.inter.broker.protocol=GSSAPI
sasl.enabled.mechanism=GSSAPI
sasl.kerberos.service.name=kafka
In one screen session, I do
5. export KAFKA_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/etc/kafka/zookeeper_jaas.conf -Dsun.security.krb5.debug=true"
and then run
bin/zookeeper-server-start.sh config/zookeeper.properties
And this correctly runs, and zookeeper starts up.
In another screen session I do
6. export KAFKA_OPTS="-Djava.security.krb5.conf=/etc/krb5.conf -Djava.security.auth.login.config=/etc/kafka/kafka_jaas.conf -Dsun.security.krb5.debug=true"
and then run
bin/kafka-server-start.sh config/server.properties
But this one fails, with this exception
[2020-02-27 22:56:04,724] ERROR SASL authentication failed using login context 'Client' with
exception: {} (org.apache.zookeeper.client.ZooKeeperSaslClient)
javax.security.sasl.SaslException: Error in authenticating with a Zookeeper Quorum member:
the quorum member's saslToken is null.
at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslToken(ZooKeeperSaslClient.java:279)
at org.apache.zookeeper.client.ZooKeeperSaslClient.respondToServer(ZooKeeperSaslClient.java:242)
at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:805)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:94)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
[2020-02-27 22:56:04,726] ERROR [ZooKeeperClient Kafka server] Auth failed.
(kafka.zookeeper.ZooKeeperClient)
[2020-02-27 22:56:04,842] ERROR Fatal error during KafkaServer startup. Prepare to shutdown
(kafka.server.KafkaServer)
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for
/consumers
at org.apache.zookeeper.KeeperException.create(KeeperException.java:126)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at kafka.zookeeper.AsyncResponse.maybeThrow(ZooKeeperClient.scala:560)
at kafka.zk.KafkaZkClient.createRecursive(KafkaZkClient.scala:1610)
at kafka.zk.KafkaZkClient.makeSurePersistentPathExists(KafkaZkClient.scala:1532)
at kafka.zk.KafkaZkClient.$anonfun$createTopLevelPaths$1(KafkaZkClient.scala:1524)
at kafka.zk.KafkaZkClient.$anonfun$createTopLevelPaths$1$adapted(KafkaZkClient.scala:1524)
at scala.collection.immutable.List.foreach(List.scala:392)
at kafka.zk.KafkaZkClient.createTopLevelPaths(KafkaZkClient.scala:1524)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:388)
at kafka.server.KafkaServer.startup(KafkaServer.scala:207)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:84)
at kafka.Kafka.main(Kafka.scala)
I also enabled the kerberos debug logs
This was the credentials log for kerberos
DEBUG: ----Credentials----
client: kafka#VISUALSKYRIM
server: zookeeper/localhost#VISUALSKYRIM
ticket: sname: zookeeper/localhost#VISUALSKYRIM
endTime: 1582881662000
----Credentials end----
This implies the Client jaas config is somehow an issue and the issue arises from this Line of Code : https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/client/ZooKeeperSaslClient.java#L310, but I cannot for the life of me figure out why. I cross referenced it with confluent docs and https://docs.confluent.io/2.0.0/kafka/sasl.html and it seems I am doing the right thing. So what gives?
Can anyone help me out with this? Thanks.
Well turns out kafka implicitly believes zookeeper's principal is
zookeeper/localhost
In order to make progress I
Created zookeeper/localhost principal in KDC.
Created a keytab for this called zookeeper-server.keyta
Updated the zookeeper jaas config to be
Server {
com.sun.security.auth.module.Krb5LoginModule required debug=true
useKeyTab=true
keyTab="/etc/kafka/zookeeper-server.keytab"
storeKey=true
useTicketCache=false
principal="zookeeper/localhost";
};
Which now no longer shows this error.
The kafka producer seems to be picking up the SPNs based on my /etc/hosts config
# Replace there keberos KDC server IP with the appropriate IP addresses
172.31.40.220 kerberos.com
127.0.0.1 localhost
Maybe try to look at KAFKA_HOME/config/server.properties and change default localhost to your-host in
zookeeper.connect=localhost:2181
as principal cname was not the same as sname. example:
cname zk/myhost#REALM.MY
sname zookeeper/localhost#REALM.MY
I also ended up using EXTRA_ARGS option -Dzookeeper.sasl.client.username=zk as stated in docs.
Worked for me. Seems like code 1, 2 which should take care of this is ignoring it and uses this property instead.

Kafka-connect issue

I installed Apache Kafka on centos 7 (confluent), am trying to run filestream kafka connect in distributed mode but I was getting below error:
[2017-08-10 05:26:27,355] INFO Added alias 'ValueToKey' to plugin 'org.apache.kafka.connect.transforms.ValueToKey' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:290)
Exception in thread "main" org.apache.kafka.common.config.ConfigException: Missing required configuration "internal.key.converter" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:463)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:453)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:62)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:75)
at org.apache.kafka.connect.runtime.WorkerConfig.<init>(WorkerConfig.java:197)
at org.apache.kafka.connect.runtime.distributed.DistributedConfig.<init>(DistributedConfig.java:289)
at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:65)
Which is now resolved by updating the workers.properties as mentioned in http://docs.confluent.io/current/connect/userguide.html#connect-userguide-distributed-config
Command used:
/home/arun/kafka/confluent-3.3.0/bin/connect-distributed.sh ../../../properties/file-stream-demo-distributed.properties
Filestream properties file (workers.properties):
name=file-stream-demo-distributed
connector.class=org.apache.kafka.connect.file.FileStreamSourceConnector
tasks.max=1
file=/tmp/demo-file.txt
bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
config.storage.topic=demo-2-distributed
offset.storage.topic=demo-2-distributed
status.storage.topic=demo-2-distributed
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter.schemas.enable=false
group.id=""
I added below properties and command went through without any errors.
bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
config.storage.topic=demo-2-distributed
offset.storage.topic=demo-2-distributed
status.storage.topic=demo-2-distributed
group.id=""
But, now when I run consumer command, I am unable to see the messages in /tmp/demo-file.txt. Please let me know if there is a way I can check if the messages are published to kafka topics and partitions ?
kafka-console-consumer --zookeeper localhost:2181 --topic demo-2-distributed --from-beginning
I believe I am missing something really basic here. Can some one please help?
You need to define unique topics for Kafka connect framework to store its config, offset, and status.
In your workers.properties file change these parameters to something like the following:
config.storage.topic=demo-2-distributed-config
offset.storage.topic=demo-2-distributed-offset
status.storage.topic=demo-2-distributed-status
These topics are use to store state and configuration metadata of connect and not for storing the messages for any of the connectors that run on top of connect. Do not use console consumer on any of these three topics and expect to see the messages.
The messages are stored in the topic configured in the connector configuration json with the parameter called "topic".
Example file-sink-config.json file
{
"name": "MyFileSink",
"config": {
"topics": "mytopic",
"connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"tasks.max": 1,
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"file": "/tmp/demo-file.txt"
}
}
Once the distributed worker is running you need to apply the config file to it using curl like so:
curl -X POST -H "Content-Type: application/json" --data #file-sink-config.json http://localhost:8083/connectors
After that the config will be safely stored in the config topic you created for all distributed workers to use. Make sure the config topic (and the status and offset topics) will not expire messages or you will loose you Connector configuration when it does.