MSK, IAM, and Kafka Java Api - scala

So for some reason I can't get my connections just right with MSK via the Kafka Java API. I can get producers/consumers to work with MSK using conduktor and Kafka CLI tools. However when I try to hook up my Scala code I can't get it to work. So I am using the config as follows to connect via conduktor and Kafka CLI tools.
security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config = software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandle
and for my Scala application I am setting up producers/consumers using a similar pattern
def props: Properties = {
val p = new Properties()
....
p.setProperty("security.protocol", "SASL_SSL")
p.setProperty("sasl.mechanism", "AWS_MSK_IAM")
p.setProperty("sasl.jaas.config", "software.amazon.msk.auth.iam.IAMLoginModule required;")
p.setProperty("sasl.client.callback.handler.class", "software.amazon.msk.auth.iam.IAMClientCallbackHandler")
p
}
val PRODUCER = new KafkaConsumer[AnyRef, AnyRef](props)
So the code works when I omit the security config lines and run against a local instance of Kafka but when I try to hit the MSK it seems like it isn't constructing a consumer and I get the following error.
java.lang.IllegalStateException: You can only check the position for partitions assigned to this consumer.
However, the locally running instance works. So this makes me think I'm not setting up something correctly in the config to connect to the MSK.
I am trying to follow the following tutorial and I am using Scala 2.11 and Kafka versions 2.41. I also added the aws-msk-iam-auth to my build.sbt (1.1.0). Any thoughts or solutions?

This turned out to not be a problem with my AWS connection as I implemented some logging as explained here. My problem lies in the difference between my local running version of Kafka and MSK. I am still trying to understand the differences.

Related

What is the equivalent of enable.ssl.certificate.verification from librdkafka in Kafka Client Scala Library

We are testing the new TLS configuration in our Kafka Clusters in Test Environment, and we have two types of consumers on using librdkafka and other using Kafka Consumers in Scala.
security.protocol=SSL
ssl.endpoint.identification.algorithm=none
enable.ssl.certificate.verification=false
This work fine with our kafkacat, and also our libraries with librdkafka.
But if I try to connect with our Scala connector the configuration enable.ssl.certificate.verification doesn't exists in the documentation
I would like to know what is the equivalent of enable.ssl.certificate.verification in the Kafka consumers in Scala or Java. This is just to proceed with our testing. Is there anyway to connect with SSL without the certificate using the Scala Library?

NimbusLeaderNotFoundException in Apache Storm UI

I am trying to launch Storm ui for streaming application, however I constantly get this error:
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [localhost]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:250)
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:179)
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:138)
at org.apache.storm.daemon.ui.resources.StormApiResource.getClusterConfiguration(StormApiResource.java:116)
I have launched storm locally using storm script for starting nimbus, submitting jar and polling ui. What could be the reason of it?
Here is the code with connection setup:
val cluster = new LocalCluster()
val bootstrapServers = "localhost:9092"
val spoutConfig = KafkaTridentSpoutConfig.builder(bootstrapServers, "tweets")
.setProp(props)
.setFirstPollOffsetStrategy(FirstPollOffsetStrategy.LATEST)
.build()
val config = new Config()
cluster.submitTopology("kafkaTest", config, tridentTopology.build())
When you submit to a real cluster using storm jar, you should not use LocalCluster. Instead use the StormSubmitter class.
The error you're getting is saying that it can't find Nimbus at localhost. Are you sure Nimbus is running on the machine you're running storm jar from? If so, please post the commands you're running, and maybe also check the Nimbus log.

Kafka Connect configuration and the "consumer." prefix

I was hoping to get some clarification on the kafka connect configuration properties here https://docs.confluent.io/current/connect/userguide.html
We were having issues connecting to our confluent connect cluster to our kafka connect instance. We had all our settings configured correctly from what i could tell and didn’t have any luck.
After extensive googling some discovered that prefixing the configuration properties with “consumer.” seems to fix the issue. There is a mention of that prefix here https://docs.confluent.io/current/connect/userguide.html#overriding-producer-and-consumer-settings
I am having a hard time understanding wrapping my head around the prefix and how the properties are picked up by connect and used. It was my assumption that the java api client used by kafka connect will pick up the connection properties from the properties file. It might have some hard coded configuration properties that can be overridden by specifying the values in the properties file. But, this is not correct? The doc linked above mentions
All new producer configs and new consumer configs can be overridden by prefixing them with producer. or consumer.
What are the new configs? The link on that page just takes me to the list of all the configs. The doc mentions
Occasionally, you may have an application that needs to adjust the default settings. One example is a standalone process that runs a log file connector
that as the use case for using the prefix override, but this is connect cluster, how does that use case apply? Appreciate your time if you have read thus far
The new prefix is probably misleading. Apache Kafka is currently at version 2.3, and back in 0.8 and 0.9 a "new" producer and consumer API was added. These are now just the standard producer and consumer, but the new prefix has hung around.
In terms of overriding configuration, it is as you say; you can prefix any of the standard consumer/producer configs in the Kafka Connect worker with consumer. (for a sink) or producer. (for a source).
Note that as of Apache Kafka 2.3 you can also override these per connector, as detailed in this post : https://www.confluent.io/blog/kafka-connect-improvements-in-apache-kafka-2-3
The Post is too old, but I'll answer it for people who will face he same difficulty:
New properties, they would like to say : any custom consumer or producer configs.
And there is two levels :
Worker side : the worker has a consumer to read configs, status and offsets of each connector and has a producer (to write status and offsets) [not confuse with __consumer_offsets topics : offset topic is only for source connector], so to override those configs:
consumer.* (example: consumer.max.poll.records=10)
producer.* (example: producer.batch.size=10000)
Connector Side : this one will inherit the worker config by default, and to override consumer/producer configs, we should use :
consumer.override.* (example: consumer.override.max.poll.records=100)
producer.override* (example: producer.override.batch.size=20000)

Kafka broker is not available from localhost

I have installed kafka_2.11-1.1.0 and set advertised listener to advertised.listeners=PLAINTEXT://<my-ip>:9092 (in $KAFKA_HOME/config/server.properties).
I can connect and write to my kafka using java code and see my cluster via kafka-tool from another server but I can't write messages to my topic from my local machine (the one that I have installed kafka cluster on it).
I have also tried to set listeners value to listeners = PLAINTEXT://:9092 but there is no change. What should I do to my kafka to make it reachable and writable from both outside and inside of the localhost?
In the server.properties use these two following properties
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://<your ip>:9092
I finally solved the issue by changing my code's org.apache.kafka library from version 1.1.0 to version 2.1.0.
I mention that all of these libraries were imported (downloaded) and used via mvnrepository.com.
Also, our kafka producer and consumer code pattern were written using this article:
https://dzone.com/articles/kafka-producer-and-consumer-example.
Have a look in the below following links, it may be helpful for your scenario,
Kafka access inside and outside docker
Kafka Listeners - Explained

Consume from Kafka 0.10.x topic using Storm 0.10.x (KafkaSpout)

I am not sure if this a right question to ask in this forum. We were consuming from a Kafka topic by Storm using the Storm KafkaSpout connector. It was working fine till now. Now we are supposed to connect to a new Kafka cluster having upgraded version 0.10.x from the same Storm env which is running on version 0.10.x.
From storm documentation (http://storm.apache.org/releases/1.1.0/storm-kafka-client.html) I can see that storm 1.1.0 is compatible with Kafka 0.10.x onwards supporting the new Kafka consumer API. But in that case I won't be able to run the topology in my end (please correct me if I am wrong).
Is there any work around for this?
I have seen that even if the New Kafka Consumer API has removed ZooKeeper dependency but we can still consume message from it using the old Kafka-console-consumer.sh by passing the --zookeeper flag instead of new –bootstrap-server flag (recommended). I run this command from using Kafka 0.9 and able to consume from a topic hosted on Kafka 0.10.x
When we are trying to connect getting the below exception:
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /brokers/topics/mytopic/partitions
at storm.kafka.DynamicBrokersReader.getBrokerInfo(DynamicBrokersReader.java:81) ~[stormjar.jar:?]
at storm.kafka.trident.ZkBrokerReader.<init>(ZkBrokerReader.java:42) ~[stormjar.jar:?]
But we are able to connect to the remote ZK server and validated that the path exists:
./zkCli.sh -server remoteZKServer:2181
[zk: remoteZKServer:2181(CONNECTED) 5] ls /brokers/topics/mytopic/partitions
[3, 2, 1, 0]
As we can see above that it's giving us expected output as the topic has 4 partitions in it.
At this point have the below questions:
1) Is it at all possible to connect to Kafka 0.10.x using Storm version 0.10.x ? Has one tried this ?
2) Even if we are able to consume, do we need to make any code change in order to retrieve the message offset in case of topology shutdown/restart. I am asking this as we will passing the Zk cluster details instead of the brokers info as supported in old KafkaSpout version.
Running out of options here, any pointers would be highly appreciated
UPDATE:
We are able to connect and consume from the remote Kafka topic while running it locally using eclipse. To make sure storm does not uses the in-memory zk we have used the overloaded constructor LocalCluster("zkServer",port), it's working fine and we can see the data coming. This lead us to conclude that version compatibility might not be the issue here.
However still no luck when deployed the topology in cluster.
We have verified the connectivity from storm box to zkservers
The znode seems fine also ..
At this point really need some pointers here, what could possibly be wrong with this and how do we debug that? Never worked with Kafka 0.10x before so not sure what exactly are we missing.
Really appreciate some help and suggestions
Storm 0.10x is compatible with Kafka 0.10x . We can still uses the old KafkaSpout that depends on zookeeper based offset storage mechanism.
The connection loss exception was coming as we were trying to reach a remote Kafka cluster that does not allow/accept connection from our end. We need to open specific firewall port so that the connection can be established. It seems that while running topology is cluster mode all the supervisor nodes should be able to talk to the zookeeper, so the firewall should be open for each one of them.