Changing the Dynamic Default Broker Config using Java AdminClient - apache-kafka

Currently I am changing the default broker configurations in my kafka cluster using the kafka-configs.sh script.
./kafka-configs.sh --bootstrap-server <bootstrap_server> --entity-type brokers --entity-default --alter --add-config max.connections=100
The above command would set the default value of max.connections configuration to 100 in all my brokers of the cluster. I would like to achieve the same through Java.
I tried using the alterConfigs method in the AdminClient class. Using this method I am able to set the configuration value, but this value getting at the broker level.
Due to this I would have to execute the alterConfigs for each and every broker in the cluster which is not scalable.
Could anyone help me with changing the default broker configuration using AdminClient class similar to what I was doing with the shell script.
Thank you.

You could use the code below to set configs at broker-default level:
Properties props = new Properties();
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
Map<String, NewPartitions> newPartitions = new HashMap<>();
ConfigResource configResource = new ConfigResource(ConfigResource.Type.BROKER, "");
ConfigEntry entry = new ConfigEntry("max.connections", String.valueOf(100));
AlterConfigOp op = new AlterConfigOp(entry, AlterConfigOp.OpType.SET);
Map<ConfigResource, Collection<AlterConfigOp>> configs = new HashMap<>(1);
configs.put(configResource, Arrays.asList(op));
try (Admin admin = AdminClient.create(props)) {
admin.incrementalAlterConfigs(configs).all().get();
}

Related

Kafka Stream Exception: GroupAuthorizationException

I'm developing a Kafka-Stream application, which will read the message from input Kafka topic and filter unwanted data and push to output Kafka topic.
Kafka Stream Configuration:
#Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_CONFIG_BEAN_NAME)
public KafkaStreamsConfiguration kStreamsConfigs() {
Map<String, Object> streamsConfiguration = new HashMap<>();
streamsConfiguration.put(ConsumerConfig.GROUP_ID_CONFIG, "abcd");
streamsConfiguration.put(StreamsConfig.APPLICATION_ID_CONFIG, "QC-NormalizedEventProcessor-v1.0.0");
streamsConfiguration.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "test:9072");
streamsConfiguration.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
streamsConfiguration.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());
streamsConfiguration.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE);
streamsConfiguration.put(StreamsConfig.producerPrefix(ProducerConfig.ACKS_CONFIG), -1);
streamsConfiguration.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 3);
streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
streamsConfiguration.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG, kafkaConsumerProperties.getConsumerJKSFileLocation());
streamsConfiguration.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG, kafkaConsumerProperties.getConsumerJKSPwd());
streamsConfiguration.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SASL_SSL");
streamsConfiguration.put(SASL_MECHANISM, "PLAIN");
return new KafkaStreamsConfiguration(streamsConfiguration);
}
KStream Filter Logic:
#Bean
public KStream<String, String> kStreamJson(StreamsBuilder builder) {
KStream<String, String> stream = builder.stream(kafkaConsumerProperties.getConsumerTopic(), Consumed.with(Serdes.String(), Serdes.String()));
/** Printing the source message */
stream.foreach((key, value) -> LOGGER.info(THREAD_NO + Thread.currentThread().getId() + " *****Message From Input Topic: " + key + ": " + value));
KStream<String, String> filteredDocument = stream.filter((k, v) -> filterCondition.test(k, v));
filteredDocument.to(kafkaConsumerProperties.getProducerTopic(), Produced.with(Serdes.String(), Serdes.String()));
/** After filtering printing the same message */
filteredDocument.foreach((key, value) -> LOGGER.info(THREAD_NO + Thread.currentThread().getId() + " #####Filtered Document: " + key + ": " + value));
return stream;
}
While starting above spring based Kafka Stream application, i was getting below exception.
2019-05-27T07:58:36.018-0500 ERROR stream-thread [QC-NormalizedEventProcessor-v1.0.0-e9cb1bed-3d90-41f1-957a-4fc7efc12a02-StreamThread-1] Encountered the following unexpected Kafka exception during processing, this usually indicate Streams internal errors:
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: QC-NormalizedEventProcessor-v1.0.0
Our Kafka Infra team given necessary permission to "group.id", using this same "group id" i can consume the message using other Kafka Consumer applications and I was using name as per my wish in "application.id". We are not adding/updating "application.id" in Kafka Access Control List.
Am really not sure we need to give any permission for "application.id" or am missing something in the Kafka Stream Configuration. Please advice.
Please Note: I have tried with using with "group.id" and without "group.id" in Kafka Stream Configuration, all the time i am getting same exception.
Thanks!
Bharathiraja Shanmugam
I am not at my desk but I think Streams sets the group.id to application.id.
We need to set access for application.id as well.
For more information please refer -> https://docs.confluent.io/current/streams/developer-guide/security.html
Required ACL setting for secure Kafka clusters
Kafka clusters can use ACLs to control access to resources (like the ability to
create topics), and for such clusters each client, including Kafka
Streams, is required to authenticate as a particular user in order to
be authorized with appropriate access. In particular, when Streams
applications are run against a secured Kafka cluster, the principal
running the application must have the ACL set so that the application
has the permissions to create internal topics.
Since all internal topics as well as the embedded consumer group name are prefixed with the application ID, it is recommended to use ACLs on prefixed resource pattern to configure control lists to allow client to manage all topics and consumer groups started with this prefix as --resource-pattern-type prefixed --topic --operation All (see KIP-277 and KIP-290 for details).
For example, given the following setup of your Streams application:
• Config application.id value is team1-streams-app1.
• Authenticating with the Kafka cluster as a team1 user. • The
application's coded topology reads from input topics input-topic1
and input-topic2. • The application's topology write to output
topics output-topic1 and output-topic2.
Then the following commands would create the necessary ACLs in the
Kafka cluster to allow your application to operate:
# Allow Streams to read the input topics:
bin/kafka-acls ... --add --allow-principal User:team1 --operation Read --topic input-topic1 --topic input-topic2
# Allow Streams to write to the output topics:
bin/kafka-acls ... --add --allow-principal User:team1 --operation Write --topic output-topic1 --topic output-topic2
# Allow Streams to manage its own internal topics and consumer groups:
bin/kafka-acls ... --add --allow-principal User:team1 --operation All --resource-pattern-type prefixed --topic team1-streams-app1 --group team1-streams-app1

Kafka spout integration

I am using kafka 0.10.1.1 and storm 1.0.2. In the storm documentation for kafka integration , i can see that offsets are still maintained using zookeeper as we are initializing kafka spout using zookeeper servers.
How can i bootstrap the spout using kafka servers .Is there any example for this .
Example from storm docs
BrokerHosts hosts = new ZkHosts(zkConnString);
SpoutConfig spoutConfig = new SpoutConfig(hosts, topicName, "/" + topicName, UUID.randomUUID().toString());
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
This option using zookeeper is working fine and is consuming the messages . but i was not able to see the consumer group or storm nodes as consumers in kafkamanager ui .
Alternate approach tried is this .
KafkaSpoutConfig<String, String> kafkaSpoutConfig = newKafkaSpoutConfig();
KafkaSpout<String, String> spout = new KafkaSpout<>(kafkaSpoutConfig);
private static KafkaSpoutConfig<String, String> newKafkaSpoutConfig() {
Map<String, Object> props = new HashMap<>();
props.put(KafkaSpoutConfig.Consumer.BOOTSTRAP_SERVERS, bootstrapServers);
props.put(KafkaSpoutConfig.Consumer.GROUP_ID, GROUP_ID);
props.put(KafkaSpoutConfig.Consumer.KEY_DESERIALIZER,
"org.apache.kafka.common.serialization.StringDeserializer");
props.put(KafkaSpoutConfig.Consumer.VALUE_DESERIALIZER,
"org.apache.kafka.common.serialization.StringDeserializer");
props.put(KafkaSpoutConfig.Consumer.ENABLE_AUTO_COMMIT, "true");
String[] topics = new String[1];
topics[0] = topicName;
KafkaSpoutStreams kafkaSpoutStreams =
new KafkaSpoutStreamsNamedTopics.Builder(new Fields("message"), topics).build();
KafkaSpoutTuplesBuilder<String, String> tuplesBuilder =
new KafkaSpoutTuplesBuilderNamedTopics.Builder<>(new TuplesBuilder(topicName)).build();
KafkaSpoutConfig<String, String> spoutConf =
new KafkaSpoutConfig.Builder<>(props, kafkaSpoutStreams, tuplesBuilder).build();
return spoutConf;
}
But this solution is showing CommitFailedException after reading few messages from kafka.
Storm-kafka writes consumer information in a different location and different format in zookeeper with common kafka client. So you can't see it in kafkamanager ui.
You can find some other monitor tools, like
https://github.com/keenlabs/capillary.
On your alternate approach, you're likely getting CommitFailedException due to:
props.put(KafkaSpoutConfig.Consumer.ENABLE_AUTO_COMMIT, "true");
Up to Storm 2.0.0-SNAPSHOT (and since 1.0.6) -
KafkaConsumer autocommit is unsupported
From the docs:
Note that KafkaConsumer autocommit is unsupported. The
KafkaSpoutConfig constructor will throw an exception if the
"enable.auto.commit" property is set, and the consumer used by the
spout will always have that property set to false. You can configure
similar behavior to autocommit through the setProcessingGuarantee
method on the KafkaSpoutConfig builder.
References:
http://storm.apache.org/releases/2.0.0-SNAPSHOT/storm-kafka-client.html
http://storm.apache.org/releases/1.0.6/storm-kafka-client.html

Kafka: dynamically query configurations

Is there a way to access the configuration values in server.properties without direct access to that file itself?
I thought that:
kafka-configs.sh --describe --entity-type topics --zookeeper localhost:2181
might give me what I want, but I did not see the values set in server.properties. Just the following (I set 'ddos' as my own topic from kafka-topics.sh):
Configs for topics:ddos are
Configs for topics:__consumer_offsets are segment.bytes=104857600,cleanup.policy=compact
I was thinking I'd also see globally configured options, like this from the default configuration I have:
log.retention.hours=168
Thanks in advance.
Since Kafka 0.11, you can use the AdminClient describeConfigs() API to retrieve configuration of brokers.
For example, skeleton code to retrieve configuration for broker 0:
Properties adminProps = new Properties();
adminProps.load(new FileInputStream("admin.properties"));
AdminClient admin = KafkaAdminClient.create(adminProps);
Collection<ConfigResource> resources = new ArrayList<>();
ConfigResource cr = new ConfigResource(Type.BROKER, "0");
resources.add(cr);
DescribeConfigsResult dcr = admin.describeConfigs(resources);
System.out.println(dcr.all().get());

How can I get the group.id of a topic in command line in Kafka?

I installed kafka on my server and want to learn how to use it,
I found a sample code written by scala, below is part of it,
def createConsumerConfig(zookeeper: String, groupId: String): ConsumerConfig = {
val props = new Properties()
props.put("zookeeper.connect", zookeeper)
props.put("group.id", groupId)
props.put("auto.offset.reset", "largest")
props.put("zookeeper.session.timeout.ms", "400")
props.put("zookeeper.sync.time.ms", "200")
props.put("auto.commit.interval.ms", "1000")
val config = new ConsumerConfig(props)
config
}
but I don't know how to find the group id on my server.
The group id is something you define yourself for your consumer by providing a string id for it. All consumers started with the same id will "cooperate" and read topics in a coordinated way where each consumer instance will handle a subset of the messages in a topic. Providing a non-existent group id will be considered to be a new consumer and create a new entry in Zookeeper where committed offsets will be stored.
You could get a Zookeeper shell and list path where Kafka stores consumers' offsets like this:
./bin/zookeeper-shell.sh localhost:2181
ls /consumers
You'll get a list of all groups.
EDIT: I missed the part where you said that you're setting this up yourself so I thought that you want to list the consumer groups of an existing cluster.
Lundahl is right, this is a property that you define, which is used to coordinate consumer threads so that they don't consume "each other's" messages (each consumes a subset). If you, for example, use 2 consumers with different groups, they'll each consume the whole topic.
/kafkadir/kafka-consumer-groups.sh --all-topics --bootstrap-server hostname:port --list

Kafka 0.8, is it possible to create topic with partition and replication using java code?

In Kafka 0.8beta a topic can be created using a command like below as mentioned here
bin/kafka-create-topic.sh --zookeeper localhost:2181 --replica 2 --partition 3 --topic test
the above command will create a topic named "test" with 3 partitions and 2 replicas per partition.
Can I do the same thing using Java ?
So far what I found is using Java we can create a producer as seen below
Producer<String, String> producer = new Producer<String, String>(config);
producer.send(new KeyedMessage<String, String>("mytopic", msg));
This will create a topic named "mytopic" with the number of partition specified using the "num.partitions" attribute and start producing.
But is there a way to define the partition and replication also ? I couldn't find any such example. If we can't then does that mean we always need to create topic with partitions and replication (as per our requirement) before and then use the producer to produce message within that topic. For example will it be possible if I want to create the "mytopic" the same way but with different number of partition (overriding the num.partitions attribute) ?
Note: My answer covers Kafka 0.8.1+, i.e. the latest stable version available as of April 2014.
Yes, you can create a topic programatically via the Kafka API. And yes, you can specify the desired number of partitions as well as the replication factor for the topic.
Note that the recently released Kafka 0.8.1+ provides a slightly different API than Kafka 0.8.0 (which was used by Biks in his linked reply). I added a code example to create a topic in Kafka 0.8.1+ to my reply to the question How Can we create a topic in Kafka from the IDE using API that Biks was referring to above.
`
import kafka.admin.AdminUtils;
import kafka.cluster.Broker;
import kafka.utils.ZKStringSerializer$;
import kafka.utils.ZkUtils;
String zkConnect = "localhost:2181";
ZkClient zkClient = new ZkClient(zkConnect, 10 * 1000, 8 * 1000, ZKStringSerializer$.MODULE$);
ZkUtils zkUtils = new ZkUtils(zkClient, new ZkConnection(zkConnect), false);
Properties pop = new Properties();
AdminUtils.createTopic(zkUtils, topic.getTopicName(), topic.getPartitionCount(), topic.getReplicationFactor(),
pop);
zkClient.close();`