LogStash Kafka output/input is not working - apache-kafka

I try to use logstash with kafka broker. It can not integrated together.
Each version are,
logstash 2.4
kafka 0.8.2.2 with scala 2.10
The kafka input config file is:
input {
stdin{
}
}
output{
stdout {
codec => rubydebug
}
kafka {
bootstrap_servers => 10.120.16.202:6667,10.120.16.203:6667,10.120.16.204:6667'
topic_id => "cephosd1"
}
}
I can list topic cephosd1 from kafka.
The stdout could print out content, also.
But I can not read anything from kafka-console-consumer.sh .

I think you have a compatibility issue. If you check the version compatibility matrix between Logstash, Kafka and the kafka output plugin, you'll see that the kafka output plugin in Logstash 2.4 uses the Kafka 0.9 client version.
If you have a Kafka broker 0.8.2.2, it is not compatible with the client version 0.9 (the other way around would be ok). You can either downgrade to Logstash 2.0 or upgrade your Kafka broker to 0.9.

Related

kafka producer api 0.8.2.1 is not compatible with 1.0.1 broker?

i was using kafka producer which version is 0.8.2.1 to write to kafka broker which version is 1.0.1 async.
my code is like bellow:
KafkaProducer producer = new KafkaProducer(configs);
ProducerRecord producerRecord = new ProducerRecord("topic", "key", "value");
producer.send(producerRecord, new CallBack(){
#override
public void onCompletion(RecordMetadata metadata,
java.lang.Exception exception){
if(metadata != null){
System.out.println(metadata.partition() + "|" + metadata.offset());
}
});
i found that partition offset printed in my producer app's log at "onCompletion" method was bigger than kafka broker's offset which was query by shell command "./kafka-run-class.sh kafka.tools.GetOffsetShell ".
my producer was set with the config "acks=all"
for example, partition 0's offset is 30000 in log, but is 10000 queryed by shell command.
is it caused by version compatible problem?
The producer API was rewriten around Kafka 0.9 such that offsets are stored in Kafka, not Zookeeper. It's not clear if you've used GetOffsetShell with Zookeeper option or not.
Newer brokers are mostly backwards compatible down to version 0.10.2, but you shouldn't expect older clients less than those versions to work correctly with newer broker versions
https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix

Flink Kafka connector to eventhub

I am using Apache Flink, and trying to connect to Azure eventhub by using Apache Kafka protocol to receive messages from it. I manage to connect to Azure eventhub and receive messages, but I can't use flink feature "setStartFromTimestamp(...)" as described here (https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-start-position-configuration).
When I am trying to get some messages from timestamp, Kafka said that the message format on the broker side is before 0.10.0.
Is anybody faced with this?
Apache Kafka client version is 2.0.1
Apache Flink version is 1.7.2
UPDATED: tried to use Azure-Event-Hub quickstart examples (https://github.com/Azure/azure-event-hubs-for-kafka/tree/master/quickstart/java) in consumer package added code to get offset with timestamp, it returns null as expected if message version under 0.10.0 kafka version.
List<PartitionInfo> partitionInfos = consumer.partitionsFor(TOPIC);
List<TopicPartition> topicPartitions = partitionInfos.stream().map(pi -> new TopicPartition(pi.topic(), pi.partition())).collect(Collectors.toList());
Map<TopicPartition, Long> topicPartitionToTimestampMap = topicPartitions.stream().collect(Collectors.toMap(tp -> tp, tp -> 0L));
Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionToTimestampMap);
System.out.println(offsetAndTimestamp);
Sorry we missed this. Kafka offsetsForTimes() is now supported in EH (previously unsupported).
Feel free to open an issue against our Github in the future. https://github.com/Azure/azure-event-hubs-for-kafka

Logstash: Kafka Output Plugin - Issues with Bootstrap_Server

I am trying to use the logstash-output-kafka in logstash:
Logstash Configuration File
input {
stdin {}
}
output {
kafka {
topic_id => "mytopic"
bootstrap_server => "[Kafka Hostname]:9092"
}
}
However, when executing this configuration, I am getting this error:
[ERROR][logstash.agent ] Failed to execute action
{:action=>LogStash::PipelineAction::Create/pipeline_id:main,
:exception=>"LogStash::ConfigurationError", :message=>"Something is wrong
with your configuration."
I tried to change "[Kafka Hostname]:9092" to "localhost:9092", but that also fails to connect to kafka. Only when I remove the bootstrap_server configuration (which then defaults to localhost:9092) then the kafka connection seems to be established.
Is there something wrong with the bootstrap_server configuration of the kafka output plugin? i am using logstash v6.4.1, logstash-output-kafka v7.1.3
I think there is a typo in your configuration. Instead of bootstrap_server you need to define bootstrap_servers.
input {
stdin {}
}
output {
kafka {
topic_id => "mytopic"
bootstrap_servers => "your_Kafka_host:9092"
}
}
According to Logstash Docs,
bootstrap_servers
Value type is string
Default value is "localhost:9092"
This is for bootstrapping and the producer will only
use it for getting metadata (topics, partitions and replicas). The
socket connections for sending the actual data will be established
based on the broker information returned in the metadata. The format
is host1:port1,host2:port2, and the list can be a subset of brokers or
a VIP pointing to a subset of brokers.

How can flink reads newest data from kafka

Now, in my scenario, flink reads newest data from kafka everytime.
For example,
kafka products:
log1
log2
log3
When read,only log3 is needed.
Kafka consumer API, seekToEnd() can do it.
Does FlinkKafkaConsumer have the same function?
Flink 1.3 has this function.
FlinkKafkaConsumer09 flinkKafkaConsumer09 = new FlinkKafkaConsumer09<>(properties.getProperty("topic"), new RowDeserializationSchema(properties.getProperty("separator"), resultType), properties);
flinkKafkaConsumer09.setStartFromLatest();

kafka won't connect with logstash

i am trying to connect kafka to logstash using https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html
I have the kafka and zookeeper running,(i've verified this by creating a producer and consumer in python), but logstash won't detect kafka,
I have installed the kafka input plug-in, this is what my conf file looks like :
input {
kafka {
bootstrap_servers => "localhost:9092"
topics => ["divolte-data"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "divolte-data"
}
}
any help would be appreciated.
I guess the problem is with the version. Since you're running on ES 2.3, it's not compatible to use bootstrap_servers within your kafka input plugin, which was introduced from the version of 5.0.
As per the doc, you should be using zk_connect instead of bootstrap_servers, like this:
kafka {
zk_connect => "localhost:9092"
}