I've tried to ensure that a consumer on a queue (with message grouping) will only receive one message at a time from each queue it's handling, until the consumer acknowledges said message.
For a test I've set up ActiveMQ Artemis and have 3 consumers on a wildcard EXAMPLE.* , and one publisher posting 10 messages to each of 5 queues: EXAMPLE.1 - EXAMPLE.5 . What I'm seeing is that each of the consumers receive messages from the queues immediately. I've tried using the consumer window size setting (as 0) as I thought that would help me only deliver one message at a time from each queue, but that doesn't seem to work.
Have I misunderstood that setting? If so, are there any other settings I should be looking at to help me get this working?
The particular use case I'm trying to achieve is that I'll possibly have many queues and a couple of consumers. And it's important that messages in each of the queues are handled sequentially, but all queues can be handled in parallel.
Thanks!
The message grouping supported by ActiveMQ Artemis allows to get all message of the same group processed serially by the same consumer. If all messages of a queue have the same group id, they will be processed serially by the same consumer.
However grouped messages can impact the concurrent processing. For example, if there is a chunk of 100 messages of the groups associated with a client at the head of a queue followed by other messages of the groups associated with another clients then all the first 100 messages will need to be sent to the appropriate client (which is consuming those grouped messages serially) before other messages can be consumed.
The consumer window size only affects the consumer buffer messages from the server. If the consumer window size is set to 0 then the consumer does't buffer any message, so the messages can be delivered to another consumer.
In your case the producer could use the queue name to set the group id so the consumers of EXAMPLE.* would process sequentially the messages of each queue but I would not set the consumer window size to 0 because it could limit the consumers parallelism.
The following demo of the message grouping in topic hierarchy with the consumer window size equal to 0 shows the messages consumed sequentially from each queue and a limited parallelism among consumers.
public static void main(final String[] args) throws Exception {
Connection connection = null;
try {
ActiveMQConnectionFactory cf = new ActiveMQConnectionFactory();
cf.setConsumerWindowSize(1);
connection = cf.createConnection();
Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
Topic topicSubscribe = ActiveMQJMSClient.createTopic("EXAMPLE.*");
MessageConsumer[] messageConsumers = new MessageConsumer[] {
session.createSharedConsumer(topicSubscribe, "EXAMPLE"),
session.createSharedConsumer(topicSubscribe, "EXAMPLE"),
session.createSharedConsumer(topicSubscribe, "EXAMPLE")
};
MessageProducer producer = session.createProducer(null);
for (int i = 0; i < 10; i++) {
for (int t = 0; t < 5; t++) {
TextMessage groupMessage = session.createTextMessage("Group-" + t + " message " + i);
groupMessage.setStringProperty("JMSXGroupID", "Group-" + t);
producer.send(ActiveMQJMSClient.createTopic("EXAMPLE." + t), groupMessage);
}
}
connection.start();
TextMessage messageReceived;
for (int i = 0; i < 100; i++) {
for (int c = 0; c < 3; c++) {
while ((messageReceived = (TextMessage) messageConsumers[c].receive(500)) != null) {
System.out.println("Consumer" + c + " received message: " + messageReceived.getText());
}
System.out.println("Consumer" + c + " received message: null");
}
}
} finally {
// Step 12. Be sure to close our resources!
if (connection != null) {
connection.close();
}
}
}
The output of the demo is:
Consumer0 received message: Group-0 message 0
Consumer0 received message: Group-3 message 0
Consumer0 received message: Group-4 message 0
Consumer0 received message: Group-0 message 1
Consumer0 received message: null
Consumer1 received message: Group-1 message 0
Consumer1 received message: Group-1 message 1
Consumer1 received message: null
Consumer2 received message: Group-2 message 0
Consumer2 received message: Group-2 message 1
Consumer2 received message: null
Consumer0 received message: Group-3 message 1
Consumer0 received message: Group-4 message 1
Consumer0 received message: Group-0 message 2
Consumer0 received message: Group-3 message 2
Consumer0 received message: Group-4 message 2
Consumer0 received message: Group-0 message 3
Consumer0 received message: null
Consumer1 received message: Group-1 message 2
Consumer1 received message: Group-1 message 3
Consumer1 received message: null
Consumer2 received message: Group-2 message 2
Consumer2 received message: Group-2 message 3
Consumer2 received message: null
...
Related
I have a topic first_topic with 3 partitions.
Here is my code, I send 55 messages to a consumer ( which is running in cmd ), the code below shows the partition to which the message was sent. Every time I launch the code all the messages go to one partition only ( that is picked randomly ), it may be partition 0, 1 or 2.
Why doesn't round-robin work here? I do not specify the key, so I hope it should.
Logger logger = LoggerFactory.getLogger(Producer.class);
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
KafkaProducer<String, String> producer = new KafkaProducer(properties);
for (int i = 0; i < 55; i++) {
producer.send(new ProducerRecord<>("first_topic", "Hello world partitionCheck " + i), (recordMetadata, e) -> {
// executes every time a record is sent or Exception is thrown
if (e == null) {
// record was successfully sent
logger.info("metaData: " +
"topic " + recordMetadata.topic() +
" Offset " + recordMetadata.offset() +
" TimeStamp " + recordMetadata.timestamp() +
" Partition " + recordMetadata.partition());
} else {
logger.error(e.toString());
}
});
}
producer.flush();
producer.close();
As per theory, if you are not specifying any custom partition it will use the default partitioner as per the below rule
If a partition is specified in the record, use it that partition to publish.
If no partition is specified but a key is present choose a partition based on a hash of the key
If no partition or key is present choose a partition in a round-robin fashion
Can you confirm, how are you checking this. "Every time I launch the code all the messages go to one partition only ( that is picked randomly ), it may be partition 0, 1 or 2. "
I am facing difficulty with KafkaConsumer.poll(duration timeout), wherein it runs indefinitely and never come out of the method. Understand that this could be related to connection and I have seen it a bit inconsistent sometimes. How do I handle this should poll stops responding? Given below is the snippet from KafkaConsumer.poll()
public ConsumerRecords<K, V> poll(final Duration timeout) {
return poll(time.timer(timeout), true);
}
and I am calling the above from here :
Duration timeout = Duration.ofSeconds(30);
while (true) {
final ConsumerRecords<recordID, topicName> records = consumer.poll(timeout);
System.out.println("record count is" + records.count());
}
I am getting the below error:
org.apache.kafka.common.errors.SerializationException: Error
deserializing key/value for partition at offset 2. If
needed, please seek past the record to continue consumption.
I stumbled upon some useful information while trying to fix the problem I was facing above. I will provide the piece of code which should be able to handle this, but before that it is important to know what causes this.
While producing or consuming message or data to Apache Kafka, we need schema structure to that message or data, in my case Avro schema. If there is a conflict of message being produced to Kafka that conflict with that message schema, it will have an effect on consumption.
Add below code in your consumer topic in the method where it consume records --
do remember to import below packages:
import org.apache.kafka.common.TopicPartition;
import org.jsoup.SerializationException;
try {
while (true) {
ConsumerRecords<String, GenericRecord> records = null;
try {
records = consumer.poll(10000);
} catch (SerializationException e) {
String s = e.getMessage().split("Error deserializing key/value
for partition ")[1].split(". If needed, please seek past the record to
continue consumption.")[0];
String topics = s.split("-")[0];
int offset = Integer.valueOf(s.split("offset ")[1]);
int partition = Integer.valueOf(s.split("-")[1].split(" at") .
[0]);
TopicPartition topicPartition = new TopicPartition(topics,
partition);
//log.info("Skipping " + topic + "-" + partition + " offset "
+ offset);
consumer.seek(topicPartition, offset + 1);
}
for (ConsumerRecord<String, GenericRecord> record : records) {
System.out.printf("value = %s \n", record.value());
}
}
} finally {
consumer.close();
}
I ran into this while setting up a test environment.
Running the following command on the broker printed out the stored records as one would expect:
bin/kafka-console-consumer.sh --bootstrap-server="localhost:9092" --topic="foo" --from-beginning
It turned out that the Kafka server was misconfigured. To connect from an external
IP address listeners must have a valid value in kafka/config/server.properties, e.g.
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://:9092
I have 1000 messages in my topic.I need to write a kafka consumer in scala to just fetch 1000 messages ,so that i can start processing 1000 messages.
var recordList = new ListBuffer[ConsumerRecord[String, String]]()
while (true) {
val records: ConsumerRecords[String, String] = consumer.poll(100)
records.asScala.foreach(record => recordList += record)
recordList.toList
}
But what happens is the loop never ends and i get the below messages in log.
Fetch READ_UNCOMMITTED at offset 1000 for partition test-0 returned fetch data (error=NONE, highWaterMark=1000, lastStableOffset = -1, logStartOffset = 0, abortedTransactions = null, recordsSizeInBytes=10486)
Added READ_UNCOMMITTED fetch request for partition test-0 at offset 1000 to node localhost:9092 (id: 0 rack: null)
Sending READ_UNCOMMITTED fetch for partitions [test-0] to broker localhost:9092 (id: 0 rack: null)
Why don't you quit when records.size() is zero?
Another way would be to close the consumer once records.size() is zero
I am new to Kafka. I want to know why CommitAsync() does NOT do everything on
AdvancedConsumer
That is, the result is the same for both when CommitAsync is enabled and disabled in terms of remembering offset committed.
Test 1: Disable CommitAsync()
1 Disable CommitAsync() method on Run_Consume() like below:
public static void Run_Consume(string brokerList, List<string> topics)
{
...
//if (msg.Offset % 5 == 0)
//{
// Console.WriteLine($"Committing offset");
// var committedOffsets = consumer.CommitAsync(msg).Result;
// Console.WriteLine($"Committed offset: {committedOffsets}");
//}
}
}
}
2 When Consumer starts, it reads all messages from offset 0 to last offset N that have been already read from last session
3 Post new messages to Producer , and Consumer displays new message with offset N+1
4 Kill Consumer
5 Start Consumer, it displays all messages from offset 0 to N+1
Test 2: Enable CommitAsync()
Step 1 Enable CommitAsync() method on Run_Consume()
follow the same steps 2 - 5 like above
On both tests, newly started Consumer can still remember the last offset. Thus, Why is CommitAsync() stilled called?
Please note the similar issue below:
https://github.com/confluentinc/confluent-kafka-dotnet/issues/470#issuecomment-375634009
I'm new to Kafka and working on a prototype to connect a proprietary streaming service into Kafka.
I'm looking to get the key of the last message sent on a topic as our in-house stream consumer needs to logon with the ID of the last message it received when connecting.
Is it possible, using either the KafkaProducer or a KafkaConsumer to do this?
I've attempted to do the following using a Consumer, but when also running the console consumer I see messages replayed.
// Poll so we know we're connected
consumer.poll(100);
// Get the assigned partitions
Set<TopicPartition> assignedPartitions = consumer.assignment();
// Seek to the end of those partitions
consumer.seekToEnd(assignedPartitions);
for(TopicPartition partition : assignedPartitions) {
final long offset = consumer.committed(partition).offset();
// Seek to the previous message
consumer.seek(partition,offset - 1);
}
// Now get the last message
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
lastKey = record.key();
}
consumer.close();
Is this expected behaviour or am I on the wrong path?
The problem is on line final long offset = consumer.committed(partition).offset(), as link api refers committed method is to get the last committed offset for the given partition, i.e: the last offset your consumer tell kafka server that it had already read.
So, definitely you will got messages replayed, because you always read from specific offset.
As I think I only have to remove the first for block.
Check the record count and get the last message:
// Poll so we know we're connected
consumer.poll(100);
// Get the assigned partitions
Set<TopicPartition> assignedPartitions = consumer.assignment();
// Seek to the end of those partitions
consumer.seekToEnd(assignedPartitions);
for (TopicPartition partition : assignedPartitions) {
final long offset = consumer.committed(partition).offset();
// Seek to the previous message
consumer.seek(partition, offset - 1);
}
// Now get the last message
ConsumerRecords<String, String> records = consumer.poll(100);
int size = records.count();
int index = 0;
for (ConsumerRecord<String, String> record : records) {
index = index + 1;
if (index == size) {
String value = record.value();
System.out.println("Last Message = " + value);
}
}
consumer.close();