Kafka: Consumer api: Regression test fails if runs in a group (sequentially)

Kafka: Consumer api: Regression test fails if runs in a group (sequentially) - apache-kafka

I have implemented a kafka application using consumer api. And I have 2 regression tests implemented with stream api:
To test happy path: by producing data from the test ( into the input topic that the application is listening to) that will be consumed by the application and application will produce data (into the output topic ) that the test will consume and validate against expected output data.
To test error path: behavior is the same as above. Although this time application will produce data into output topic and test will consume from application's error topic and will validate against expected error output.
My code and the regression-test codes are residing under the same project under expected directory structure. Both time ( for both tests) data should have been picked up by the same listener at the application side.
The problem is :
When I am executing the tests individually (manually), each test is passing. However, If I execute them together but sequentially ( for example: gradle clean build ) , only first test is passing. 2nd test is failing after the test-side-consumer polling for data and after some time it gives up not finding any data.
Observation:
From debugging, it looks like, the 1st time everything works perfectly ( test-side and application-side producers and consumers). However, during the 2nd test it seems that application-side-consumer is not receiving any data ( It seems that test-side-producer is producing data, but can not say that for sure) and hence no data is being produced into the error topic.
What I have tried so far:
After investigations, my understanding is that we are getting into race conditions and to avoid that found suggestions like :
use #DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
Tear off broker after each test ( Please see the ".destry()" on brokers)
use different topic names for each test
I applied all of them and still could not recover from my issue.
I am providing the code here for perusal. Any insight is appreciated.
Code for 1st test (Testing error path):
#DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
#EmbeddedKafka(
partitions = 1,
controlledShutdown = false,
topics = {
AdapterStreamProperties.Constants.INPUT_TOPIC,
AdapterStreamProperties.Constants.ERROR_TOPIC
},
brokerProperties = {
"listeners=PLAINTEXT://localhost:9092",
"port=9092",
"log.dir=/tmp/data/logs",
"auto.create.topics.enable=true",
"delete.topic.enable=true"
}
)
public class AbstractIntegrationFailurePathTest {
private final int retryLimit = 0;
#Autowired
protected EmbeddedKafkaBroker embeddedFailurePathKafkaBroker;
//To produce data
#Autowired
protected KafkaTemplate<PreferredMediaMsgKey, SendEmailCmd> inputProducerTemplate;
//To read from output error
#Autowired
protected Consumer<PreferredMediaMsgKey, ErrorCmd> outputErrorConsumer;
//Service to execute notification-preference
#Autowired
protected AdapterStreamProperties projectProerties;
protected void subscribe(Consumer consumer, String topic, int attempt) {
try {
embeddedFailurePathKafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
} catch (ComparisonFailure ex) {
if (attempt < retryLimit) {
subscribe(consumer, topic, attempt + 1);
}
}
}
}
.
#TestConfiguration
public class AdapterStreamFailurePathTestConfig {
#Autowired
private EmbeddedKafkaBroker embeddedKafkaBroker;
#Value("${spring.kafka.adapter.application-id}")
private String applicationId;
#Value("${spring.kafka.adapter.group-id}")
private String groupId;
//Producer of records that the program consumes
#Bean
public Map<String, Object> sendEmailCmdProducerConfigs() {
Map<String, Object> results = KafkaTestUtils.producerProps(embeddedKafkaBroker);
results.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.KEY_SERDE.serializer().getClass());
results.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.INPUT_VALUE_SERDE.serializer().getClass());
return results;
}
#Bean
public ProducerFactory<PreferredMediaMsgKey, SendEmailCmd> inputProducerFactory() {
return new DefaultKafkaProducerFactory<>(sendEmailCmdProducerConfigs());
}
#Bean
public KafkaTemplate<PreferredMediaMsgKey, SendEmailCmd> inputProducerTemplate() {
return new KafkaTemplate<>(inputProducerFactory());
}
//Consumer of the error output, generated by the program
#Bean
public Map<String, Object> outputErrorConsumerConfig() {
Map<String, Object> props = KafkaTestUtils.consumerProps(
applicationId, Boolean.TRUE.toString(), embeddedKafkaBroker);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.KEY_SERDE.deserializer().getClass()
.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
AdapterStreamProperties.Constants.ERROR_VALUE_SERDE.deserializer().getClass()
.getName());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public Consumer<PreferredMediaMsgKey, ErrorCmd> outputErrorConsumer() {
DefaultKafkaConsumerFactory<PreferredMediaMsgKey, ErrorCmd> rpf =
new DefaultKafkaConsumerFactory<>(outputErrorConsumerConfig());
return rpf.createConsumer(groupId, "notification-failure");
}
}
.
#RunWith(SpringRunner.class)
#SpringBootTest(classes = AdapterStreamFailurePathTestConfig.class)
#ActiveProfiles(profiles = "errtest")
public class ErrorPath400Test extends AbstractIntegrationFailurePathTest {
#Autowired
private DataGenaratorForErrorPath400Test datagen;
#Mock
private AdapterHttpClient httpClient;
#Autowired
private ErroredEmailCmdDeserializer erroredEmailCmdDeserializer;
#Before
public void setup() throws InterruptedException {
Mockito.when(httpClient.callApi(Mockito.any()))
.thenReturn(
new GenericResponse(
400,
TestConstants.ERROR_MSG_TO_CHK));
Mockito.when(httpClient.createURI(Mockito.any(),Mockito.any(),Mockito.any())).thenCallRealMethod();
inputProducerTemplate.send(
projectProerties.getInputTopic(),
datagen.getKey(),
datagen.getEmailCmdToProduce());
System.out.println("producer: "+ projectProerties.getInputTopic());
subscribe(outputErrorConsumer , projectProerties.getErrorTopic(), 0);
}
#Test
public void testWithError() throws InterruptedException, InvalidProtocolBufferException, TextFormat.ParseException {
ConsumerRecords<PreferredMediaMsgKeyBuf.PreferredMediaMsgKey, ErrorCommandBuf.ErrorCmd> records;
List<ConsumerRecord<PreferredMediaMsgKeyBuf.PreferredMediaMsgKey, ErrorCommandBuf.ErrorCmd>> outputListOfErrors = new ArrayList<>();
int attempt = 0;
int expectedRecords = 1;
do {
records = KafkaTestUtils.getRecords(outputErrorConsumer);
records.forEach(outputListOfErrors::add);
attempt++;
} while (attempt < expectedRecords && outputListOfErrors.size() < expectedRecords);
//Verify the recipient event stream size
Assert.assertEquals(expectedRecords, outputListOfErrors.size());
//Validate output
}
#After
public void tearDown() {
outputErrorConsumer.close();
embeddedFailurePathKafkaBroker.destroy();
}
}
2nd test is almost the same in structure. Although this time the test-side-consumer is consuming from application-side-output-topic( instead of error topic). And I named the consumers,broker,producer,topics differently. Like :
#DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
#EmbeddedKafka(
partitions = 1,
controlledShutdown = false,
topics = {
AdapterStreamProperties.Constants.INPUT_TOPIC,
AdapterStreamProperties.Constants.OUTPUT_TOPIC
},
brokerProperties = {
"listeners=PLAINTEXT://localhost:9092",
"port=9092",
"log.dir=/tmp/data/logs",
"auto.create.topics.enable=true",
"delete.topic.enable=true"
}
)
public class AbstractIntegrationSuccessPathTest {
private final int retryLimit = 0;
#Autowired
protected EmbeddedKafkaBroker embeddedKafkaBroker;
//To produce data
#Autowired
protected KafkaTemplate<PreferredMediaMsgKey,SendEmailCmd> sendEmailCmdProducerTemplate;
//To read from output regular topic
#Autowired
protected Consumer<PreferredMediaMsgKey, NotifiedEmailCmd> ouputConsumer;
//Service to execute notification-preference
#Autowired
protected AdapterStreamProperties projectProerties;
protected void subscribe(Consumer consumer, String topic, int attempt) {
try {
embeddedKafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
} catch (ComparisonFailure ex) {
if (attempt < retryLimit) {
subscribe(consumer, topic, attempt + 1);
}
}
}
}
Please let me know if I should provide any more information.,

"port=9092"
Don't use a fixed port; leave that out and the embedded broker will use a random port; the consumer configs are set up in KafkaTestUtils to point to the random port.
You shouldn't need to dirty the context after each test method - use a different group.id for each test and a different topic.

In my case the consumer was not closed properly. I had to do :
#After
public void tearDown() {
// shutdown hook to correctly close the streams application
Runtime.getRuntime().addShutdownHook(new Thread(ouputConsumer::close));
}
to resolve.

Related

Reactive program exiting early before sending all messages to Kafka

This is a subsequent question to a previous reactive kafka issue (Issue while sending the Flux of data to the reactive kafka).
I am trying to send some log records to the kafka using the reactive approach. Here is the reactive code sending messages using reactive kafka.
public class LogProducer {
private final KafkaSender<String, String> sender;
public LogProducer(String bootstrapServers) {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "log-producer");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
SenderOptions<String, String> senderOptions = SenderOptions.create(props);
sender = KafkaSender.create(senderOptions);
}
public void sendMessages(String topic, Flux<Logs.Data> records) throws InterruptedException {
AtomicInteger sentCount = new AtomicInteger(0);
sender.send(records
.map(record -> {
LogRecord lrec = record.getRecords().get(0);
String id = lrec.getId();
Thread.sleep(0, 5); // sleep for 5 ns
return SenderRecord.create(new ProducerRecord<>(topic, id,
lrec.toString()), id);
})).doOnNext(res -> sentCount.incrementAndGet()).then()
.doOnError(e -> {
log.error("[FAIL]: Send to the topic: '{}' failed. "
+ e, topic);
})
.doOnSuccess(s -> {
log.info("[SUCCESS]: {} records sent to the topic: '{}'", sentCount, topic);
})
.subscribe();
}
}
public class ExecuteQuery implements Runnable {
private LogProducer producer = new LogProducer("localhost:9092");
#Override
public void run() {
Flux<Logs.Data> records = ...
producer.sendMessages(kafkaTopic, records);
.....
.....
// processing related to the messages sent
}
}
So even when the Thread.sleep(0, 5); is there, sometimes it does not send all messages to kafka and the program exists early printing the SUCCESS message (log.info("[SUCCESS]: {} records sent to the topic: '{}'", sentCount, topic);). Is there any more concrete way to solve this problem. For example, using some kind of callback, so that thread will wait for all messages to be sent successfully.
I have a spring console application and running ExecuteQuery through a scheduler at fixed rate, something like this
public class Main {
private ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private ExecutorService executor = Executors.newFixedThreadPool(POOL_SIZE);
public static void main(String[] args) {
QueryScheduler scheduledQuery = new QueryScheduler();
scheduler.scheduleAtFixedRate(scheduledQuery, 0, 5, TimeUnit.MINUTES);
}
class QueryScheduler implements Runnable {
#Override
public void run() {
// preprocessing related to time
executor.execute(new ExecuteQuery());
// postprocessing related to time
}
}
}

Your Thread.sleep(0, 5); // sleep for 5 ns does not have any value for a main thread to be blocked, so it exits when it needs and your ExecuteQuery may not finish its job yet.
It is not clear how you start your application, but I recommended Thread.sleep() exactly in a main thread to block. To be precise in the public static void main(String[] args) { method impl.

How to commit the offsets when using KafkaItemReader in spring batch job, once all the messages are processed and written to the .dat file?

I have developed a Spring Batch Job which read from Kafka topic using KafkaItemReader class. I want to commit the offset only when the messages read in defined chunk are Processed and written successfully to an Output .dat file.
#Bean
public Job kafkaEventReformatjob(
#Qualifier("MaintStep") Step MainStep,
#Qualifier("moveFileToFolder") Step moveFileToFolder,
#Qualifier("compressFile") Step compressFile,
JobExecutionListener listener)
{
return jobBuilderFactory.get("kafkaEventReformatJob")
.listener(listener)
.incrementer(new RunIdIncrementer())
.flow(MainStep)
.next(moveFileToFolder)
.next(compressFile)
.end()
.build();
}
#Bean
Step MainStep(
ItemProcessor<IncomingRecord, List<Record>> flatFileItemProcessor,
ItemWriter<List<Record>> flatFileWriter)
{
return stepBuilderFactory.get("mainStep")
.<InputRecord, List<Record>> chunk(5000)
.reader(kafkaItemReader())
.processor(flatFileItemProcessor)
.writer(writer())
.listener(basicStepListener)
.build();
}
//Reader reads all the messages from akfka topic and sending back in form of IncomingRecord.
#Bean
KafkaItemReader<String, IncomingRecord> kafkaItemReader() {
Properties props = new Properties();
props.putAll(this.properties.buildConsumerProperties());
List<Integer> partitions = new ArrayList<>();
partitions.add(0);
partitions.add(1);
return new KafkaItemReaderBuilder<String, IncomingRecord>()
.partitions(partitions)
.consumerProperties(props)
.name("records")
.saveState(true)
.topic(topic)
.pollTimeout(Duration.ofSeconds(40L))
.build();
}
#Bean
public ItemWriter<List<Record>> writer() {
ListUnpackingItemWriter<Record> listUnpackingItemWriter = new ListUnpackingItemWriter<>();
listUnpackingItemWriter.setDelegate(flatWriter());
return listUnpackingItemWriter;
}
public ItemWriter<Record> flatWriter() {
FlatFileItemWriter<Record> fileWriter = new FlatFileItemWriter<>();
String tempFileName = "abc";
LOGGER.info("Output File name " + tempFileName + " is in working directory ");
String workingDir = service.getWorkingDir().toAbsolutePath().toString();
Path outputFile = Paths.get(workingDir, tempFileName);
fileWriter.setName("fileWriter");
fileWriter.setResource(new FileSystemResource(outputFile.toString()));
fileWriter.setLineAggregator(lineAggregator());
fileWriter.setForceSync(true);
fileWriter.setFooterCallback(customFooterCallback());
fileWriter.close();
LOGGER.info("Successfully created the file writer");
return fileWriter;
}
#StepScope
#Bean
public TransformProcessor processor() {
return new TransformProcessor();
}
==============================================================================
Writer Class
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
#AfterStep
public void afterStep(StepExecution stepExecution) {
this.stepExecution.setWriteCount(count);
}
#Override
public void write(final List<? extends List<Record>> lists) throws Exception {
List<Record> consolidatedList = new ArrayList<>();
for (List<Record> list : lists) {
if (!list.isEmpty() && null != list)
consolidatedList.addAll(list);
}
delegate.write(consolidatedList);
count += consolidatedList.size(); // to count Trailer record count
}
===============================================================
Item Processor
#Override
public List process(IncomingRecord record) {
List<Record> recordList = new ArrayList<>();
if (null != record.getEventName() and a few other conditions inside this section) {
// setting values of Record Class by extracting from the IncomingRecord.
recordList.add(the valid records which matching the condition);
}else{
return null;
}

Synchronizing a read operation and a write operation between two transactional resources (a queue and a database for instance)
is possible by using a JTA transaction manager that coordinates both transaction managers (2PC protocol).
However, this approach is not possible if one of the resources is not transactional (like the majority of file systems). So unless you use
a transactional file system and a JTA transaction manager that coordinates a kafka transaction manager and a file system transaction manager..
you need another approach, like the Compensating Transaction pattern. In your case, the "undo" operation (compensating action) would be rewinding the offset where it was before the failed chunk.

Unable to send message with KafkaNull as Value

I am building a Kafka Application Using Log Compaction on a Topic but I am not able to send a Tombstone Value (KafkaNull)
I have tried using the default configuration for a serializer and when that did not work I used the suggested changes from "Publish null/tombstone message with raw headers" To set the application.properties to:
spring.cloud.stream.output.producer.useNativeEncoding=true
spring.cloud.stream.kafka.binder.configuration.value.serializer=org.springframework.kafka.support.serializer.JsonSerializer
The code I have to send a message to a stream is
this.stockTopics.compactedStocks().send(MessageBuilder
.withPayload(KafkaNull.INSTANCE)
.setHeader(KafkaHeaders.MESSAGE_KEY,company.getBytes())
.build())
this.stopTopics.compactedStocks() returns a messageStream that I can send messages to.
Every time I try and send that message with a KafkaNull instance as a payload I get the error Failed to convert message: 'GenericMessage [payload=org.springframework.kafka.support.KafkaNull#1c2d8163, headers={id=f81857e7-fbd0-56f5-8418-6a1944e7f2b1, kafka_messageKey=[B#36ec022a, contentType=application/json, timestamp=1547827957485}]' to outbound message.
I expect the message to simply be sent to the consumer with a null value but obviously it errors.

I opened a GitHub issue for this.
EDIT
Workaround - this works...
#SpringBootApplication
#EnableBinding(Source.class)
public class So54257687Application {
public static void main(String[] args) {
SpringApplication.run(So54257687Application.class, args);
}
#Bean
public ApplicationRunner runner(MessageChannel output) {
return args -> output.send(new GenericMessage<>(KafkaNull.INSTANCE));
}
#KafkaListener(id = "foo", topics = "output")
public void listen(#Payload(required = false) byte[] in) {
System.out.println(in);
}
#Bean
#StreamMessageConverter
public MessageConverter kafkaNullConverter() {
class KafkaNullConverter extends AbstractMessageConverter {
KafkaNullConverter() {
super(Collections.emptyList());
}
#Override
protected boolean supports(Class<?> clazz) {
return KafkaNull.class.equals(clazz);
}
#Override
protected Object convertFromInternal(Message<?> message, Class<?> targetClass, Object conversionHint) {
return message.getPayload();
}
#Override
protected Object convertToInternal(Object payload, MessageHeaders headers, Object conversionHint) {
return payload;
}
}
return new KafkaNullConverter();
}
}

Kafka + Spring Batch Listener Flush Batch

Using Kafka Broker: 1.0.1
spring-kafka: 2.1.6.RELEASE
I'm using a batched consumer with the following settings:
// Other settings are not shown..
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");
I use spring listener in the following way:
#KafkaListener(topics = "${topics}", groupId = "${consumer.group.id}")
public void receive(final List<String> data,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) final List<Integer> partitions,
#Header(KafkaHeaders.RECEIVED_TOPIC) Set<String> topics,
#Header(KafkaHeaders.OFFSET) final List<Long> offsets) { // ......code... }
I always find the a few messages remain in the batch and not received in my listener. It appears to be that if the remaining messages are less than a batch size, it isn't consumed (may be in memory and published to my listener). Is there any way to have a setting to auto-flush the batch after a time interval so as to avoid the messages not being flushed?
What's the best way to deal with such kind of situation with a batch consumer?

I just ran a test without any problems...
#SpringBootApplication
public class So50370851Application {
public static void main(String[] args) {
SpringApplication.run(So50370851Application.class, args);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
for (int i = 0; i < 230; i++) {
template.send("so50370851", "foo" + i);
}
};
}
#KafkaListener(id = "foo", topics = "so50370851")
public void listen(List<String> in) {
System.out.println(in.size());
}
#Bean
public NewTopic topic() {
return new NewTopic("so50370851", 1, (short) 1);
}
}
and
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.enable-auto-commit=false
spring.kafka.consumer.max-poll-records=100
spring.kafka.listener.type=batch
and
100
100
30
Also, the debug logs shows after a while that it is polling and fetching 0 records (and this gets repeated over and over).
That implies the problem is on the sending side.

Why use Kryo serialize framework into apache storm will over write data when blot get values

Maybe mostly develop were use AVRO as serialize framework in Kafka and Apache Storm scheme. But I need handle most complex data then I found the Kryo serialize framework also were successfully integrate it into our project which follow Kafka and Apache Storm environment. But when want to further operation there had a strange status.
I had sent 5 times message to Kafka, the Storm job also can read the 5 messages and deserialize success. But next blot get the data value is wrong. There print out the same value as the last message. Then I had add the print out after when complete the deserialize code. Actually it print out true there had different 5 message. Why the next blot can't the values? See my code below:
KryoScheme.java
public abstract class KryoScheme<T> implements Scheme {
private static final long serialVersionUID = 6923985190833960706L;
private static final Logger logger = LoggerFactory.getLogger(KryoScheme.class);
private Class<T> clazz;
private Serializer<T> serializer;
public KryoScheme(Class<T> clazz, Serializer<T> serializer) {
this.clazz = clazz;
this.serializer = serializer;
}
#Override
public List<Object> deserialize(byte[] buffer) {
Kryo kryo = new Kryo();
kryo.register(clazz, serializer);
T scheme = null;
try {
scheme = kryo.readObject(new Input(new ByteArrayInputStream(buffer)), this.clazz);
logger.info("{}", scheme);
} catch (Exception e) {
String errMsg = String.format("Kryo Scheme failed to deserialize data from Kafka to %s. Raw: %s",
clazz.getName(),
new String(buffer));
logger.error(errMsg, e);
throw new FailedException(errMsg, e);
}
return new Values(scheme);
}}
PrintFunction.java
public class PrintFunction extends BaseFunction {
private static final Logger logger = LoggerFactory.getLogger(PrintFunction.class);
#Override
public void execute(TridentTuple tuple, TridentCollector collector) {
List<Object> data = tuple.getValues();
if (data != null) {
logger.info("Scheme data size: {}", data.size());
for (Object value : data) {
PrintOut out = (PrintOut) value;
logger.info("{}.{}--value: {}",
Thread.currentThread().getName(),
Thread.currentThread().getId(),
out.toString());
collector.emit(new Values(out));
}
}
}}
StormLocalTopology.java
public class StormLocalTopology {
public static void main(String[] args) {
........
BrokerHosts zk = new ZkHosts("xxxxxx");
Config stormConf = new Config();
stormConf.put(Config.TOPOLOGY_DEBUG, false);
stormConf.put(Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS, 1000 * 5);
stormConf.put(Config.TOPOLOGY_WORKERS, 1);
stormConf.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, 5);
stormConf.put(Config.TOPOLOGY_TASKS, 1);
TridentKafkaConfig actSpoutConf = new TridentKafkaConfig(zk, topic);
actSpoutConf.fetchSizeBytes = 5 * 1024 * 1024 ;
actSpoutConf.bufferSizeBytes = 5 * 1024 * 1024 ;
actSpoutConf.scheme = new SchemeAsMultiScheme(scheme);
actSpoutConf.startOffsetTime = kafka.api.OffsetRequest.LatestTime();
TridentTopology topology = new TridentTopology();
TransactionalTridentKafkaSpout actSpout = new TransactionalTridentKafkaSpout(actSpoutConf);
topology.newStream(topic, actSpout).parallelismHint(4).shuffle()
.each(new Fields("act"), new PrintFunction(), new Fields());
LocalCluster cluster = new LocalCluster();
cluster.submitTopology(topic+"Topology", stormConf, topology.build());
}}
There also other problem why the kryo scheme only can read one message buffer. Is there other way get multi messages buffer then can batch send data to next blot.
Also if I send 1 message the full flow seems success.
Then send 2 message is wrong. the print out message like below:
56157 [Thread-18-spout0] INFO s.s.a.s.s.c.KryoScheme - 2016-02- 05T17:20:48.122+0800,T6mdfEW#N5pEtNBW
56160 [Thread-20-b-0] INFO s.s.a.s.s.PrintFunction - Scheme data size: 1
56160 [Thread-18-spout0] INFO s.s.a.s.s.c.KryoScheme - 2016-02- 05T17:20:48.282+0800,T(o2KnFxtGB0Tlp8
56161 [Thread-20-b-0] INFO s.s.a.s.s.PrintFunction - Thread-20-b-0.99--value: 2016-02-05T17:20:48.282+0800,T(o2KnFxtGB0Tlp8
56162 [Thread-20-b-0] INFO s.s.a.s.s.PrintFunction - Scheme data size: 1
56162 [Thread-20-b-0] INFO s.s.a.s.s.PrintFunction - Thread-20-b-0.99--value: 2016-02-05T17:20:48.282+0800,T(o2KnFxtGB0Tlp8

I'm sorry this my mistake. Just found a bug in Kryo deserialize class, there exist an local scope parameter, so it can be over write in multi thread environment. Not change the parameter in party scope, the code run well.
reference code see blow:
public class KryoSerializer<T extends BasicEvent> extends Serializer<T> implements Serializable {
private static final long serialVersionUID = -4684340809824908270L;
// It's wrong set
//private T event;
public KryoSerializer(T event) {
this.event = event;
}
#Override
public void write(Kryo kryo, Output output, T event) {
event.write(output);
}
#Override
public T read(Kryo kryo, Input input, Class<T> type) {
T event = new T();
event.read(input);
return event;
}
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Kafka: Consumer api: Regression test fails if runs in a group (sequentially) - apache-kafka

In my case the consumer was not closed properly. I had to do : #After public void tearDown() { // shutdown hook to correctly close the streams application Runtime.getRuntime().addShutdownHook(new Thread(ouputConsumer::close)); } to resolve.

Related

Reactive program exiting early before sending all messages to Kafka

How to commit the offsets when using KafkaItemReader in spring batch job, once all the messages are processed and written to the .dat file?

Unable to send message with KafkaNull as Value

Kafka + Spring Batch Listener Flush Batch

Why use Kryo serialize framework into apache storm will over write data when blot get values

Categories

Resources