Mongock does not run changeunit in kotlin project - mongodb

I have Java + maven project - https://github.com/petersuchy/mongock-test-java based on Mongock reactive example (https://github.com/mongock/mongock-examples/tree/master/mongodb/springboot-reactive)
And everything works well.
I tried to migrate that project into Kotlin + gradle - https://github.com/petersuchy/mongock-test-kotlin
And I am able to run it. But my ChangeUnit is ignored. Mongock is set up properly beacause in the end I have 2 collections created - mongockLock and mongockChangeLog.
I tried to get rid of #Value annotations in MongockConfig and MongoClientConfig, but there was no change in behaviour.
Can you please point me why is this happening? I think it can be something with these Reflections becasue it is only difference in logs.
Kotlin:
2023-02-12T00:49:58.455+01:00 INFO 80854 --- [ main] i.m.r.c.e.system.SystemUpdateExecutor : Mongock has finished the system update execution
2023-02-12T00:49:58.457+01:00 INFO 80854 --- [ main] org.reflections.Reflections : Reflections took 0 ms to scan 0 urls, producing 0 keys and 0 values
2023-02-12T00:49:58.458+01:00 INFO 80854 --- [ main] org.reflections.Reflections : Reflections took 1 ms to scan 0 urls, producing 0 keys and 0 values
2023-02-12T00:49:58.458+01:00 INFO 80854 --- [ main] i.m.r.c.e.o.migrate.MigrateExecutorBase : Mongock skipping the data migration. There is no change set item.
Java:
2023-02-12T00:29:48.064+01:00 INFO 78548 --- [ main] i.m.r.c.e.system.SystemUpdateExecutor : Mongock has finished the system update execution
2023-02-12T00:29:48.072+01:00 INFO 78548 --- [ main] org.reflections.Reflections : Reflections took 6 ms to scan 1 urls, producing 1 keys and 2 values
2023-02-12T00:29:48.075+01:00 INFO 78548 --- [ main] org.reflections.Reflections : Reflections took 3 ms to scan 1 urls, producing 1 keys and 2 values
2023-02-12T00:29:48.081+01:00 INFO 78548 --- [ main] i.m.driver.core.lock.LockManagerDefault : Mongock trying to acquire the lock
Here is the full log from Kotlin project
2023-02-12T00:49:55.863+01:00 INFO 80854 --- [ main] c.e.m.MongockTestKotlinApplicationKt : No active profile set, falling back to 1 default profile: "default"
2023-02-12T00:49:56.764+01:00 INFO 80854 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data Reactive MongoDB repositories in DEFAULT mode.
2023-02-12T00:49:57.019+01:00 INFO 80854 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 246 ms. Found 1 Reactive MongoDB repository interfaces.
2023-02-12T00:49:57.919+01:00 INFO 80854 --- [ main] org.mongodb.driver.client : MongoClient with metadata {"driver": {"name": "mongo-java-driver|reactive-streams|spring-boot", "version": "4.8.2"}, "os": {"type": "Linux", "name": "Linux", "architecture": "amd64", "version": "5.15.0-60-generic"}, "platform": "Java/Private Build/17.0.5+8-Ubuntu-2ubuntu122.04"} created with settings MongoClientSettings{readPreference=primary, writeConcern=WriteConcern{w=null, wTimeout=null ms, journal=null}, retryWrites=true, retryReads=true, readConcern=ReadConcern{level=null}, credential=null, streamFactoryFactory=NettyStreamFactoryFactory{eventLoopGroup=io.netty.channel.nio.NioEventLoopGroup#631cb129, socketChannelClass=class io.netty.channel.socket.nio.NioSocketChannel, allocator=PooledByteBufAllocator(directByDefault: true), sslContext=null}, commandListeners=[], codecRegistry=ProvidersCodecRegistry{codecProviders=[ValueCodecProvider{}, BsonValueCodecProvider{}, DBRefCodecProvider{}, DBObjectCodecProvider{}, DocumentCodecProvider{}, CollectionCodecProvider{}, IterableCodecProvider{}, MapCodecProvider{}, GeoJsonCodecProvider{}, GridFSFileCodecProvider{}, Jsr310CodecProvider{}, JsonObjectCodecProvider{}, BsonCodecProvider{}, EnumCodecProvider{}, com.mongodb.Jep395RecordCodecProvider#3d20e575]}, clusterSettings={hosts=[localhost:27017], srvServiceName=mongodb, mode=SINGLE, requiredClusterType=UNKNOWN, requiredReplicaSetName='null', serverSelector='null', clusterListeners='[]', serverSelectionTimeout='30000 ms', localThreshold='30000 ms'}, socketSettings=SocketSettings{connectTimeoutMS=10000, readTimeoutMS=0, receiveBufferSize=0, sendBufferSize=0}, heartbeatSocketSettings=SocketSettings{connectTimeoutMS=10000, readTimeoutMS=10000, receiveBufferSize=0, sendBufferSize=0}, connectionPoolSettings=ConnectionPoolSettings{maxSize=100, minSize=0, maxWaitTimeMS=120000, maxConnectionLifeTimeMS=0, maxConnectionIdleTimeMS=0, maintenanceInitialDelayMS=0, maintenanceFrequencyMS=60000, connectionPoolListeners=[], maxConnecting=2}, serverSettings=ServerSettings{heartbeatFrequencyMS=10000, minHeartbeatFrequencyMS=500, serverListeners='[]', serverMonitorListeners='[]'}, sslSettings=SslSettings{enabled=false, invalidHostNameAllowed=false, context=null}, applicationName='null', compressorList=[], uuidRepresentation=JAVA_LEGACY, serverApi=null, autoEncryptionSettings=null, contextProvider=null}
2023-02-12T00:49:57.968+01:00 INFO 80854 --- [ main] i.m.r.core.builder.RunnerBuilderBase : Mongock runner COMMUNITY version[5.2.2]
2023-02-12T00:49:57.970+01:00 INFO 80854 --- [ main] i.m.r.core.builder.RunnerBuilderBase : Running Mongock with NO metadata
2023-02-12T00:49:58.034+01:00 INFO 80854 --- [localhost:27017] org.mongodb.driver.cluster : Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=17, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=62515465, setName='myReplicaSet', canonicalAddress=mongo1:27017, hosts=[mongo3:27017, mongo2:27017, mongo1:27017], passives=[], arbiters=[], primary='mongo1:27017', tagSet=TagSet{[]}, electionId=7fffffff0000000000000002, setVersion=1, topologyVersion=TopologyVersion{processId=63e7c5a7d11b71e048698dab, counter=6}, lastWriteDate=Sun Feb 12 00:49:53 CET 2023, lastUpdateTimeNanos=45870970528894}
2023-02-12T00:49:58.336+01:00 INFO 80854 --- [ main] org.reflections.Reflections : Reflections took 33 ms to scan 1 urls, producing 2 keys and 2 values
2023-02-12T00:49:58.343+01:00 INFO 80854 --- [ main] org.reflections.Reflections : Reflections took 2 ms to scan 1 urls, producing 2 keys and 2 values
2023-02-12T00:49:58.367+01:00 INFO 80854 --- [ main] i.m.driver.core.lock.LockManagerDefault : Mongock trying to acquire the lock
2023-02-12T00:49:58.400+01:00 INFO 80854 --- [ main] i.m.driver.core.lock.LockManagerDefault : Mongock acquired the lock until: Sun Feb 12 00:50:58 CET 2023
2023-02-12T00:49:58.401+01:00 INFO 80854 --- [ Thread-1] i.m.driver.core.lock.LockManagerDefault : Starting mongock lock daemon...
2023-02-12T00:49:58.404+01:00 INFO 80854 --- [ main] i.m.r.c.e.system.SystemUpdateExecutor : Mongock starting the system update execution id[2023-02-12T00:49:57.955733372-712]...
2023-02-12T00:49:58.408+01:00 INFO 80854 --- [ main] i.m.r.c.executor.ChangeLogRuntimeImpl : method[io.mongock.runner.core.executor.system.changes.SystemChangeUnit00001] with arguments: []
2023-02-12T00:49:58.411+01:00 INFO 80854 --- [ main] i.m.r.c.executor.ChangeLogRuntimeImpl : method[beforeExecution] with arguments: [io.mongock.driver.mongodb.reactive.repository.MongoReactiveChangeEntryRepository]
2023-02-12T00:49:58.413+01:00 INFO 80854 --- [ main] i.m.r.core.executor.ChangeExecutorBase : APPLIED - {"id"="system-change-00001_before", "type"="before-execution", "author"="mongock", "class"="SystemChangeUnit00001", "method"="beforeExecution"}
2023-02-12T00:49:58.425+01:00 INFO 80854 --- [ main] i.m.r.c.executor.ChangeLogRuntimeImpl : method[execution] with arguments: [io.mongock.driver.mongodb.reactive.repository.MongoReactiveChangeEntryRepository]
2023-02-12T00:49:58.429+01:00 INFO 80854 --- [ main] i.m.r.core.executor.ChangeExecutorBase : APPLIED - {"id"="system-change-00001", "type"="execution", "author"="mongock", "class"="SystemChangeUnit00001", "method"="execution"}
2023-02-12T00:49:58.447+01:00 INFO 80854 --- [ main] i.m.driver.core.lock.LockManagerDefault : Mongock waiting to release the lock
2023-02-12T00:49:58.447+01:00 INFO 80854 --- [ main] i.m.driver.core.lock.LockManagerDefault : Mongock releasing the lock
2023-02-12T00:49:58.455+01:00 INFO 80854 --- [ main] i.m.driver.core.lock.LockManagerDefault : Mongock released the lock
2023-02-12T00:49:58.455+01:00 INFO 80854 --- [ main] i.m.r.c.e.system.SystemUpdateExecutor : Mongock has finished the system update execution
2023-02-12T00:49:58.457+01:00 INFO 80854 --- [ main] org.reflections.Reflections : Reflections took 0 ms to scan 0 urls, producing 0 keys and 0 values
2023-02-12T00:49:58.458+01:00 INFO 80854 --- [ main] org.reflections.Reflections : Reflections took 1 ms to scan 0 urls, producing 0 keys and 0 values
2023-02-12T00:49:58.458+01:00 INFO 80854 --- [ main] i.m.r.c.e.o.migrate.MigrateExecutorBase : Mongock skipping the data migration. There is no change set item.
2023-02-12T00:49:58.458+01:00 INFO 80854 --- [ main] i.m.r.c.e.o.migrate.MigrateExecutorBase : Mongock has finished
2023-02-12T00:49:59.190+01:00 INFO 80854 --- [ main] o.s.b.web.embedded.netty.NettyWebServer : Netty started on port 8080
2023-02-12T00:49:59.201+01:00 INFO 80854 --- [ main] c.e.m.MongockTestKotlinApplicationKt : Started MongockTestKotlinApplicationKt in 4.086 seconds (process running for 4.773)

The problem is in your application.yam. The migration-scan-package name is wrong. That's the reason Mongock doesn't find any ChangeUnit

Related

Debezium server error: java.lang.OutOfMemoryError: Java heap space

Debezium server: v 1.9.0.Final
MongoDB Atlas: v 4.2.20
Running on AWS ECS with Fargate w/ 1GB CPU & 4GB MEMORY
Overview:
Debezium starts an initial snapshot and it sends some data to kinesis but it runs into an error (below) before it finishes the snapshot. I've tried increasing the memory of the container to 4GB but not sure if that's the issue. The one collection I'm testing this with is 28GB total and 11M documents.
Debezium config (in Terraform):
environment = [
{
"name" : "DEBEZIUM_SINK_TYPE",
"value" : "kinesis"
},
{
"name" : "DEBEZIUM_SINK_KINESIS_REGION",
"value" : "us-east-1"
},
{
"name" : "DEBEZIUM_SINK_KINESIS_CREDENTIALS_PROFILE",
"value" : "default"
},
{
"name" : "DEBEZIUM_SINK_KINESIS_ENDPOINT",
"value" : "https://kinesis.us-east-1.amazonaws.com"
},
{
"name" : "DEBEZIUM_SOURCE_CONNECTOR_CLASS",
"value" : "io.debezium.connector.mongodb.MongoDbConnector"
},
{
"name" : "DEBEZIUM_SOURCE_OFFSET_STORAGE_FILE_FILENAME",
"value" : "data/offsets.dat"
},
{
"name" : "DEBEZIUM_SOURCE_OFFSET_FLUSH_INTERVAL_MS",
"value" : "0"
},
{
"name" : "DEBEZIUM_SOURCE_MONGODB_NAME",
"value" : "test"
},
{
"name" : "DEBEZIUM_SOURCE_MONGODB_HOSTS",
"value" : "test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017"
},
{
"name" : "DEBEZIUM_SOURCE_MONGODB_SSL_ENABLED",
"value" : "true"
},
{
"name" : "DEBEZIUM_SOURCE_MONGODB_MEMBERS_AUTO_DISCOVER",
"value" : "true"
},
{
"name" : "DEBEZIUM_SOURCE_DATABASE_INCLUDE_LIST",
"value" : "test"
},
{
"name" : "DEBEZIUM_SOURCE_COLLECTION_INCLUDE_LIST",
"value" : "test.testCollection"
},
{
"name" : "DEBEZIUM_SOURCE_CAPTURE_MODE",
"value" : "change_streams_update_full"
},
{
"name" : "DEBEZIUM_SOURCE_DATABASE_HISTORY",
"value" : "io.debezium.relational.history.FileDatabaseHistory"
},
{
"name" : "DEBEZIUM_SOURCE_DATABASE_HISTORY_FILE_FILENAME",
"value" : "history.dat"
},
{
"name" : "QUARKUS_LOG_CONSOLE_JSON",
"value" : "false"
}
]
secrets = [
{
"name" : "DEBEZIUM_SOURCE_MONGODB_USER",
"valueFrom" : "${data.aws_secretsmanager_secret.test-debezium-read.arn}:username::"
},
{
"name" : "DEBEZIUM_SOURCE_MONGODB_PASSWORD",
"valueFrom" : "${data.aws_secretsmanager_secret.test-debezium-read.arn}:password::"
}
]
Stacktrace:
2022-06-01 18:22:23,976 ERROR [io.deb.con.mon.MongoDbSnapshotChangeEventSource] (debezium-mongodbconnector-test-replicator-snapshot-0) Error while attempting to sync 'test-mongodb-shard-0.test.testCollection': : java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172)
at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538)
at java.base/java.lang.StringBuffer.append(StringBuffer.java:317)
at java.base/java.io.StringWriter.write(StringWriter.java:106)
at org.bson.json.StrictCharacterStreamJsonWriter.write(StrictCharacterStreamJsonWriter.java:368)
at org.bson.json.StrictCharacterStreamJsonWriter.writeStartObject(StrictCharacterStreamJsonWriter.java:204)
at org.bson.json.LegacyExtendedJsonDateTimeConverter.convert(LegacyExtendedJsonDateTimeConverter.java:22)
at org.bson.json.LegacyExtendedJsonDateTimeConverter.convert(LegacyExtendedJsonDateTimeConverter.java:19)
at org.bson.json.JsonWriter.doWriteDateTime(JsonWriter.java:129)
at org.bson.AbstractBsonWriter.writeDateTime(AbstractBsonWriter.java:394)
at org.bson.codecs.DateCodec.encode(DateCodec.java:32)
at org.bson.codecs.DateCodec.encode(DateCodec.java:29)
at org.bson.codecs.EncoderContext.encodeWithChildContext(EncoderContext.java:91)
at org.bson.codecs.DocumentCodec.writeValue(DocumentCodec.java:203)
at org.bson.codecs.DocumentCodec.writeMap(DocumentCodec.java:217)
at org.bson.codecs.DocumentCodec.writeValue(DocumentCodec.java:200)
at org.bson.codecs.DocumentCodec.writeMap(DocumentCodec.java:217)
at org.bson.codecs.DocumentCodec.writeValue(DocumentCodec.java:200)
at org.bson.codecs.DocumentCodec.writeMap(DocumentCodec.java:217)
at org.bson.codecs.DocumentCodec.encode(DocumentCodec.java:159)
at org.bson.codecs.DocumentCodec.encode(DocumentCodec.java:46)
at org.bson.Document.toJson(Document.java:453)
at io.debezium.connector.mongodb.JsonSerialization.lambda$new$0(JsonSerialization.java:57)
at io.debezium.connector.mongodb.JsonSerialization$$Lambda$521/0x0000000840448840.apply(Unknown Source)
at io.debezium.connector.mongodb.JsonSerialization.getDocumentValue(JsonSerialization.java:89)
at io.debezium.connector.mongodb.MongoDbSchema$$Lambda$580/0x00000008404ce840.apply(Unknown Source)
at io.debezium.connector.mongodb.MongoDbCollectionSchema.valueFromDocumentOplog(MongoDbCollectionSchema.java:90)
at io.debezium.connector.mongodb.MongoDbChangeSnapshotOplogRecordEmitter.emitReadRecord(MongoDbChangeSnapshotOplogRecordEmitter.java:68)
at io.debezium.connector.mongodb.MongoDbChangeSnapshotOplogRecordEmitter.emitReadRecord(MongoDbChangeSnapshotOplogRecordEmitter.java:27)
at io.debezium.pipeline.AbstractChangeRecordEmitter.emitChangeRecords(AbstractChangeRecordEmitter.java:42)
at io.debezium.pipeline.EventDispatcher.dispatchSnapshotEvent(EventDispatcher.java:163)
I noticed that during the snapshot, the number of records sent and the last recorded offset doesn't seem to change while the amount of time elapsed between each of those messages gets longer. This seems like an exponential backoff thing but I'm not entirely sure.
Example:
2022-06-01 16:20:37,789 INFO [io.deb.con.mon.MongoDbSnapshotChangeEventSource] (debezium-mongodbconnector-test-replicator-snapshot-0) Beginning snapshot of 'test-mongodb-shard-0' at {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:20:37,804 INFO [io.deb.con.mon.MongoDbSnapshotChangeEventSource] (debezium-mongodbconnector-test-replicator-snapshot-0) Exporting data for collection 'test-mongodb-shard-0.test.testCollection'
2022-06-01 16:20:42,983 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 717 records sent during previous 00:00:06.159, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:20:57,417 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:00:14.434, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:21:05,107 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
2022-06-01 16:21:16,624 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:00:19.207, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:21:35,107 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
2022-06-01 16:21:53,130 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:00:36.505, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:22:05,107 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 16:23:17,521 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:01:24.391, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:23:35,106 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 16:26:06,523 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:02:49.003, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:26:35,107 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 16:31:18,075 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:05:11.552, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:31:35,106 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 16:42:07,711 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:10:49.636, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 16:42:35,106 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 17:03:12,872 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:21:05.161, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 17:03:35,117 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 17:45:58,637 INFO [io.deb.con.com.BaseSourceTask] (pool-7-thread-1) 2048 records sent during previous 00:42:45.765, last recorded offset: {sec=1654100437, ord=138, initsync=true, h=0}
2022-06-01 17:46:05,106 INFO [io.deb.con.mon.ReplicaSetDiscovery] (debezium-mongodbconnector-test-replica-set-monitor) Checking current members of replica set at test-mongodb-shard-00-00.test.mongodb.net:27017,test-mongodb-shard-00-01.test.mongodb.net:27017,test-mongodb-shard-00-02.test.mongodb.net:27017,test-mongodb-i-00-00.test.mongodb.net:27017
...
2022-06-01 18:22:23,976 ERROR [io.deb.con.mon.MongoDbSnapshotChangeEventSource] (debezium-mongodbconnector-test-replicator-snapshot-0) Error while attempting to sync 'test-mongodb-shard-0.test.testCollection': : java.lang.OutOfMemoryError: Java heap space
Beside increasing the container memory to 4GB , you can set also bigger heap size , the maximum and initial heap size can be set for example to 2GB :
-Xms2048m -Xmx2048m
If the issue continue follow this steps:
Start JVM with argument -XX:+HeapDumpOnOutOfMemoryError this will give you a heap dump when program goes in to OOM.
Use a tool like visualVM to analyze the heap dump obtained. That will help in identifying the memory leak.
Not really answering the OP, just wanted to share my experience.
I too occasionally receive java.lang.OutOfMemoryError and would like to find out what's causing it.
My setup:
Debezium 1.9.5
Kafka 2.8
Docker container memory - 6Gi
Java heap - 4Gi both min and max
max.queue.size.in.bytes - 512Mi
max.batch.size - 16384
The errors from stdout:
2022-07-20T16:47:16.348943181Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "KafkaBasedLog Work Thread - cdc.config"
2022-07-20T16:47:27.628395682Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "KafkaBasedLog Work Thread - cdc.status"
2022-07-20T16:47:28.970536167Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | connector-producer-REDACTED-202207200823-0"
2022-07-20T16:47:33.787361085Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | producer-3"
2022-07-20T16:47:45.067373810Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "server-timer"
2022-07-20T16:47:46.987669188Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | REDACTED-dbhistory"
2022-07-20T16:48:03.396881812Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | producer-2"
2022-07-20T16:48:04.017710798Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-coordinator-heartbeat-thread | production"
2022-07-20T16:48:09.709036280Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "prometheus-http-1-3"
2022-07-20T16:48:14.667691706Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "mysql-cj-abandoned-connection-cleanup"
2022-07-20T16:48:17.182623196Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp1890777616-62"
2022-07-20T16:48:25.227925660Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp1890777616-58"
2022-07-20T16:48:43.598026645Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HTTP-Dispatcher"
2022-07-20T16:48:45.543984655Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | producer-1"
2022-07-20T16:48:52.284810255Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "SourceTaskOffsetCommitter-1"
2022-07-20T16:48:56.992674380Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "DistributedHerder-connect-1-1"
2022-07-20T16:49:18.691603140Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Session-HouseKeeper-47a3d56a-1"
2022-07-20T16:49:19.350459393Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "KafkaBasedLog Work Thread - cdc.offset"
2022-07-20T16:49:26.256350455Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-admin-client-thread | adminclient-8"
2022-07-20T16:49:33.154845201Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp1890777616-59"
2022-07-20T16:49:34.414951745Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp1890777616-60"
2022-07-20T16:49:40.871967276Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "qtp1890777616-61"
2022-07-20T16:49:56.007111292Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "debezium-sqlserverconnector-REDACTED-change-event-source-coordinator"
2022-07-20T16:50:00.410800756Z Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-admin-client-thread | connector-adminclient-REDACTED-0"
Some context:
I transfer a table with rows of ~1000 bytes each. There is a useful max.queue.size.in.bytes setting which I expect to work as a top bound for the connector, but my dashboard shows that the queue size goes only as much as 122Mi fluctuating between 30Mi and 60Mi most of the time. My calculations are 1000*16384 should never reach 512Mi.
In this particular case, the connector had been working for about 3 hours normally and then there was no new data for streaming for a few minutes. Soon after that, the OutOfMemory appeared.
I also notice that every time it happens, the container takes its maximum CPU allowed (which is 4 cores in my case).

Does cascade happen in 1 transaction?

I save the Product which cascade persist the productMaterial. However, when the productMaterial throws DataIntegrityViolationException the product is rollbacked, which seems like cascade is done in 1 transaction, but i don't find any docs saying that it does. Can someone clarify it for me?
NOTE: I DO NOT use #Transactional
Material material = new Material();
material.setId(1);
Product newProduct = new Product();
ProductMaterial productMaterial = new ProductMaterial();
newProduct.setName("bàn chải");
newProduct.setPrice(1000);
newProduct.setCreatedAt(new Date());
newProduct.setProductMaterials(Collections.singletonList(productMaterial));
productMaterial.setProduct(newProduct);
productMaterial.setMaterial(material);
productRepository.save(newProduct);
Here is the hibernate execution:
Hibernate:
/* insert com.vietnam.hanghandmade.entities.Product
*/ insert
into
product
(created_at, name, price, id)
values
(?, ?, ?, ?)
2020-11-10 14:55:38.281 TRACE 65729 --- [nio-8080-exec-2] o.h.type.descriptor.sql.BasicBinder : binding parameter [1] as [TIMESTAMP] - [Tue Nov 10 14:55:38 JST 2020]
2020-11-10 14:55:38.281 TRACE 65729 --- [nio-8080-exec-2] o.h.type.descriptor.sql.BasicBinder : binding parameter [2] as [VARCHAR] - [bàn chải]
2020-11-10 14:55:38.281 TRACE 65729 --- [nio-8080-exec-2] o.h.type.descriptor.sql.BasicBinder : binding parameter [3] as [INTEGER] - [1000]
2020-11-10 14:55:38.281 TRACE 65729 --- [nio-8080-exec-2] o.h.type.descriptor.sql.BasicBinder : binding parameter [4] as [OTHER] - [e5729490-a0f8-48e7-9600-eeeba8b8f279]
Hibernate:
/* insert com.vietnam.hanghandmade.entities.ProductMaterial
*/ insert
into
product_material
(material_id, product_id)
values
(?, ?)
2020-11-10 14:55:38.324 TRACE 65729 --- [nio-8080-exec-2] o.h.type.descriptor.sql.BasicBinder : binding parameter [1] as [INTEGER] - [1]
2020-11-10 14:55:38.324 TRACE 65729 --- [nio-8080-exec-2] o.h.type.descriptor.sql.BasicBinder : binding parameter [2] as [OTHER] - [e5729490-a0f8-48e7-9600-eeeba8b8f279]
2020-11-10 14:55:38.328 WARN 65729 --- [nio-8080-exec-2] o.h.engine.jdbc.spi.SqlExceptionHelper : SQL Error: 0, SQLState: 23503
2020-11-10 14:55:38.328 ERROR 65729 --- [nio-8080-exec-2] o.h.engine.jdbc.spi.SqlExceptionHelper : ERROR: insert or update on table "product_material" violates foreign key constraint "product_material_material_id_fkey"
Detail: Key (material_id)=(1) is not present in table "material".
NOTE: This answer missed the point of the question, which is about “cascading persist” – it talks about “cascading delete” for foreign keys.
The cascading delete or update is part of the action of the system trigger that implements foreign key constraints, and as such it runs in the same transaction as the triggering statement.
I cannot find a place in the fine manual that spells this out, but it is obvious if you think about it: if the cascading delete were run in a separate transaction, it would be possible that the delete succeeds and the cascading delete fails, which would render the database inconsistent and is consequently not an option.

create tablespace problem in db2 HADR environment

We have Db2 10.5.0.7 on centos 6.9 and TSAMP 3.2 as our high availability solution, when we create a tablespace in primary database we encounter the following errors in the standby:
2019-08-31-08.47.32.164952+270 I87056E2779 LEVEL: Error (OS)
PID : 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 Common, OSSe, ossGetDiskInfo, probe:130
MESSAGE : ECF=0x90000001=-1879048191=ECF_ACCESS_DENIED
Access denied CALLED : OS, -, fopen OSERR: EACCES (13) DATA #1 : String, 12 bytes /proc/mounts DATA #2 :
String, 25 bytes /dbdata1/samdbTsContainer DATA #3 : unsigned integer,
8 bytes
2019-08-31-08.47.32.185625+270 E89836E494 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, high avail services,
sqlhaGetLocalDiskInfo, probe:9433 MESSAGE :
ECF=0x90000001=-1879048191=ECF_ACCESS_DENIED
Access denied
2019-08-31-08.47.32.186258+270 E90331E484 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, high avail services, sqlhaCreateMount,
probe:9746 RETCODE : ZRC=0x827300AA=-2106392406=HA_ZRC_FAILED "SQLHA
API call error"
2019-08-31-08.47.32.186910+270 I90816E658 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, buffer pool services,
sqlbDMSAddContainerRequest, probe:812 MESSAGE :
ZRC=0x827300AA=-2106392406=HA_ZRC_FAILED "SQLHA API call error" DATA
: String, 36 bytes Cluster add mount operation failed: DATA #2 : String, 37 bytes /dbdata1/samdbTsContainer/TSPKGCACH.1 DATA #3 :
String, 8 bytes SAMDB
2019-08-31-08.47.32.190537+270 E113909E951 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, buffer pool services,
sqlblog_reCreatePool, probe:3134 MESSAGE : ADM6106E Table space
"TSPKGCACH" (ID = "49") could not be created
during the rollforward operation. The most likely cause is that there
is not enough space to create the containers associated with the
table space. Connect to the database after the rollforward operation
completes and use the SET TABLESPACE CONTAINERS command to assign
containers to the table space. Then, issue another ROLLFORWARD
DATABASE command to complete recovery of this table space.
2019-08-31-08.47.32.200949+270 E114861E592 LEVEL: Error PID
: 4046 TID : 47669095425792 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMDB
APPHDL : 0-8 APPID: *LOCAL.DB2.190725231126
HOSTNAME: samdb-b EDUID : 155 EDUNAME: db2redom
(SAMDB) 0 FUNCTION: DB2 UDB, buffer pool services, sqlbIncPoolState,
probe:4628 MESSAGE : ADM12512W Log replay on the HADR standby has
stopped on table space
"TSPKGCACH" (ID "49") because it has been put into "ROLLFORWARD
PENDING" state.
There is free space available for the database and the specified path (/dbdata1/samdbTsContainer) exists on the server and we can create file manually on it .
all settings are equivalent on the primary and standby. db2inst1 is the owner of /dbdata1/samdbTsContainer and permission is drwxr-xr-x, the result of su - db2inst1 “ulimit -Hf” is unlimited and ext3 is file system type and create tablespace statement is as follows:
CREATE LARGE TABLESPACE TSPKGCACH IN DATABASE PARTITION GROUP IBMDEFAULTGROUP PAGESIZE 8 K MANAGED BY DATABASE USING (FILE '/dbdata1/samdbTsContainer/TSPKGCACH.1' 5120) ON DBPARTITIONNUM (0) EXTENTSIZE 64 PREFETCHSIZE 64 BUFFERPOOL BP8KPKGCACH OVERHEAD 10.5 TRANSFERRATE 0.14 DATA TAG NONE NO FILE SYSTEM CACHING;
SELinux is disabled and the sector size is 512 bytes. The mount options are as follws:
/dev/sdf1 /dbdata1 ext3 rw,relatime,errors=continue,barrier=1,data=ordered 0 0
We can not recreate the problem sometimes this problem occur and we don't know the reason of it, but the problem remains until server reboot.
When we restart the standby server problem solves but we need to drop the tablespace and recreate it, is there any idea for this problem?
From the error it looks to me that problem is not with the file access itself but rather /proc/mounts, which Db2 uses to do the mapping between containers and filesystems (to know e.g. the FS type). Hence I suggest to test whether all:
cat /proc/mounts
cat /proc/self/mounts
mount
work OK run as Db2 instance owner ID (db2inst1). If not, this implies some odd OS issue that Db2 is a victim of and we would need more OS diagnostics (e.g strace from the cat /proc/mounts command) to understand it.
Edit:
To confirm this theory I've run a quick test with Db2 11.1. Note this must be TSA-controlled environment for Db2 to follow sqlhaCreateMount code path (because if this will be a separate mount, Db2 will add it to the TSA resource model)
On both primary and standby:
mkdir /db2data
chown db2v111:db2iadm /db2data
then on standby:
chmod o-rx /proc
(couldn't find a "smarter" way to hit EACCES on mount info).
When I will run on primary:
db2 "create tablespace test managed by database using (file '/db2data/testts' 100 M)"
it completes fine on primary but standby hits exactly the error you are seeing:
2019-06-21-03.00.37.087693+120 I1774E2661 LEVEL: Error (OS)
PID : 10379 TID : 46912992438016 PROC : db2sysc 0
INSTANCE: db2v111 NODE : 000 DB : SAMPLE
APPHDL : 0-4492 APPID: *LOCAL.DB2.190621005919
HOSTNAME: rhel-hadrs.kkuduk.com
EDUID : 61 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 Common, OSSe, ossGetDiskInfo, probe:130
MESSAGE : ECF=0x90000001=-1879048191=ECF_ACCESS_DENIED
Access denied
CALLED : OS, -, fopen OSERR: EACCES (13)
DATA #1 : String, 12 bytes
/proc/mounts
DATA #2 : String, 8 bytes
/db2data
DATA #3 : unsigned integer, 8 bytes
1
CALLSTCK: (Static functions may not be resolved correctly, as they are resolved to the nearest symbol)
[0] 0x00002AAAB9CFD84B /home/db2v111/sqllib/lib64/libdb2osse.so.1 + 0x23F84B
[1] 0x00002AAAB9CFED51 ossLogSysRC + 0x101
[2] 0x00002AAAB9D19647 ossGetDiskInfo + 0xF07
[3] 0x00002AAAAC52402C _Z21sqlhaGetLocalDiskInfoPKcjPcjS1_jS1_ + 0x26C
[4] 0x00002AAAAC523C5F _Z16sqlhaGetDiskInfoPKcS0_jPcjS1_jS1_ + 0x29F
[5] 0x00002AAAAC521CA0 _Z16sqlhaCreateMountPKcS0_m + 0x350
[6] 0x00002AAAACDE8D5D _Z26sqlbDMSAddContainerRequestP12SQLB_POOL_CBP16SQLB_POOLCONT_CBP12SQLB_GLOBALSP14SQLB_pfParIoCbbm + 0x90D
[7] 0x00002AAAACE14FF9 _Z29sqlbDoDMSAddContainerRequestsP12SQLB_POOL_CBP16SQLB_POOLCONT_CBjP26SQLB_AS_CONT_AND_PATH_INFOP12SQLB_GLOBALS + 0x2D9
[8] 0x00002AAAACE0C20F _Z17sqlbDMSCreatePoolP12SQLB_POOL_CBiP16SQLB_POOLCONT_CBbP12SQLB_GLOBALS + 0x103F
[9] 0x00002AAAACDB1EAC _Z13sqlbSetupPoolP12SQLB_GLOBALSP12SQLB_POOL_CBPKciiiihiP19SQLB_CONTAINER_SPECllblsib + 0xE4C
-> it is an issue with /proc/mounts access, not the target path itself, where i can write with no issues:
[db2v111#rhel-hadrs ~]$ echo "test" > /db2data/testfile
If that would be path access issue:
chmod o+rx /proc
chmod a-rw /db2data
then an error during the "CREATE TABLESPACE" redo on standby will be different:
2019-06-21-03.07.29.175486+120 I35023E592 LEVEL: Error
PID : 10379 TID : 46912992438016 PROC : db2sysc 0
INSTANCE: db2v111 NODE : 000 DB : SAMPLE
APPHDL : 0-4492 APPID: *LOCAL.DB2.190621005919
HOSTNAME: rhel-hadrs.kkuduk.com
EDUID : 61 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 UDB, buffer pool services, sqlbCreateAndLockParent, probe:918
MESSAGE : ZRC=0x8402001E=-2080243682=SQLB_CONTAINER_NOT_ACCESSIBLE
"Container not accessible"
DATA #1 : <preformatted>
Failed at directory /db2data.
2019-06-21-03.07.29.175799+120 I35616E619 LEVEL: Severe
PID : 10379 TID : 46912992438016 PROC : db2sysc 0
INSTANCE: db2v111 NODE : 000 DB : SAMPLE
APPHDL : 0-4492 APPID: *LOCAL.DB2.190621005919
HOSTNAME: rhel-hadrs.kkuduk.com
EDUID : 61 EDUNAME: db2redom (SAMPLE) 0
FUNCTION: DB2 UDB, buffer pool services, sqlbCreateAndLockParent, probe:722
MESSAGE : ZRC=0x8402001E=-2080243682=SQLB_CONTAINER_NOT_ACCESSIBLE
"Container not accessible"
DATA #1 : <preformatted>
Failed to create a portion of the path /db2data/testts2
(few more errors follow pointing directly to the permissions on /db2data)
This proves it is the /proc access issue and you need to debug it with your OS team. Perhaps /proc gets completely unmounted?
In any case, the actual issue is db2sysc process hitting EACCES running fopen on /proc/mounts and you need debug it further with OS team.
Edit:
When it comes to the debugging and proving the error is returned by the OS, we would have to trace open() syscalls done by Db2. Strace can do that, but overhead is too high for a production system. If you can get SystemTap installed on the system, I suggest a script like this (this is a basic version):
probe nd_syscall.open.return
{
if ( user_string( #entry( pointer_arg(1) ) ) =~ ".*mounts")
{
printf("exec: %s pid: %d uid: %d (euid: %d) gid: %d (egid: %d) run open(%s) rc: %d\n", execname(), pid(), uid(), euid(), gid(), egid(), user_string(#entry(pointer_arg(1)), "-"), returnval() )
}
}
it uses nd_syscall probe, so it will work even without kernel debuginfo package. You can run it like this:
$ stap open.stap
exec: cat pid: 24159 uid: 0 (euid: 0) gid: 0 (egid: 0) run open(/proc/mounts) rc: 3
exec: cat pid: 24210 uid: 0 (euid: 0) gid: 0 (egid: 0) run open(/proc/mounts) rc: 3
exec: cat pid: 24669 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/mounts) rc: 3
exec: cat pid: 24734 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/mounts) rc: -13
exec: cat pid: 24891 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/self/mounts) rc: -13
exec: ls pid: 24971 uid: 1111 (euid: 1111) gid: 1001 (egid: 1001) run open(/proc/mounts) rc: -13
-> at some point I've revoked access from /proc and open attempt failed with -13 (EACCES). You just need to enable it on the system when you see the error and see if something is logged when Db2 fails.

hibernate second level cache can get lazy-loading entity when i set breakpoint

I use Spring data JPA and hibernate second level cache via hibernate-redis in my project. I use #Transactional for lazy-loading, But it hints miss when I run application. if i debug it, and set a breakpoint wait for some time, it works and retrieve cache from redis. Here is the code:
Entity ItemCategory:
#Entity
#Cacheable
public class ItemCategory extends BaseModel {
#NotNull
#Column(updatable=false)
private String name;
#JsonBackReference
#ManyToOne(fetch = FetchType.LAZY)
private ItemCategory root;
}
Entity Item:
#Entity
#Cacheable
public class Item extends BaseModel {
#ManyToOne(fetch = FetchType.EAGER)
private ItemCategory category;
}
Repository:
#Repository
public interface ItemCategoryRepository extends JpaRepository<ItemCategory, Long> {
#QueryHints(value = {
#QueryHint(name = "org.hibernate.cacheable", value = "true")
})
#Query("select distinct i.category.root from Item i where i.store.id = :id and i.category.parent.id = i.category.root.id")
List<ItemCategory> findByStoreId(#Param("id") Long id);
}
hint miss:
2017-03-06 14:49:30.105 TRACE 30295 --- [nio-8080-exec-2] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.org.hibernate.cache.internal.StandardQueryCache, key=sql: select distinct itemcatego2_.id as id1_21_, itemcatego2_.create_by_id as create_b8_21_, itemcatego2_.create_date as create_d2_21_, itemcatego2_.last_modified_by_id as last_mod9_21_, itemcatego2_.last_modified_date as last_mod3_21_, itemcatego2_.background_id as backgro10_21_, itemcatego2_.enabled as enabled4_21_, itemcatego2_.name as name5_21_, itemcatego2_.parent_id as parent_11_21_, itemcatego2_.root_id as root_id12_21_, itemcatego2_.slide as slide6_21_, itemcatego2_.son_number as son_numb7_21_ from item item0_ inner join item_category itemcatego1_ on item0_.category_id=itemcatego1_.id inner join item_category itemcatego2_ on itemcatego1_.root_id=itemcatego2_.id where item0_.store_id=? and itemcatego1_.parent_id=itemcatego1_.root_id; parameters: ; named parameters: {id=4}; transformer: org.hibernate.transform.CacheableResultTransformer#110f2, value=[6098054966726656, 3, 1]
2017-03-06 14:49:30.116 TRACE 30295 --- [nio-8080-exec-2] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.org.hibernate.cache.spi.UpdateTimestampsCache, key=item, value=null
2017-03-06 14:49:30.127 TRACE 30295 --- [nio-8080-exec-2] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.org.hibernate.cache.spi.UpdateTimestampsCache, key=item_category, value=null
2017-03-06 14:49:41.971 INFO 30295 --- [nio-8080-exec-2] i.StatisticalLoggingSessionEventListener : Session Metrics {
974551 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
0 nanoseconds spent preparing 0 JDBC statements;
0 nanoseconds spent executing 0 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
19881210 nanoseconds spent performing 1 L2C hits;
24082571 nanoseconds spent performing 2 L2C misses;
0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
26331 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}
if i debug and set a breakpoint wait for some time(not work every time):
2017-03-06 14:50:00.565 TRACE 30295 --- [nio-8080-exec-3] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.org.hibernate.cache.internal.StandardQueryCache, key=sql: select distinct itemcatego2_.id as id1_21_, itemcatego2_.create_by_id as create_b8_21_, itemcatego2_.create_date as create_d2_21_, itemcatego2_.last_modified_by_id as last_mod9_21_, itemcatego2_.last_modified_date as last_mod3_21_, itemcatego2_.background_id as backgro10_21_, itemcatego2_.enabled as enabled4_21_, itemcatego2_.name as name5_21_, itemcatego2_.parent_id as parent_11_21_, itemcatego2_.root_id as root_id12_21_, itemcatego2_.slide as slide6_21_, itemcatego2_.son_number as son_numb7_21_ from item item0_ inner join item_category itemcatego1_ on item0_.category_id=itemcatego1_.id inner join item_category itemcatego2_ on itemcatego1_.root_id=itemcatego2_.id where item0_.store_id=? and itemcatego1_.parent_id=itemcatego1_.root_id; parameters: ; named parameters: {id=4}; transformer: org.hibernate.transform.CacheableResultTransformer#110f2, value=[6098054966726656, 3, 1]
2017-03-06 14:50:00.584 TRACE 30295 --- [nio-8080-exec-3] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.org.hibernate.cache.spi.UpdateTimestampsCache, key=item, value=null
2017-03-06 14:50:00.595 TRACE 30295 --- [nio-8080-exec-3] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.org.hibernate.cache.spi.UpdateTimestampsCache, key=item_category, value=null
2017-03-06 14:50:01.805 TRACE 30295 --- [nio-8080-exec-3] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.com.foo.bar.model.item.ItemCategory, key=com.foo.bar.model.item.ItemCategory#3, value={parent=null, lastModifiedDate=2016-12-14 09:30:48.0, lastModifiedBy=1, enabled=true, sonNumber=2, _subclass=com.foo.bar.model.item.ItemCategory, createBy=1, children=3, background=1, slide=0, root=3, name=foo, _lazyPropertiesUnfetched=false, _version=null, createDate=2016-12-14 09:29:56.0}
Hibernate: select user0_.id as id1_59_0_, user0_.create_by_id as create_11_59_0_, user0_.create_date as create_d2_59_0_, user0_.last_modified_by_id as last_mo12_59_0_, user0_.last_modified_date as last_mod3_59_0_, user0_.avatar_id as avatar_13_59_0_, user0_.email as email4_59_0_, user0_.enabled as enabled5_59_0_, user0_.gender as gender6_59_0_, user0_.nickname as nickname7_59_0_, user0_.phone as phone8_59_0_, user0_.seller_auth_info_id as seller_14_59_0_, user0_.seller_auth_status as seller_a9_59_0_, user0_.user_ext_id as user_ex15_59_0_, user0_.user_group_id as user_gr16_59_0_, user0_.username as usernam10_59_0_, user1_.id as id1_59_1_, user1_.create_by_id as create_11_59_1_, user1_.create_date as create_d2_59_1_, user1_.last_modified_by_id as last_mo12_59_1_, user1_.last_modified_date as last_mod3_59_1_, user1_.avatar_id as avatar_13_59_1_, user1_.email as email4_59_1_, user1_.enabled as enabled5_59_1_, user1_.gender as gender6_59_1_, user1_.nickname as nickname7_59_1_, user1_.phone as phone8_59_1_, user1_.seller_auth_info_id as seller_14_59_1_, user1_.seller_auth_status as seller_a9_59_1_, user1_.user_ext_id as user_ex15_59_1_, user1_.user_group_id as user_gr16_59_1_, user1_.username as usernam10_59_1_, user2_.id as id1_59_2_, user2_.create_by_id as create_11_59_2_, user2_.create_date as create_d2_59_2_, user2_.last_modified_by_id as last_mo12_59_2_, user2_.last_modified_date as last_mod3_59_2_, user2_.avatar_id as avatar_13_59_2_, user2_.email as email4_59_2_, user2_.enabled as enabled5_59_2_, user2_.gender as gender6_59_2_, user2_.nickname as nickname7_59_2_, user2_.phone as phone8_59_2_, user2_.seller_auth_info_id as seller_14_59_2_, user2_.seller_auth_status as seller_a9_59_2_, user2_.user_ext_id as user_ex15_59_2_, user2_.user_group_id as user_gr16_59_2_, user2_.username as usernam10_59_2_, usergroup3_.id as id1_65_3_, usergroup3_.create_by_id as create_b5_65_3_, usergroup3_.create_date as create_d2_65_3_, usergroup3_.last_modified_by_id as last_mod6_65_3_, usergroup3_.last_modified_date as last_mod3_65_3_, usergroup3_.name as name4_65_3_ from user user0_ left outer join user user1_ on user0_.create_by_id=user1_.id left outer join user user2_ on user1_.last_modified_by_id=user2_.id left outer join user_group usergroup3_ on user1_.user_group_id=usergroup3_.id where user0_.id=?
Hibernate: select usergroup0_.id as id1_65_0_, usergroup0_.create_by_id as create_b5_65_0_, usergroup0_.create_date as create_d2_65_0_, usergroup0_.last_modified_by_id as last_mod6_65_0_, usergroup0_.last_modified_date as last_mod3_65_0_, usergroup0_.name as name4_65_0_, user1_.id as id1_59_1_, user1_.create_by_id as create_11_59_1_, user1_.create_date as create_d2_59_1_, user1_.last_modified_by_id as last_mo12_59_1_, user1_.last_modified_date as last_mod3_59_1_, user1_.avatar_id as avatar_13_59_1_, user1_.email as email4_59_1_, user1_.enabled as enabled5_59_1_, user1_.gender as gender6_59_1_, user1_.nickname as nickname7_59_1_, user1_.phone as phone8_59_1_, user1_.seller_auth_info_id as seller_14_59_1_, user1_.seller_auth_status as seller_a9_59_1_, user1_.user_ext_id as user_ex15_59_1_, user1_.user_group_id as user_gr16_59_1_, user1_.username as usernam10_59_1_, user2_.id as id1_59_2_, user2_.create_by_id as create_11_59_2_, user2_.create_date as create_d2_59_2_, user2_.last_modified_by_id as last_mo12_59_2_, user2_.last_modified_date as last_mod3_59_2_, user2_.avatar_id as avatar_13_59_2_, user2_.email as email4_59_2_, user2_.enabled as enabled5_59_2_, user2_.gender as gender6_59_2_, user2_.nickname as nickname7_59_2_, user2_.phone as phone8_59_2_, user2_.seller_auth_info_id as seller_14_59_2_, user2_.seller_auth_status as seller_a9_59_2_, user2_.user_ext_id as user_ex15_59_2_, user2_.user_group_id as user_gr16_59_2_, user2_.username as usernam10_59_2_, user3_.id as id1_59_3_, user3_.create_by_id as create_11_59_3_, user3_.create_date as create_d2_59_3_, user3_.last_modified_by_id as last_mo12_59_3_, user3_.last_modified_date as last_mod3_59_3_, user3_.avatar_id as avatar_13_59_3_, user3_.email as email4_59_3_, user3_.enabled as enabled5_59_3_, user3_.gender as gender6_59_3_, user3_.nickname as nickname7_59_3_, user3_.phone as phone8_59_3_, user3_.seller_auth_info_id as seller_14_59_3_, user3_.seller_auth_status as seller_a9_59_3_, user3_.user_ext_id as user_ex15_59_3_, user3_.user_group_id as user_gr16_59_3_, user3_.username as usernam10_59_3_, usergroup4_.id as id1_65_4_, usergroup4_.create_by_id as create_b5_65_4_, usergroup4_.create_date as create_d2_65_4_, usergroup4_.last_modified_by_id as last_mod6_65_4_, usergroup4_.last_modified_date as last_mod3_65_4_, usergroup4_.name as name4_65_4_, user5_.id as id1_59_5_, user5_.create_by_id as create_11_59_5_, user5_.create_date as create_d2_59_5_, user5_.last_modified_by_id as last_mo12_59_5_, user5_.last_modified_date as last_mod3_59_5_, user5_.avatar_id as avatar_13_59_5_, user5_.email as email4_59_5_, user5_.enabled as enabled5_59_5_, user5_.gender as gender6_59_5_, user5_.nickname as nickname7_59_5_, user5_.phone as phone8_59_5_, user5_.seller_auth_info_id as seller_14_59_5_, user5_.seller_auth_status as seller_a9_59_5_, user5_.user_ext_id as user_ex15_59_5_, user5_.user_group_id as user_gr16_59_5_, user5_.username as usernam10_59_5_, authoritie6_.user_group_id as user_gro1_66_6_, authoritie6_.authorities as authorit2_66_6_ from user_group usergroup0_ left outer join user user1_ on usergroup0_.create_by_id=user1_.id left outer join user user2_ on user1_.create_by_id=user2_.id left outer join user user3_ on user1_.last_modified_by_id=user3_.id left outer join user_group usergroup4_ on user1_.user_group_id=usergroup4_.id left outer join user user5_ on usergroup0_.last_modified_by_id=user5_.id left outer join user_group_authorities authoritie6_ on usergroup0_.id=authoritie6_.user_group_id where usergroup0_.id=?
2017-03-06 14:50:01.830 TRACE 30295 --- [nio-8080-exec-3] o.h.cache.redis.client.RedisClient : retrieve cache item. region=hibernate.com.foo.bar.model.item.ItemCategory, key=com.foo.bar.model.item.ItemCategory#1, value={parent=null, lastModifiedDate=2016-12-05 09:31:51.0, lastModifiedBy=1, enabled=true, sonNumber=1, _subclass=com.foo.bar.model.item.ItemCategory, createBy=1, children=1, background=1, slide=0, root=1, name=bar, _lazyPropertiesUnfetched=false, _version=null, createDate=2016-12-05 09:31:28.0}
2017-03-06 14:51:02.165 INFO 30295 --- [nio-8080-exec-3] i.StatisticalLoggingSessionEventListener : Session Metrics {
15435533 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
1405433 nanoseconds spent preparing 2 JDBC statements;
2301936 nanoseconds spent executing 2 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
64020073 nanoseconds spent performing 3 L2C hits;
27037450 nanoseconds spent performing 2 L2C misses;
1247578 nanoseconds spent executing 1 flushes (flushing a total of 4 entities and 3 collections);
24403 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}
application.yml:
spring:
profiles: development
jpa:
show-sql: true
properties:
hibernate.cache.use_second_level_cache: true
hibernate.cache.region.factory_class: org.hibernate.cache.redis.hibernate5.SingletonRedisRegionFactory
hibernate.cache.use_query_cache: true
hibernate.cache.region_prefix: hibernate
hibernate.generate_statistics: true
hibernate.cache.use_structured_entries: true
redisson-config: classpath:redisson.yml
hibernate.cache.use_reference_entries: true
javax.persistence.sharedCache.mode: ENABLE_SELECTIVE

Kafka Consumer Marking the coordinator 2147483647 dead

I am using Kafka Server 0.9 with consumer kafka-client version 0.9 and kafka-producer 0.8.2.
Every thing is working great except i am getting lot of info that the coordinator is dead on the consumer
2016-02-25 19:30:45.046 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.048 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.049 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.050 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.051 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.052 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.053 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.054 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.055 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.056 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.057 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.058 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.059 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.060 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.061 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.062 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.062 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.063 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.064 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.065 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.066 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.067 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.068 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.068 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.069 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.070 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.071 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.072 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.072 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.073 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.074 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.075 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.075 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.076 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.077 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.078 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.079 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.079 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.080 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.081 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.082 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.083 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.083 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.084 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.085 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.086 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.086 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.087 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.088 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.089 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.089 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.090 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.091 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.093 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.094 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-02-25 19:30:45.094 INFO 10263 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
I also noticed that the producer is having disconnect connect every 10 minute as the below
2016-03-12 15:55:36 INFO [pool-1-thread-1] - Fetching metadata from broker id:0,host:192.168.72.30,port:9092 with correlation id 41675 for 1 topic(s) Set(act)
2016-03-12 15:55:36 INFO [pool-1-thread-1] - Connected to 192.168.72.30:9092 for producing
2016-03-12 15:55:36 INFO [pool-1-thread-1] - Disconnecting from 192.168.72.30:9092
2016-03-12 15:55:36 INFO [pool-1-thread-1] - Disconnecting from kafkauk.XXXXXXXXXX.co:9092
2016-03-12 15:55:36 INFO [pool-1-thread-1] - Connected to kafkauk.XXXXXXXXXX.co:9092 for producing
this is my producer configuration
metadata.broker.list=192.168.72.30:9092
serializer.class=kafka.serializer.StringEncoder
request.required.acks=1
linger.ms=2000
batch.size=500
and consumer config
bootstrap.servers: kafkauk.xxxxxxxx.co:9092
group.id: cdrServer
client.id: cdrServer
enable.auto.commit: true
auto.commit.interval.ms: 1000
session.timeout.ms: 30000
key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
value.deserializer: org.apache.kafka.common.serialization.StringDeserializer
I could not figure out what does these mean and should i neglect them or i am missing something in the configuration
After i change kafka to debug level on the consumer i found the below
2016-03-13 18:21:55.586 DEBUG 5469 --- [ cdrServer] org.apache.kafka.clients.NetworkClient : Node 2147483647 disconnected.
2016-03-13 18:21:55.586 INFO 5469 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 2147483647 dead.
2016-03-13 18:21:55.586 DEBUG 5469 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Issuing group metadata request to broker 0
2016-03-13 18:21:55.586 DEBUG 5469 --- [ cdrServer] org.apache.kafka.clients.NetworkClient : Sending metadata request ClientRequest(expectResponse=true
, callback=null, request=RequestSend(header={api_key=3,api_version=0,correlation_id=183025,client_id=cdrServer}, body={topics=[act]}), isInitiatedByNetworkCli
ent, createdTimeMs=1457893315586, sendTimeMs=0) to node 0
2016-03-13 18:21:55.591 DEBUG 5469 --- [ cdrServer] org.apache.kafka.clients.Metadata : Updated cluster metadata version 296 to Cluster(nodes = [N
ode(0, kafkauk.xxxxxxxxx.co, 9092)], partitions = [Partition(topic = act, partition = 0, leader = 0, replicas = [0,], isr = [0,]])
2016-03-13 18:21:55.592 DEBUG 5469 --- [ cdrServer] o.a.k.c.c.internals.AbstractCoordinator : Group metadata response ClientResponse(receivedTimeMs=1457
893315592, disconnected=false, request=ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFu
tureCompletionHandler#1e2de777, request=RequestSend(header={api_key=10,api_version=0,correlation_id=183024,client_id=cdrServer}, body={group_id=cdrServer}), c
reatedTimeMs=1457893315586, sendTimeMs=1457893315586), responseBody={error_code=0,coordinator={node_id=0,host=kafkauk.xxxxxxxx.co,port=9092}})
I am not sure it is a network problem because it happen every 9 minute exactly
Update
I found that is directly related to
connections.max.idle.ms: 300000
What ever i put then i will get disconnected at this value
Marking the coordinator dead happens when there is a Network communication error between the Consumer Client and the Coordinator (Also this can happen when the Coordinator dies and the group needs to rebalance). There are a variety of situations (offset commit request, fetch offset, etc) that can cause this issue. I will suggest that you research what's causing this situations
I have faced the same issue. Finally after follow Shannon recommendation about TRACING logs, I used:
logging.level.org.apache.kafka=TRACE
To find out that my client was trying to resolve Euler:9092 as coordinator... Local name!!
So I commented out and changed listeners and advertised.listeners values in server.properties file.
It is working now! :-)
In my case the message was in logs when I try to assign partitions manually. After I've read in api docs of the new consumer follow notice:
It is also possible for the consumer to manually assign specific partitions (similar to the older "simple" consumer) using assign(Collection). In this case, dynamic partition assignment and consumer group coordination will be disabled.
That is, if you have code like this:
KafkaConsumer<String, String> consumer = new KafkaConsumer(props);
consumer.assign( Arrays.asList(
new TopicPartition("topic", 0),
new TopicPartition("topic", 1)
));
then the message "Marking the coordinator 2147483647 dead" puts in our logs always.
This is basically you are not able to reach to Kafka.
In my case I was running Kafka in vagrant box, and if I start VPN it refresh vagrant ip hence it was not able to connect to it.
Possible Solution: In this case stop VPN and start your vagrant.
This may also be related to a long garbage collection stop-the-world phase. In my case I encountered this message after > 10 sec GCs.
This error mostly occurs when there is a conflict between coordinator and consumer. The first thing you should do is to expose the listener port in server.properties and secondly you need to remove all the logs under kafka-logs. Don't forget to restart the server and zookeeper after these steps. It will resolve the issue.
I faced this issue today and solved it (temporarily, might I add). I've posted an answer here on how I did it.