PyFlink 1.11 not connecting to Confluent Cloud (Kafka) cluster - apache-kafka

I am configuring PyFlink to connect to a confluent cloud kafka cluster. I am using SASL/PLAIN. Below is the code snippet:
""" CREATE TABLE {0} (
`transaction_amt` BIGINT NOT NULL,
`event_id` VARCHAR(64) NOT NULL,
`event_time` TIMESTAMP(6) NOT NULL
)
WITH (
'connector' = 'kafka',
'topic' = '{1}',
'properties.bootstrap.servers' = '{2}',
'properties.group.id' = 'testGroupTFI',
'format' = 'json',
'json.timestamp-format.standard' = 'ISO-8601',
'properties.security.protocol' = 'SASL_SSL',
'properties.sasl.mechanism' = 'PLAIN',
'properties.sasl.jaas.config' = 'org.apache.kafka.common.security.plain.PlainLoginModule required username=\"{3}\" password=\"{4}\";'
) """.format(table_name, stream_name, broker, user, secret)
I am getting this error:
{
"applicationARN": "arn:aws:kinesisanalytics:us-east-2:xxxxxxxxxxx:application/sentiment",
"applicationVersionId": "13",
"locationInformation": "org.apache.flink.runtime.taskmanager.Task.transitionState(Task.java:973)",
"logger": "org.apache.flink.runtime.taskmanager.Task",
"message": "Source: TableSourceScan(table=[[default_catalog, default_database, input_table]], fields=[transaction_amt, event_id, event_time]) -> Sink: Sink(table=[default_catalog.default_database.output_table_msk], fields=[transaction_amt, event_id, event_time]) (3/12) (25a905455865731943be6aa60927a49c) switched from RUNNING to FAILED.",
"messageSchemaVersion": "1",
"messageType": "WARN",
"threadName": "Source: TableSourceScan(table=[[default_catalog, default_database, input_table]], fields=[transaction_amt, event_id, event_time]) -> Sink: Sink(table=[default_catalog.default_database.output_table_msk], fields=[transaction_amt, event_id, event_time]) (3/12)",
"throwableInformation": "org.apache.flink.kafka.shaded.org.apache.kafka.common.KafkaException: Failed to construct kafka producer\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:432)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:298)\n\tat org.apache.flink.streaming.connectors.kafka.internal.FlinkKafkaInternalProducer.<init>(FlinkKafkaInternalProducer.java:78)\n\tat org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.createProducer(FlinkKafkaProducer.java:1141)\n\tat org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.initProducer(FlinkKafkaProducer.java:1242)\n\tat org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.initNonTransactionalProducer(FlinkKafkaProducer.java:1238)\n\tat org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.beginTransaction(FlinkKafkaProducer.java:940)\n\tat org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.beginTransaction(FlinkKafkaProducer.java:99)\n\tat org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.beginTransactionInternal(TwoPhaseCommitSinkFunction.java:398)\n\tat org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.initializeState(TwoPhaseCommitSinkFunction.java:389)\n\tat org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.initializeState(FlinkKafkaProducer.java:1111)\n\tat org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:185)\n\tat org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:167)\n\tat org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)\n\tat org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:106)\n\tat org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258)\n\tat org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:290)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:474)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:470)\n\tat org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:529)\n\tat org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:724)\n\tat org.apache.flink.runtime.taskmanager.Task.run(Task.java:549)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: org.apache.flink.kafka.shaded.org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: No LoginModule found for org.apache.kafka.common.security.plain.PlainLoginModule\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:158)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:146)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:67)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:99)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.clients.producer.KafkaProducer.newSender(KafkaProducer.java:450)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:421)\n\t... 23 more\nCaused by: javax.security.auth.login.LoginException: No LoginModule found for org.apache.kafka.common.security.plain.PlainLoginModule\n\tat java.base/javax.security.auth.login.LoginContext.invoke(LoginContext.java:731)\n\tat java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:672)\n\tat java.base/javax.security.auth.login.LoginContext$4.run(LoginContext.java:670)\n\tat java.base/java.security.AccessController.doPrivileged(Native Method)\n\tat java.base/javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:670)\n\tat java.base/javax.security.auth.login.LoginContext.login(LoginContext.java:581)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.security.authenticator.AbstractLogin.login(AbstractLogin.java:60)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.security.authenticator.LoginManager.<init>(LoginManager.java:62)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:105)\n\tat org.apache.flink.kafka.shaded.org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:147)\n\t... 28 more\n"
}
I feel that SASL is not supported by PyFlink SQL Connector for 1.11 or 1.13. Is this is correct? Is there a workaround I can work on?

Related

PyFlink CDC Connectors Postgres failure

Trying to follow the Flink CDC Connectors Postgres tutorial using PyFlink:
https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html
Failing Code
ddl = """
CREATE TABLE shipments (
shipment_id INT,
order_id INT,
origin STRING,
destination STRING,
is_arrived BOOLEAN
) WITH (
'connector' = 'postgres-cdc',
'hostname' = 'localhost',
'port' = '5432',
'username' = 'postgres',
'password' = 'postgres',
'database-name' = 'postgres',
'schema-name' = 'public',
'slot.name' = 'slot2',
'table-name' = 'shipments'
);
"""
table_env.execute_sql(ddl)
table2: Table = table_env.sql_query("SELECT * FROM shipments")
table2.execute().print()
Main Stacktrace
Caused by: io.debezium.DebeziumException: Creation of replication slot failed
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:141)
at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:130)
at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:759)
at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:188)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "CREATE_REPLICATION_SLOT"
Position: 1
Troubleshooting
CDC needs access to Postgres' WAL, write ahead log. In order to get the WAL Postgres needs to have a replication slot. So you need to set this up in Postgres, I got that done with this command:
ALTER SYSTEM SET wal_level = logical
SELECT * FROM pg_create_logical_replication_slot('slot2', 'test_decoding')
Then restart Postgres.
CDC code that get the replication slot
jdbcConnection.getReplicationSlotState(connectorConfig.slotName(), connectorConfig.plugin().getPostgresPluginName());
called from:
https://github.com/debezium/debezium/blob/main/debezium-connector-postgres/src/main/java/io/debezium/connector/postgresql/PostgresConnectorTask.java
When that doesn't work you it will try to create the replication slot and that is where the exception is thrown.
Possible version issue
Maybe the problem is that I am running Postgres 13 and Ververica CDC might only support up to Postgres 12.

Getting Ignite Exception in Kafka Connector

I am running 3 servers and 1 client topology.
1 of the servers node is being run by kafka connector process.
client is not able to send any message to that kafka connector ignite node.
this is the exception
SEVERE: Failed to read message [msg=GridIoMessage [plc=0, topic=null, topicOrd=-1, ordered=false, timeout=0, skipOnTimeout=false, msg=null], buf=java.nio.DirectByteBuffer[pos=4 lim=251 cap=32768], reader=DirectMessageReader [state=DirectMessageState [pos=0, stack=[StateItem [stream=DirectByteBufferStreamImplV2 [baseOff=140323292482432, arrOff=-1, tmpArrOff=0, valReadBytes=0, tmpArrBytes=0, msgTypeDone=false, msg=null, mapIt=null, it=null, arrPos=-1, keyDone=false, readSize=-1, readItems=0, prim=0, primShift=0, uuidState=0, uuidMost=0, uuidLeast=0, uuidLocId=0], state=0], null, null, null, null, null, null, null, null, null]], protoVer=3, lastRead=false], ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=3, bytesRcvd=251, bytesSent=0, bytesRcvd0=251, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=null, finished=false, heartbeatTs=1604581761046, hashCode=1782557810, interrupted=false, runner=grid-nio-worker-tcp-comm-3-#139]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=4 lim=251 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=9, resendCnt=0, rcvCnt=7, sentCnt=9, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=3f63a8d7-8964-4a4b-89c1-124d8eaba14a, consistentId=3f63a8d7-8964-4a4b-89c1-124d8eaba14a, addrs=ArrayList [127.0.0.1, 172.20.50.222], sockAddrs=HashSet [/127.0.0.1:0, /172.20.50.222:0], discPort=0, order=4, intOrder=4, lastExchangeTime=1604581749120, loc=false, ver=8.7.10#20191227-sha1:c481441d, isClient=true], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=2, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=9, resendCnt=0, rcvCnt=7, sentCnt=9, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=3f63a8d7-8964-4a4b-89c1-124d8eaba14a, consistentId=3f63a8d7-8964-4a4b-89c1-124d8eaba14a, addrs=ArrayList [127.0.0.1, 172.20.50.222], sockAddrs=HashSet [/127.0.0.1:0, /172.20.50.222:0], discPort=0, order=4, intOrder=4, lastExchangeTime=1604581749120, loc=false, ver=8.7.10#20191227-sha1:c481441d, isClient=true], connected=false, connectCnt=1, queueLimit=4096, reserveCnt=2, pairedConnections=false], outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.AtomicLongMetric#69a257d1, super=GridNioSessionImpl [locAddr=/172.20.52.38:54412, rmtAddr=/172.20.50.222:47100, createTime=1604581761046, closeTime=0, bytesSent=0, bytesRcvd=251, bytesSent0=0, bytesRcvd0=251, sndSchedTime=1604581761046, lastSndTime=1604581761046, lastRcvTime=1604581761046, readsPaused=false, filterChain=FilterChain[filters=[GridNioTracerFilter [tracer=GridProcessorAdapter []], GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser#1766eecd, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]]]
class org.apache.ignite.IgniteException: Invalid message type: -33
at org.apache.ignite.internal.managers.communication.GridIoMessageFactory.create(GridIoMessageFactory.java:1106)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.create(TcpCommunicationSpi.java:2407)
at org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readMessage(DirectByteBufferStreamImplV2.java:1175)
at org.apache.ignite.internal.direct.DirectMessageReader.readMessage(DirectMessageReader.java:335)
at org.apache.ignite.internal.managers.communication.GridIoMessage.readFrom(GridIoMessage.java:270)
at org.apache.ignite.internal.util.nio.GridDirectParser.decode(GridDirectParser.java:89)
at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:112)
at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:108)
at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:87)
at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:108)
at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:3681)
at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:174)
at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1360)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2472)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2239)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1880)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
at java.lang.Thread.run(Thread.java:748)```
Direct type -33 is GridH2QueryRequest.
Are you sure that your Kafka connector node has ignite-indexing in its classpath? Try adding it explicitly.

Field does not exist on transformations to extract key with Debezium

I am trying to create a Debezium MySQL connector with a transformation to extract the key.
Before key transformations:
create source connector mysql with(
"connector.class" = 'io.debezium.connector.mysql.MySqlConnector',
"database.hostname" = 'mysql',
"tasks.max" = '1',
"database.port" = '3306',
"database.user" = 'debezium',
"database.password" = 'dbz',
"database.server.id" = '42',
"database.server.name" = 'before',
"table.whitelist" = 'deepprices.deepprices',
"database.history.kafka.bootstrap.servers" = 'kafka:29092',
"database.history.kafka.topic" = 'dbz.deepprices',
"include.schema.changes" = 'true',
"transforms" = 'unwrap',
"transforms.unwrap.type" = 'io.debezium.transforms.UnwrapFromEnvelope');
Topic results are :
> rowtime: 2020/05/20 16:47:23.354 Z, key: [St#5778462697648631933/8247607644536792125], value: {"id": "P195910", "price": "1511.64"}
When the key.converter is set to JSON, Key becomes {"id": "P195910"}
So, I want to extract id from key and make it a string key:
Expected results :
rowtime: 2020/05/20 16:47:23.354 Z,
key: 'P195910',
value: {"id": "P195910", "price": "1511.64"}
While trying to use a transformation with ExtractField or ValueToKey I get:
DataException: Field does not exist: id:
My try with instruction containing ValueToKey:
create source connector mysql with(
"connector.class" = 'io.debezium.connector.mysql.MySqlConnector',
"database.hostname" = 'mysql',
"tasks.max" = '1',
"database.port" = '3306',
"database.user" = 'debezium',
"database.password" = 'dbz',
"database.server.id" = '42',
"database.server.name" = 'after',
"table.whitelist" = 'deepprices.deepprices',
"database.history.kafka.bootstrap.servers" = 'kafka:29092',
"database.history.kafka.topic" = 'dbz.deepprices',
"include.schema.changes" = 'true',
"key.converter" = 'org.apache.kafka.connect.json.JsonConverter',
"key.converter.schemas.enable" = 'TRUE',
"value.converter" = 'org.apache.kafka.connect.json.JsonConverter',
"value.converter.schemas.enable" = 'TRUE',
"transforms" = 'unwrap,createkey',
"transforms.unwrap.type" = 'io.debezium.transforms.UnwrapFromEnvelope',
"transforms.createkey.type" = 'org.apache.kafka.connect.transforms.ValueToKey',
"transforms.createkey.fields" = 'id'
);
Causes the following error in my Kafka-connect log:
Caused by: org.apache.kafka.connect.errors.DataException: Field does not exist: id
at org.apache.kafka.connect.transforms.ValueToKey.applyWithSchema(ValueToKey.java:89)
at org.apache.kafka.connect.transforms.ValueToKey.apply(ValueToKey.java:67)
Changing the transformation type from UnwrapFromEnvelope to ExtractNewRecordState, solved the issue on Debezium MySQL CDC Connector, version 1.1.0.
transforms.unwrap.type" = 'io.debezium.transforms.ExtractNewRecordState'
Since you're using ksqlDB here you'll want to set your source connector to write the key as a String:
key.converter=org.apache.kafka.connect.storage.StringConverter

Kafka jdbc connect sink: Is it possible to use pk.fields for fields in value and key?

The issue i'm having is that when jdbc sink connector consumes kafka message, the key variables when writing to db is null.
However, when i consume directly through the kafka-avro-consumer - I can see the key and value variables with it's values because I use this config: --property print.key=true.
ASK: is there away to make sure that jdbc connector is processing the message key variable values?
console kafka-avro config
/opt/confluent-5.4.1/bin/kafka-avro-console-consumer \
--bootstrap-server "localhost:9092" \
--topic equipmentidentifier.persist \
--property parse.key=true \
--property key.separator=~ \
--property print.key=true \
--property schema.registry.url="http://localhost:8081" \
--property key.schema=[$KEY_SCHEMA] \
--property value.schema=[$IDENTIFIER_SCHEMA,$VALUE_SCHEMA]
error:
org.apache.kafka.connect.errors.RetriableException: java.sql.SQLException: java.sql.BatchUpdateException: Batch entry 0 INSERT INTO "assignment_table" ("created_date","custome
r","id_type","id_value") VALUES('1970-01-01 03:25:44.567+00'::timestamp,123,'BILL_OF_LADING','BOL-123') was aborted: ERROR: null value in column "equipment_ide
ntifier_type" violates not-null constraint
Detail: Failing row contains (null, null, null, null, 1970-01-01 03:25:44.567, 123, id, 56). Call getNextException to see other errors in the batch.
org.postgresql.util.PSQLException: ERROR: null value in column "equipment_identifier_type" violates not-null constraint
Sink config:
task.max=1
topic=assignment
connect.class=io.confluet.connect.jdbc.JdbcSinkConnector
connection.url=jdbc:postgresql://localhost:5432/db
connection.user=test
connection.password=test
table.name.format=assignment_table
auto.create=false
insert.mode=insert
pk.fields=customer,equip_Type,equip_Value,id_Type,id_Value,cpId
transforms=flatten
transforms.flattenKey.type=org.apache.kafka.connect.transforms.Flatten$Key
transforms.flattenKey.delimiter=_
transforms.flattenKey.type=org.apache.kafka.connect.transforms.Flatten$Value
transforms.flattenKey.delimiter=_
Kafka key:
{
"assignmentKey": {
"cpId": {
"long": 1001
},
"equip": {
"Identifier": {
"type": "eq",
"value": "eq_45"
}
},
"vendorId": {
"string": "vendor"
}
}
}
Kafka value:
{
"assigmentValue": {
"id": {
"Identifier": {
"type": "id",
"value": "56"
}
},
"timestamp": {
"long": 1234456756
},
"customer": {
"long": 123
}
}
}
You need to tell the connector to use fields from the key, because by default it won't.
pk.mode=record_key
However you need to use fields from either the Key or the Value, not both as you have in your config currently:
pk.fields=customer,equip_Type,equip_Value,id_Type,id_Value,cpId
If you set pk.mode=record_key then pk.fields will refer to the fields in the message key.
Ref: https://docs.confluent.io/current/connect/kafka-connect-jdbc/sink-connector/sink_config_options.html#sink-pk-config-options
See also https://rmoff.dev/kafka-jdbc-video

CDC with WSO2 Streaming Integrator and Postgres DB

I am trying to setup Change Data Capture (CDC) between WSO2 Streaming Integrator and a local Postgres DB.
I have added the Postgres Driver (v42.2.5) to SI_HOME/lib and I am able to read data from the database from a Siddhi application.
I am following the CDCWithListeningMode example to implement CDC and I am using pgoutput as the logical decoding plugin. But when I run the application I get the following log.
[2020-04-23_19-02-37_460] INFO {org.apache.kafka.connect.json.JsonConverterConfig} - JsonConverterConfig values:
converter.type = key
schemas.cache.size = 1000
schemas.enable = true
[2020-04-23_19-02-37_461] INFO {org.apache.kafka.connect.json.JsonConverterConfig} - JsonConverterConfig values:
converter.type = value
schemas.cache.size = 1000
schemas.enable = false
[2020-04-23_19-02-37_461] INFO {io.debezium.embedded.EmbeddedEngine$EmbeddedConfig} - EmbeddedConfig values:
access.control.allow.methods =
access.control.allow.origin =
bootstrap.servers = [localhost:9092]
header.converter = class org.apache.kafka.connect.storage.SimpleHeaderConverter
internal.key.converter = class org.apache.kafka.connect.json.JsonConverter
internal.value.converter = class org.apache.kafka.connect.json.JsonConverter
key.converter = class org.apache.kafka.connect.json.JsonConverter
listeners = null
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
offset.flush.interval.ms = 60000
offset.flush.timeout.ms = 5000
offset.storage.file.filename =
offset.storage.partitions = null
offset.storage.replication.factor = null
offset.storage.topic =
plugin.path = null
rest.advertised.host.name = null
rest.advertised.listener = null
rest.advertised.port = null
rest.host.name = null
rest.port = 8083
ssl.client.auth = none
task.shutdown.graceful.timeout.ms = 5000
value.converter = class org.apache.kafka.connect.json.JsonConverter
[2020-04-23_19-02-37_516] INFO {io.debezium.connector.common.BaseSourceTask} - offset.storage = io.siddhi.extension.io.cdc.source.listening.InMemoryOffsetBackingStore
[2020-04-23_19-02-37_517] INFO {io.debezium.connector.common.BaseSourceTask} - database.server.name = localhost_5432
[2020-04-23_19-02-37_517] INFO {io.debezium.connector.common.BaseSourceTask} - database.port = 5432
[2020-04-23_19-02-37_517] INFO {io.debezium.connector.common.BaseSourceTask} - table.whitelist = SweetProductionTable
[2020-04-23_19-02-37_517] INFO {io.debezium.connector.common.BaseSourceTask} - cdc.source.object = 1716717434
[2020-04-23_19-02-37_517] INFO {io.debezium.connector.common.BaseSourceTask} - database.hostname = localhost
[2020-04-23_19-02-37_518] INFO {io.debezium.connector.common.BaseSourceTask} - database.password = ********
[2020-04-23_19-02-37_518] INFO {io.debezium.connector.common.BaseSourceTask} - name = CDCWithListeningModeinsertSweetProductionStream
[2020-04-23_19-02-37_518] INFO {io.debezium.connector.common.BaseSourceTask} - server.id = 6140
[2020-04-23_19-02-37_519] INFO {io.debezium.connector.common.BaseSourceTask} - database.history = io.debezium.relational.history.FileDatabaseHistory
[2020-04-23_19-02-38_103] INFO {io.debezium.connector.postgresql.PostgresConnectorTask} - user 'user_name' connected to database 'db_name' on PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit with roles:
role 'user_name' [superuser: false, replication: true, inherit: true, create role: false, create db: false, can log in: true] (Encoded)
[2020-04-23_19-02-38_104] INFO {io.debezium.connector.postgresql.PostgresConnectorTask} - No previous offset found
[2020-04-23_19-02-38_104] INFO {io.debezium.connector.postgresql.PostgresConnectorTask} - Taking a new snapshot of the DB and streaming logical changes once the snapshot is finished...
[2020-04-23_19-02-38_105] INFO {io.debezium.util.Threads} - Requested thread factory for connector PostgresConnector, id = localhost_5432 named = records-snapshot-producer
[2020-04-23_19-02-38_105] INFO {io.debezium.util.Threads} - Requested thread factory for connector PostgresConnector, id = localhost_5432 named = records-stream-producer
[2020-04-23_19-02-38_293] INFO {io.debezium.connector.postgresql.connection.PostgresConnection} - Obtained valid replication slot ReplicationSlot [active=false, latestFlushedLSN=null]
[2020-04-23_19-02-38_704] ERROR {io.siddhi.core.stream.input.source.Source} - Error on 'CDCWithListeningMode'. Connection to the database lost. Error while connecting at Source 'cdc' at 'insertSweetProductionStream'. Will retry in '5 sec'. (Encoded)
io.siddhi.core.exception.ConnectionUnavailableException: Connection to the database lost.
at io.siddhi.extension.io.cdc.source.CDCSource.lambda$connect$1(CDCSource.java:424)
at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:793)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.connect.errors.ConnectException: Cannot create replication connection
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.(PostgresReplicationConnection.java:87)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.(PostgresReplicationConnection.java:38)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$ReplicationConnectionBuilder.build(PostgresReplicationConnection.java:362)
at io.debezium.connector.postgresql.PostgresTaskContext.createReplicationConnection(PostgresTaskContext.java:65)
at io.debezium.connector.postgresql.RecordsStreamProducer.(RecordsStreamProducer.java:81)
at io.debezium.connector.postgresql.RecordsSnapshotProducer.(RecordsSnapshotProducer.java:70)
at io.debezium.connector.postgresql.PostgresConnectorTask.createSnapshotProducer(PostgresConnectorTask.java:133)
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:86)
at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:45)
at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:677)
... 3 more
Caused by: io.debezium.jdbc.JdbcConnectionException: ERROR: could not access file "decoderbufs": No such file or directory
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.initReplicationSlot(PostgresReplicationConnection.java:145)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.(PostgresReplicationConnection.java:79)
... 12 more
Caused by: org.postgresql.util.PSQLException: ERROR: could not access file "decoderbufs": No such file or directory
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:307)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:293)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:270)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:266)
at org.postgresql.replication.fluent.logical.LogicalCreateSlotBuilder.make(LogicalCreateSlotBuilder.java:48)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.initReplicationSlot(PostgresReplicationConnection.java:108)
... 13 more
Debezium defaults to decoderbufs plugin - "could not access file "decoderbufs": No such file or directory".
According to this answer, the issue is due to the configuration of decoderbufs plugin.
Details
Postgres - 11.4
siddhi-cdc-io - 2.0.3
Debezium - 0.8.3
How do I configure the embedded debezium engine to use the pgoutput plugin? Will changing this configuration fix the error?
Please help me with this issue. I have not found any resources that can help me.
you either need to update the Debezium to the latest 1.1 version - this will enable you to use pgoutput plugin using plugin.name config option or you need to deploy (and maybe build) decoderbufs.so library to your PostgreSQL database.
I'd recommend the former as 0.8.3 is very old version.
I observed this behavior with PostgreSQL 12 when I tried to do CDC with pgoutput logical decoding output plug-in. It seems like even though I configured the database with pgoutput, the siddhi extension is trying to make the connection using "decoderbufs" as decoding plug-in.
When I tried configuring decoderbufs as the logical decoding output plug-in in the database level, I was able to use siddhi io extension without any issue.
It seems like for now, Siddhi io CDC only supports decoderbufs logical decoding output plug-in with PostgreSQL.