Debezium postgres ERROR: parameter "include-unchanged-toast" was deprecated - postgresql

I am doing a load test on Debezium postgres connector at the moment to know if it can support very massive amounts (in terms of billions) of changes logs in Aurora Postgres.
When I insert 1 million records to the postgres table, Debezium Postgres connector failed with following error messages:
org.apache.kafka.connect.errors.ConnectException: An exception ocurred in the change event producer. This connector will be stopped.
at io.debezium.connector.base.ChangeEventQueue.throwProducerFailureIfPresent(ChangeEventQueue.java:170)
at io.debezium.connector.base.ChangeEventQueue.poll(ChangeEventQueue.java:151)
at io.debezium.connector.postgresql.PostgresConnectorTask.poll(PostgresConnectorTask.java:188)
at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:259)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:226)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.postgresql.util.PSQLException: ERROR: parameter "include-unchanged-toast" was deprecated
Where: slot "wal2json_dbz5", output plugin "wal2json", in the startup callback
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440)
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1116)
at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
at org.postgresql.core.v3.replication.V3PGReplicationStream.readPending(V3PGReplicationStream.java:78)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.readPending(PostgresReplicationConnection.java:401)
at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:94)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:91)
It seems connector does not support include-unchanged-toast anymore. Is there any workaround to fix this issue?

You can either get Debrezium fixed or you can use an old version of wal2json from before the option was removed.
The GIT snapshot of wal2json from just before the option was removed is here.
Be warned, though, that the option was removed for a good reason.

Related

Running multiple Debezium connectors on the same source MariaDB

We have multiple MariaDB schemas and for each of those running two debezium connectors. Everything runs fine for a while but then every 1-2 weeks or so debezium error on random connector occurs:
2022-10-31 06:18:55,106 ERROR MySQL|scheme_1|binlog Error during binlog processing. Last offset stored = {transaction_id=null, ts_sec=1667155787, file=mysql-bin.075628, pos=104509320, server_id=1, event=32}, binlog reader near position = mysql-bin.075628/300573885 [io.debezium.connector.mysql.MySqlStreamingChangeEventSource]
2022-10-31 06:18:55,107 ERROR MySQL|scheme_1|binlog Producer failure [io.debezium.pipeline.ErrorHandler]
io.debezium.DebeziumException: Connection reset
at io.debezium.connector.mysql.MySqlStreamingChangeEventSource.wrap(MySqlStreamingChangeEventSource.java:1189)
at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener.onCommunicationFailure(MySqlStreamingChangeEventSource.java:1234)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:980)
at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599)
at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:857)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.net.SocketException: Connection reset
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:186)
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
at com.github.shyiko.mysql.binlog.io.BufferedSocketInputStream.read(BufferedSocketInputStream.java:59)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readWithinBlockBoundaries(ByteArrayInputStream.java:261)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:245)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.fill(ByteArrayInputStream.java:112)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:105)
at com.github.shyiko.mysql.binlog.BinaryLogClient.readPacketSplitInChunks(BinaryLogClient.java:995)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:953)
... 3 more
2022-10-31 06:18:55,113 INFO MySQL|scheme_1|binlog Stopped reading binlog after 0 events, last recorded offset: {transaction_id=null, ts_sec=1667155787, file=mysql-bin.075628, pos=104509320, server_id=1, event=32} [io.debezium.connector.mysql.MySqlStreamingChangeEventSource]
2022-10-31 06:18:55,123 ERROR || WorkerSourceTask{id=scheme_1-connector-1666100046785939106-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted [org.apache.kafka.connect.runtime.WorkerTask]
org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped.
at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:50)
at io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener.onCommunicationFailure(MySqlStreamingChangeEventSource.java:1234)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:980)
at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:599)
at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:857)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.debezium.DebeziumException: Connection reset
at io.debezium.connector.mysql.MySqlStreamingChangeEventSource.wrap(MySqlStreamingChangeEventSource.java:1189)
... 5 more
Caused by: java.net.SocketException: Connection reset
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:186)
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
at com.github.shyiko.mysql.binlog.io.BufferedSocketInputStream.read(BufferedSocketInputStream.java:59)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.readWithinBlockBoundaries(ByteArrayInputStream.java:261)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:245)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.fill(ByteArrayInputStream.java:112)
at com.github.shyiko.mysql.binlog.io.ByteArrayInputStream.read(ByteArrayInputStream.java:105)
at com.github.shyiko.mysql.binlog.BinaryLogClient.readPacketSplitInChunks(BinaryLogClient.java:995)
at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:953)
... 3 more
2022-10-31 06:18:55,132 INFO || Stopping down connector [io.debezium.connector.common.BaseSourceTask]
This must be related to fact that we have two connectors attached, because there are no problems if there's one connector per schema.
MariaDB server didn't go down because we have another connector on the same server and it wasn't affected.
It seems unlikely that two independent connectors would crash at exactly the same binlog position because of each others presence.
ts_sec=1667155787, file=mysql-bin.075628, pos=104509320, server_id=1, event=32
Take the mariadb-binlog --start-position=104509320 mysql-bin.075628 from that position (just one full entry is probably sufficient) and raise a bug report (if one doesn't already exist).

Kafka Connect 'ExtractField$Key' SMT results in 'Unknown Field' error

I have a setup of Debezium connector (running on ksqlDB-server) that's streaming values from SQL Server CDC Tables to Kafka Topics. I'm trying to transform the key of my message from JSON to Integer value. The example key I'm receiving looks like this: {"InternalID":11117} and I want to represent it as just a number 11117. According to Kafka Connect documentation this should be fairly easy with ExtractField SMT. However when I'm configuring my connector to use this transform I'm receiving an error Caused by: java.lang.IllegalArgumentException: Unknown field: InternalID.
Connector config:
CREATE SOURCE CONNECTOR properties_sql_connector WITH (
'connector.class'= 'io.debezium.connector.sqlserver.SqlServerConnector',
'database.hostname'= 'propertiessql',
'database.port'= '1433',
'database.user'= 'XXX',
'database.password'= 'XXX',
'database.dbname'= 'Properties',
'database.server.name'= 'properties',
'table.exclude.list'= 'dbo.__EFMigrationsHistory',
'database.history.kafka.bootstrap.servers'= 'kafka:9091',
'database.history.kafka.topic'= 'dbhistory.properties',
'key.converter.schemas.enable'= 'false',
'transforms'= 'unwrap,extractField',
'transforms.unwrap.type'= 'io.debezium.transforms.ExtractNewRecordState',
'transforms.unwrap.delete.handling.mode'= 'none',
'transforms.extractField.type'= 'org.apache.kafka.connect.transforms.ExtractField$Key',
'transforms.extractField.field'= 'InternalID',
'key.converter'= 'org.apache.kafka.connect.json.JsonConverter');
Error details:
--------------------------------------------------------------------------------------------------------------------------------------
0 | FAILED | org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:223)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:149)
at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:50)
at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:355)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:258)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:188)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IllegalArgumentException: Unknown field: InternalID
at org.apache.kafka.connect.transforms.ExtractField.apply(ExtractField.java:65)
at org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:173)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:207)
... 11 more
Any ideas for why this transform is failing? Am I missing some configuration?
When extractField transform is removed the key of my message looks like the above:{"InternalID":11117}
In order to extract a named field from JSON, you'll need schemas.enable = 'true' for that converter
For any data that's not sourced from Debezium, that'll require the JSON has a schema as part of the event.
Or, if you're using the Schema Registry, switch to a different converter that uses that, and it should work.

debezium SERVER app could not access file "decoderbufs" using postgres 11

I'm trying to setup a debezium server with Postgres and AWS Kinesis using the following instructions: https://debezium.io/documentation/reference/stable/operations/debezium-server.html
and facing with the issue while executing sh run.sh:
io.debezium.DebeziumException: Creation of replication slot failed
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:143)
at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:130)
at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:759)
at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:188)
at io.debezium.server.DebeziumServer.lambda$start$1(DebeziumServer.java:147)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: org.postgresql.util.PSQLException: ERROR: could not access file "decoderbufs": No such file or directory
I found some solution to add plugin.name=pgoutput property inside configuration file for postgres connector, but it works only for Debezium Connector, not for Debezium Server app.
My Debezium Server application.properties file:
debezium.sink.type=kinesis
debezium.sink.kinesis.region=us-east-1
debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
debezium.source.offset.storage.file.filename=data/offsets.dat
debezium.source.offset.flush.interval.ms=0
debezium.source.database.hostname=localhost
debezium.source.database.port=5432
debezium.source.database.user=postgres
debezium.source.database.password=postgres
debezium.source.database.dbname=dbzm
debezium.source.database.server.name=debezium_cdc
debezium.source.schema.include.list=business_view
debezium.source.table.include.list=inventory
quarkus.log.console.json=false
debezium.snapshot.new.tables=parallel
This variant of the property is working fine with Debezium Server:
debezium.source.plugin.name=pgoutput

io.debezium.DebeziumException: The db history topic or its content is fully or partially missing

I am facing frequent issues related to db history topic which is created by the connector itself. There is a temporary solution (by changing the name of the db history topic) which I tried but it's not the better way to handle it. Also, the retention byte is set to -1.
This is the error stack.
ERROR WorkerSourceTask{id=cdcit.ventures.sandbox.streamdomain.streamsubdomain.order-filter-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
io.debezium.DebeziumException: The db history topic or its content is fully or partially missing. Please check database history topic configuration and re-execute the snapshot.
at io.debezium.relational.HistorizedRelationalDatabaseSchema.recover(HistorizedRelationalDatabaseSchema.java:47)
at io.debezium.connector.sqlserver.SqlServerConnectorTask.start(SqlServerConnectorTask.java:87)
at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:101)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:213)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:184)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:234)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2020-09-04 19:12:26,445] ERROR WorkerSourceTask{id=cdcit.ventures.sandbox.streamdomain.streamsubdomain.order-filter-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
You must use a single database history topic per connector. The topic must not be used by more than one connector.
please change the value of parameter "name" in the config "connector.properties" to a new name.
Thanks.

JDBC Sink connector throwing java.sql.BatchUpdateException

I started a Sink JDBC some weeks ago. Everything was fine until the logs started to thow this error:
[2019-06-27 11:35:44,121] WARN Write of 500 records failed, remainingRetries=10 (io.confluent.connect.jdbc.sink.JdbcSinkTask:68)
java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 16.20.00.10] [Error 1338] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. Details of the failure can be found in the exception chain that is accessible with getNextException.
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:149)
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:138)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:276)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatch(TDPreparedStatement.java:2754)
at io.confluent.connect.jdbc.sink.BufferedRecords.flush(BufferedRecords.java:99)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:78)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:62)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:66)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:429)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have already tried to low the batch.size property, evento as low as 100 and it is still failing.
Added connector status:
{"name":"teradata-sink-K_C_OSUSR_DGL_DFORM_I1-V2",
"connector":{
"state":"RUNNING",
"worker_id":"10.28.148.64:41029"},
"tasks":[{"state":"FAILED","trace":"org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:451)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)\n","id":0,"worker_id":"10.28.148.64:41029"}]}
I faced similar issue while trying to write to teradata from spark using JDBC. Apparently, this happens with teradata only. It happens when you try to write to one table using multiple JDBC connections parallely. In my case, spark was forking multiple jdbc connection to teradata while writing the data to Teradata.
Your options are:
You have to restrict your application not to make multiple jdbc
connections, you can monitor that from Viewpoint if you have access.
You can try providing a jdbc type=FASTLOAD and see if it is working.
like this
Or go for TPT which handles this issue internally.