debezium SERVER app could not access file "decoderbufs" using postgres 11 - postgresql

I'm trying to setup a debezium server with Postgres and AWS Kinesis using the following instructions: https://debezium.io/documentation/reference/stable/operations/debezium-server.html
and facing with the issue while executing sh run.sh:
io.debezium.DebeziumException: Creation of replication slot failed
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:143)
at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:130)
at io.debezium.embedded.EmbeddedEngine.run(EmbeddedEngine.java:759)
at io.debezium.embedded.ConvertingEngineBuilder$2.run(ConvertingEngineBuilder.java:188)
at io.debezium.server.DebeziumServer.lambda$start$1(DebeziumServer.java:147)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: org.postgresql.util.PSQLException: ERROR: could not access file "decoderbufs": No such file or directory
I found some solution to add plugin.name=pgoutput property inside configuration file for postgres connector, but it works only for Debezium Connector, not for Debezium Server app.
My Debezium Server application.properties file:
debezium.sink.type=kinesis
debezium.sink.kinesis.region=us-east-1
debezium.source.connector.class=io.debezium.connector.postgresql.PostgresConnector
debezium.source.offset.storage.file.filename=data/offsets.dat
debezium.source.offset.flush.interval.ms=0
debezium.source.database.hostname=localhost
debezium.source.database.port=5432
debezium.source.database.user=postgres
debezium.source.database.password=postgres
debezium.source.database.dbname=dbzm
debezium.source.database.server.name=debezium_cdc
debezium.source.schema.include.list=business_view
debezium.source.table.include.list=inventory
quarkus.log.console.json=false
debezium.snapshot.new.tables=parallel

This variant of the property is working fine with Debezium Server:
debezium.source.plugin.name=pgoutput

Related

Kafka connect S3 source failing with read-only registry

I am trying to read avro records stored in S3 in order to put them back in a kafka topic using the S3 source provided by confluent.
I already have the topics and the registry setup with the right schemas but when the connect S3 source tries to serialize the my records to the topics I get this error
Caused by: org.apache.kafka.common.errors.SerializationException:
Error registering Avro schema: ... at
io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:121)
at
io.confluent.connect.avro.AvroConverter$Serializer.serialize(AvroConverter.java:143)
at
io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:84)
... 15 more Caused by:
io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException:
Subject com-row-count-value is in read-only mode; error code: 42205
at
io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:292)
at
io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:352)
at
io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:495)
at
io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:486)
at
io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:459)
at
io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:214)
at
io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:276)
at
io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:252)
at
io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:75)
it seems that the connect producer does not try to get the schema_id if it exists but tries to write it but my registry is readonly.
Anyone knows if this is an issue or there are some configuration I am missing ?
If you're sure the correct schema for that subject is already registered by some other means, you can try to set auto.register.schemas to false in the serializer configuration.
See here for more details: https://docs.confluent.io/platform/current/schema-registry/serdes-develop/index.html#handling-differences-between-preregistered-and-client-derived-schemas

NoClassDefFoundError while running kafka connect for mongodb in standalone mode

I am trying to set up debezium connector with MongoDB in a standalone mode on my local machine by following this.
I have set up a replica set of MongoDB with 3 nodes, 1 master 2 replica (host = localhost, port = 27017, 27018, 27019) . I have started kafka and zookeper on the local machine as a quickstart guide.
After this, I downloaded the MongoDB connector plugin jar from here.
Set plugin.path variable to mongodb plugin jars as :
plugin.path=dbz_connector_mongodb_jar_path,KAFKA_HOME/libs
created connector config for standalone mode.
KAFKA_HOME/config/my_mongo_connector.properties :
{
"name": "my-mongo-connector",
"config": {
"connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
"mongodb.hosts": "rc0/127.0.0.1:27017",
"mongodb.name": "myMongoConnceter",
"collection.include.list": "dbname.collectionName"
}
}
Now when I run kafka connect with below command :
cd KAFKA_HOME
bin/connect-standalone.sh config/connect-standalone.properties config/my_mongo_connector.properties
I see below error :
java.lang.NoClassDefFoundError: com/mongodb/MongoException
at java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3137)
at java.base/java.lang.Class.getConstructor0(Class.java:3342)
at java.base/java.lang.Class.newInstance(Class.java:556)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.versionFor(DelegatingClassLoader.java:392)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.getPluginDesc(DelegatingClassLoader.java:362)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:334)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:268)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:260)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:229)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:206)
at org.apache.kafka.connect.runtime.isolation.Plugins.<init>(Plugins.java:61)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:78)
Caused by: java.lang.ClassNotFoundException: com.mongodb.MongoException
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 13 more
Then I added mongo-java-driver-3.12.10.jar file to the kafka connect plugins folder which has this class com.mongodb.MongoException . But still, I face the same error.
I have also found that this jar is loading while running kafka connect command :
INFO Loading plugin from: debezium-connector-mongodb_path/mongo-java-driver-3.12.10.jar (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:246)
INFO Registered loader: PluginClassLoader{pluginLocation=file:debezium-connector-mongodb_path/mongo-java-driver-3.12.10.jar} (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:269)
EDIT :
My problem of NoClassDefFoundError was solved by correcting plugin.path.
Earlier it was /opt/debezium-connector-mongodb and I changed it to /opt , as suggested in the comment.
My another mistake : I was taking connector propeties file in json format, so changed it into properties file format :
name:my-mongo-connector
connector.class:io.debezium.connector.mongodb.MongoDbConnector
mongodb.hosts=rc0/127.0.0.1:27017
mongodb.name=myMongoConnector
collection.include.list=dbname.collectionName
topic.creation.default.replication.factor=1
topic.creation.default.partitions=2
mongodb.members.auto.discover=false
Now kafka connect started but on insertion of new entry in mongo collection, kafka topic is not creating.
So now the problem is unable to create kafka topic even after data insertion in DB.

Kafka Sink: ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:130)

Am trying to stream data from one stream file to another file. It was working earlier and suddenly it providing the error as ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:130). Have restarted the zookeeper, kafka-server, schema-registry, source and sink connectors, but still am facing same issue and unable to resolve it. Any suggestion would be helpful.
Source connector:
name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=/home/jimmacaulay/Desktop/ETL/Kafka/confluent-5.5.1/data/data/Jim_Source.csv
topic=Jim
Sink connector:
name=local-file-sink
connector.class=FileStreamSink
tasks.max=1
file=/home/jimmacaulay/Desktop/ETL/Kafka/confluent-5.5.1/data/data/Jim_Sink.csv
topics=Jim
Error:
[2020-08-18 06:25:50,482] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:130)
org.apache.kafka.connect.errors.ConnectException: Unable to initialize REST server
at org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:217)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:87)
Caused by: java.io.IOException: Failed to bind to 0.0.0.0/0.0.0.0:8083
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:346)
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:307)
at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:231)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at org.eclipse.jetty.server.Server.doStart(Server.java:385)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:215)
... 1 more
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:220)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:85)
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:342)
... 8 more
Resolved the error by starting the connect-standalone using source and sink properties together.
sh connect-standalone ../config/connect-avro-standalone.properties ../config/connect-file-source_Topic_Jim.properties ../config/connect-file-sink_Topic_Jim.properties
Earlier i was starting it separately as below,
sh connect-standalone ../config/connect-avro-standalone.properties ../config/connect-file-source_Topic_Jim.properties
sh connect-standalone ../config/connect-avro-standalone.properties ../config/connect-file-sink_Topic_Jim.properties
Cause of the issue,
When am starting separately connect-standalone is getting started first for source properties using the port number 8083. Again when am starting the sink properties, it tries to use same port number and failes.
Solutions,
Both source and sink properties should be while starting connect-standalone which shares the same port.
Or define the different port numbers in the properties file and start it separately

vertica integration with apache kafka error

Could you please help me with the following issue?
I have an installed vertica 9.3 on one host and installed apache kafka on second host. I want integration vertica with apache kafka (2.4.0). I configure by official vertica's guide, but, when I try to make source:
vkconfig source --create --cluster kafka_weblog --source test --partitions 1 --conf /opt/vertica/packages/kafka/config/my.conf
I get error:
Exception in thread "main" com.vertica.solutions.kafka.exception.ConfigurationException: ERROR: [[Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_3_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]]
at com.vertica.solutions.kafka.model.StreamSource.validateConfiguration(StreamSource.java:248)
at com.vertica.solutions.kafka.model.StreamSource.setFromMapAndValidate(StreamSource.java:194)
at com.vertica.solutions.kafka.model.StreamModel.<init>(StreamModel.java:93)
at com.vertica.solutions.kafka.model.StreamSource.<init>(StreamSource.java:44)
at com.vertica.solutions.kafka.cli.SourceCLI.getNewModel(SourceCLI.java:62)
at com.vertica.solutions.kafka.cli.SourceCLI.getNewModel(SourceCLI.java:13)
at com.vertica.solutions.kafka.cli.CLI.run(CLI.java:59)
at com.vertica.solutions.kafka.cli.CLI._main(CLI.java:141)
at com.vertica.solutions.kafka.cli.SourceCLI.main(SourceCLI.java:29)
Caused by: java.sql.SQLNonTransientException: [Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_3_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]
at com.vertica.util.ServerErrorData.buildException(Unknown Source)
at com.vertica.dataengine.VResultSet.fetchChunk(Unknown Source)
at com.vertica.dataengine.VResultSet.initialize(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.readExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.handleExecuteResponse(Unknown Source)
at com.vertica.dataengine.VQueryExecutor.execute(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
at com.vertica.jdbc.common.SPreparedStatement.executeQuery(Unknown Source)
at com.vertica.solutions.kafka.model.StreamSource.validateConfiguration(StreamSource.java:227)
... 8 more
Caused by: com.vertica.support.exceptions.NonTransientException: [Vertica][VJDBC](5861) ERROR: Error calling processPartition() in User Function KafkaListTopics at [/data/qb_workspaces/jenkins2/ReleaseBuilds/Grader/REL-9_3_1-x_grader/build/udx/supported/kafka/KafkaUtil.cpp:163], error code: 0, message: Error getting metadata: [Local: Broker transport failure]
If I show table weblog_sched.stream_clusters, then there is localhost:9092 in the column hosts, but not my ip-address of my kafka server (192.168.0.8), although when I created the cluster I specified the address of the kafka server:
vkconfig cluster --create --cluster kafka_weblog --hosts 192.168.0.8:9092 --conf /opt/vertica/packages/kafka/config/my.conf
Why is that? (I think, that this error is associated with an incorrect entry in weblog_sched.stream_clusters)
There are few ways of debugging this..
Try telnet 192.168.0.8 9092. If you are able to telnet, then the port is open and Kafka is running and is accessible from the place where you are executing the telnet command. If you are not able to connect via telnet, then you can try this..
Check your firewall or iptables rules and allow 9092 port on that machine.
Ex: iptables -A INPUT -p tcp --dport 9092 -j ACCEPT
If you are still not able to access it, try disabling or allowing that port in your firewall in your system (like ufw, firewalld) etc'
Check your advertised.listeners broker configuration. Ensure that you have set it to PLAINTEXT://192.168.0.8:9092 (if you are using the PLAINTEXT). This most likely may solve your problem.
Make sure to add Kafka hosts in /etc/hosts file of Vertica Cluster. The issue should be resolved. If this is the case, then it means that your advertised.listeners have been configured incorrectly.
Hardcoding your hosts under /etc/hosts is definitely not the best approach. You must fix your advertised.listeners as a more permanent solution.

Debezium postgres ERROR: parameter "include-unchanged-toast" was deprecated

I am doing a load test on Debezium postgres connector at the moment to know if it can support very massive amounts (in terms of billions) of changes logs in Aurora Postgres.
When I insert 1 million records to the postgres table, Debezium Postgres connector failed with following error messages:
org.apache.kafka.connect.errors.ConnectException: An exception ocurred in the change event producer. This connector will be stopped.
at io.debezium.connector.base.ChangeEventQueue.throwProducerFailureIfPresent(ChangeEventQueue.java:170)
at io.debezium.connector.base.ChangeEventQueue.poll(ChangeEventQueue.java:151)
at io.debezium.connector.postgresql.PostgresConnectorTask.poll(PostgresConnectorTask.java:188)
at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:259)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:226)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.postgresql.util.PSQLException: ERROR: parameter "include-unchanged-toast" was deprecated
Where: slot "wal2json_dbz5", output plugin "wal2json", in the startup callback
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440)
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1116)
at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
at org.postgresql.core.v3.replication.V3PGReplicationStream.readPending(V3PGReplicationStream.java:78)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.readPending(PostgresReplicationConnection.java:401)
at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:94)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:91)
It seems connector does not support include-unchanged-toast anymore. Is there any workaround to fix this issue?
You can either get Debrezium fixed or you can use an old version of wal2json from before the option was removed.
The GIT snapshot of wal2json from just before the option was removed is here.
Be warned, though, that the option was removed for a good reason.