I followed Debezium tutorial (https://github.com/debezium/debezium-examples/tree/master/tutorial#using-postgres) and all received CDC data from Postgres are sent to Kafka topic in JSON format with schema - how to get rid of schema?
Here is config of connector (launched in Docker container)
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "postgres",
"database.password": "postgres",
"database.dbname" : "postgres",
"database.server.name": "dbserver1",
"schema.include": "inventory"
}
}
The JSON schema is still in message.
I managed to get rid of it only when launched Docker container with following environment variables:
- CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false
- CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false
Why I cannot achieve exactly the same from connector configuration?
Example of Kafka message with schema:
{"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"}],"optional":false,"name":"dbserver1.inventory.customers.Key"},"payload":{"id":1001}} {"schema":{"type":"struct","fields":[{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"first_name"},{"type":"string","optional":false,"field":"last_name"},{"type":"string","optional":false,"field":"email"}],"optional":true,"name":"dbserver1.inventory.customers.Value","field":"before"},{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":false,"field":"first_name"},{"type":"string","optional":false,"field":"last_name"},{"type":"string","optional":false,"field":"email"}],"optional":true,"name":"dbserver1.inventory.customers.Value","field":"after"},{"type":"struct","fields":[{"type":"string","optional":false,"field":"version"},{"type":"string","optional":false,"field":"connector"},{"type":"string","optional":false,"field":"name"},{"type":"int64","optional":false,"field":"ts_ms"},{"type":"string","optional":true,"name":"io.debezium.data.Enum","version":1,"parameters":{"allowed":"true,last,false"},"default":"false","field":"snapshot"},{"type":"string","optional":false,"field":"db"},{"type":"string","optional":false,"field":"schema"},{"type":"string","optional":false,"field":"table"},{"type":"int64","optional":true,"field":"txId"},{"type":"int64","optional":true,"field":"lsn"},{"type":"int64","optional":true,"field":"xmin"}],"optional":false,"name":"io.debezium.connector.postgresql.Source","field":"source"},{"type":"string","optional":false,"field":"op"},{"type":"int64","optional":true,"field":"ts_ms"},{"type":"struct","fields":[{"type":"string","optional":false,"field":"id"},{"type":"int64","optional":false,"field":"total_order"},{"type":"int64","optional":false,"field":"data_collection_order"}],"optional":true,"field":"transaction"}],"optional":false,"name":"dbserver1.inventory.customers.Envelope"},"payload":{"before":null,"after":{"id":1001,"first_name":"Sally","last_name":"Thomas","email":"sally.thomas#acme.com"},"source":{"version":"1.4.1.Final","connector":"postgresql","name":"dbserver1","ts_ms":1611918971029,"snapshot":"true","db":"postgres","schema":"inventory","table":"customers","txId":602,"lsn":34078720,"xmin":null},"op":"r","ts_ms":1611918971032,"transaction":null}}
Example (desired by me) w/o schema:
{"id":1001} {"before":null,"after":{"id":1001,"first_name":"Sally","last_name":"Thomas","email":"sally.thomas#acme.com"},"source":{"version":"1.4.1.Final","connector":"postgresql","name":"dbserver1","ts_ms":1611920304594,"snapshot":"true","db":"postgres","schema":"inventory","table":"customers","txId":597,"lsn":33809448,"xmin":null},"op":"r","ts_ms":1611920304596,"transaction":null}
Debezium container is run with following command:
docker run -it --name connect -p 8083:8083 -e GROUP_ID=1 -e CONFIG_STORAGE_TOPIC=my_connect_configs -e OFFSET_STORAGE_TOPIC=my_connect_offsets -e STATUS_STORAGE_TOPIC=my_connect_statuses -e CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false -e CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false --link zookeeper:zookeeper --link kafka:kafka --link mysql:mysql debezium/connect:1.3
or as docker-compose
connect:
image: debezium/connect:${DEBEZIUM_VERSION}
ports:
- 8083:8083
links:
- kafka
- postgres
environment:
- BOOTSTRAP_SERVERS=kafka:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_connect_statuses
- CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false
- CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false
CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE=false and CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE=false was added later by me, but without them I cannot get rid of schema.
connect docker container (Kafka connectors servers cluster - if I understood it correctly) is started without any connector.
I create it manually.
LOGs from docker-compose for connect when connector created
connect_1 | 2021-01-29 18:04:57,395 INFO || JsonConverterConfig values:
connect_1 | converter.type = key
connect_1 | decimal.format = BASE64
connect_1 | schemas.cache.size = 1000
connect_1 | schemas.enable = true
connect_1 | [org.apache.kafka.connect.json.JsonConverterConfig]
connect_1 | 2021-01-29 18:04:57,396 INFO || Set up the key converter class org.apache.kafka.connect.json.JsonConverter for task inventory-connector-0 using the worker config [org.apache.kafka.connect.runtime.Worker]
connect_1 | 2021-01-29 18:04:57,396 INFO || JsonConverterConfig values:
connect_1 | converter.type = value
connect_1 | decimal.format = BASE64
connect_1 | schemas.cache.size = 1000
connect_1 | schemas.enable = true
connect_1 | [org.apache.kafka.connect.json.JsonConverterConfig]
...
connect_1 | 2021-01-29 18:04:57,458 INFO || Starting PostgresConnectorTask with configuration: [io.debezium.connector.common.BaseSourceTask]
connect_1 | 2021-01-29 18:04:57,460 INFO || key.converter.schemas.enable = false [io.debezium.connector.common.BaseSourceTask]
connect_1 | 2021-01-29 18:04:57,460 INFO || value.converter.schemas.enable = false [io.debezium.connector.common.BaseSourceTask]
Here are get connector command output:
$ curl -i http://localhost:8083/connectors/inventory-connector
{"name":"inventory-connector","config":{"connector.class":"io.debezium.connector.postgresql.PostgresConnector",**"key.converter.schemas.enable":"false"**,"database.user":"postgres","database.dbname":"postgres","tasks.max":"1","database.hostname":"postgres","database.password":"postgres",**"value.converter.schemas.enable":"false"**,"name":"inventory-connector","database.server.name":"dbserver1","database.port":"5432","schema.include":"inventory"},"tasks":[{"connector":"inventory-connector","task":0}],"type":"source"}
I reproduced this example. As #OneCricketeer mentioned in the comment, you have to explicitly add JsonConverter:
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable": "false",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "postgres",
"database.password": "postgres",
"database.dbname" : "postgres",
"database.server.name": "dbserver1",
"schema.include": "inventory"
}
}
I need to create kafka source connector for REST API with Header authentication like
curl -H "Authorization: Basic " -H "clientID: " "https:< url for source> " .
I am using apache kafka , I used connector class com.github.castorm.kafka.connect.http.HttpSourceConnector
Here is my json file for connector
{
"name": "rest_data6",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
"tasks.max": "1",
"http.request.headers": "Authorization: Basic <key1>",
"http.request.headers": "clientID: <key>",
"http.request.url": "https:<url for source ?",
"kafka.topic": "mysqltopic2"
}
}
Also I tried with "connector.class": "com.tm.kafka.connect.rest.RestSourceConnector", My joson file as below
"name": "rest_data2",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.tm.kafka.connect.rest.RestSourceConnector",
"rest.source.poll.interval.ms": "900",
"rest.source.method": "GET",
"rest.source.url":"URL of source ",
"tasks.max": "1",
"rest.source.headers": "Authorization: Basic <key> , clientId :<key2>",
"rest.source.topic.selector": "com.tm.kafka.connect.rest.selector.SimpleTopicSelector",
"rest.source.destination.topics": "mysql1"
}
}
But no hope . Any idea how to GET REST API data with authentication . My authentication parameter is
Authorization: Basic and Authorization: Basic .
Just for mention both the file are working with REST API without authentication , once I added authentication parameter then wither connector status is failed or It produce ":"Cannot route. Codebase/company is invalid"" message in topic.
Can any one suggest what is way to solve it
I mailed to original developer to Cástor Rodríguez. As per his solution I modified my json
Put header into a single form and it works
{
"name": "rest_data6",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
"tasks.max": "1",
"http.request.headers": "Authorization: Basic <key1>, clientID: <key>"
"http.request.url": "https:<url for source ?",
"kafka.topic": "mysqltopic2"
}
}
I am trying to run Kafka workers in distributed mode. Unlike standalone mode, we cannot pass the connector property file while starting the worker in distributed mode. In Distributed mode, workers are started separately and we deploy and manage the connectors on those workers using REST API
Reference Link - https://docs.confluent.io/current/connect/managing/configuring.html#connect-managing-distributed-mode
I tried building a connector by passing the below values in curl command and execued it
curl -X POST -H "Content-Type: application/json" --data '{"name":"sailpointdb","connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector","tasks.max":"1","connection.password " : " abc","connection.url " : "jdbc:mysql://localhost:3306/db","connection.user " : "abc" ,"query" : " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS
TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A","mode" : " timestamp","timestamp.column.name" : "TASKFAILEDON","topic.prefix" : "dbevents","validate.non.null" : "false" }}' http://localhost:8089/connectors/
I am getting below error - curl: (3) URL using bad/illegal format or missing URL
Please let me know what is wrong with the above curl statement, am i missing anything here
You had an extra closing curly brace in your JSON which won't help
If you're POSTing to /connectors you need the name and config root level elements. But, I recommend using PUT /config because you can re-run it to update the config if you need to
Try this:
curl -X PUT -H "Content-Type:application/json" \
http://localhost:8089/connectors/source-jdbc-sailpointdb-00/config \
-d '{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "1",
"connection.password ": " abc",
"connection.url ": "jdbc:mysql://localhost:3306/db",
"connection.user ": "abc",
"query": " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A",
"mode": " timestamp",
"timestamp.column.name": "TASKFAILEDON",
"topic.prefix": "dbevents",
"validate.non.null": "false"
}'
I want to publish multiple table data on to same Kafka topic using the below connector config, but I am seeing below exception
Exception
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema being registered is incompatible with an earlier schema; error code: 409
The connector seems to ignore the subject strategy properties set and keeps using the old ${topic}-key and ${topic}-value subjects.
[2019-04-25 22:43:45,590] INFO AvroConverterConfig values:
schema.registry.url = [http://schema-registry:8081]
basic.auth.user.info = [hidden]
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
schema.registry.basic.auth.user.info = [hidden]
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
Connector configuration
curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d '{
"name": "two-in-one-connector",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "xxxxxxx",
"database.port": "3306",
"database.user": "xxxxxxx",
"database.password": "xxxxxxxxx",
"database.server.id": "18405457",
"database.server.name": "xxxxxxxxxx",
"table.whitelist": "customers,phone_book",
"database.history.kafka.bootstrap.servers": "broker:9092",
"database.history.kafka.topic": "dbhistory.customer",
"transforms": "dropPrefix",
"transforms.dropPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropPrefix.regex":"(.*)",
"transforms.dropPrefix.replacement":"customer",
"key.converter.key.subject.name.strategy": "io.confluent.kafka.serializers.subject.TopicRecordNameStrategy",
"value.converter.value.subject.name.strategy": "io.confluent.kafka.serializers.subject.TopicRecordNameStrategy"
}
}'
Try setting strategy classes to below parameters in you connector configuration (JSON) file instead of "key.converter.key.subject.name.strategy" and "value.converter.value.subject.name.strategy"
"key.subject.name.strategy"
"value.subject.name.strategy"
I am dockerizing sensu infrastructure. Everything goes fine except the execution of checks.
I am using docker-compose according to this structure (docker-compose.yml):
sensu-core:
build: sensu-core/
links:
- redis
- rabbitmq
sensors-production:
build: sensors-production/
links:
- rabbitmq
uchiwa:
build: sensu-uchiwa
links:
- sensu-core
ports:
- "3000:3000"
rabbitmq:
build: rabbitmq/
redis:
image: redis
command: redis-server
My rabbitmq Dockerfile is pretty straightforward:
FROM ubuntu:latest
RUN apt-get -y install wget
RUN wget http://packages.erlang-solutions.com/erlang-solutions_1.0_all.deb
RUN dpkg -i erlang-solutions_1.0_all.deb
RUN wget http://www.rabbitmq.com/rabbitmq-signing-key-public.asc
RUN apt-key add rabbitmq-signing-key-public.asc
RUN echo "deb http://www.rabbitmq.com/debian/ testing main" | sudo tee /etc/apt/sources.list.d/rabbitmq.list
RUN apt-get update
RUN apt-get -y install erlang rabbitmq-server
CMD /etc/init.d/rabbitmq-server start && \
rabbitmqctl add_vhost /sensu && \
rabbitmqctl add_user sensu secret && \
rabbitmqctl set_permissions -p /sensu sensu ".*" ".*" ".*" && \
cd /var/log/rabbitmq/ && \
ls -1 * | xargs tail -f
So do the uchiwa Dockerfile:
FROM podbox/sensu
RUN apt-get -y install uchiwa
RUN echo ' \
{ \
"sensu": [ \
{ \
"name": "Sensu", \
"host": "sensu-core", \
"port": 4567, \
"timeout": 5 \
} \
], \
"uchiwa": { \
"host": "0.0.0.0", \
"port": 3000, \
"interval": 5 \
} \
}' > /etc/sensu/uchiwa.json
EXPOSE 3000
CMD /etc/init.d/uchiwa start && \
tail -f /var/log/uchiwa.log
Sensu core runs sensu-server & sensu-api. Here is his dockerfile:
FROM podbox/sensu
RUN apt-get -y install sensu
RUN echo '{ \
"rabbitmq": { \
"host": "rabbitmq", \
"vhost": "/sensu", \
"user": "sensu", \
"password": "secret" \
}, \
"redis": { \
"host": "redis", \
"port": 6379 \
}, \
"api": { \
"host": "localhost", \
"port": 4567 \
} \
}' >> /etc/sensu/config.json
CMD /etc/init.d/sensu-server start && \
/etc/init.d/sensu-api start && \
tail -f /var/log/sensu/sensu-server.log -f /var/log/sensu-api.log
sensors-production runs sensu-client along with a dumb metric, here is his Dockerfile:
FROM podbox/sensu
RUN apt-get -y install sensu
RUN echo '{ \
"rabbitmq": { \
"host": "rabbitmq", \
"vhost": "/sensu", \
"user": "sensu", \
"password": "secret" \
} \
}' >> /etc/sensu/config.json
RUN mkdir -p /etc/sensu/conf.d
RUN echo '{ \
"client": { \
"name": "wise_oracle", \
"address": "prod_sensors", \
"subscriptions": [ \
"web", "aws" \
] \
} \
' >> /etc/sensu/conf.d/client.json
RUN echo '{ \
"checks": { \
"dumb": { \
"command": "ls", \
"subscribers": [ \
"web" \
], \
"interval": 10 \
} \
} \
}' >> /etc/sensu/conf.d/dumb.json
CMD /etc/init.d/sensu-client start && \
tail -f /var/log/sensu/sensu-client.log
Running
docker-compose up -d
Everything goes OK. No errors in the logs, I can access the uchiwa dashboard, which shows me the defined client alright (keepalive requests seems to be OK). However, no check is available.
I noticed that no check request / check result is present in the log, as if the sensu server consider there is no check to run. Although, I have no idea why is that.
Could someone tell me what's going on more precisely? Thank you.
Check request/result will delivered via RabbitMQ, you can access to http://yourrabbitmqserver:15672 to see the queue and subscribed consumers.
Also make sure your server have some check.json files placed in /sensu/conf.d to schedule checks base on their interval