get task id's from kafka connect API to print in logs - apache-kafka

I have a kafka connect sink code for which below json is passed as curl command to register tasks.
Please let me know if anyone has any idea on how to get the task id's of my connect. For example in below example, we have defined max tasks is 3, so I need to know
the name of 3 tasks for logs i.e. I need to know which line of my log belongs to which task.
In below example, I know I have 3 tasks - TestCheck-1, TestCheck-2 and TestCheck-3 based on the kafka connect logs. I want to know how to get the task names so that I can print them in my kafka connect log lines.
{
"name": "TestCheck",
"config": {
"topics": "topic1",
"connector.class": "ApplicationSinkTask Class package",
"tasks.max": "3",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"connector.url": "jdbc connection url",
"driver.name": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"username": "myusername",
"password": "mypassword",
"table.name": "test_table",
"database.name": "test",
}
}
When I register, I will get below details.
curl -X POST -H "Content-Type: application/json" --data #myjson.json http://service:8082/connectors
{"name":"TestCheck","config":{"topics":"topic1","connector.class":"ApplicationSinkTask Class package","tasks.max":"3","key.converter":"org.apache.kafka.connect.storage.StringConverter","value.converter":"org.apache.kafka.connect.storage.StringConverter","connector.url":"jdbc:sqlserver://datahubprod.database.windows.net:1433;","driver.name":"jdbc connection url","username":"myuser","password":"mypassword","table.name":"test_table","database.name":"test","name":"TestCheck"},"tasks":[{"connector":"TestCheck","task":0},{"connector":"TestCheck","task":1},{"connector":"TestCheck","task":2}],"type":null}

You can manage the connectors with the Kafka Connect Rest API. There's a whole heap of commands which you can find here
The example given in the above link shows you can retrieve all task for a given connector using the command
$ curl localhost:8083/connectors/local-file-sink/tasks
[
{
"id": {
"connector": "local-file-sink",
"task": 0
},
"config": {
"task.class": "org.apache.kafka.connect.file.FileStreamSinkTask",
"topics": "connect-test",
"file": "test.sink.txt"
}
}
]
You can use a language of your choice to send the curl command and import the json response into a variable/dictionary for further use, such as printing to a log. Here's a very simple example using python which will assign the whole output to a variable.
import requests
import json
connectors = 'http://localhost:8083/connectors'
p = requests.get(connectors)
data = p.json()
If you parse the data variable to a a dictionary, you can the access each element, i.e the task id
I hope this helps!

Related

Kafka connect RabbitMQ unable to use insert field transform: Only Struct objects supported for [field insertion], found: [B

I'm trying to use the InsertField kafka connect transformation with rabbitmq connector.
my configuration:
"config": {
"connector.class": "io.confluent.connect.rabbitmq.RabbitMQSourceConnector",
"confluent.topic.bootstrap.servers": "kafka:29092",
"topic.creation.default.replication.factor": 1,
"topic.creation.default.partitions": 1,
"tasks.max": "2",
"kafka.topic": "test",
"rabbitmq.queue": "events",
"rabbitmq.host": "rabbitmq",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"transforms": "InsertField",
"transforms.InsertField.type": "org.apache.kafka.connect.transforms.InsertField$Value",
"transforms.InsertField.static.field": "MessageSource",
"transforms.InsertField.static.value": "Kafka Connect framework"
}
I have also tried using BytesArrayConverter as the value. Using python, I send a message as follows:
msg = json.dumps(body)
self.channel.basic_publish(exchange="", routing_key="events", body=msg)
where using encode() to transform it into a byte array does not work as well.
The exception I'm receiving is:
Caused by: org.apache.kafka.connect.errors.DataException: Only Struct objects supported for [field insertion], found: [B
at org.apache.kafka.connect.transforms.util.Requirements.requireStruct(Requirements.java:52)
at org.apache.kafka.connect.transforms.InsertField.applyWithSchema(InsertField.java:162)
at org.apache.kafka.connect.transforms.InsertField.apply(InsertField.java:133)
at org.apache.kafka.connect.runtime.TransformationChain.lambda$apply$0(TransformationChain.java:50)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 11 more
I understand the error and thought that using JsonConverter will solve it, but I was wrong. I've also used "value.converter.schemas.enable" : "false" to no avail.
Would appreciate any help. I don't mind sending the data in json form or bytes form, I just want a key:value pair to be added to the event.
Thanks
As the error indicates, you can only insert fields into structs. To get a Struct from RabbitMQ String/Bytes schemas, you must chain a HoistField transform before InsertField one.
To get any Struct from JSONConverter, your JSON needs two top level fields named schema and payload, then connector needs
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "true"
https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained/
Alternatively, use Kafka headers for "source" information, rather than trying to inject into the value

Receiving 404 when trying to edit a datasource despite being able to retrieve it's details

I have a datasource created in Grafana and attempting to update it to refresh the bearer token for auth access.
However, I'm receiving a 404 Not Found error from the grafana api when making a request to localhost:3000/api/datasources/uid/:uid with a uid just received from the datasources/name api - attempting to update as per the documentation https://grafana.com/docs/grafana/latest/developers/http_api/data_source/#update-an-existing-data-source
I'm using the grafana opensource docker container with the Infinity plugin.
docker run -d -p 3000:3000 --name=grafana -e "GF_INSTALL_PLUGINS=yesoreyeram-infinity-datasource" grafana/grafana-oss
I'm able to create a datasource via the api, just can't update an existing one.
My code is:-
grafana_api_token = '<my api token>'
new_access_token = '<my new bearer token>'
my_data_source = 'my_data_source'
grafana_header = {"authorization": f"Bearer {grafana_api_token}", "content-type":"application/json;charset=UTF-8"}
grafana_datasource_url = f"http://localhost:3000/api/datasources/name/{my_data_source}"
firebolt_datasource_resp = get(url=grafana_datasource_url, headers=grafana_header)
full_datasource = loads(firebolt_datasource_resp.content.decode("utf-8"))
datasource_uid = full_datasource["uid"]
update_token_url = f"http://localhost:3000/api/datasources/uid/{datasource_uid}"
new_data = {"id": full_datasource["id"],
"uid": full_datasource["uid"],
"orgId": full_datasource["orgId"],
"name": "new_data_source",
"type": full_datasource["type"],
"access": full_datasource["access"],
"url": full_datasource["url"],
"user": full_datasource["user"],
"database": full_datasource["database"],
"basicAuth": full_datasource["basicAuth"],
"basicAuthUser": full_datasource["basicAuthUser"],
"withCredentials": full_datasource["withCredentials"],
"isDefault": full_datasource["isDefault"],
"jsonData": full_datasource["jsonData"],
"secureJsonData": {
"bearerToken": new_access_token
}
}
update_bearer_token_resp = post(url=update_token_url, data=dumps(new_data), headers=grafana_header)
Oh, oh, oh, idiot mode. Using post rather than put. Doh.

Kafka REST API source connector with authentication header

I need to create kafka source connector for REST API with Header authentication like
curl -H "Authorization: Basic " -H "clientID: " "https:< url for source> " .
I am using apache kafka , I used connector class com.github.castorm.kafka.connect.http.HttpSourceConnector
Here is my json file for connector
{
"name": "rest_data6",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
"tasks.max": "1",
"http.request.headers": "Authorization: Basic <key1>",
"http.request.headers": "clientID: <key>",
"http.request.url": "https:<url for source ?",
"kafka.topic": "mysqltopic2"
}
}
Also I tried with "connector.class": "com.tm.kafka.connect.rest.RestSourceConnector", My joson file as below
"name": "rest_data2",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.tm.kafka.connect.rest.RestSourceConnector",
"rest.source.poll.interval.ms": "900",
"rest.source.method": "GET",
"rest.source.url":"URL of source ",
"tasks.max": "1",
"rest.source.headers": "Authorization: Basic <key> , clientId :<key2>",
"rest.source.topic.selector": "com.tm.kafka.connect.rest.selector.SimpleTopicSelector",
"rest.source.destination.topics": "mysql1"
}
}
But no hope . Any idea how to GET REST API data with authentication . My authentication parameter is
Authorization: Basic and Authorization: Basic .
Just for mention both the file are working with REST API without authentication , once I added authentication parameter then wither connector status is failed or It produce ":"Cannot route. Codebase/company is invalid"" message in topic.
Can any one suggest what is way to solve it
I mailed to original developer to Cástor Rodríguez. As per his solution I modified my json
Put header into a single form and it works
{
"name": "rest_data6",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
"tasks.max": "1",
"http.request.headers": "Authorization: Basic <key1>, clientID: <key>"
"http.request.url": "https:<url for source ?",
"kafka.topic": "mysqltopic2"
}
}

Setting mirror maker 2 using kafka connect rest api put method not allowed

I am trying to do the setup for mirror maker 2 using my current connect cluster.
Based on this documentation it can be done via connect rest api.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-RunningMirrorMakerinaConnectcluster
I followed the sample sending this PUT request :
PUT /connectors/us-west-source/config HTTP/1.1
{
"name": "us-west-source",
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"source.cluster.alias": "us-west",
"target.cluster.alias": "us-east",
"source.cluster.bootstrap.servers": "us-west-host1:9091",
"topics": ".*"
}
but i am getting a method not allowed response error response.
{
"error_code": 405,
"message": "HTTP 405 Method Not Allowed"
}
The api looks ok if I do a simple GET from the / , returning the version
{
"version": "2.1.0-cp1",
"commit": "bda8715f42a1a3db",
"kafka_cluster_id": "VBo-j1OAQZSN8tO4lMJ0Gg"
}
the PUT method doesnt work, using POST works as the api's documentation shows:
https://docs.confluent.io/current/connect/references/restapi.html#get--connectors
Remove the name of the connector from the url as #cricket_007 suggested, and wrap the config with new element like this:
curl --noproxy "*" -XPOST -H 'Content-Type: application/json' -H 'Accept: application/json' http://localhost:8083/connectors -d'{
"name": "dc-west-source",
"config": {
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"source.cluster.alias": "dc-west",
"target.cluster.alias": "dc-east",
"source.cluster.bootstrap.servers": "dc-west-cp-kafka-0.domain:32721,dc-west-cp-kafka-1.domain:32722,dc-west-cp-kafka-2.dc.domain:32723",
"topics": ".*"
}
}
' | jq .

Unable to create kafka connector using REST API

I am trying to run Kafka workers in distributed mode. Unlike standalone mode, we cannot pass the connector property file while starting the worker in distributed mode. In Distributed mode, workers are started separately and we deploy and manage the connectors on those workers using REST API
Reference Link - https://docs.confluent.io/current/connect/managing/configuring.html#connect-managing-distributed-mode
I tried building a connector by passing the below values in curl command and execued it
curl -X POST -H "Content-Type: application/json" --data '{"name":"sailpointdb","connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector","tasks.max":"1","connection.password " : " abc","connection.url " : "jdbc:mysql://localhost:3306/db","connection.user " : "abc" ,"query" : " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS
TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A","mode" : " timestamp","timestamp.column.name" : "TASKFAILEDON","topic.prefix" : "dbevents","validate.non.null" : "false" }}' http://localhost:8089/connectors/
I am getting below error - curl: (3) URL using bad/illegal format or missing URL
Please let me know what is wrong with the above curl statement, am i missing anything here
You had an extra closing curly brace in your JSON which won't help
If you're POSTing to /connectors you need the name and config root level elements. But, I recommend using PUT /config because you can re-run it to update the config if you need to
Try this:
curl -X PUT -H "Content-Type:application/json" \
http://localhost:8089/connectors/source-jdbc-sailpointdb-00/config \
-d '{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "1",
"connection.password ": " abc",
"connection.url ": "jdbc:mysql://localhost:3306/db",
"connection.user ": "abc",
"query": " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A",
"mode": " timestamp",
"timestamp.column.name": "TASKFAILEDON",
"topic.prefix": "dbevents",
"validate.non.null": "false"
}'