Unable to create kafka connector using REST API - apache-kafka

I am trying to run Kafka workers in distributed mode. Unlike standalone mode, we cannot pass the connector property file while starting the worker in distributed mode. In Distributed mode, workers are started separately and we deploy and manage the connectors on those workers using REST API
Reference Link - https://docs.confluent.io/current/connect/managing/configuring.html#connect-managing-distributed-mode
I tried building a connector by passing the below values in curl command and execued it
curl -X POST -H "Content-Type: application/json" --data '{"name":"sailpointdb","connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector","tasks.max":"1","connection.password " : " abc","connection.url " : "jdbc:mysql://localhost:3306/db","connection.user " : "abc" ,"query" : " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS
TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A","mode" : " timestamp","timestamp.column.name" : "TASKFAILEDON","topic.prefix" : "dbevents","validate.non.null" : "false" }}' http://localhost:8089/connectors/
I am getting below error - curl: (3) URL using bad/illegal format or missing URL
Please let me know what is wrong with the above curl statement, am i missing anything here

You had an extra closing curly brace in your JSON which won't help
If you're POSTing to /connectors you need the name and config root level elements. But, I recommend using PUT /config because you can re-run it to update the config if you need to
Try this:
curl -X PUT -H "Content-Type:application/json" \
http://localhost:8089/connectors/source-jdbc-sailpointdb-00/config \
-d '{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "1",
"connection.password ": " abc",
"connection.url ": "jdbc:mysql://localhost:3306/db",
"connection.user ": "abc",
"query": " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A",
"mode": " timestamp",
"timestamp.column.name": "TASKFAILEDON",
"topic.prefix": "dbevents",
"validate.non.null": "false"
}'

Related

Setting mirror maker 2 using kafka connect rest api put method not allowed

I am trying to do the setup for mirror maker 2 using my current connect cluster.
Based on this documentation it can be done via connect rest api.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-RunningMirrorMakerinaConnectcluster
I followed the sample sending this PUT request :
PUT /connectors/us-west-source/config HTTP/1.1
{
"name": "us-west-source",
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"source.cluster.alias": "us-west",
"target.cluster.alias": "us-east",
"source.cluster.bootstrap.servers": "us-west-host1:9091",
"topics": ".*"
}
but i am getting a method not allowed response error response.
{
"error_code": 405,
"message": "HTTP 405 Method Not Allowed"
}
The api looks ok if I do a simple GET from the / , returning the version
{
"version": "2.1.0-cp1",
"commit": "bda8715f42a1a3db",
"kafka_cluster_id": "VBo-j1OAQZSN8tO4lMJ0Gg"
}
the PUT method doesnt work, using POST works as the api's documentation shows:
https://docs.confluent.io/current/connect/references/restapi.html#get--connectors
Remove the name of the connector from the url as #cricket_007 suggested, and wrap the config with new element like this:
curl --noproxy "*" -XPOST -H 'Content-Type: application/json' -H 'Accept: application/json' http://localhost:8083/connectors -d'{
"name": "dc-west-source",
"config": {
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"source.cluster.alias": "dc-west",
"target.cluster.alias": "dc-east",
"source.cluster.bootstrap.servers": "dc-west-cp-kafka-0.domain:32721,dc-west-cp-kafka-1.domain:32722,dc-west-cp-kafka-2.dc.domain:32723",
"topics": ".*"
}
}
' | jq .

Why does DESCRIBE EXTENDED in Kafka KSQL return error ShowColumns not supported?

I have a simple KTABLE in KSQL called DIMAGE
When I run the following code
{
"ksql": "DESCRIBE EXTENDED DIMAGE ;"
}
I receive the following error
{
"#type": "generic_error",
"error_code": 40000,
"message": "Statement type `io.confluent.ksql.parser.tree.ShowColumns' not supported for this resource",
"stackTrace": []
}
I also receive a similar error message trying to describe a stream. I also receive the same error message if I remove the EXTENDED attribute.
You're using the wrong REST endpoint. If you use query endpoint query you'll get your error:
$ curl -s -X "POST" "http://localhost:8088/query" \
-H "Content-Type: application/vnd.ksql.v1+json; charset=utf-8" \
-d '{
"ksql": "DESCRIBE EXTENDED COMPUTER_T;"
}'
{"#type":"generic_error","error_code":40000,"message":"Statement type `io.confluent.ksql.parser.tree.ShowColumns' not supported for this resource","stackTrace":[]}⏎
If you use the statement endpoint ksql it works fine:
$ curl -s -X "POST" "http://localhost:8088/ksql" \
-H "Content-Type: application/vnd.ksql.v1+json; charset=utf-8" \
-d '{
"ksql": "DESCRIBE EXTENDED COMPUTER_T;"
}'|jq '.'
[
{
"#type": "sourceDescription",
"statementText": "DESCRIBE EXTENDED COMPUTER_T;",
"sourceDescription": {
"name": "COMPUTER_T",
"readQueries": [
{
"sinks": [
"COMP_WATCH_BY_EMP_ID_T"
],
"id": "CTAS_COMP_WATCH_BY_EMP_ID_T_0",
[...]
I've logged #2362 so that we can improve the UX of this.

get task id's from kafka connect API to print in logs

I have a kafka connect sink code for which below json is passed as curl command to register tasks.
Please let me know if anyone has any idea on how to get the task id's of my connect. For example in below example, we have defined max tasks is 3, so I need to know
the name of 3 tasks for logs i.e. I need to know which line of my log belongs to which task.
In below example, I know I have 3 tasks - TestCheck-1, TestCheck-2 and TestCheck-3 based on the kafka connect logs. I want to know how to get the task names so that I can print them in my kafka connect log lines.
{
"name": "TestCheck",
"config": {
"topics": "topic1",
"connector.class": "ApplicationSinkTask Class package",
"tasks.max": "3",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"connector.url": "jdbc connection url",
"driver.name": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"username": "myusername",
"password": "mypassword",
"table.name": "test_table",
"database.name": "test",
}
}
When I register, I will get below details.
curl -X POST -H "Content-Type: application/json" --data #myjson.json http://service:8082/connectors
{"name":"TestCheck","config":{"topics":"topic1","connector.class":"ApplicationSinkTask Class package","tasks.max":"3","key.converter":"org.apache.kafka.connect.storage.StringConverter","value.converter":"org.apache.kafka.connect.storage.StringConverter","connector.url":"jdbc:sqlserver://datahubprod.database.windows.net:1433;","driver.name":"jdbc connection url","username":"myuser","password":"mypassword","table.name":"test_table","database.name":"test","name":"TestCheck"},"tasks":[{"connector":"TestCheck","task":0},{"connector":"TestCheck","task":1},{"connector":"TestCheck","task":2}],"type":null}
You can manage the connectors with the Kafka Connect Rest API. There's a whole heap of commands which you can find here
The example given in the above link shows you can retrieve all task for a given connector using the command
$ curl localhost:8083/connectors/local-file-sink/tasks
[
{
"id": {
"connector": "local-file-sink",
"task": 0
},
"config": {
"task.class": "org.apache.kafka.connect.file.FileStreamSinkTask",
"topics": "connect-test",
"file": "test.sink.txt"
}
}
]
You can use a language of your choice to send the curl command and import the json response into a variable/dictionary for further use, such as printing to a log. Here's a very simple example using python which will assign the whole output to a variable.
import requests
import json
connectors = 'http://localhost:8083/connectors'
p = requests.get(connectors)
data = p.json()
If you parse the data variable to a a dictionary, you can the access each element, i.e the task id
I hope this helps!

Hbase REST call - getting junk characters " \x0A "

I'm trying to insert values into a Hbase table using the HBase REST API call.. Below is the curl command i'm using..
curl -v -XPUT 'http://localhost:8080/emp/1/pers:name' -H "Accept: application/json" -H "Content-Type: application/json" --data '{ "Row": [ { "Cell": [ { "column": "cGVyczpuYW1lCg==", "$": "TXlOYW1lCg==" } ], "key": "MQo=" } ] }'
The call works fine and I get a "HTTP/1.1 200 OK" .. But when i see the Hbase table, instead of updating the value in Row "1", the call creates a new row "1\x0A" and inserts the new value with the same junk characters..
1\x0A column=pers:name\x0A, timestamp=1437596697507, value=MyName\x0A
Anyone seen something like this? Thanks in advance..
Ok i sorted this out..
\x0A is the escaped hexadecimal Line Feed. The equivalent of \n..
When we base64 encode, it takes the \n escape character into account.. so we need to pass a " -n " when we base64 encode the rowkey, column and value..
For ex:
echo -n MyName | base64
TXlOYW1l
echo MyName | base64
TXlOYW1lCg==
There is a difference between the two.. and thats what caused my problem..

Parameterize collection: IN Operator WHERE clause Cypher REST

How to specify parameters in something like this - WHERE a.name IN ["Peter", "Tobias"]. I am trying to pass the collection after IN operator as a parameter in Cypher. I am using Cypher through REST API.
This is my example:
curl -X POST http://localhost:7474/db/data/ext/CypherPlugin/graphdb/execute_query -H "Content-Type: applicatio/json" --data-binary '{
"query": "start ca=node:ca({search_ca_query}) MATCH ca_club-[:has]-ca WHERE (ca_club.CA_CLUB IN {CA_CLUB}) RETURN distinct ca.NUM_OFC_CA, ca.NME_CA, ca_club.CA_CLUB",
"params": {
"search_ca_query": "NUM_OFC_CA:(\"000333\", \"111033\", \"222197\")",
"CA_CLUB": "[\"Driad\", \"No-Club\"]"
}
}'
I have also tried swapping square brackets in query, but even that didn't worked. (i.e. i am not getting any error but getting an empty list - "data" : [ ].
Any suggestions on how to do this?
Your in parameter needs to be a list:
curl -X POST http://localhost:7474/db/data/ext/CypherPlugin/graphdb/execute_query -H "Content-Type: applicatio/json" --data-binary '{
"query": "start ca=node:ca({search_ca_query}) MATCH ca_club-[:has]-ca WHERE (ca_club.CA_CLUB IN {CA_CLUB}) RETURN distinct ca.NUM_OFC_CA, ca.NME_CA, ca_club.CA_CLUB",
"params": {
"search_ca_query": "NUM_OFC_CA:(\"000333\", \"111033\", \"222197\")",
"CA_CLUB": ["Driad", "No-Club"]
}
}'