Kafka REST API source connector with authentication header - apache-kafka

I need to create kafka source connector for REST API with Header authentication like
curl -H "Authorization: Basic " -H "clientID: " "https:< url for source> " .
I am using apache kafka , I used connector class com.github.castorm.kafka.connect.http.HttpSourceConnector
Here is my json file for connector
{
"name": "rest_data6",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
"tasks.max": "1",
"http.request.headers": "Authorization: Basic <key1>",
"http.request.headers": "clientID: <key>",
"http.request.url": "https:<url for source ?",
"kafka.topic": "mysqltopic2"
}
}
Also I tried with "connector.class": "com.tm.kafka.connect.rest.RestSourceConnector", My joson file as below
"name": "rest_data2",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.tm.kafka.connect.rest.RestSourceConnector",
"rest.source.poll.interval.ms": "900",
"rest.source.method": "GET",
"rest.source.url":"URL of source ",
"tasks.max": "1",
"rest.source.headers": "Authorization: Basic <key> , clientId :<key2>",
"rest.source.topic.selector": "com.tm.kafka.connect.rest.selector.SimpleTopicSelector",
"rest.source.destination.topics": "mysql1"
}
}
But no hope . Any idea how to GET REST API data with authentication . My authentication parameter is
Authorization: Basic and Authorization: Basic .
Just for mention both the file are working with REST API without authentication , once I added authentication parameter then wither connector status is failed or It produce ":"Cannot route. Codebase/company is invalid"" message in topic.
Can any one suggest what is way to solve it

I mailed to original developer to Cástor Rodríguez. As per his solution I modified my json
Put header into a single form and it works
{
"name": "rest_data6",
"config": {
"key.converter":"org.apache.kafka.connect.json.JsonConverter",
"value.converter":"org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter.schemas.enable":"true",
"connector.class": "com.github.castorm.kafka.connect.http.HttpSourceConnector",
"tasks.max": "1",
"http.request.headers": "Authorization: Basic <key1>, clientID: <key>"
"http.request.url": "https:<url for source ?",
"kafka.topic": "mysqltopic2"
}
}

Related

How to migrate consumer offsets using MirrorMaker 2.0?

With Kafka 2.7.0, I am using MirroMaker 2.0 as a Kafka-connect connector to replicate all the topics from the primary Kafka cluster to the backup cluster.
All the topics are being replicated perfectly except __consumer_offsets. Below are the connect configurations:
{
"name": "test-connector",
"config": {
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"topics.blacklist": "some-random-topic",
"replication.policy.separator": "",
"source.cluster.alias": "",
"target.cluster.alias": "",
"exclude.internal.topics":"false",
"tasks.max": "10",
"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"source.cluster.bootstrap.servers": "xx.xx.xxx.xx:9094",
"target.cluster.bootstrap.servers": "yy.yy.yyy.yy:9094",
"topics": "test-topic-from-primary,primary-kafka-connect-offset,primary-kafka-connect-config,primary-kafka-connect-status,__consumer_offsets"
}
}
In a similar question here, the accepted answer says the following:
Add this in your consumer.config:
exclude.internal.topics=false
And add this in your producer.config:
client.id=__admin_client
Where do I add these in my configuration?
Here the Connector Configuration Properties does not have such property named client.id, I have set the value of exclude.internal.topics to false though.
Is there something I am missing here?
UPDATE
I learned that Kafka 2.7 and above supports automated consumer offset sync using MirrorCheckpointTask as mentioned here.
I have created a connector for this having the below configurations:
{
"name": "mirror-checkpoint-connector",
"config": {
"connector.class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
"sync.group.offsets.enabled": "true",
"source.cluster.alias": "",
"target.cluster.alias": "",
"exclude.internal.topics":"false",
"tasks.max": "10",
"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"source.cluster.bootstrap.servers": "xx.xx.xxx.xx:9094",
"target.cluster.bootstrap.servers": "yy.yy.yyy.yy:9094",
"topics": "__consumer_offsets"
}
}
Still no help.
Is this the correct approach? Is there something needed?
you do not want to replicate connsumer_offsets. The offsets from the src to the destination cluster will not be the same for various reasons.
MirrorMaker2 provides the ability to do offset translation. It will populate the destination cluster with a translated offset generated from the src cluster. https://cwiki.apache.org/confluence/display/KAFKA/KIP-545%3A+support+automated+consumer+offset+sync+across+clusters+in+MM+2.0
__consumer_offsets is ignored by default
topics.exclude = [.*[\-\.]internal, .*\.replica, __.*]
you'll need to override this config

Setting mirror maker 2 using kafka connect rest api put method not allowed

I am trying to do the setup for mirror maker 2 using my current connect cluster.
Based on this documentation it can be done via connect rest api.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-RunningMirrorMakerinaConnectcluster
I followed the sample sending this PUT request :
PUT /connectors/us-west-source/config HTTP/1.1
{
"name": "us-west-source",
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"source.cluster.alias": "us-west",
"target.cluster.alias": "us-east",
"source.cluster.bootstrap.servers": "us-west-host1:9091",
"topics": ".*"
}
but i am getting a method not allowed response error response.
{
"error_code": 405,
"message": "HTTP 405 Method Not Allowed"
}
The api looks ok if I do a simple GET from the / , returning the version
{
"version": "2.1.0-cp1",
"commit": "bda8715f42a1a3db",
"kafka_cluster_id": "VBo-j1OAQZSN8tO4lMJ0Gg"
}
the PUT method doesnt work, using POST works as the api's documentation shows:
https://docs.confluent.io/current/connect/references/restapi.html#get--connectors
Remove the name of the connector from the url as #cricket_007 suggested, and wrap the config with new element like this:
curl --noproxy "*" -XPOST -H 'Content-Type: application/json' -H 'Accept: application/json' http://localhost:8083/connectors -d'{
"name": "dc-west-source",
"config": {
"connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"source.cluster.alias": "dc-west",
"target.cluster.alias": "dc-east",
"source.cluster.bootstrap.servers": "dc-west-cp-kafka-0.domain:32721,dc-west-cp-kafka-1.domain:32722,dc-west-cp-kafka-2.dc.domain:32723",
"topics": ".*"
}
}
' | jq .

Unable to create kafka connector using REST API

I am trying to run Kafka workers in distributed mode. Unlike standalone mode, we cannot pass the connector property file while starting the worker in distributed mode. In Distributed mode, workers are started separately and we deploy and manage the connectors on those workers using REST API
Reference Link - https://docs.confluent.io/current/connect/managing/configuring.html#connect-managing-distributed-mode
I tried building a connector by passing the below values in curl command and execued it
curl -X POST -H "Content-Type: application/json" --data '{"name":"sailpointdb","connector.class":"io.confluent.connect.jdbc.JdbcSourceConnector","tasks.max":"1","connection.password " : " abc","connection.url " : "jdbc:mysql://localhost:3306/db","connection.user " : "abc" ,"query" : " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS
TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A","mode" : " timestamp","timestamp.column.name" : "TASKFAILEDON","topic.prefix" : "dbevents","validate.non.null" : "false" }}' http://localhost:8089/connectors/
I am getting below error - curl: (3) URL using bad/illegal format or missing URL
Please let me know what is wrong with the above curl statement, am i missing anything here
You had an extra closing curly brace in your JSON which won't help
If you're POSTing to /connectors you need the name and config root level elements. But, I recommend using PUT /config because you can re-run it to update the config if you need to
Try this:
curl -X PUT -H "Content-Type:application/json" \
http://localhost:8089/connectors/source-jdbc-sailpointdb-00/config \
-d '{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"tasks.max": "1",
"connection.password ": " abc",
"connection.url ": "jdbc:mysql://localhost:3306/db",
"connection.user ": "abc",
"query": " SELECT * FROM (SELECT NAME, FROM_UNIXTIME(completed/1000) AS TASKFAILEDON FROM abc WHERE COMPLETION_STATUS = 'Error') as A",
"mode": " timestamp",
"timestamp.column.name": "TASKFAILEDON",
"topic.prefix": "dbevents",
"validate.non.null": "false"
}'

get task id's from kafka connect API to print in logs

I have a kafka connect sink code for which below json is passed as curl command to register tasks.
Please let me know if anyone has any idea on how to get the task id's of my connect. For example in below example, we have defined max tasks is 3, so I need to know
the name of 3 tasks for logs i.e. I need to know which line of my log belongs to which task.
In below example, I know I have 3 tasks - TestCheck-1, TestCheck-2 and TestCheck-3 based on the kafka connect logs. I want to know how to get the task names so that I can print them in my kafka connect log lines.
{
"name": "TestCheck",
"config": {
"topics": "topic1",
"connector.class": "ApplicationSinkTask Class package",
"tasks.max": "3",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"connector.url": "jdbc connection url",
"driver.name": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"username": "myusername",
"password": "mypassword",
"table.name": "test_table",
"database.name": "test",
}
}
When I register, I will get below details.
curl -X POST -H "Content-Type: application/json" --data #myjson.json http://service:8082/connectors
{"name":"TestCheck","config":{"topics":"topic1","connector.class":"ApplicationSinkTask Class package","tasks.max":"3","key.converter":"org.apache.kafka.connect.storage.StringConverter","value.converter":"org.apache.kafka.connect.storage.StringConverter","connector.url":"jdbc:sqlserver://datahubprod.database.windows.net:1433;","driver.name":"jdbc connection url","username":"myuser","password":"mypassword","table.name":"test_table","database.name":"test","name":"TestCheck"},"tasks":[{"connector":"TestCheck","task":0},{"connector":"TestCheck","task":1},{"connector":"TestCheck","task":2}],"type":null}
You can manage the connectors with the Kafka Connect Rest API. There's a whole heap of commands which you can find here
The example given in the above link shows you can retrieve all task for a given connector using the command
$ curl localhost:8083/connectors/local-file-sink/tasks
[
{
"id": {
"connector": "local-file-sink",
"task": 0
},
"config": {
"task.class": "org.apache.kafka.connect.file.FileStreamSinkTask",
"topics": "connect-test",
"file": "test.sink.txt"
}
}
]
You can use a language of your choice to send the curl command and import the json response into a variable/dictionary for further use, such as printing to a log. Here's a very simple example using python which will assign the whole output to a variable.
import requests
import json
connectors = 'http://localhost:8083/connectors'
p = requests.get(connectors)
data = p.json()
If you parse the data variable to a a dictionary, you can the access each element, i.e the task id
I hope this helps!

wso2am API manager 2.1 publisher change-lifecycle issue

I deployed API Manager 2.1.0 and setup the api-import-export-2.1.0 war file described here. After importing my API endpoint by uploading a zip file the status=CREATED.
To actually publish the API I am calling the Publisher's change-lifecycle API but I am getting this exception:
TID: [-1234] [] [2017-07-06 11:11:57,289] ERROR
{org.wso2.carbon.apimgt.rest.api.util.exception.GlobalThrowableMapper}
- An Unknown exception has been captured by global exception mapper.
{org.wso2.carbon.apimgt.rest.api.util.exception.GlobalThrowableMapper}
java.lang.NoSuchMethodError:
org.wso2.carbon.apimgt.api.APIProvider.changeLifeCycleStatus(Lorg/wso2/carbon/apimgt/api/model/APIIdentifier;Ljava/lang/String;)Z
Any ideas on why?
I can get an access token (scope apim:api_view) and call this
:9443/api/am/publisher/v0.10/apis
to list the api's just fine.
I get a different acces_token (for scope: apim:api_publish) and then call
:9443/api/am/publisher/v0.10/apis/change-lifecycle
but get the above Exception. Here's the example:
[root#localhost] ./publish.sh
View APIs (token dc0c1497-6c27-3a10-87d7-b2abc7190da5 scope: apim:api_view)
curl -k -s -H "Authorization: Bearer dc0c1497-6c27-3a10-87d7-b2abc7190da5" https://gw-node:9443/api/am/publisher/v0.10/apis
{
"count": 1,
"next": "",
"previous": "",
"list": [
{
"id": "d214f784-ee16-4067-9588-0898a948bb17",
"name": "Health",
"description": "health check",
"context": "/api",
"version": "v1",
"provider": "admin",
"status": "CREATED"
}
] }
Publish API (token b9a31369-8ea3-3bf2-ba3c-7f2a4883de7d scope: apim:api_publish)
curl -k -H "Authorization: Bearer b9a31369-8ea3-3bf2-ba3c-7f2a4883de7d" -X POST https://gw-node:9443/api/am/publisher/v0.10/apis/change-lifecycle?apiId=d214f784-ee16-4067-9588-0898a948bb17&action=Publish
{
"code":500,
"message":"Internal server error",
"description":"The server encountered an internal error. Please contact administrator.",
"moreInfo":"",
"error":[]
}
Issue resolved. In apim 2.1 the publisher & store API versions changed.
In apim 2.0 I was using:
:9443/api/am/publisher/v0.10/apis
:9443/api/am/store/v0.10/apis
but in apim 2.1 they are:
:9443/api/am/publisher/v0.11/apis
:9443/api/am/store/v0.11/apis