I'm trying to use AWS DocumentDB as a sink for storing data received from Kafka and was wondering if the MongoDB Kafka connector works with DocumentDB as its documentation mentions that it is compatible with MongoDB drivers.
https://www.mongodb.com/docs/kafka-connector/current/
https://aws.amazon.com/documentdb/
If not this connector what is the alternate way other than building a custom kafka connect?
You can use MongoDB Kafka connector with DocumentDB for source as well Sink.
Kafka Connector worker(with Mongodb Kafka connector) can be run in distributed mode using containers as well as EC2 hosts.
You can refer blog here which has step by step details
https://aws.amazon.com/blogs/database/stream-data-with-amazon-documentdb-and-amazon-msk-using-a-kafka-connector/
Related
We are planning to have an MSK created in AWS with Mongo DB as the Source for the MSK. We do not see any Pre-defined Connectors for connecting to the Mongo DB as source.
How to setup a connector?
Do we need to create a Connector completely or are there built in connectors available.
You can use MSK Connect to install and provision whatever connectors you want via plugins.
This includes Debezium's Mongo connector, or the source connector provided by Mongo themselves, which are "available to download" ; not sure what you mean "pre-defined".
After this, consult the official documentation for the connector of your choice for its own config properties and use HTTP requests to submit it to the cluster
I am using Kafka MSK in AWS. So we don't have native kafka connect with all required connectors like on confluent.
Actually I work with kakfa mongo connector and I want to find a way to push the kafka mongo connector jar to an on an instance of kafka MSK cluster.
The path to which the jar will be pushed is the plugins.path as defined in the properties of the used connector.
ANy way to make it please ?
MSK doesn't give you a hosted Kafka Connect worker. You'd need to provision and run this yourself, e.g. on EC2. This work would then connect to your Kafka cluster (MSK in this case)
To be clear: MSK is only the hosted Kafka brokers (and Zookeeper). It does not include Kafka Connect, which is what you need in order to run connectors.
I want to connect mongodb to kafka using Kafka connect.
but I am using kafka MSK , no kafka confluent connectors can be used. Do you have any idea how to do it please ?
Thanks.
You would need to run the Kafka Connect worker yourself, and install the appropriate MongoDB connector in the worker. You can find a list of connectors at https://hub.confluent.io.
How do I use Kafka Connect adapters with Amazon MSK?
As per the AWS documentation, it supports Kafka connect but not documented about how to setup adapters and use it.
Edit Oct 2021: MSK Connect has been launched, see https://aws.amazon.com/blogs/aws/introducing-amazon-msk-connect-stream-data-to-and-from-your-apache-kafka-clusters-using-managed-connectors/
AFAIK Amazon MSK does not provide managed connectors, so you have to run them yourself. This is done by running the Kafka Connect worker process (a JVM) and then providing it one or more connector configurations to run.
From the point of view of a Kafka Connect worker it just needs a Kafka cluster to connect to; it shouldn't matter whether it's MSK or on-premises, since it's ultimately 'just' a consumer/producer underneath.
You can see more, including a live demo, here: https://rmoff.dev/bbuzz19-kafka-connect
For an example of configuring Kafka Connect to use a cloud-hosted Kafka platform (in this case, Confluent Cloud), see this article.
If you are interested in managed connectors in the Cloud, check out the connectors that are provided in Confluent Cloud.
Disclaimer: I work for Confluent :)
AWS now supports MSK Connect, a new feature of MSK service based on Kafka Connect allowing you to deploy managed Kafka connectors built for Kafka connect
Check the announcement here: https://aws.amazon.com/blogs/aws/introducing-amazon-msk-connect-stream-data-to-and-from-your-apache-kafka-clusters-using-managed-connectors/
There are two aspects to this
Kafka Connect is a framework which should be deployed separately from kafka brokers. MSK only provides kafka brokers. If you want to use Kafka Connect with MSK you would need to use EC2 instances and deploy the kafka binaries.Kafka Connect framework is bundled along with kafka
Coming to connectors if you donot have a confluent subscription or similar - I am afraid your choices get very limited. But having said you can always write your own connectors. Writing new connectors is not that difficult rather you can apply your business specific logic and be on your way quite quickly.
I am trying to to use Kafka Connect to HBase and there are no Confluent supported connectors available for HBase, though there are some community connectors available. We are not really ready to take risk in production with out support to the connectors: Is there any other work around for HBase connectivity from Kafka Connect? Can we use Kafka JDBC connector for Kafka Connect?