How to connect already setup kafka cluster to mongodb? - mongodb

How can I connect kafka events to a mongodb sink?
The resources I found on the net using confluent they make a cluster for you and didn't find how to connect my already existing cluster

You need to install the Mongo Connector into the plugin.path of your connect properties file, then start Kafka Connect using one of the bin/connect- scripts in your Kafka installation

Related

AWS MSK Connect with Mongo DB as the Source

We are planning to have an MSK created in AWS with Mongo DB as the Source for the MSK. We do not see any Pre-defined Connectors for connecting to the Mongo DB as source.
How to setup a connector?
Do we need to create a Connector completely or are there built in connectors available.
You can use MSK Connect to install and provision whatever connectors you want via plugins.
This includes Debezium's Mongo connector, or the source connector provided by Mongo themselves, which are "available to download" ; not sure what you mean "pre-defined".
After this, consult the official documentation for the connector of your choice for its own config properties and use HTTP requests to submit it to the cluster

Can Kafka Connect consume data from a separate kerberized Kafka instance and then route to Splunk?

My pipeline is:
Kerberized Kafka --> Logstash (hosted on a different server) --> Splunk.
Can I replace the Logstash component with Kafka Connect?
Could you point me to a resource/guide where I can use kerberized Kafka as a source for my Kafka connect (which is hosted separately)?
From the documentation, what I understood is that if Kafka Connect is hosted on the same cluster as that of Kafka, that's quite possible. But I don't have that option right now, as our Kafka cluster is multi-tenant and hence not approved for additional processes on the cluster.
Kerberos keytabs aren't commonly machine/JVM specific, so yes, Kafka Connect should be able to be configured very similarly to Logstash since both are JVM processes using native Kafka protocol.
You shouldn't run Connect on the brokers anyway
If you can't add Kafka Connect to an existing Kafka cluster, you will have to spin up a separate Kafka Connect (Cluster or standalone).
I've written about it here: enter link description here

kafka connect mongo on kafka MSK

I am using Kafka MSK in AWS. So we don't have native kafka connect with all required connectors like on confluent.
Actually I work with kakfa mongo connector and I want to find a way to push the kafka mongo connector jar to an on an instance of kafka MSK cluster.
The path to which the jar will be pushed is the plugins.path as defined in the properties of the used connector.
ANy way to make it please ?
MSK doesn't give you a hosted Kafka Connect worker. You'd need to provision and run this yourself, e.g. on EC2. This work would then connect to your Kafka cluster (MSK in this case)
To be clear: MSK is only the hosted Kafka brokers (and Zookeeper). It does not include Kafka Connect, which is what you need in order to run connectors.

connect kafka MSK to mongodb

I want to connect mongodb to kafka using Kafka connect.
but I am using kafka MSK , no kafka confluent connectors can be used. Do you have any idea how to do it please ?
Thanks.
You would need to run the Kafka Connect worker yourself, and install the appropriate MongoDB connector in the worker. You can find a list of connectors at https://hub.confluent.io.

Kafka and Kafka Connect deployment environment

if I already have Kafka running on premises, is Kafka Connect just a configuration on top of my existing Kafka, or does Kafka Connect require it's own Server/Environment separate from that of my existing Kafka?
Kafka Connect is part of Apache Kafka, but it runs as a separate process, called a Kafka Connect Worker. Except in a sandbox environment, you would usually deploy it on a separate machine/node from your Kafka brokers.
This diagram shows conceptually how it runs, separate from your brokers:
You can run Kafka Connect on a single node, or as part of a cluster (for throughput and redundancy).
You can read more here about installation and configuration and architecture of Kafka Connect.
Kafka Connect is its own configuration on top of your bootstrap-server's configuration.
For Kafka Connect you can choose between a standalone server or distributed connect servers and you'll have to update the corresponding properties file to point to your currently running Kafka server(s).
Look under {kafka-root}/config and you'll see
You'll basically update connect-standalone or connect-distributed properties based on your need.