How to Connect Kafka to Postgres in Heroku - postgresql

I have some Kafka consumers and producers running through my Kafka instance on my Heroku Cluster. I'm looking to create a data sink connector to connect Kafka to PosytgreSQL to put data FROM Kafka TO my heroku PostgreSQL instance. Pretty much like the HeroKu docs, but one way.
I can't figure out the steps I need to take to achieve this.
The docs say to look at the Gitlab or Confluence Ecosystem page but i can't find any mention of Postgres in these.
Looking in the Confluent Kafka Connectors library there seems to something from Debezium but i'm not running Confluent.
The diagram in the Heroku docs mentions a JDBC connector? I found this Postgres JDBC driver, should I be using this?
I'm happy to create a consumer and update postgres manually as the data comes if that's what's needed, but I feel that Kafka to Postgres must be a common enough interface that there should be something out there to manage this?
I'm just looking for some high level help or examples to set me on the right path.
Thanks

You're almost there :)
Bear in mind that Kafka Connect is part of Apache Kafka, and you get a variety of connectors. Some (e.g. Debezium) are community projects from Red Hat, others (e.g. JDBC Sink) are community projects from Confluent.
The JDBC Sink connector will let you stream data from Kafka to a database with a JDBC driver - such as Postgres.
Here's an example configuration:
{
"connector.class" : "io.confluent.connect.jdbc.JdbcSinkConnector",
"key.converter" : "org.apache.kafka.connect.storage.StringConverter",
"connection.url" : "jdbc:postgresql://postgres:5432/",
"connection.user" : "postgres",
"connection.password": "postgres",
"auto.create" : true,
"auto.evolve" : true,
"insert.mode" : "upsert",
"pk.mode" : "record_key",
"pk.fields" : "MESSAGE_KEY"
}
Here's a walkthrough and couple of videos that you might find useful:
Kafka Connect in Action: JDBC Sink
ksqlDB and the Kafka Connect JDBC Sink
Do i actually need to install anything
Kafka Connect comes with Apache Kafka. You need to install the JDBC connector.
Do i actually need to write any code
No, just the configuration, similar to what I quoted above
can i just call the Connect endpoint , which comes with Kafka,
Once you've installed the connector you run Kafka Connect (a binary that ships with Apache Kafka) and then use the REST endpoint to create the connector using the configuration

Related

PLC4X OPCUA -Kafka Connnector

I want to use the PLC4X Connector (https://www.confluent.io/hub/apache/kafka-connect-plc4x-plc4j) to connect OPC UA (Prosys Simulation Server) with Kafka.
However I really do not find any website that describe the kafka connect configuration options?
I tried to connect to the prosys opc ua simulation server and than stream the data to a kafka topic.
I managed it to simply send the data and consume it, however i want to use a schema and the avro connverter.
My output from my sink python connector looks like this. That seems a bit strange to me too?
b'Struct{fields=Struct{ff=-5.4470555688606E8,hhh=Sean Ray MD},timestamp=1651838599206}'
How can I use the PLC4X connector with the Avro converter and a Schema?
Thanks!
{
"connector.class": "org.apache.plc4x.kafka.Plc4xSourceConnector",
"default.topic":"plcTestTopic",
"connectionString":"opcua.tcp://127.0.0.1:12345",
"tasks.max": "2",
"sources": "machineA",
"sources.machineA.connectionString": "opcua:tcp://127.0.0.1:12345",
"sources.machineA.jobReferences": "jobA",
"jobs": "jobA",
"jobs.jobA.interval": "5000",
"jobs.jobA.fields": "job1,job2",
"jobs.jobA.fields.job1": "ns=2;i=2",
"jobs.jobA.fields.job2": "ns=2;i=3"
}
When using a schema with Avro and the Confluent schema registry, the following settings should be used. You can also choose to use different settings for both the keys and values.
key.converter=io.confluent.connect.avro.AvroConverter
value.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url:http://127.0.0.1:8081
value.converter.schema.registry.url:http://127.0.0.1:8081
key.converter.schemas.enable=true
value.converter.schemas.enable=true
Sample configuration files are also available in the PLC4X Github repository.
https://github.com/apache/plc4x/tree/develop/plc4j/integrations/apache-kafka/config

How to use Kafka with Neo4j community edition

I installed Neo4j and I can access the server. I can make nodes though cypher.
Now I want to use it for data streams. But I'm not sure how to do so. I just started Neo4j and I'm struggling with installing 'Stream Plugin'.
Any help is highly appreciated.
You should copy the jar files for the Neo4j streams plugin directly into your /plugins folder and configure the connections to Kafka and Zookeeper as well as other Neo4j property values at the neo4j.conf file as described here. For example:
kafka.zookeeper.connect=zookeeper-host:2181
kafka.bootstrap.servers=kafka-host:9092
Alternatively, if you are looking only for a sink connection from Kafka (i.e. moving records from Kafka topics to into Neo4j), you can also use Kafka Connect with the the supported Kafka Connect Neo4j Sink. More at https://www.confluent.io/hub/neo4j/kafka-connect-neo4j

How to load data from Kafka into CrateDB?

From the following issue at CrateDB GitHub page it seems it is not possible, i.e., the Kafka protocol is not supported by CrateDB.
https://github.com/crate/crate/issues/7459
Is there another way to load data from Kafka into CrateDB?
Usually you'd use Kafka Connect for integrating Kafka to target (and source) systems, using the appropriate connector for the destination technology.
I can't find a Kafka Connect connector for CrateDB, but there is a JDBC sink connector for Kafka Connect, and a JDBC driver for CrateDB, so this may be worth a try.
You can read more about Kafka Connect here, and see it in action in this blog series:
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
https://www.confluent.io/blog/blogthe-simplest-useful-kafka-connect-data-pipeline-in-the-world-or-thereabouts-part-2/
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/
Disclaimer: I work for Confluent, and I wrote the above blog posts.

Kafka connect to HBase

I am trying to to use Kafka Connect to HBase and there are no Confluent supported connectors available for HBase, though there are some community connectors available. We are not really ready to take risk in production with out support to the connectors: Is there any other work around for HBase connectivity from Kafka Connect? Can we use Kafka JDBC connector for Kafka Connect?

Connect kafka to jdbc database

Can someone show me how to connect my kafka server to a jdbc postgresql database and retrieve data from it ? all the tutorials on the internet got me more confused !
You've not said which tutorials you've tried, or in what way you got confused…but the short answer is to use the Kafka Connect JDBC Connector.
You can find examples here and here.
Another option to explore is another Kafka Connect connector, called Debezium. This implements proper Change-Data-Capture (CDC) against Postgres.