PLC4X OPCUA -Kafka Connnector - apache-kafka

I want to use the PLC4X Connector (https://www.confluent.io/hub/apache/kafka-connect-plc4x-plc4j) to connect OPC UA (Prosys Simulation Server) with Kafka.
However I really do not find any website that describe the kafka connect configuration options?
I tried to connect to the prosys opc ua simulation server and than stream the data to a kafka topic.
I managed it to simply send the data and consume it, however i want to use a schema and the avro connverter.
My output from my sink python connector looks like this. That seems a bit strange to me too?
b'Struct{fields=Struct{ff=-5.4470555688606E8,hhh=Sean Ray MD},timestamp=1651838599206}'
How can I use the PLC4X connector with the Avro converter and a Schema?
Thanks!
{
"connector.class": "org.apache.plc4x.kafka.Plc4xSourceConnector",
"default.topic":"plcTestTopic",
"connectionString":"opcua.tcp://127.0.0.1:12345",
"tasks.max": "2",
"sources": "machineA",
"sources.machineA.connectionString": "opcua:tcp://127.0.0.1:12345",
"sources.machineA.jobReferences": "jobA",
"jobs": "jobA",
"jobs.jobA.interval": "5000",
"jobs.jobA.fields": "job1,job2",
"jobs.jobA.fields.job1": "ns=2;i=2",
"jobs.jobA.fields.job2": "ns=2;i=3"
}

When using a schema with Avro and the Confluent schema registry, the following settings should be used. You can also choose to use different settings for both the keys and values.
key.converter=io.confluent.connect.avro.AvroConverter
value.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url:http://127.0.0.1:8081
value.converter.schema.registry.url:http://127.0.0.1:8081
key.converter.schemas.enable=true
value.converter.schemas.enable=true
Sample configuration files are also available in the PLC4X Github repository.
https://github.com/apache/plc4x/tree/develop/plc4j/integrations/apache-kafka/config

Related

"Production grade" File Sink connector in kafka connect

I am using File System source connector to ingest data in kafka. But I am not able to find any file sink connector i checked pulse and spooldir everyone has only source connector. I am trying to use fileStream sink connector but it is not production grade as it is mentioned in official website.
Could anyone please suggest me any solution or connectors.
Note: I dont want to use consumer application

Apache NiFi to/from Confluent Cloud

I'm trying to publish custom db data (derived from Microsoft SQL CDC tables, having a join on other tables -> how it's arrived is for a different day though) to Kafka cluster.
I'm able to publish and consume messages from Apache NiFi -to/from- Apache Kafka.
But I'm unable to do publish messages from Apache NiFi -to- Kafka in Confluent Cloud.
Is it possible to publish/consume messages from Apache NiFi (server-A) to Confluent Cloud using the API Key that's created there?
If yes, what is the corresponding property in Apache NiFi's PublishKafkaRecord's processor and ConsumeKafkaRecord processor?
If no, please share any other idea to overcome the constraint.
Yes, NiFi uses the plain Kafka Clients Java API; it can work with any Kafka environment.
Confluent Cloud gives you all the client properties you will need, such as SASL configs for username + password.
Using PublishKafka_2_6 as an example,
Obviously, "Kafka Brokers" is the Bootstrap Brokers, then you have "Username" and "Password" settings for the SASL connection.
Set "Security Protocol" to SASL_SSL and "SASL Mechanism" to PLAIN.
"Delivery Guarantee" will set producer acks.
For any extra properties, use the + button above the properties for setting "Dynamic Properties" (refer above NiFi docs)
share any other idea to overcome the constraint
Use Debezium (Kafka Connect) instead.

Sending Avro messages to Kafka

I have an app that produces an array of messages in raw JSON periodically. I was able to convert that to Avro using the avro-tools. I did that because I needed the messages to include schema due to the limitations of Kafka-Connect JDBC sink. I can open this file on notepad++ and see that it includes the schema and a few lines of data.
Now I would like to send this to my central Kafka Broker and then use Kafka Connect JDBC sink to put the data in a database. I am having a hard time understanding how I should be sending these Avro files I have to my Kafka Broker. Do I need a schema registry for my purposes? I believe Kafkacat does not support Avro so I suppose I will have to stick with the kafka-producer.sh that comes with the Kafka installation (please correct me if I am wrong).
Question is: Can someone please share the steps to produce my Avro file to a Kafka broker without getting Confluent getting involved.
Thanks,
To use the Kafka Connect JDBC Sink, your data needs an explicit schema. The converter that you specify in your connector configuration determines where the schema is held. This can either be embedded within the JSON message (org.apache.kafka.connect.json.JsonConverter with schemas.enabled=true) or held in the Schema Registry (one of io.confluent.connect.avro.AvroConverter, io.confluent.connect.protobuf.ProtobufConverter, or io.confluent.connect.json.JsonSchemaConverter).
To learn more about this see https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained
To write an Avro message to Kafka you should serialise it as Avro and store the schema in the Schema Registry. There is a Go client library to use with examples
without getting Confluent getting involved.
It's not entirely clear what you mean by this. The Kafka Connect JDBC Sink is written by Confluent. The best way to manage schemas is with the Schema Registry. If you don't want to use the Schema Registry then you can embed the schema in your JSON message but it's a suboptimal way of doing things.

How to Connect Kafka to Postgres in Heroku

I have some Kafka consumers and producers running through my Kafka instance on my Heroku Cluster. I'm looking to create a data sink connector to connect Kafka to PosytgreSQL to put data FROM Kafka TO my heroku PostgreSQL instance. Pretty much like the HeroKu docs, but one way.
I can't figure out the steps I need to take to achieve this.
The docs say to look at the Gitlab or Confluence Ecosystem page but i can't find any mention of Postgres in these.
Looking in the Confluent Kafka Connectors library there seems to something from Debezium but i'm not running Confluent.
The diagram in the Heroku docs mentions a JDBC connector? I found this Postgres JDBC driver, should I be using this?
I'm happy to create a consumer and update postgres manually as the data comes if that's what's needed, but I feel that Kafka to Postgres must be a common enough interface that there should be something out there to manage this?
I'm just looking for some high level help or examples to set me on the right path.
Thanks
You're almost there :)
Bear in mind that Kafka Connect is part of Apache Kafka, and you get a variety of connectors. Some (e.g. Debezium) are community projects from Red Hat, others (e.g. JDBC Sink) are community projects from Confluent.
The JDBC Sink connector will let you stream data from Kafka to a database with a JDBC driver - such as Postgres.
Here's an example configuration:
{
"connector.class" : "io.confluent.connect.jdbc.JdbcSinkConnector",
"key.converter" : "org.apache.kafka.connect.storage.StringConverter",
"connection.url" : "jdbc:postgresql://postgres:5432/",
"connection.user" : "postgres",
"connection.password": "postgres",
"auto.create" : true,
"auto.evolve" : true,
"insert.mode" : "upsert",
"pk.mode" : "record_key",
"pk.fields" : "MESSAGE_KEY"
}
Here's a walkthrough and couple of videos that you might find useful:
Kafka Connect in Action: JDBC Sink
ksqlDB and the Kafka Connect JDBC Sink
Do i actually need to install anything
Kafka Connect comes with Apache Kafka. You need to install the JDBC connector.
Do i actually need to write any code
No, just the configuration, similar to what I quoted above
can i just call the Connect endpoint , which comes with Kafka,
Once you've installed the connector you run Kafka Connect (a binary that ships with Apache Kafka) and then use the REST endpoint to create the connector using the configuration

How to load data from Kafka into CrateDB?

From the following issue at CrateDB GitHub page it seems it is not possible, i.e., the Kafka protocol is not supported by CrateDB.
https://github.com/crate/crate/issues/7459
Is there another way to load data from Kafka into CrateDB?
Usually you'd use Kafka Connect for integrating Kafka to target (and source) systems, using the appropriate connector for the destination technology.
I can't find a Kafka Connect connector for CrateDB, but there is a JDBC sink connector for Kafka Connect, and a JDBC driver for CrateDB, so this may be worth a try.
You can read more about Kafka Connect here, and see it in action in this blog series:
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
https://www.confluent.io/blog/blogthe-simplest-useful-kafka-connect-data-pipeline-in-the-world-or-thereabouts-part-2/
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/
Disclaimer: I work for Confluent, and I wrote the above blog posts.