How can I connect Jboss Fuse to a database? - mongodb

I would like to directly connect a database to Fuse. My goal is to save all messages received by one or more topics inside a database (MySQL, postgreSQL, MongoDB,...).
I don't need a failover database, basically I would "subscribe" a database to Topics and save all messages for future analysis.
What's the easiest way to do it?

At a high level, the easiest thing to do would be setup a Camel route that consumes from the topic using the JMS component (or ActiveMQ if you're using that for your broker), and then write the message body into the Database using the JDBC Component. You could use PIDs to control the topic (or topics) that are consumed.
To create the JDBC connection you could either setup that up as part of your bundle containing the Camel Route (via blueprint/spring), or you could create a separate bundle that creates a JDBC connection/datasource via blueprint/spring and then exposes it as an OSGi Service for the camel route.

Related

Custom Connector for Apache Kafka

I am looking to write a custom connector for Apache Kafka to connect to SQL database to get CDC data. I would like to write a custom connector so I can connect to multiple databases using one connector because all the marketplace connectors only offer one database per connector.
First question: Is it possible to connect to multiple databases using one custom connector? Also, in that custom connector, can I define which topics the data should go to?
Second question: Can I write a custom connector in .NET or it has to be Java? Is there an example that I can look at for custom connector for CDC for a database in .net?
There are no .NET examples. The Kafka Connect API is Java only, and not specific to Confluent.
Source is here - https://github.com/apache/kafka/tree/trunk/connect
Dependency here - https://search.maven.org/artifact/org.apache.kafka/connect-api
looking to write a custom connector ... to connect to SQL database to get CDC data
You could extend or contribute to Debezium, if you really wanted this feature.
connect to multiple databases using one custom connector
If you mean database servers, then not really, no. Your URL would have to be unique per connector task, and there isn't an API to map a task number to a config value. If you mean one server, and multiple database schemas, then I also don't think that is really possible to properly "distribute" within a single connector with multiple tasks (thus why database.names config in Debezium only currently supports one name).
explored debezium but it won't work for us because we have microservices architecture and we have more than 1000 databases for many clients and debezium creates one topic for each table which means it is going to be a massive architecture
Kafka can handle thousands of topics fine. If you run the connector processes in Kubernetes, as an example, then they're centrally deployable, scalable, and configurable from there.
However, I still have concerns over you needing all databases to capture CDC events.
Was also previously suggested to use Maxwell

Can we use any other database like MariaDB or MongoDB for Storing states in Kafka Streams instead of Rocks DB, is there any way to configure it?

i have a Spring boot Kafka Stream application which process all the incoming events and store it in the State Store which Kafka Streams provides internally and query it using interactive query service. Inside all these Kafka Streams using "RocksDB" , i want to replace this RocksDB with any other db that can configurable like MariaDB or MongoDB. Is there a way to do it ? if not
How can i configure Kafka Stream application to use MongoDB for creating the state stores.
StateStore / KeyValueStore are open interfaces in Kafka Streams which can be used with TopologyBuilder.addStateStore
Yes, you can materialize values to your own store implementation with a database of your choice, but it'll affect processing semantics should there be any database connection issues, particularly with remote databases.
Instead, using a topic more of a log of transactions then following that up with Kafka Connect is the proper approach for external systems

Listen to a topic continiously, fetch data, perform some basic cleansing

I'm to build a Java based Kafka streaming application that will listen to a topic X continiously, fetch data, perform some basic cleansing and write to a Oracle database. The kafka cluster is outside my domain and have no ability to deploy any code or configurations in it.
What is the best way to design such a solution? I came across Kafka Streams but was confused as to if it can be used for 'Topic > Process > Topic' scenarios?
I came accross Kafka Streams but was confused as to if it can be used for 'Topic > Process > Topic' scenarios?
Absolutely.
For example, excluding the "process" step, it's two lines outside of the configuration setup.
final StreamsBuilder builder = new StreamsBuilder();
builder.stream("streams-plaintext-input").to("streams-pipe-output");
This code is straight from the documentation
If you want to write to any database, you should first check if there is a Kafka Connect plugin to do that for you. Kafka Streams shouldn't really be used to read/write from/to external systems outside of Kafka, as it is latency-sensitive.
In your case, the JDBC Sink Connector would work well.
The kafka cluster is outside my domain and have no ability to deploy any code or configurations in it.
Using either solution above, you don't need to, but you will need some machine with Java installed to run a continous Kafka Streams application and/or Kafka Connect worker.

Spring Cloud Dataflow on different message broker than Redis?

from what I can see in the documentation, always a Redis instance is needed for Spring Cloud Dataflow to work.
Is it also possible to work with different message broker, e.g. RabbitMQ?
How would one specify a different message broker during startup?
With the recent 1.0.0.M3 release, when using the Local server, we load kafka based OOTB applications, by default.
If you'd like to switch from kafka to rabbit, you can via --binder=rabbit as a command line argument when starting the spring-cloud-dataflow-server-local JAR (and be sure to start the RabbitMQ server).

What is the best practice with JMS Servers? Should it be deployed on Consumer or Producer side?

From architecture point of view, I was wondering what would be the best practice on an integration scenario with 2 application and OSB as Middleware: JMS Consumer runs over JBoss while OSB application encapsulates a service provider. Should JMS Queues reside on the JBoss (foreign server) or on WebLogic Server? That is, If I get to choose, JMS Server should be on consumer or producer side? What would be the pros and cons?
Thanks in advance.
It depends on your needs, you can create a foreign destination in your web logic server that connects to the producer queue in the producer end. In this arrangement your consumer will listen in the local end of the foreign destination that connects to the producer queue.
I can think of the following benefits:
A> Foreign Destinations are mapped to WebLogic JNDI tree, any MDB that you deploy to the server can simply reference the remote destination using its local JNDI name.
B> As you are directly communicating with the remote resource there is no lag / latency in delivery etc.
C> One issue maybe you will not be able to produce messages in the consuming end because this user may not have enqueue access to the queue. But it all depends on your setup. This maybe required for some cases like testing etc.