Can kafka be implemented with spring boot + postgresql ? is there an additional process, for example if we insert data into the database how to use kafka? then when calling data from database should I put into kafka first and then the get api just call from kafka not from database ? is this correct to use kafka ?
then when the user calls the save api then I have to retrieve it from the database at the same time so that the data in kafka is also updated ?
Kafka consumption would be separate from any database action
You can use Kafka Connect framework, separately from any Spring app to write data from Postgres to/from Kafka using a combination of Debezium and/or Confluent's JDBC connector
Actions from the Spring app can use KafkaTemplate or JDBCTemplate depending on your needs
Related
How we can stream schema and data changes along with some kind of transformations into another MySQL instance using Kafka connect source connector.
Is there a way to propagate schema changes also if I use Kafka's Python library(confluent_kafka) to consume and transform messages before loading into target DB.
You can use Debezium to stream MySQL binlogs into Kafka. Debezium is built upon Kafka Connect framework.
From there, you can use whatever client you want, including Python, to consume and transform the data.
If you want to write to MySQL, you can use Kafka Connect JDBC sink connector.
Here is an old post on this topic - https://debezium.io/blog/2017/09/25/streaming-to-another-database/
I need to read my mongo DB table data periodically and publish it into a Kafka topic using spring boot. I have created a collection in Mongo DB and inserted a few records in Mongo DB. Further, I want to read the data from Mongo DB periodically and need to publish those table data in Kafka's topic using spring boot. I'm very new to spring batch scheduler. Can you please suggest me an idea to achieve this?
Thanks in advance.
What you are talking about is more relevant to Spring Integration: https://spring.io/projects/spring-integration#overview
So, you configure a MongoDbMessageSource with a Poller to read collection periodically.
And then you have service-activator based on the KafkaProducerMessageHandler to damp data into a Kafka topic.
See more in docs:
https://docs.spring.io/spring-integration/docs/5.3.2.RELEASE/reference/html/mongodb.html#mongodb
https://docs.spring.io/spring-integration/docs/5.4.0-M3/reference/html/kafka.html#kafka
Not sure though how to do that with Spring Batch...
i have a Spring boot Kafka Stream application which process all the incoming events and store it in the State Store which Kafka Streams provides internally and query it using interactive query service. Inside all these Kafka Streams using "RocksDB" , i want to replace this RocksDB with any other db that can configurable like MariaDB or MongoDB. Is there a way to do it ? if not
How can i configure Kafka Stream application to use MongoDB for creating the state stores.
StateStore / KeyValueStore are open interfaces in Kafka Streams which can be used with TopologyBuilder.addStateStore
Yes, you can materialize values to your own store implementation with a database of your choice, but it'll affect processing semantics should there be any database connection issues, particularly with remote databases.
Instead, using a topic more of a log of transactions then following that up with Kafka Connect is the proper approach for external systems
I need to build an app that reads from Kafka and writes the data to MongoDB.
Most of the times, the data will be written as is, but there will be cases where some processing on the data will be needed.
I wonder what to do -
Use Kafka Connect MongoDB Sink or use our "old and familiar" approach of building an app with Kafka consumer and write the data to Mongo using MongoDB client (runs on K8s).
What are the adventages\ disadvantages of using Kafka connect? in terms of monitoring, scaling, debugging and pre-processing of the data?
Thanks
I want to do some analytics using Flink on the Data in Postgresql. How and where should I give the port address,username and password. I was trying with the table source as mentioned in the link:https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/common.html#register-tables-in-the-catalog.
final static ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
final static TableSource csvSource = new CsvTableSource("localhost", port);
I am unable to start with actually. I went through all the documents but detailed report about this not found.
The tables and catalog referred to the link you've shared are part of Flink's SQL support, wherein you can use SQL to express computations (queries) to be performed on data ingested into Flink. This is not about connecting Flink to a database, but rather it's about having Flink behave somewhat like a database.
To the best of my knowledge, there is no Postgres source connector for Flink. There is a JDBC table sink, but it only supports append mode (via INSERTs).
The CSVTableSource is for reading data from CSV files, which can then be processed by Flink.
If you want to operate on your data in batches, one approach you could take would be to export the data from Postgres to CSV, and then use a CSVTableSource to load it into Flink. On the other hand, if you wish to establish a streaming connection, you could connect Postgres to Kafka and then use one of Flink's Kafka connectors.
Reading a Postgres instance directly isn't supported as far as I know. However, you can get realtime streaming of Postgres changes by using a Kafka server and a Debezium instance that replicates from Postgres to Kafka.
Debezium connects using the native Postgres replication mechanism on the DB side and emits all record inserts, updates or deletes as a message on the Kafka side. You can then use the Kafka topic(s) as your input in Flink.