How does Oracle NoSQL DB Streams handles failures? - nosql

I had set up several NoSQL streams this morning. Around mid-day, there were issues with the network connectivicity in the environment where Oracle NoSQL Database was deployed. I saw some anomalies in my applications.
How does Oracle NoSQL DB Streams API handle failures like unstable network connection, master transfer, and rebalance at the server?
Please advise.

Thanks for trying Oracle NoSQL Database Streams API.
Oracle NoSQL Streams handles failures in different ways depending on the nature of the failure. In the case of unstable network connectivity, the Streams API would reconnect to the master node of each shard at the host and port cached in the Streams API, therefore, when network connectivity restores, the Streams API would reconnect successfully and resume streaming from the last streamed operation. If the attempts to reconnect fail after three times, the Streams API would refresh the topology from the Oracle NoSQL Database, and reconnect with the master node found in the latest store topology. This happens when the master transfers and the old master is no longer accessible.
Handling store rebalancing is similar to handling master transfer, the Streams API would pull the new topology from the store, and locate the new master of each shard to reconnect. After the rebalancing is complete and the new topology is ready, the Streams API would be able to reconnect and resume streaming.
The description above is for the latest version Oracle NoSQL Database 21.1. In previous versions, there is a limit that caps the number of reconnects in Stream API, and the Streams API will terminate the stream and signal NoSQLSubscriber.onError() if it runs out of the maximum attempts. In 21.1, such limit is removed by default and the Streams API would just keep reconnecting till the connection is restored. The users can override the default behavior by setting NoSQLSubscriptionConfig.Builder.setMaxReconnect(long maxReconnect)

Related

How to manage Flink application when Kafka broker is unavailable?

I have a Flink application running in production which writes data to a Kafka topic owned by an external vendor.
We were notified by the vendor that they would be migrating their cluster and hence there will be downtime where the Kafka brokers will not be available.
My question is, what will happen to the Flink application data when the topic is not available to write data into? Can I allow my Flink application to continue running or should I stop it and wait for the brokers to be up and running?
The task will fail if it can't connect to the Kafka Sink. What it does after failing will depend on your Task Failure Recovery strategy.
If you don't want to keep an eye on when Kafka will be available again, a fixed-delay with infinite retries and a long delay or an exponential-delay strategy may be your best option to not overload your infrastructure too much with unnecessary restarts.

Kafka sink from multiple independent brokers

I want to aggregate changes from multiple databases into one so I thought to run a Debezium connector and a Kafka server/broker next to each database, and use a Kafka sink connector to consume from all those Kafkas to write into one database.
The question is, can I use a single instance of Kafka sink connector to consume at the same time, from multiple Kafka brokers which are independent (not a cluster).
Running a Kafka broker next to each database sounds very complicated. And a single Kafka connect worker that connects to different Kafka broker clusters does not seem to be supported, as far as I can see.
If you go down this path, it may make more sense to use something like Kafka MirrorMaker to copy your local topics to a single main Kafka cluster, and then use a Kafka Connect Sink to read all the copied topics from one worker and write to a central DB.
Ultimately, running a Broker next to each source database is pretty complicated. From what you described, it sounds like you have some connectivity between your different databases, but it is limited and possibly prone to disconnects. Have you considered alternative designs:
DB Replication: Use your DB vendor's native async replication to just copy the data to a single target DB. The remote region is always read-only, replication should not slow down your source DB (depends on the DB, of-course). And async DB replication can usually handle some network disconnections and latency.
Local Debezium: Run a process with Debezium next to each DB, and save all events to a file. Copy the files to some central server or to a cloud storage service like S3. Finally, import these files into a central DB. This would basically skip Kafka completely.
You can point the Connect property files at whatever bootstrap.servers you want
The property itself is required to be part of a single "cluster" (even if a single broker), which would be determined by the broker zookeeper.connect property

Running Source Connector on Demand and Not Based on poll.interval.ms

I have a table that is updated once / twice a day, but I want the data to be pushed to Kafka immediately after the table is updated. Is it possible to avoid running the connector every poll.interval.ms, but rather to run it only after the table is updated (sync on demand or trigger the sync in some other way after the table update)
I apologize if this question is stupid... Can sink connector be running on one Kafka cluster, but pull messages from another Kafka cluster and insert them into Postgres. I'm not talking about replicating messages from Cluster A to Cluster B and then inserting messages from Cluster B to Postgres. I'm talking about Connector running on Cluster B but pulling messages from Cluster A and writing them to Postgres.
Thanks!
If you use log-based change data capture (Debezium, etc) then you capture changes as soon as they are there, without needing to re-query the database. If you use query-based CDC then you do have to query the database on a polling interval. For query-based vs log-based CDC see this blog or talk.
One option would be to use the Kafka Connect REST API to control the connector - but you're kind of going against the streaming paradigm here and will start to find awkward edges in doing this. For example, when do you decide to pause the connector? How do you determine that it's ingested all the changes? etc.
Using log-based CDC is low-impact on the source system and commonly the route that people go.
Kafka Connect does not run on your Kafka cluster. Kafka Connect runs as its own cluster. Physically, it can be co-located for purposes of dev/sandbox environment (this ref arch is useful for production). See also this talk "Running Kafka Connect".
So in your example, "Cluster B" is actually a Kafka Connect cluster - and it would be configured to read from Kafka cluster "A", and that is fine.

Can we use any other database like MariaDB or MongoDB for Storing states in Kafka Streams instead of Rocks DB, is there any way to configure it?

i have a Spring boot Kafka Stream application which process all the incoming events and store it in the State Store which Kafka Streams provides internally and query it using interactive query service. Inside all these Kafka Streams using "RocksDB" , i want to replace this RocksDB with any other db that can configurable like MariaDB or MongoDB. Is there a way to do it ? if not
How can i configure Kafka Stream application to use MongoDB for creating the state stores.
StateStore / KeyValueStore are open interfaces in Kafka Streams which can be used with TopologyBuilder.addStateStore
Yes, you can materialize values to your own store implementation with a database of your choice, but it'll affect processing semantics should there be any database connection issues, particularly with remote databases.
Instead, using a topic more of a log of transactions then following that up with Kafka Connect is the proper approach for external systems

How to add health check for topics in KafkaStreams api

I have a critical Kafka application that needs to be up and running all the time. The source topics are created by debezium kafka connect for mysql binlog. Unfortunately, many things can go wrong with this setup. A lot of times debezium connectors fail and need to be restarted, so does my apps then (because without throwing any exception it just hangs up and stops consuming). My manual way of testing and discovering the failure is checking kibana log, then consume the suspicious topic through terminal. I can mimic this in code but obviously no way the best practice. I wonder if there is the ability in KafkaStream api that allows me to do such health check, and check other parts of kafka cluster?
Another point that bothers me is if I can keep the stream alive and rejoin the topics when connectors are up again.
You can check the Kafka Streams State to see if it is rebalancing/running, which would indicate healthy operations. Although, if no data is getting into the Topology, I would assume there would be no errors happening, so you need to then lookup the health of your upstream dependencies.
Overall, sounds like you might want to invest some time into using monitoring tools like Consul or Sensu which can run local service health checks and send out alerts when services go down. Or at the very least Elasticseach alerting
As far as Kafka health checking goes, you can do that in several ways
Is the broker and zookeeper process running? (SSH to the node, check processes)
Is the broker and zookeeper ports open? (use Socket connection)
Are there important JMX metrics you can track? (Metricbeat)
Can you find an active Controller broker (use AdminClient#describeCluster)
Are there a required minimum number of brokers you would like to respond as part of the Controller metadata (which can be obtained from AdminClient)
Are the topics that you use having the proper configuration? (retention, min-isr, replication-factor, partition count, etc)? (again, use AdminClient)