Kafka producer posting messages to secondary cluster - apache-kafka

Description of proposed cluster setup
2 Data centres and each having 5 node Kafka cluster
Clusters are having the same topics and same producer/consumer instances working with it
There is no data replication across the clusters. So data in Cluster 1 and 2 is distinct
There is no message affinity required. [It will not make any difference functionally if either the Producer 1 were to start posting message to Cluster 2 and vice versa]
What we want to achieve is, Lets say Producer 1 posts a message asynchronously to Cluster 1, but receives a negative acknowledgment ( after all the retry timeout has occurred). This is easily detected in the producer callback method
On receiving this failure, We use another KafkaTemplate (having details of Cluster 2) to be used by producer. Now producer tries posting the same message on to cluster 2 [ It applies other way round as well, if producer 2 unable to post locally , it will send message to cluster 1]
The advantage that we get here is
message is not lost and posted automatically to the other cluster
Since this activity occurs for each message, so once the Cluster 1 is back up, automatically Producer 1 is able to send messages to cluster 1
One down side we see is, We are handling the failover logic ourselves by producing to secondary cluster in exception handling block of either Metadata fetch timeout or on Negative acknowledgment
I could not find any where on the net showing a similar setup. Is there is something fundamentally wrong with this approach

Sure; just configure 2 sets of infrastructure beans - producer and consumer factories, container factories, templates.
You can't use Boot's auto configuration for that, but you can define the beans yourself.

Related

How to simulate KEY_BUSY hot key error code 14 in Aerospike

How can I test/produce an Aerospike exception code 14?
I have a simple one node environment with Aerospike in it and a java application on K8s.
There are 3 pods of the application, all are consuming messages from Kafka topic with 3 partitions, all in the same consumer group.
With Kafka producer driver, we inject at once 200 messages, with no Kafka message key (so that kafka will round robin on the 3 topic partitions).
All messages relate to the same Aerospike key so the 3 application pods suppose to update the same record in parallel, resulting with Aerospike hotkey exception (KEY_BUSY, error code 14).
But that's not happening and all 200 messages are processed successfully.
The configuration parameter "transaction-pending-limit" is set to 1 in aerospike.conf.
Many thanks.
Try adding one more node in the Aerospike cluster. With a one node Aerospike cluster, you are not replicating to another node. So the transaction is completing before you can encounter "key busy". Adding another node to the Aerospike cluster with replication factor 2 will cause the current transaction to wait in the queue for the replication ack and then, I believe, you will be able to simulate key busy error with transaciton-pending-limit set to 1. Let us know if that works for you.

Create Producer when the first broker in the list of brokers is down

I have a multi-node Kafka cluster which I use for consuming and producing.
In my application, I use confluent-kafka-go(1.6.1) to create producers and consumers. Everything works great when I produce and consume messages.
This is how I configure my bootstrap server list
"bootstrap.servers":"localhost:9092,localhost:9093,localhost:9094"
But the moment when I start giving out the IP address of the brokers in bootstrap.servers and if the first broker is down, it seems that the producer repeatedly fails creation telling
Failed to initialize Producer ID: Local: Timed out
If I remove the IP of the failed node, producing and consuming messages work.
If the broker is down after I create the producer/consumer, they continue to be usable by switching over to other nodes.
How should I configure bootstrap.servers in such a way that the producer will be created using the available nodes?
You shouldn't really be running 3 brokers on the same machine anyway, but using multiple unique servers works fine for me when the first is down (and the cluster elects a different leader if it needs to), so sounds like you either lost the primary leader of your topic partitions or you've lost the Controller. Enabling retires on the producer should be able fix itself (by making a new metadata request for partition leaders)
Overall, it's just a CSV; there's no other way to configure that property itself. You could stick a reverse proxy in front of the brokers that resolves only to healthy nodes, but then you'd be conflicting with a potential DNS cache

Why my Kafka connect sink cluster only has one worker processing messages?

I've recently setup a local Kafka on my computer for testing and development purposes:
3 brokers
One input topic
Kafka connect sink between the topic and elastic search
I managed to configure it in standalone mode, so everything is localhost, and the Kafka connect was started using ./connect-standalone.sh script.
What I'm trying to do now is to run my connectors in distributed mode, so the Kafka messages can be separated into both workers.
I've started the two workers (still everything on the same machine), but when I send message to my Kafka topic, only one worker (the last started) is processing messages.
So my question is: Why only one worker is processing Kafka messages instead of both ?
When I kill one of the worker, the other one takes the message flow back, so I think the cluster is well setup.
What I think:
I don't put Keys inside my Kafka messages, can it be related to this ?
I'm running everything in localhost, does distributed mode can work this way ? (I've correctly configure specific unique field such as ret.port)
Resolved:
From Kafka documentation:
The division of work between tasks is shown by the partitions that each task is assigned
If you don't use partition (push all messages in same partition), workers won't be able to divide messages.
You don't need to use message keys, you can just push your messages to different partition in a cyclic way.
See: https://docs.confluent.io/current/connect/concepts.html#distributed-workers

What is the real use of kafka based multi ordering service

I am new in fabric technologies. I read some articles about the Kafka based ordering services and its advantage. Some of articles say that Kafka based multi ordering services is suitable for fault tolerance. Now i just apply 3 Kafka based ordering services(orderer0,orderer1,orderer2). Then i stopped 2 orderer using the following command
docker stop orderer1.example.com
docker stop orderer2.example.com
Now the Rest api working correctly. Then i stopped orderer0 using
docker stop orderer0.example.com
Now my Rest api is not working.It has facing network connection problem.Then I started orderer1,orderer2 using the following command
docker start orderer1.example.com
docker start orderer2.example.com
But my Rest api is not working...........It has facing the same network connection problem.
And finally I started orderer0 using
docker start orderer0.example.com
Now the network is working fine.
My questions is
What is actual use of Kafka based ordering services..??
How we can implement Kafka based ordering service for prevent the orderer downing problem...??
Fabric:1.1.0
Composer:0.19.16
Node:8.11.3
OS: Ubuntu 16.04
I had the same problem as you when I wanted to set up several orderer. To solve this problem I have 2 solutions:
I changed the SDK, currently your SDK tries to contact the orderer0 if it fails it returns an error, it is necessary to change this so that the request loop on a list of orderer and returns an error if no is valid.
easier: set up a load-balancer upstream of the orderers.
To answer your question. The advantage of setting up Kafka based ordering services is that the data of the proposed blocks are spread over several servers. There is a fault tolerance because if an orderer crashes and reconnects to the kafka cluster it will be able to resynchronize. The performances are better (it's theoretical I did not test on this point)
As per Kafka Ordering Services
Each channel maps to a separate single-partition topic in Kafka
This means that all messages in the topic are totally-ordered in the order in which they were sent.
and
At a minimum, [the number of brokers] should be set to 4. (As we will explain in Step 4 below, this is the minimum number of nodes necessary in order to exhibit crash fault tolerance, i.e. with 4 brokers, you can have 1 broker go down, all channels will continue to be writeable and readable, and new channels can be created.)
The above assumes a Kafka replication factor of 3 and the producing client to set min.insync.replicas ideally to 2 to make sure that all writes are replicated to at least two servers.
Based on your network issues, this sounds to me like you did not actually configure all three brokers correctly (would need to see your entire Docker setup and what the Dockerfile is actually doing). But, assuming you did configure all three brokers for this "REST API", and there is a single-partition Kafka topic with 3 replicas (the default replication is 1, and topics are auto-created with this). So, I suggest you clean it all, then start three brokers, then manually create the topic with 1 partition, 3 replicas, then start Hyperledger.
If the REST API is the actual problem, not the Kafka connection, then you would need a load-balancer, I guess

Kafka cluster fail handling explanation

I would like to set up an apache kafka cluster for using it in a new project. Unfortunately I can't find any detailed explanation on how kafka handle broker fails and network partitioning.
For example, if I have a cluster made of 2 or more brokers and 1 node fail, does the only one node up keep accepting messages?
If yes, when the second come up again, how it resync its missing data?
Have a look here and here at the description of the replication protocol that Kafka uses. Each partition in a Kafka topic has a 'leader', and messages are sent to the leader. Messages are replicated to 'followers'.
So to answer you questions specifically, my understanding is:
if I have a cluster made of 2 or more brokers and 1 node fail, does the only one node up keep accepting messages?
Only one node accepts messages anyway; the leader node. If a follower fails, the leader continues to accept messages.
If the leader fails a new leader is elected from those followers that are up to date.
If yes, when the second come up again, how it resync its missing data?
'Followers' act as consumers of the 'leader', so a follower once brought back up will continue to consume its messages from the leader to get back into sync.