What is the real use of kafka based multi ordering service - apache-kafka

I am new in fabric technologies. I read some articles about the Kafka based ordering services and its advantage. Some of articles say that Kafka based multi ordering services is suitable for fault tolerance. Now i just apply 3 Kafka based ordering services(orderer0,orderer1,orderer2). Then i stopped 2 orderer using the following command
docker stop orderer1.example.com
docker stop orderer2.example.com
Now the Rest api working correctly. Then i stopped orderer0 using
docker stop orderer0.example.com
Now my Rest api is not working.It has facing network connection problem.Then I started orderer1,orderer2 using the following command
docker start orderer1.example.com
docker start orderer2.example.com
But my Rest api is not working...........It has facing the same network connection problem.
And finally I started orderer0 using
docker start orderer0.example.com
Now the network is working fine.
My questions is
What is actual use of Kafka based ordering services..??
How we can implement Kafka based ordering service for prevent the orderer downing problem...??
Fabric:1.1.0
Composer:0.19.16
Node:8.11.3
OS: Ubuntu 16.04

I had the same problem as you when I wanted to set up several orderer. To solve this problem I have 2 solutions:
I changed the SDK, currently your SDK tries to contact the orderer0 if it fails it returns an error, it is necessary to change this so that the request loop on a list of orderer and returns an error if no is valid.
easier: set up a load-balancer upstream of the orderers.
To answer your question. The advantage of setting up Kafka based ordering services is that the data of the proposed blocks are spread over several servers. There is a fault tolerance because if an orderer crashes and reconnects to the kafka cluster it will be able to resynchronize. The performances are better (it's theoretical I did not test on this point)

As per Kafka Ordering Services
Each channel maps to a separate single-partition topic in Kafka
This means that all messages in the topic are totally-ordered in the order in which they were sent.
and
At a minimum, [the number of brokers] should be set to 4. (As we will explain in Step 4 below, this is the minimum number of nodes necessary in order to exhibit crash fault tolerance, i.e. with 4 brokers, you can have 1 broker go down, all channels will continue to be writeable and readable, and new channels can be created.)
The above assumes a Kafka replication factor of 3 and the producing client to set min.insync.replicas ideally to 2 to make sure that all writes are replicated to at least two servers.
Based on your network issues, this sounds to me like you did not actually configure all three brokers correctly (would need to see your entire Docker setup and what the Dockerfile is actually doing). But, assuming you did configure all three brokers for this "REST API", and there is a single-partition Kafka topic with 3 replicas (the default replication is 1, and topics are auto-created with this). So, I suggest you clean it all, then start three brokers, then manually create the topic with 1 partition, 3 replicas, then start Hyperledger.
If the REST API is the actual problem, not the Kafka connection, then you would need a load-balancer, I guess

Related

Messages are stuck in ActiveMQ Artemis cluster queues

We have a problem with Apache ActiveMQ Artemis cluster queues. Sometimes messages are beginning to pile up in the particular cluster queues. It usually happens 1-4 times per day and mostly on production (it was only one time for last 90 days when it has happened on one of the test environments).
These messages are not delivered to consumers on other cluster brokers until we restart cluster connector (or entire broker).
The problem looks related to ARTEMIS-3809.
Our setup is: 6 servers in one environment (3 pairs of master/backup servers). Operating system is Linux (Red Hat).
We have tried to:
upgrade from 2.22.0 to 2.23.1
increase minLargeMessageSize on the cluster connectors to 1024000
The messages are still being stuck in the cluster queues.
Another problem that I tried to configure min-large-message-size as it written in documentation (in cluster-connection), but it caused errors at start (broker.xml did not pass validation with xsd), so it was only option to specify minLargeMessageSize in the URL parameters of connector for each cluster broker. I don't know if this setting has effect.
So we had to make a script which checks if messages are stuck in the cluster queues and restarts cluster connector.
How can we debug this situation?
When the messages are stuck, nothing wrong is written to the log (no errors, no stacktraces etc.).
Which logging level (for what classes) should we enable to debug or trace level to find out what happens with the cluster connectors?
I believe you can remedy the situation by setting this on your cluster-connection:
<producer-window-size>-1</producer-window-size>
See ARTEMIS-3805 for more details.
Generally speaking, moving message around the cluster via the cluster-connection, while convenient, isn't terribly efficient (much less so for "large" messages). Ideally you would have a sufficient number of clients on each node to consume the messages that were originally produced there. If you don't have that many clients then you may want to re-evaluate the size of your cluster as it may actually decrease overall message throughput rather than increase it.
If you're just using 3 HA pairs in order to establish a quorum for replication then you should investigate the recently added pluggable quorum voting which allows integration with a 3rd party component (e.g. ZooKeeper) for leader election eliminating the need for a quorum of brokers.

How to handle failure senario for kafka and zookeeper in kubernetes

What I have zookeeper setup which is running on server1, server2 and server3 and similarly kafka also running in server1, server2 and server3.
Setup are running in kubernetes.
Problem statement:
In case one zookeeper setup get down entire setup will get down, because kafka is depended to zookeeper. am i right?
If Q1 correct - Is there any way to make setup like if one zookeeper server will get down then kafka should run as it is?
How to expose kafka port in kubernetes setup ?
what is the recommended way to persist data in kubernetes for production server ?
I fail to see how Zookeeper questions are related to k8s... But you definitely should set affinity rules such that Zookeeper and Kafka are not on the same physical servers or sharing same disks
If one Zookeeper out of three goes down, you'll end up with a split brain event in that no single Zookeeper knows which should be responsible for leadership. This effectively can crash or corrupt Kafka, yes.
To mitigate that risk, you can choose to run 5 Zookeepers, in which case you can lose up to 3 servers to reach the same state. The Definitive Guide book covers these concepts in the first few chapters
Regarding the other questions - NodePorts and PVCs, generally speaking.
Use one of the popular Kafka Operators on Github and you'll not need to think too hard about setting those properties
You still must manually perform Kafka admin tasks in any installation... You can use extra services like Cruise Control if you want to reduce that workload, though

Building a Kafka Cluster using two servers only

I'm planning to build a Kafka Cluster using two servers, and host Zookeeper on these two servers as well.
The Question is, since Kafka requires Zookeeper to run, what is the best cluster build for zookeeper to implement Kafka Cluster on two servers?
for eg. I'm currently running two zookeepers on both servers and one Kafka on each server, and in the Kafka configuration they point to all Zookeepers.
Is there a better way to do this?
First of all, you don't have to setup Zookeper and Kafka in the same server. One of the roles of Zookeeper is electing controller. (one of the brokers which is responsible for maintaining the leader/follower relationship for all the partitions) For election; majority of Zookeper nodes must be alive. In your case even one Zookeeper instance is down, you cannot select controller. So there is no difference between having one Zookeper or two. That's why it is recommended to have at least 3 nodes in Zookeeper cluster. By this way you can handle failure of one Zookeeper node.
An addition to this, it is highly recommended to have at least three brokers in your Kafka cluster to maintain both consistency and high availability. (link1, link2)
UPDATE:
As long as you are limited to only two servers, then you can consider sacrificing from high availability by set up your broker by setting min.insync.replicas=2 and having topics with replication.factor=2. If HA is more important than data loss, then you can use min.insync.replicas=1 (default) broker config with again topic replication.factor=2. In this circumstance, your options are these IMHO. (Having one or two Zookeepers is not important as I mentioned above)
I am often faced with the same problem as you do #frisky5 where i would like to achieve a "suboptimal" HA system using only 2 nodes, and thus workarounds are always needed with cloud-native frameworks that rely on the assumption that clusters will have lot of nodes available.
That ain't always the case in real life, is it ;) ?
That being said, i see you essentially having 2 options:
Externalize zookeeper configuration on a replicated storage system using 2 nodes (e.g. DRBD)
Replicate Kafka data volumes entirely on the second nodes and use 2 one-node Kafka clusters that you switch on and off depending on who is the current master node.
I would go for the first option. In that case you would have 2 Kafka servers and one zookeeper server whose ip needs to be static (virtual ip). When the zookeeper node goes down, it is restarted one the second node with same VIP, but it needs to access the synchronized data folder.
I am not too familiar with zookeepers internals and i can't tell you whether it will go in conflict when starting up on a data store who "wasn't its own" but i would guess it makes sense for you to test it using a simple rsync setup.
Another way to achieve consensus if you are using a k3s based kubernetes cluster would be to rely on internal k8s distributed consensus mechanics to "tell Kafka" which node is the leader. This works for the postgresoperator by chruncydata because Patroni is cool ( https://patroni.readthedocs.io/en/latest/kubernetes.html ) 😎 but i am not sure if Kafka/zookeeper are that flexible and can communicate with a rest API to set their locks ...
Once you have achieved this intermediate step, then you can use a PostgreSQL db as external source of truth for k3s and then it is as simple as syncing the postgres data folder between the machines (easily done with rsync). The beauty of this approach is that it is way more generic and could be used for other systems too.
Let me know what do you think about these two approaches and whether you manage to setup a test environment. If you do on GitHub i can help you out with implementation

During rolling upgrade/restart, how to detect when a kafka broker is "done"?

I need to automate a rolling restart of a kafka cluster (3 kafka brokers). I can easily do it manually - restart one after the other, while checking the log to see when it's fine (e.g., when the new process has joined the cluster).
What is a good way to automate this check? How can I ask the broker whether it's up and running, connected to its peers, all topics up-to-date and such? In my restart script, I have access to the metrics, but to be frank, I did not really see one there which gives me a clear picture.
Another way would be to ask what a good "readyness" probe would be that does not simply check some TCP/IP port, but looks at the actual server...
I would suggest exposing JMX metrics and tracking the following for cluster health
the controller count (must be 1 over the whole cluster)
under replicated partitions (should be zero for healthy cluster)
unclean leader elections (if you don't disable this in server.properties make sure there are none in the metric counts)
ISR shrinks within a reasonable time period, like 10 minute window (should be none)
Also, Yelp has tooling for rolling restarts implemented in Python, which requires Jolokia JMX Agents installed on the brokers, and it polls the metrics to make sure some of the above conditions are true
Assuming your cluster was healthy at the beginning of the restart operation, at a minimum, after each broker restart, you should ensure that the under-replicated partition count returns to zero before restarting the next broker.
As the previous responders mentioned, there is existing code out there to automate this. I don’t use Jolikia, myself, but my solution (which I’m working on now) also uses JMX metrics.
Kakfa Utils by Yelp is one of the best tools that can be used to detect when a kafka broker is "done". Specifically, kafka_rolling_restart is the tool which gets broker details from zookeeper and URP (Under Replicated Partitions) metrics from each broker. When a broker is restarted, total URPs across Kafka cluster is periodically collected and when it goes to zero, it restarts another broker. The controller broker is restarted at the last.

Kafka in distributed system

I am new to kafka , i am running kafka in a single machine as of now. I want to run kafka in an distributed environment on multiple machines. There is no proper documentation for this. Any documentation or suggestion on this will be really helpful.
Adding on to the previous answer by user2720864
Let us assume that Kafka system with below configuration is needed.
7 Kafka nodes
3 Zoo keepers
To achieve this install 7 Kafka instances, in 7 different server/vm(instances), and in each of these instances set a different broker-id, this will let the zookeeper identify the different kafka nodes for bookkeeping, maintenance.
broker.id=X (/config/server.properties)
To start zookeepers, you can use 3 of the previous kafka instances or can use new servers to start zookeepers. Once the servers on which zookeepers run are decided, change the /config/server.properties to specify zookeepers.
zookeeper.connect=hostname1:port1,hostname2:port2
In a distributed environment its nice to have 3 zoo keepers. While there is only one zookeeper which acts as a true master, other 2 zookeepers act as fail overs. When the master fails one of the two ZKs will take over as master.
I found this link to be very useful, it helped me clarify a lot of things about kafka architecture.
This is a good reference for all the configurations on the property files in kafka.
Hope this helps!
Basically you need to do the follwing
1) Set up kafka on all the machines
2) Configure the config/server1.properties properties file to specify an unique id for each machines. You can do that by setting the broker.id properties in the config file. e.g. broker.id=1, broker.id=2. For every brokers this id should be unique. This is how every node is identified in a kafka cluster.
3) Start kafka in all nodes
You can refer Step 6: Setting up a multi broker cluster from their official quick start page.
Also here is a nice article worth taking a look