Starting RabbitMQ in K8s with Pre-defined Topics - kubernetes

This might be a dumb question but I've never used RabbitMQ in K8s before and am trying to figure out the best way to accomplish what I'm about to ask. I also haven't been able to find as part of my searching any results that address this particular question.
I have a series of deployments/pods, of which a few need to communicate using RabbitMQ via a topic (or topics). Is there any way to spin up a RabbitMQ cluster such that, when it starts up, it auto-creates the specified topics, exchanges, etc.?
I've yet to see any way to actually do this, which leads me to believe that the "best" way to accomplish this is to use an init container (or some other startup script elsewhere) that programmatically uses the RabbitMQ API to create and configure the topics. I could use pre-determined topic names to avoid having to update things at runtime.

Related

Monitor if Kafka is up?

I need to simply monitor if my Kafka cluster is up. Occasionally the machines running Kafka were shutdown. I want to send an email alert if the cluster is not available.
I can create a producer and consumer to send and receive dummy messages periodically. Is there a simpler way to do it?
You can use https://github.com/obsidiandynamics/kafdrop
It won't send you emails, but it much easier than send dummy messages
Actually knowing if cluster is up is not so easy at all, there is discussion with community what is the best practice to decide if kafka cluster is up and active but there is no current good way to get this information, as kafka architecture is distributed system, you might have big clusters and while one or more brokers are down , still having your cluster to give high available service, not effecting the integrity of data. Also you might have problems with one topic while on other topics it might work fine.
One suggestion I read which might give you the most certain approach is to produce "dummy" msgs to your applicative topics, and "skip" these msgs on consumption, that guarantee you that your application would work. I don't like this approach very much as it requires to "send junk to your main topics"
Other approaches are like you say "produce/consume to/from test/healthcheck topic" but it is might not give full guarantee that your application would work, this is a lot like select from dummy in other db approaches... if for them is good enough....
Another suggestion is to use AdminClient to read the metrics of cluster, if metrics are provided that usually means the cluster is healthy , also not very good guarantee...
I asked in comment which language are you using, maybe you are using something like spring which has HealthIndicator to check component status, but for your case it would be little different.
First of all, you should know that Kafka by default should be High
Available, so while building the cluster you should follow the bold
lines of best practices, you should ensure that you have replicas of
machines. This is good assumption that will make you satisfied over implementing all of this.
But, if you want to check health of a cluster, you can use admin process, you can use AdminClient, with help of some utilities; you can check list of topics, groups, etc that you have. But this not 100% guarantee for you although it is good workaround.
You can do that using as you mentioned periodic scheduler, and send email based on the findings you get. But again this is not the ideal solution, and HA cluster infrastructure should save lots of time for you if you build it correctly from the beginning.

Mapping out a Kafka+Zookeeper cluster

Background
I inherited a Kafka/Zookeeper installation. I have a passing knowledge of those - I know the general architecture, how clients work, about topics, etc., have been involved in programming Java clients etc.
But the installation is somewhat dubious. They are three instances of Kafka and Zookeeper each (in their separate docker containers). Supposedly they should work, but what I am seeing is all processes spout immense amount of log output with loads and loads of (diverse) warnings and errors. I have the impression that some of these seem to be quite normal (or are being self-healed all the time), and am having a very hard time figuring if everything works as intended or not, and set up correctly.
Some of these are - according to Google - related to unclean shutdowns of the brokers; corrupted individual topics and such. As this is a test environment, I can easily delete such files.
I know about some commands which help me check topics etc. (basic stuff, like listing them, displaying their individual configuration etc.).
However...
Question
Is there an online ressource/documentation which can be used as a systematic walkthrough to check whether everything is basically setup OK; for example to clear up these questions:
Do the three Zookeepers and the three Kafka instances correctly talk to each other for high-availability purposes? Do they have a correct "leader" etc.?
Are the servers generally "healthy", i.e., easily able to accept connections etc.?
How are the topics working (what's in there, how many messages, etc.)?
I am aware that one may very quickly dismiss this question as too generic; I am not asking you to solve my problems. I am looking for a ressource to systematically walk through such an installation - it may or may not cover the examples I have given, but it definitely should give a systematic way to find out if things are fundamentally wrong.
Rather than looking solely at logs, you might want to familiarize yourself with JMX metrics and how you can gather them across the cluster.
If you want to actually collect and analyze logs, you'll likely need to separately use something like Elasticsearch.
You won't see "how many messages" in a topic, and you'll need even more monitoring to know if a port is actually open and the Kafka process is running, the disks are filling up, etc.
My point here is that, Kafka needs fed and watered, if you plan to productionalize it, you can't just set up a small cluster and forget about it. Even if you think it's setup correctly at the beginning, increasing the load on it will cause it to fall in a bad state eventually.
For a limited trial for your dev environment to get a full look at your cluster health, Confluent Control Center can assist with that.
To solve the "what's in there" problem, I suggest you setup a Schema Registry, and convince Kafka producers to use it.
This packtpub tutorial/training by Stéphane Maarek is wonderful resource for setting kafka in cluster mode. However he did that in AWS cloud in ubuntu VM.
I have followed the same steps and installed in Vagrant VMs in cent OS. You can find the code here.
The VM has yahoo kafka manager to monitor the kafka internal details. list of broker available, healthy , partitions, leaders etc.,
kafka manager can help you with high level monitoring.
Please provide your comments.

How to notify POD in statefull set about other PODS in Kubernetes

I was reading the tutorial on deploying a Cassandra ring and zookeeper with statefulsets. What I don't understand is if I decide to add another replica into the statefulset, how do I notify the other PODS that there is another one. What are best practices for it? I want to be able for one POD to redirect request to another POD in my custom application in case the request doesn't belong to it (ie. it doesn't have the data)
Well, seems like you want to run a clustered application inside kubernetes. It is not something that kubernetes is directly responsible for. The cluster coordination for given solution should be handled within it, and a response to a "how to" question can not be generic.
Most of the softwares out there will have some kind of coordination, discovery and registration mechanism. Be it preconfigured members, external dioscovery catalog/db or some networ broadcasting.
StatefulSet helps a lot in it by retaining network identity under service/pod, or helping to keep storage, so you can ie. always point your new replicas to register with first replica (or preferably one of the first two, cause what if your no.1 is the one that restarted), but as a wrote above, this is pretty much depending on capabilities available on the solution you want to deploy.

kafka Multi-Datacenter with high availability

I'm setting up 2 kafka v0.10.1.0 clusters on different DCs and planning to use mirror-maker to keep one as source and the other one as target, what I'm not sure is how to ensure high availability when my source/main cluster goes down (complete DC where source kafka cluster goes down) do I need to make my application switch to produce messages to the target kafka and what will happen when source kafka is back? how to bring it back in sync with the possible lost messages?
Thanks
From reading your question I don't think, that MirrorMaker will be a suitable tool for your needs I am afraid.
Basically MirrorMaker is simply a Consumer and a Producer tied together to replicate messages from one cluster to another. It is not a tool to tie two Kafka clusters together in an active-active configuration, which sounds a lot like what you are looking for.
But to answer your questions in order:
Do I need to make my application switch to produce messages to the
target kafka?
Yes, there is currently no failover function, you would need to implement logic in your producers to try the target cluster after x amount of failed messages or no messages sent in y minutes or something like that.
What will happen when source kafka is back?
Pretty much nothing that you don't implement yourself :)
MirrorMaker will start replicating data from your source cluster to your target cluster again, but since your producers now switched over to the target cluster, the source cluster is not getting any data, so they will idle along.
Your producers will keep producing into the target cluster, unless you implemented a regular check whether the source came back online and have them switch back.
How to bring it back in sync with the possible lost messages?
When your source cluster is back online and assuming all the things I mentioned above have happened you effectively switched your clusters around, depending on whether you want your source as primary cluster that gets written to or are happy to reverse roles when this happens you have two options that I can come up with off the top of my head:
reverse the direction of mirrormaker and set the consumer group offsets manually so that it picks up at the point where the source cluster died
stop producing new data for a while, recover missing data to the source cluster, switch back your producers and start everything up again.
Both options require you to figure out, what data is missing on the source cluster manually though, I don't think there is a way around this.
Bottom line is, that this in not an easy thing to do with MirrorMaker and it might be worth having another think about whether you really want to switch producers over to the target cluster if the source goes down.
You could also have a look at Confluent's Replicator, which might better suit what you are looking for and is part of their corporate offering. Information is a bit sparse on that, let me know if you are interested in it and I can make an introduction to someone who can tell you more about it (or of course just send a mail to Confluent, that'll reach the right person as well).

Learning Zookeeper - Help me with example

I'm trying to wrap my head around Zookeeper and what it does. To this point, my experience with Zookeeper has been through other libraries that require Zookeeper (Solr and Kafka) and so my basic understand is the very vague "you better use Zookeeper to keep your configuration straight".
So help me think through a simple example problem. Let's say that I build my own service that does "stuff". There are two things that I want to protect:
I want to have as little downtime as possible (gotta keep doing stuff).
I can not have more than one server doing stuff because bad things would happen.
So, how would I set this up in Zookeeper? Is Zookeeper responsible for starting another stuff server if one goes down? Or do I subscribe to a Zookeeper "stuff doer status" callback? If I erroneously start up two stuff servers, how does Zookeeper help me keep bad things from happening?
Zookeeper is a distributed lock manager. These systems provide features like coordinator election (aka "master election" or "leader election") for a distributed system, as well as provide a consistent, distributed access to small amounts of critical information which is frequently used for configuration (i.e., don't treat it like a database or a general file system).
Note that Zookeeper does not manage your service, but you can use Zookeeper to keep a hot standby (or several) such that in case of one master failing, another one will take over, so you would run N replicas of your servers, such that one of the working instances can take over immediately if the current leader goes down or becomes unavailable for any reason.
Using master election, you can choose to have two (or more) servers, but only one of them will be able to take the master lock, so only that one will be able to take action. As soon as it goes away, it will lose its claim to the lock, and your hot standby will pick up the lock and start doing work that you need it to do. Look at Zookeeper recipes for code samples. However, properly handing off work, checkpointing, and general service resilience is still up to you to design and implement.
That said, Zookeeper and similar systems provide a solid foundation to enable you to build robust distributed systems.
Other systems similar to Zookeeper include (alphabetically):
Chubby
doozerd
etcd
Several of these have detailed comparisons written up on their respective websites to show how they differ from the others in the list.