Does a Mesos slave needs to be contacted by a Mesos master? - apache-zookeeper

Can Apache Mesos 'slave' nodes be located on a separate network than the Mesos 'master' nodes? Similarly (for high-availability (HA) deploys), can the Apache Zookeeper nodes used in Mesos 'master' election be deployed on a separate network than the Mesos 'slave' nodes?
Currently, I have 3 masters+slaves nodes in the cloud, and I want to add a slave installed in my local subnet.
If such a setup is feasible, what are the pros/cons of such a setup?

I think https://www.stratio.com/blog/mesos-multi-data-center-architecture-for-disaster-recovery/ is a nice read on several of the things you need to make this work. There are some scenarios on how to handle stuff if a DC is down.
Pros:
You can failover in case of a DC being down/unreachable
Cons:
Both DC's must be able to run the environment by itself (active, or you should be able to scale up fast), so that creates overhead costs
Complexity increases (network, mesos/application configuration)
About the networks: they must be able to connect somehow, so public (but encrypted and firewalled, I also think that every node needs a public IP) or via an ipsec tunnel or another option like the link mentions.
I don't think doing it via the internet without tunneling (so the first option I mention) is a very good option.

Related

Consumen multiple network interfaces of single machine for kafka cluster

I have a Linux machine with 3 network interfaces,
let's say IPs are 192.168.1.101,192.168.1.102,192.168.1.103
I want to consume all 3 IPs of this single node to create a Kafka cluster with other nodes, Should all 3 IPs have their separate brokers?
Also using nic bonding is not recommended, all IPs need to be utilized
Overall, I'm not sure why you'd want to do this... If you are using separate volumes (log.dirs) for each address, then maybe you'd want separate Java processes, sure, but you'd still be sharing the same memory, and having that machine be a single point of failure.
In any case, you can set one process to have advertised.listeners list out each of those addresses for clients to communicate with, however, you'd still have to deal with port allocations in the OS, so you might need to set listeners like so
listeners=PLAINTEXT_1://0.0.0.0:9092,PLAINTEXT_2://0.0.0.0:9093,PLAINTEXT_3://0.0.0.0:9094
And make sure you have listener.security.protocol.map setup as well using those names
Note that clients will only communicate with the leader topic-partition at any time, so if you have one broker JVM process and 3 addresses for it, then really only one address is going to be utilized. One optimization for that could be your intra-cluster replication can use a separate NIC.

MongoDB sharding: mongos and configuration servers together?

We want to create a MongoDB shard (v. 2.4). The official documentation recommends to have 3 config servers.
However, the policies of our company won't allow us to get 3 extra servers for this purpose. Since we have already 3 application servers (1 web node, 2 process nodes) we are considering to put the configuration servers in the same application servers, with the mongos. Availability is not critical for us.
What do you think about this configuration? Can we face some problem or is it discouraged for some reason?
Given that Availability is not critical for your use case, I would say it should be fine to place the config servers in the same application servers and mongos.
If one of the process nodes is down, you will lose: 1 x mongos, 1 application server and 1 config server. During this down time, the other two config servers will be read-only , which means there won't be balancing of shards, modification to cluster config etc. Although your other two mongos should still be operational (CRUD wise). If your web-node is down, then you have a bigger problem to deal with.
If two of the nodes are down (2 process nodes, or 1 web server and process node), again, you would have bigger problem to deal with. i.e. Your applications are probably not going to work anyway.
Having said that, please consider the capacity of these nodes to be able to handle a mongos, an application server and a config server. i.e. CPU, RAM, network connections, etc.
I would recommend to test the deployment architecture in a development/staging cluster first under your typical workload and use case.
Also see Sharded Cluster High Availability for more info.
Lastly, I would recommend to check out MongoDB v3.2 which is the current stable release. The config servers in v3.2 are modelled as a replica set, see Sharded Cluster config servers for more info.

Is it possible to set up an ejabberd cluster in master-slave mode where data is not being replicated?

I am trying to setup a very simple cluster of 2 ejabberd nodes. However, while trying to go through the official ejabberd documentation and using the join_cluster argument that comes along with the ejabberdctl script, I always end up with a multi-master cluster where both the mnesia databases have replicated data.
Is it possible to set up a ejabberd cluster in master-slave mode? And if yes, then what I am I missing?
In my understanding, a slave get the data replicated but would simply not be active. The slave needs the data to be able to take over the task of the master at some point.
It seems to means that the core of the setup you describe is not about disabling replication but about not sending traffic to the slave, no ?
In that case, this is just a matter of configuring your load balancing mechanism to route the traffic accordingly to your preference..

ejabberd cluster: Multi-master or Master-slave

So far what I've come across is this -
Setting up ejabberd cluster in a master-slave configuration, there would be a single point of failure and people have experienced issues when even after fixing the master (if it goes down), the cluster doesn't become operable again. Also sometimes, ejabberd instances of every slave would have to be revisited again to get them working properly, or mnesia commands would have to be in-putted again to make master communicate with the slaves.
Setting up ejabberd cluster in a multi-master configuration then any of the nodes can be taken out of the cluster without bringing the whole cluster down. Basically, there is no single point of failure and, this is also the way in which the official documentation for ejabberd tells you to do via the join_cluster argument they expose in the ejabberdctl script. HOWEVER, in this case, all the data is replicated across both nodes which is a big performance overhead in my opinion.
So it boils down to this.
What is the best/recommended/popular mode in which an ejabberd cluster of 2 nodes should be set up mostly with respect to performance but keeping other critical factors (fault tolerance, load balancing) in mind as well.
There is only a single mode in ejabberd. Basically, it works like what you describe as multi-master. master-slave would basically be the same setup without any traffic sent to the second node by load balancing mechanism.
So case 2 is the way to go.

Is my RabbitMQ cluster Active Active or Active Passive?

I have created a cluster consists of three RabbitMQ nodes using join_cluster command.
i.e.
rabbitmqctl –n rabbit2#MYPC1 join_cluster rabbit2#MYPC1
(currently the cluster runs on a single computer)
Questions:
In the documents it says there is one implemetation for active passive and one for active active.
What did I configure?
How do I know?
How can it be changed?
Is there a big performance trade off between Active Active & Active Passive?
What is the best practice to interact with active/active?
i.e. install a load balancer? apache that will round robin
What is the best practice to interact with active/passive?
if I interact with only the active - this is a single point f failure
Thanks.
I have been doing some research into availability options with RabbitMQ and while I am still fairly new, I'll attempt to answer your questions with the knowledge I do have. Please understand that these answers are not intended to be comprehensive.
Before getting to the questions and answers, I think it's worth pointing out that I think using the terms Active/Active and Active/Passive in the context of a cluster running on a single computer does not really apply. Active/Active and Active/Passive are typically terms used to describe highly available clusters where you have a system of more than one logical server (in your case, multiple RabbitMQ clusters), shared/redundant storage, network capabilities, power, etc.
What did I configure?
Without any load balancing for the nodes in your cluster or queue mirroring you have neither, meaning you do not have a highly available cluster.
How do I know?
RabbitMQ does not provide any connection management so traffic with a failed node will not automatically be passed on to a different node, which is required for an active/active cluster. Without queue mirroring you do not have fully redundant nodes in your cluster, which is required for active/passive.
How can it be changed?
Even if you implement load balancing and/or queue mirroring you are missing a number of requirements to offer a highly-available RabbitMQ cluster. Primarily, with a RabbitMQ cluster you only have a single logical broker (at least two are required for an HA cluster).
Is there a big performance trade off between Active Active & Active Passive?
I think you will start seeing performance penalties as you start introducing data replication and/or redundancy, which would affect both Active/Active and Active/Passive. If you are using synchronous data replication then you will see a bigger performance hit than if you replicate data asynchronously. There's a lot more to it, but to me this feels like there may be a bigger performance hit by using Active/Active but this depends heavily on how fast all of the pieces are working together. In Active/Passive where you may be using asynchronous replication across servers your performance may appear better but in a failover situation you would need to wait for that replication to complete before you can switch to your secondary server.
What is the best practice to interact with active/active? i.e. install a load balancer? apache that will round robin
RabbitMQ recommends using a load balancer so that you do not have to leak details about the nodes in your cluster to the clients.
What is the best practice to interact with active/passive? if I interact with only the active - this is a single point of failure
It is a point of failure but with Active/Passive you can implement a failure strategy to retry the next available server or all remaining servers. With these strategies in place you can establish a scenario where the capabilities of your cluster are merely degraded while a failover is happening instead of totally unavailable. Also, you can interact with the passive side but the types of interactions may be very different (i.e. read-only access) since there may be fewer resources available on the passive side and there may be delays in data replication.
Here are some references used to gather this information:
High-Availability Cluster on Wikipedia
Clustering with RabbitMQ
Highly Available Queues in a RabbitMQ Cluster
High Availability in RabbitMQ