Is quorum needed in Keycloak Standalone Clustered Configuration? - wildfly

It's stated that Keycloak is built on top of the WildFly application server and its sub-projects like Infinispan (for caching) and Hibernate (for persistence).
Keycloak recommends to look in WildFly Documentation and High Availability Guide.
If understood correctly Standalone Clustered Configuration allows session replication or transmission of SSO contexts around the cluster.
I don't understand though if odd number of Keycloak nodes is required so that there will be quorum.
Singleton subsystem states
10.1.3. Quorum Network partitions are particularly problematic for singleton services, since they can trigger multiple singleton
providers for the same service to run at the same time. To defend
against this scenario, a singleton policy may define a quorum that
requires a minimum number of nodes to be present before a singleton
provider election can take place. A typical deployment scenario uses a
quorum of N/2 + 1, where N is the anticipated cluster size. This value
can be updated at runtime, and will immediately affect any active
singleton services. e.g.
Is it somehow related to Keycloak and its Standalone Clustered Configuration?

Response from Keycloak mailing list:
No, Keycloak uses Infinispan for caching and Infinispan uses JGroups for
clustering. JGroups doesn't need consensus.
It's stated that Keycloak is built on top of the WildFly application
server and its sub-projects like Infinispan (for caching) and Hibernate
(for persistence).
Keycloak recommends to look in WildFly Documentation and High Availability
Guide.
If understood correctly Standalone Clustered Configuration allows session
replication or transmission of SSO contexts around the cluster.
I don't understand though if odd number of Keycloak nodes is required so
that there will be quorum.
No it is not strictly required. As in almost all distributed systems,
having odd number of nodes helps recovering from Split Brain scenarios.

Related

Implementing multi-datacenter Cassandra with Phantom driver

I'm working with Cassandra 3.x and Phantom driver (scala),
and modifying my Cassandra deployment from a simple, three nodes cluster to a multi datacenter Cassandra deployment that consists of two datacenters:
Transactional - the "main" datacenter, to which all reads/writes occur (except for reads/writes done by some analytics job).
Analytics - a datacenter used for analytics purposes only. The analytics job should operate (i.e. read/write to) on this datacenter.
Both datacenters are configured with the proper snitch and replication factor strategies.
Based on this article ("Workload Separation" section), I'm supposed to be able to read/write from the "Transactional" datacenter, and run analytics jobs on the "Analytics" datacenter however, I'm not sure how to get this to work with the phantom driver.
How can I configure the driver to read/write from the proper datacenter?
Will setting the hosts in ContactPoints class to nodes from the Transactional datacenter only do the trick?
By default, Java driver 3.x uses so-called DCAware load balancing policy combined with TokenAware policy. Data center could be configured explicitly by using withLocalDc function of builder, but it could be omitted and driver will use the datacenter of the first contact point that was reached at initialization. So you can just point Phantom only to servers in the transactional DC, and it will work only with it (until you're using non-local consistency levels, such as QUORUM/SERIAL, EACH_QUORUM, etc.)

difference between dcos-kafka-service and mesos-kafka

I’m doing a POC to deploy Kafka as an application on Mesos Cluster. I came across these 2 codebases on github. One developed by apache-mesos (github page) & other developed by mesosphere and can run only on DCOS (github page).
Question: Would like to know if there are any differences between DCOS-Kafka & mesos-Kafka in terms of features and extended functionality.
Regarding Mesos-Kafka:
I don’t see active participation on github (and some open issues) for mesos-kafka in the past months. Can I assume that the service is robust enough that I can use in production environment? Any Inputs on this would be helpful.
kakfa-mesos is a package that includes a release of Kafka and a custom mesos scheduler that was meant to work around issues with running Kafka as a stateful service on Marathon. I think post but confluent is useful. It also includes a RESTful api for doing ops tasks and aims to include these features in the future (this is pulled from the article I linked)
Integrating Kafka commands (e.g. kafka-topics, etc) into the Scheduler so it can be used through the CLI and REST API.
Auto-scaling clusters (including auto reassignment of partitions) so that the resources (CPU, RAM, etc.) that brokers are using can be used elsewhere in known valleys of traffic.
Rack-aware partition assignment for fault tolerance.
Hooks so that producers and consumers can also be launched from the Scheduler and managed with the cluster.
Automated partition reassignment based on load and traffic
I haven't used it in a production environment myself but it has the support of Confluent which is a good sign.
DC/OS Kafka on the other hand is a DC/OS service which will probably only be useful if you are already running or plan on running services through Mesosphere's DC/OS. It also includes an API and a CLI management tool but is less ambitious with additional features. It's current feature set includes
Single-command installation for rapid provisioning
Multiple clusters for multiple tenancy with DC/OS
High availability runtime configuration and software updates
Storage volumes for enhanced data durability, known as Mesos Dynamic * * Reservations and Persistent Volumes
Integration with syslog-compatible logging services for diagnostics and troubleshooting
Integration with statsd-compatible metrics services for capacity and performance monitoring

How to make the Eureka server strong?

I am new to Spring Cloud. Currently, I want to build a new micro service based on Spring Cloud. It is very easy to build a new Eureka server. But my question is that how to make it high availability ? For example I create two Eureka server and a load balancer. When one of the Eureka server is down, the system still works well. But I don't know to to consist registered information in the two Eureka server.
I have already asked something similar in the spring cloud gitter channel.
Because of the CAP theorem, something as a distributes Service discovery has to decide, either to provide availability, or more consistency, with a trade off to the other one.
in short, by quoting Spencer Gibb:
Eureka favors availability over consistency
so it is very available, while registred services may be not acutal anymore.
As Spencer suggested, if consistency is something you need more then availability, try Consul together with spring cloud consul intead

Is my RabbitMQ cluster Active Active or Active Passive?

I have created a cluster consists of three RabbitMQ nodes using join_cluster command.
i.e.
rabbitmqctl –n rabbit2#MYPC1 join_cluster rabbit2#MYPC1
(currently the cluster runs on a single computer)
Questions:
In the documents it says there is one implemetation for active passive and one for active active.
What did I configure?
How do I know?
How can it be changed?
Is there a big performance trade off between Active Active & Active Passive?
What is the best practice to interact with active/active?
i.e. install a load balancer? apache that will round robin
What is the best practice to interact with active/passive?
if I interact with only the active - this is a single point f failure
Thanks.
I have been doing some research into availability options with RabbitMQ and while I am still fairly new, I'll attempt to answer your questions with the knowledge I do have. Please understand that these answers are not intended to be comprehensive.
Before getting to the questions and answers, I think it's worth pointing out that I think using the terms Active/Active and Active/Passive in the context of a cluster running on a single computer does not really apply. Active/Active and Active/Passive are typically terms used to describe highly available clusters where you have a system of more than one logical server (in your case, multiple RabbitMQ clusters), shared/redundant storage, network capabilities, power, etc.
What did I configure?
Without any load balancing for the nodes in your cluster or queue mirroring you have neither, meaning you do not have a highly available cluster.
How do I know?
RabbitMQ does not provide any connection management so traffic with a failed node will not automatically be passed on to a different node, which is required for an active/active cluster. Without queue mirroring you do not have fully redundant nodes in your cluster, which is required for active/passive.
How can it be changed?
Even if you implement load balancing and/or queue mirroring you are missing a number of requirements to offer a highly-available RabbitMQ cluster. Primarily, with a RabbitMQ cluster you only have a single logical broker (at least two are required for an HA cluster).
Is there a big performance trade off between Active Active & Active Passive?
I think you will start seeing performance penalties as you start introducing data replication and/or redundancy, which would affect both Active/Active and Active/Passive. If you are using synchronous data replication then you will see a bigger performance hit than if you replicate data asynchronously. There's a lot more to it, but to me this feels like there may be a bigger performance hit by using Active/Active but this depends heavily on how fast all of the pieces are working together. In Active/Passive where you may be using asynchronous replication across servers your performance may appear better but in a failover situation you would need to wait for that replication to complete before you can switch to your secondary server.
What is the best practice to interact with active/active? i.e. install a load balancer? apache that will round robin
RabbitMQ recommends using a load balancer so that you do not have to leak details about the nodes in your cluster to the clients.
What is the best practice to interact with active/passive? if I interact with only the active - this is a single point of failure
It is a point of failure but with Active/Passive you can implement a failure strategy to retry the next available server or all remaining servers. With these strategies in place you can establish a scenario where the capabilities of your cluster are merely degraded while a failover is happening instead of totally unavailable. Also, you can interact with the passive side but the types of interactions may be very different (i.e. read-only access) since there may be fewer resources available on the passive side and there may be delays in data replication.
Here are some references used to gather this information:
High-Availability Cluster on Wikipedia
Clustering with RabbitMQ
Highly Available Queues in a RabbitMQ Cluster
High Availability in RabbitMQ

with memcache, can you add/remote nodes on the fly?

with memcache, can you add/remote nodes on the fly?
if a node goes down, does it redistribute automatically?
Memcached daemons themselves do not have any knowledge of one another. Node management is handled completely at the client level. Most client implementations rely on consistent hashing of keys to determine which server in a ring the values reside on. Many of the client libraries will failover to other nodes in the ring when a node becomes unavailable.
I am not aware of any memcached clients that attempt to provide clustering or high availability.
No, But you can try Hazelcast. Also it doesn't state yet with version 1.8.5 it does support memcache protocol. The next release will have all documentation about it.
So you can replace your Memcached servers with Hazelcast. And Hazelcast does support adding and removing nodes on the fly.