Wildfly Singleton Service election in cluster gives Out of Memory/Metaspace Issues - wildfly

Wildfly runs its Singleton Service in cluster in midnight or mid day like
2022-03-18 00:00:07,151 INFO [org.wildfly.clustering.server] (LegacyDistributedSingletonService - 1) WFLYCLSV0003: alp-esb-app02:masterdata-batch-02 elected as the singleton provider of the jboss.deployment.unit."masterdata-emp-org-powerdata-1.4.war".installer service
In 3 clusters, Many integrations in 02 jumps to 03 and vice versa and in between we come up with Metaspace? Basically it un deploys one integrations and deploys other integration from other server.
Why such behavior and why it always have metaspace and how could it be fixed?

This would happen if one of your cluster members is removed from the cluster (based on criteria of your JGroups failure detection configuration). Your logs should indicate the reason.

Related

Is quorum needed in Keycloak Standalone Clustered Configuration?

It's stated that Keycloak is built on top of the WildFly application server and its sub-projects like Infinispan (for caching) and Hibernate (for persistence).
Keycloak recommends to look in WildFly Documentation and High Availability Guide.
If understood correctly Standalone Clustered Configuration allows session replication or transmission of SSO contexts around the cluster.
I don't understand though if odd number of Keycloak nodes is required so that there will be quorum.
Singleton subsystem states
10.1.3. Quorum Network partitions are particularly problematic for singleton services, since they can trigger multiple singleton
providers for the same service to run at the same time. To defend
against this scenario, a singleton policy may define a quorum that
requires a minimum number of nodes to be present before a singleton
provider election can take place. A typical deployment scenario uses a
quorum of N/2 + 1, where N is the anticipated cluster size. This value
can be updated at runtime, and will immediately affect any active
singleton services. e.g.
Is it somehow related to Keycloak and its Standalone Clustered Configuration?
Response from Keycloak mailing list:
No, Keycloak uses Infinispan for caching and Infinispan uses JGroups for
clustering. JGroups doesn't need consensus.
It's stated that Keycloak is built on top of the WildFly application
server and its sub-projects like Infinispan (for caching) and Hibernate
(for persistence).
Keycloak recommends to look in WildFly Documentation and High Availability
Guide.
If understood correctly Standalone Clustered Configuration allows session
replication or transmission of SSO contexts around the cluster.
I don't understand though if odd number of Keycloak nodes is required so
that there will be quorum.
No it is not strictly required. As in almost all distributed systems,
having odd number of nodes helps recovering from Split Brain scenarios.

Failsave Kubernetes with preemtible VMs in Google Cloud

Google has an offer called preemptible VMs which are VMs that do not guarantee to be available all the time and which are shut down once every 24 hours.
Our goal is to deploy a failsave (to a certain degree) kubernetes cluster with those VM by having enough backup VMs to handle the case that one VM is shutdown. This article describe a simple scenario where preemptible VMs are used to run an image service. This scenario is simple because there is no database or message broker involved running on preemptible VMs.
Is it possible to run a whole (microservice-based) application including databases and message brokers with only preemptible VMs?
Further Questions that we have:
When do the preemptible VMs get shut down usually? Is it usually the case that if one VM gets shut down, all the other are too (at the same time)?
How log is the downtime of a preemptible VM getting restarted?
Any guidance that helps answering those questions and/or helping us configure such a cluster is apprechiated.
Regarding your questions:
1.-When do the preemptible VMs get shut down usually? Is it usually the case that if one VM gets shut down, all the other are too (at the same time)?
A:The lifetime for preemptible VMs is no more than 24 hours, they can be shutdown whenever Google need the resources between this lifetime, find more information about limitations here. Resetting the counter means that you manually stop and start the instances, however keep in mind that the selection will preempt instances that were recently launched most recently.
2.-How log is the downtime of a preemptible VM getting restarted?
A: If you mean where you can see the logs of Compute Engine that notices you when an instance was terminated, you can use Stackdriver Logging.

Is there downtime when a partition is moved to a new node?

Service Fabric offers the capability to rebalance partitions whenever a node is removed or added to the cluster. The Service Fabric Cluster Resource Manager will move one or more partitions to this node so more work can be done.
Imagine a reliable actor service which has thousands of actors running who are distributed across multiple partitions. If the Resource Manager decides to move one or more partitions, will this cause any downtime? Or does rebalancing partitions work the same as upgrading a service?
They act pretty much the same way, The main difference I can point is that Upgrades might affect only the services being updated, and re-balancing might affect multiple services at once. During an upgrade, the cluster might re-balance the services as well to fit the new service instance in a node.
Adding or Removing nodes I would compare more with node failures. In any of these cases they will be rebalanced because of the cluster capacity changes, not because of the service metric\load changes.
The main difference between a node failure and a cluster scaling(Add/remove node) is that the rebalance will take in account the services states during the process, when a infrastructure notification comes in telling that a node is being shutdown(for updates or maintenance, or scaling down) the SF will ask the Infrastructure to wait so it can prepare for this announced 'failure', and then start re-balancing the services.
Even though re-balancing cares about the service states for a scale down it should not be considered more reliable than a node failure, because the infrastructure will wait for a while before shutting down the node(the limit it can wait will depend on the reliability tier you defined for your cluster), until SF check if the services meet health conditions, like turn down services and creating new ones, checking if they will run fine without errors, if this process takes too long, these service might be killed once the timeout is reached and the infrastructure proceed with the changes, Also, the new instances of the services might fail on new nodes, forcing the services to move again.
When you design the services is safer to consider the re-balancing as a node failure, because at the end is not much different. Your services will move around, data stored in memory will be lost if not persisted, the service address will change, and etc. The services should have replicated data and the clients should always use a retry logic and refresh the services location to reduce the down time.
The main difference between service upgrade and service rebalancing is that during upgrade all replicas from all partitions are get turned off on particular node. According to documentation here balancing is done on replica basis i.e. only some replicas from some partitions will get moved, so there shouldn't be any outage.

Is it possible to control Service Fabric hosted service restart behaviour?

I can't find much documentation on the action that Service Fabric takes when a service it is hosting fails. I have performed some experimentation (using a stateless service in a local cluster), the results of which are below. My question is: is it possible to change this behaviour?
There are two distinct scenarios that I tested.
An exception thrown from the RunAsync() method.
The hosted service is restarted immediately on another cluster node. If no other node is available then it is restarted on the same node. There does not appear to be any limit to the number of times the restart will be attempted or any kind of back-off in terms of the interval between attempts.
The hosted service fails to start (e.g. an exception is thrown before RunAsync() is called).
The hosted service is restarted on the same node. In my test environment there appears to be a fixed interval between restart attempts (15 seconds) but no limit to the number of attempts.
I can see in the cluster configuration that there are some parameters in the Hosting section that look like they might be relevant (ActivationMaxRetryInterval, ActivationRetryBackoffInterval, ActivationMaxFailureCount) and I am guessing that these cover scenario (2) above (assuming that Activation == service start). These affect the entire cluster by the looks of it.

Prevent deployment to entry node, only deploy to other nodes

I have a free OpenShift account with the default 3 gears. On this I have installed the WildFly 8.1 image using the OpenShift web console. I set the minimal and maximal scaling to 3.
What happens now is that OpenShift will create 3 JBoss WildFly instances:
One on the entry node (which is also running HAProxy)
One on an auxiliary node
One on another auxiliary node
The weird thing is that the JBoss WildFly instance on the entry node is by default disabled in the load balancer config (haproxy.conf). BUT, OpenShift is still deploying the war archive to it whenever I commit in the associated git repo.
What's extra problematic here is that because of the incredibly low number of max user processes (250 via ulimit -u), this JBoss WildFly instance on the entry node cannot even startup. During startup JBoss WildFly will throw random 'java.lang.OutOfMemoryError: unable to create new native thread' (and no, memory is fine, it's the OS process limit).
As a result, the deployment process will hang.
So to summarize:
A JBoss WildFly instance is created on the entry node, but disabled in the load balancer
JBoss WildFly in its default configuration cannot startup on the entry node, not even with a trivial war.
The deployer process attempts to deploy to JBoss WildFly on the entry node, despite it being disabled in the load balancer
Now my question:
How can I modify the deployer process (including the gear start command) to not attempt to deploy to the JBoss WildFly instance on the entry node?
When an app scales from 2 gears to 3, HAproxy stops routing traffic to your application on the headgear and routes it to the two other gears. This assures that HAproxy is getting the most CPU as possible as the application on your headgear (where HAproxy is running) is no longer serving requests.
The out of memory message you're seeing might not be an actual out of memory issue but a bug relating to ulimit https://bugzilla.redhat.com/show_bug.cgi?id=1090092.