Weblogic distributed queue - load balancing doesn't work - queue

I have setup a weblogic 9.2M3 cluster server having two nodes on separate VMs.
On the cluster I have setup Uniform Distributed Queues shared in the cluster.
Unfortunately load balancing on the distributed queues doesn't work and all the messages are being processed by the clients on the same node as producer.
I have already checked the following:
Turned off Server affinity in the factory
Checked that multicast works between nodes in the cluster
Checked different ways of setting up targets for factory and queue by:
Setting the Factory and Queue to be deployed to Cluster and
Setting the Factory to Cluster and Queue to two JMSServers
Setting both Factory and Queue to be targeted to JMSServers
Any suggestions why the load balancing might not work with the configuration below?
This is the configuration I am using the portion of config.xml:
<cluster>
<name>TestCluster</name>
<multicast-address>239.192.0.1</multicast-address>
<multicast-port>17001</multicast-port>
<number-of-servers-in-cluster-address>2</number-of-servers-in-cluster-address>
</cluster>
<jms-server>
<name>JMSServer1</name>
<target>server1</target>
<persistent-store xsi:nil="true"></persistent-store>
<temporary-template-resource xsi:nil="true"></temporary-template-resource>
<temporary-template-name xsi:nil="true"></temporary-template-name>
</jms-server>
<jms-server>
<nameJ>JMSServer2</name>
<target>server2</target>
<persistent-store xsi:nil="true"></persistent-store>
<temporary-template-resource xsi:nil="true"></temporary-template-resource>
<temporary-template-name xsi:nil="true"></temporary-template-name>
</jms-server>
<jdbc-store>
<name>PersistentStore1</name>
<prefix-name>sas1_</prefix-name>
<data-source>QueueDataSource</data-source>
<target>sas1</target>
</jdbc-store>
<jdbc-store>
<name>PersistentStore2</name>
<prefix-name>sas2_</prefix-name>
<data-source>QueueDataSource</data-source>
<target>sas2</target>
</jdbc-store>
<jms-system-resource>
<name>ClusterJMSModule</name>
<target>TestCluster</target>
<sub-deployment>
<name>ClusterSubDeployment</name>
<target>TestCluster</target>
</sub-deployment>
<descriptor-file-name>jms/clusterjmsmodule-jms.xml</descriptor-file-name>
</jms-system-resource>
The definitions of destinations:
<connection-factory name="jms/levelsInputConnectionFactory">
<sub-deployment-name>ClusterSubDeployment</sub-deployment-name>
<jndi-name>jms/levelsInputConnectionFactory</jndi-name>
<load-balancing-params>
<server-affinity-enabled>false</server-affinity-enabled>
</load-balancing-params>
</connection-factory>
<uniform-distributed-queue name="jms/levelsInputQueue">
<sub-deployment-name>ClusterSubDeployment</sub-deployment-name>
<jndi-name>jms/levelsInputQueue</jndi-name>
<forward-delay>10</forward-delay>
</uniform-distributed-queue>

I followed the steps given in article http://middlewaremagic.com/weblogic/?p=3747 and it helped me setup distributed queue for below given scenario...
(1-Admin Server (AS), 2-Managed Server (MS), 2-Boxes)
Box-A
MS-1 under Cluster
JMSServer-1 and Store-1 => MS-1 (Migratable)
Box-B
MS-2 under Cluster
JMSServer-2 and Store-2 => MS-2 (Migratable)
Admin Server
JMS_Module => Cluster
SubDeployment_UDQ => JMS Server-1, JMS Server-2
ConnectionFactory (with “affinity disabled”) => Cluster
UDQ (Distributed Queue) => SubDeployment_UDQ

Related

"SchemaRegistryException: Failed to get Kafka cluster ID" for LOCAL setup

I'm downloaded the .tz (I am on MAC) for confluent version 7.0.0 from the official confluent site and was following the setup for LOCAL (1 node) and Kafka/ZooKeeper are starting fine, but the Schema Registry keeps failing (Note, I am behind a corporate VPN)
The exception message in the SchemaRegistry logs is:
[2021-11-04 00:34:22,492] INFO Logging initialized #1403ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2021-11-04 00:34:22,543] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2021-11-04 00:34:22,614] INFO Adding listener: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2021-11-04 00:35:23,007] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryException: Failed to get Kafka cluster ID
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1488)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:166)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:71)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:90)
at io.confluent.rest.Application.configureHandler(Application.java:271)
at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:245)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44)
Caused by: java.util.concurrent.TimeoutException
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:180)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1486)
... 7 more
My schema-registry.properties file has bootstrap URL set to
kafkastore.bootstrap.servers=PLAINTEXT://localhost:9092
I saw some posts saying its the SchemaRegistry unable to connect to the KafkaCluster URL because of the localhost address potentially. I am fairly new to Kafka and basically just need this local setup to run a git repo that is utilizing some Topics/Kafka so my questions...
How can I fix this (I am behind a corporate VPN but I figured this shouldn't affect this)
Do I even need the SchemaRegistry?
I ended up just going with the Docker local setup inside, and the only change I had to make to the docker compose YAML was to change the schema-registry port (I changed it to 8082 or 8084, don't remember exactly but just an unused port that is not being used by some other Confluent service listed in the docker-compose.yaml) and my local setup is working fine now

Wildfly : Singleton Deployment on Cluster | Elects two servers in Server Group

This does not happens all time but many a times.
3 Clusters of Server Group
Wildfly 16
Deploy .war from UI. It picks fine on one server::
2020-02-26 07:21:12,951 INFO [org.wildfly.clustering.server] (LegacyDistributedSingletonService - 1) WFLYCLSV0003: alp-esb-app02:servicedesk-02 elected as the singleton provider of the jboss.deployment.unit."Now-1.11-SNAPSHOT.war".installer service
2020-02-26 07:21:13,115 INFO [org.jboss.as.server] (ServerService Thread Pool -- 26) WFLYSRV0010: Deployed "Now-1.11-SNAPSHOT.war" (runtime-name : "Now-1.11-SNAPSHOT.war")
2020-02-26 07:21:14,133 INFO [org.wildfly.clustering.server] (LegacyDistributedSingletonService - 1) WFLYCLSV0001: This node will now operate as the singleton provider of the jboss.deployment.unit."Now-1.11-SNAPSHOT.war".installer service
But i disable-renable or deploy next time: It shows same logs in two server.
An there is scheduler which runs twice which is corrupting database with duplicates.
Need to redeploy and redeploy and check when logs went fine i.e only one server is elected.
Project Structure:
webapp -> Meta INF -> singleton-deployment.xml
<?xml version="1.0" encoding="UTF-8"?>
<singleton-deployment xmlns="urn:jboss:singleton-deployment:1.0"/>
Scheduler Starts like:
#Startup
#Singleton
#AccessTimeout(value = 30, unit = TimeUnit.MINUTES)
public class SnowPollerNew {
Any suggestion why do it runs fine but do not runs fine many a time.
Is it linked to JGroups? or communication between two clusters?
You need to ensure that the servers are building the cluster correctly.
Also I remember some issues (WFLY-11619) with the singleton election.
I would suppose that this is not reproducable at WildFly 18.

Keycloak cluster fails on Amazon ECS (org.infinispan.commons.CacheException: Initial state transfer timed out for cache)

I am trying to deploy a cluster of 2 Keycloak docker images (6.0.1) on Amazon ECS (Fargate) using the built-in ECS Service Discovery mecanism (using DNS_PING).
Environment:
JGROUPS_DISCOVERY_PROTOCOL=dns.DNS_PING
JGROUPS_DISCOVERY_PROPERTIES=dns_query=my.services.internal,dns_record_type=A
JGROUPS_TRANSPORT_STACK=tcp <---(also tried udp)
The instances IP are correctly resolved from Route53 private namespace and they discover each other without any problem (x.x.x.138 is started first, then x.x.x.76).
Second instance:
[org.jgroups.protocols.dns.DNS_PING] (ServerService Thread Pool -- 58) ip-x.x.x.76: entries collected from DNS (in 3 ms): [x.x.x.76:0, x.x.x.138:0]
[org.jgroups.protocols.dns.DNS_PING] (ServerService Thread Pool -- 58) ip-x.x.x.76: sending discovery requests to hosts [x.x.x.76:0, x.x.x.138:0] on ports [55200 .. 55200]
[org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 58) ip-x.x.x.76: sending JOIN(ip-x-x-x-76) to ip-x-x-x-138
And on the first instance:
[org.infinispan.CLUSTER] (thread-8,ejb,ip-x-x-x-138) ISPN000094: Received new cluster view for channel ejb: [ip-x-x-x-138|1] (2) [ip-x-x-x-138, ip-172-x-x-x-76]
[org.infinispan.remoting.transport.jgroups.JGroupsTransport] (thread-8,ejb,ip-x-x-x-138) Joined: [ip-x-x-x-76], Left: []
[org.infinispan.CLUSTER] (thread-8,ejb,ip-x-x-x-138) ISPN100000: Node ip-x-x-x-76 joined the cluster
[org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger-12,ejb,ip-x-x-x-76) ip-x-x-x-76: pingable_mbrs=[ip-x-x-x-138, ip-x-x-x-76], ping_dest=ip-x-x-x-138
So it seems we have a working cluster. Unfortunately, the second instance ends up failing with the following exception:
Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache work on ip-x-x-x-76
Before this occurs, I am seeing a bunch of failure discovery task suspecting/unsuspecting the opposite instance:
[org.jgroups.protocols.FD_ALL] (Timer runner-1,null,null) haven't received a heartbeat from ip-x-x-x-76 for 60016 ms, adding it to suspect list
[org.jgroups.protocols.FD_ALL] (Timer runner-1,null,null) ip-x-x-x-138: suspecting [ip-x-x-x-76]
[org.jgroups.protocols.FD_ALL] (thread-9,ejb,ip-x-x-x-138) Unsuspecting ip-x-x-x-76
[org.jgroups.protocols.FD_SOCK] (thread-9,ejb,ip-x-x-x-138) ip-x-x-x-138: broadcasting unsuspect(ip-x-x-x-76)
On the Infinispan side (cache), everything seems to occur correctly but I am not sure. Every cache is "rebalanced" and each "rebalance" seems to end up with, for example:
[org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p24-t2) Finished receiving of segments for cache offlineSessions for topology 2.
It feels like its a connectivity issue, but all the ports are wide open between these 2 instances, both for TCP and UDP.
Any idea ? Anyone successfull at configuring this on ECS (fargate) ?
UPDATE 1
The second instance was initially shutting down not because of the "Initial state transfer timed out .." error but because the health check was taking longer than the configured grace period. Nonetheless, with 2 healthy instances, I receive "404 - Not Found" once every 2 queries, telling me that there is indeed a cache problem.
In current keycloak docker image (6.0.1), the default stack is UDP. According to this, version 7.0.0 will default to TCP and will also introduce a variable to toggle the stack (JGROUPS_TRANSPORT_STACK).
Using the UDP stack in Amazon ECS will "partially" work, meaning the discovery will work, the cluster will form, but the Infinispan cache won't be able to sync between instances, which will produce erratic errors. There is probably a way to make it work "as-is", but I dont see anything blocked between the instances when checking the VPC Flow logs.
A workaround is to switch to TCP by modifying the JGroups stack directly in the image in file /opt/jboss/keycloak/standalone/configuration/standalone-ha.xml:
<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
<channels default="ee">
<channel name="ee" stack="tcp" cluster="ejb"/> <-- set stack to tcp
</channels>
Then commit the new image:
docker commit -m="TCP cluster stack" CONTAINER_ID jboss/keycloak:6.0.1-tcp-cluster
Tag/Push the image to Amazon ECR and make sure the port 7600 is accepted in your security group between your Amazon ECS tasks.

How to check cluster working between two different JBoss Server

I configured cluster between two different JBoss server using Multicast method.
Both server will be connected , when I start both JBoss server.
After one days , I'm getting following messages
Errors start to show for the clustering in server.log
05:28:17,447 ERROR [org.hornetq.core.server] (Thread-11905 (HornetQ-client-global-threads-377807954)) HQ224037:
cluster connection Failed to handle message: java.lang.IllegalStateException:
Cannot find binding for d7c1004f-b1a1-4160-8888-c38175ac45d599cf0dfe-5f30-11e4-bd7e-556a35fb9ec6 on
ClusterConnectionImpl#538608327[nodeUUID=930dee51-5f30-11e4-9695-ef52e2a27831, connector=TransportConfiguration(name=netty,
factory=org-hornetq-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&host=172-29-250-191, address=jms,
server=HornetQServerImpl::serverUUID=930dee51-5f30-11e4-9695-ef52e2a27831]
at org.hornetq.core.server.cluster.impl.ClusterConnectionImpl$MessageFlowRecordImpl.doConsumerCreat
05:28:17,411 ERROR [org.hornetq.core.server] (Thread-11439
(HornetQ-remoting-threads-HornetQServerImpl::serverUUID=99cf0dfe-5f30-11e4-bd7e-556a35fb9ec6-136247994-702467456))
HQ224016: Caught exception: HornetQException[errorType=QUEUE_EXISTS message=HQ119019:
Queue already exists 7a8b46d5-a038-4efd-900e-4c041c2c121f]
At org.hornetq.core.server.impl.HornetQServerImpl.createQueue(HornetQServerImpl.java:1811)
[hornetq-server-2.3.1.Final-redhat-1.jar:2.3.1.Final-redhat-1]
How to ensure cluster between two servers. Is there any procedures or any work around available?
Red Hat provides a McastReceiverTest java client test utility- further information on its use can be located at https://access.redhat.com/solutions/123073

Embedded ActiveMQ: jdbcPersistenceAdapter using kahaDB?

I've the following Spring config of the ActiveMQ Broker:
<broker:broker id="activemqbroker" useJmx="false" persistent="true" brokerName="activemqbroker">
<broker:transportConnectors>
<broker:transportConnector name="vm" uri="vm://activemqbroker"/>
</broker:transportConnectors>
<broker:persistenceAdapter>
<broker:jdbcPersistenceAdapter dataSource="#oracle-ds" transactionIsolation="2">
<broker:statements>
<broker:statements tablePrefix="IAG_PROC_"/>
</broker:statements>
</broker:jdbcPersistenceAdapter>
</broker:persistenceAdapter>
</broker:broker>
And the problem is that the active-mq directory with KahaDB is still being created and used. I don't understand why because I'm not using journaledJDBC but jdbcPersistenceAdapter. How could I setup this to use only JDBC?
The scheduler feature in ActiveMQ uses its own KahaDB persistent store, try setting it to disabled on the broker element via: schedulerSupport=false.