Wildfly 9 - mod_cluster on TCP - wildfly

We are currently testing to move from Wildfly 8.2.0 to Wildfly 9.0.0.CR1 (or CR2 built from snapshot). The system is a cluster using mod_cluster and is running on VPS what in fact prevents it from using multicast.
On 8.2.0 we have been using the following configuration of the modcluster that works well:
<mod-cluster-config proxy-list="1.2.3.4:10001,1.2.3.5:10001" advertise="false" connector="ajp">
<dynamic-load-provider>
<load-metric type="cpu"/>
</dynamic-load-provider>
</mod-cluster-config>
Unfortunately, on 9.0.0 proxy-list was deprecated and the start of the server will finish with an error. There is a terrible lack of documentation, however after a couple of tries I have discovered that proxy-list was replaced with proxies that are a list of outbound-socket-bindings. Hence, the configuration looks like the following:
<mod-cluster-config proxies="mc-prox1 mc-prox2" advertise="false" connector="ajp">
<dynamic-load-provider>
<load-metric type="cpu"/>
</dynamic-load-provider>
</mod-cluster-config>
And the following should be added into the appropriate socket-binding-group (full-ha in my case):
<outbound-socket-binding name="mc-prox1">
<remote-destination host="1.2.3.4" port="10001"/>
</outbound-socket-binding>
<outbound-socket-binding name="mc-prox2">
<remote-destination host="1.2.3.5" port="10001"/>
</outbound-socket-binding>
So far so good. After this, the httpd cluster starts registering the nodes. However I am getting errors from load balancer. When I look into /mod_cluster-manager, I see a couple of Node REMOVED lines and there are also many many errors like:
ERROR [org.jboss.modcluster] (UndertowEventHandlerAdapter - 1) MODCLUSTER000042: Error MEM sending STATUS command to node1/1.2.3.4:10001, configuration will be reset: MEM: Can't read node
In the log of mod_cluster there are the equivalent warnings:
manager_handler STATUS error: MEM: Can't read node
As far as I understand, the problem is that although wildfly/modcluster is able to connect to httpd/mod_cluster, it does not work the other way. Unfortunately, even after an extensive effort I am stuck.
Could someone help with setting mod_cluster for Wildfly 9.0.0 without advertising? Thanks a lot.

I ran into the Node Removed issue to.
I managed to solve it by using the following as instance-id
<subsystem xmlns="urn:jboss:domain:undertow:2.0" instance-id="${jboss.server.name}">
I hope this will help someone else to ;)

There is no need for any unnecessary effort or uneasiness about static proxy configuration. Each WildFly distribution comes with xsd sheets that describe xml subsystem configuration. For instance, with WildFly 9x, it's:
WILDFLY_DIRECTORY/docs/schema/jboss-as-mod-cluster_2_0.xsd
It says:
<xs:attribute name="proxies" use="optional">
<xs:annotation>
<xs:documentation>List of proxies for mod_cluster to register with defined by outbound-socket-binding in socket-binding-group.</xs:documentation>
</xs:annotation>
<xs:simpleType>
<xs:list itemType="xs:string"/>
</xs:simpleType>
</xs:attribute>
The following setup works out of box
Download wildfly-9.0.0.CR1.zip or build with ./build.sh from sources
Let's assume you have 2 boxes, Apache HTTP Server with mod_cluster acting as a load balancing proxy and your WildFly server acting as a worker. Make sure botch servers can access each other on both MCMP enabled VirtualHost's address and port (Apache HTTP Server side) and on WildFly AJP and HTTP connector side. The common mistake is to binf WildFLy to localhost; it then reports its addess as localhost to the Apache HTTP Server residing on a dofferent box, which makes it impossible for it to contact WildFly server back. The communication is bidirectional.
This is my configuration diff from the default wildfly-9.0.0.CR1.zip.
328c328
< <mod-cluster-config advertise-socket="modcluster" connector="ajp" advertise="false" proxies="my-proxy-one">
---
> <mod-cluster-config advertise-socket="modcluster" connector="ajp">
384c384
< <subsystem xmlns="urn:jboss:domain:undertow:2.0" instance-id="worker-1">
---
> <subsystem xmlns="urn:jboss:domain:undertow:2.0">
435c435
< <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:102}">
---
> <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}">
452,454d451
< <outbound-socket-binding name="my-proxy-one">
< <remote-destination host="10.10.2.4" port="6666"/>
< </outbound-socket-binding>
456c453
< </server>
---
> </server>
Changes explanation
proxies="my-proxy-one", outbound socket binding name; could be more of them here.
instance-id="worker-1", the name of the worker, a.k.a. JVMRoute.
offset -- you could ignore, it's just for my test setup. Offset does not apply to outbound socket bindings.
<outbound-socket-binding name="my-proxy-one"> - IP and port of the VirtualHost in Apache HTTP Server containing EnableMCPMReceive directive.
Conclusion
Generally, these MEM read / node error messages are related to network problems, e.g. WildFly can contact Apache, but Apache cannot contact WildFly back. Last but not least, it could happen that the Apache HTTP Server's configuration uses PersistSlots directive and some substantial enviroment conf change took place, e.g. switch from mpm_prefork to mpm_worker. In this case, MEM Read error messages are not realted to WildFly, but to the cached slotmem files in HTTPD/cache/mod_custer that need to be deleted.
I'm certain it's network in your case though.

After a couple of weeks I got back to the problem and found the solution. The problem was - of course - in configuration and had nothing in common with the particular version of Wildfly. Mode specifically:
There were three nodes in the domain and three servers in each node. All nodes were launched with the following property:
-Djboss.node.name=nodeX
...where nodeX is the name of a particular node. However, it meant that all three servers in the node get the same name, which is exactly what confused the load balancer.
As soon as I have removed this property, everything started to work.

Related

Infinispan (Red Hat Data Grid) in Openshift, with WebSphere Liberty

We're trying to use Red Hat Data Grid (RHDG)/Infinispan in our OCP (4.5.36) cluster. We have the latest official RHDG Operator installed and a Cache type cluster defined. (Which is apparently a k8s StatefulSet.)
I've then configured a WebSphere Liberty container/Deployment to try to use that Infinispan cluster for its sessions, as described in https://github.com/WASdev/ci.docker#session-caching.
Both the Infinispan cluster and the Liberty Deployment are in the same Project/namespace.
However, the Liberty container fails to connect, and the Infinispan containers are reporting several warnings of their own.
The Liberty container "client" log:
INFINISPAN_SERVICE_NAME(original): session-infinispan
INFINISPAN_SERVICE_NAME(normalized): SESSION_INFINISPAN
INFINISPAN_HOST: 172.30.137.86
INFINISPAN_PORT: 11222
INFINISPAN_USER: developer
INFINISPAN_PASS: <redacted>
Launching defaultServer (WebSphere Application Server 21.0.0.3/wlp-1.0.50.cl210320210309-1101) on Eclipse OpenJ9 VM, version 1.8.0_282-b08 (en_US)
[AUDIT ] CWWKE0001I: The server defaultServer has been launched.
[AUDIT ] CWWKE0100I: This product is licensed for development, and limited production use. The full license terms can be viewed here: https://public.dhe.ibm.com/ibmdl/export/pub/software/websphere/wasdev/license/base_ilan/ilan/21.0.0.3/lafiles/en.html
[AUDIT ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ibm/wlp/usr/servers/defaultServer/configDropins/defaults/keystore.xml
[AUDIT ] CWWKG0093A: Processing configuration drop-ins resource: /opt/ibm/wlp/usr/servers/defaultServer/configDropins/overrides/infinispan-client-sessioncache.xml
[AUDIT ] CWWKZ0058I: Monitoring dropins for applications.
[AUDIT ] CWWKT0016I: Web application available (default_host): http://payment-engine-6dcc5b6d5-jclx2:9080/payment/
[ERROR ] ISPN004007: Exception encountered. Retry 10 out of 10
org.infinispan.client.hotrod.exceptions.TransportException:: ISPN004071: Connection to 172.30.137.86/172.30.137.86:11222 was closed while waiting for response.
[ERROR ] SESN0307E: An exception occurred when initializing the cache. The exception is: org.infinispan.client.hotrod.exceptions.TransportException:: org.infinispan.client.hotrod.exceptions.TransportException:: ISPN004071: Connection to 172.30.137.86/172.30.137.86:11222 was closed while waiting for response.
at org.infinispan.client.hotrod.impl.transport.netty.ActivationHandler.exceptionCaught(ActivationHandler.java:53)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:300)
...
What looks like the relevant part of the Inifinispan container log:
03:40:18,628 WARN (SINGLE_PORT-ServerIO-4-2) [io.netty.handler.ssl.ApplicationProtocolNegotiationHandler] [id: 0xc39380c8, L:/10.254.0.248:11222 ! R:/10.254.2.65:32986] TLS handshake failed: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: a0061e21000003ffffffff0f0000
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1254)
(Actually, there are several Infinispan startup WARNs, mostly about deprecated capabilities. But this is the only one with a stack trace, so I'm jumping to the conclusion that it might be "the culprit")
Also, this is the Infinispan Service, so you can see the IP and port match what the Liberty container is using:
Working through this on the Infinispan chat service, it does appear that there's incorrect or incomplete setup of SSL/TLS.
I had attempted to remove encryption in the Infinispan cluster, but I either didn't sufficiently restart components or you can't change it after the fact. Removing the cluster and recreating with it disabled, though, enabled the Liberty communication to work.
The following CR YAML works:
apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
name: session-infinispan
spec:
replicas: 1
service:
type: Cache
security:
endpointEncryption:
type: None
Now to pursue what's missing from the Liberty setup to make use of SSL correctly. The Infinispan chat conversation says that this Liberty XML setup from the official image:
<server>
<featureManager>
<feature>sessionCache-1.0</feature>
</featureManager>
<httpSessionCache libraryRef="InfinispanLib">
<properties infinispan.client.hotrod.server_list="${INFINISPAN_HOST}:${INFINISPAN_PORT}"/>
<properties infinispan.client.hotrod.marshaller="org.infinispan.commons.marshall.JavaSerializationMarshaller"/>
<properties infinispan.client.hotrod.java_serial_whitelist=".*"/>
<properties infinispan.client.hotrod.auth_username="${INFINISPAN_USER}"/>
<properties infinispan.client.hotrod.auth_password="${INFINISPAN_PASS}"/>
<properties infinispan.client.hotrod.auth_realm="default"/>
<properties infinispan.client.hotrod.sasl_mechanism="DIGEST-MD5"/>
<properties infinispan.client.hotrod.auth_server_name="infinispan"/>
</httpSessionCache>
<httpSessionCache enableBetaSupportForInfinispan="true"/> <!-- TODO remove once no longer gated -->
<library id="InfinispanLib">
<fileset dir="${shared.resource.dir}/infinispan" includes="*.jar"/>
</library>
</server>
Needs the following properties added:
# Encryption
infinispan.client.hotrod.sni_host_name=$SERVICE_HOSTNAME
# Path to the TLS certificate.
# Clients automatically generate trust stores from certificates.
infinispan.client.hotrod.trust_store_path=/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt

JDWP Transport dt_socket failed to initialize, TRANSPORT_INIT(510)

I want to run two deployed applications ( .ear ) in two instances of JBoss 6.0 at the same time
I have changed all used ports of both standalone.xml files including http , management-http , etc...
Like this:
application1.ear : socket-binding name="http" port="8080
application2.ear : socket-binding name="http" port="8081
application1.ear : name="management-http" port="9990
application2.ear : name="management-http" port="9991
Any Help is appreciated
Following are the two ways to run mutliple JBoss instance on same server.
Bind each instance to a different IP address
This is the easiest and most recommended way to solve this problem. If the server has multiple NICs then this is simple. If not, then one must "multi-home" the server. In other words, assign the server more than one IP address through OS configuration. Start the instances like so:
$JBOSS_HOME1/bin/run.sh -b <ip-addr-1>
$JBOSS_HOME2/bin/run.sh -b <ip-addr-2>
The same $JBOSS_HOME can be used with multiple "profiles" in $JBOSS_HOME/server. For example:
$JBOSS_HOME/bin/run.sh -b <ip-addr-1> -c node1
$JBOSS_HOME/bin/run.sh -b <ip-addr-2> -c node2
Service Binding Manager
Configure the "Service Binding Manager" to tell the JBoss instances which ports to use.
Uncomment the "jboss.system:service=ServiceBindingManager" MBean in $JBOSS_HOME/server/$PROFILE/conf/jboss-service.xml.
<mbean code="org.jboss.services.binding.ServiceBindingManager"
name="jboss.system:service=ServiceBindingManager">
<attribute name="ServerName">ports-01</attribute>
<attribute name="StoreURL">${jboss.home.url}/docs/examples/binding-manager/sample-bindings.xml</attribute>
<attribute name="StoreFactoryClassName">
org.jboss.services.binding.XMLServicesStoreFactory
</attribute>
</mbean>
This tells JBoss to use the port numbering scheme defined by "ports-01" in $JBOSS_HOME/docs/examples/binding-manager/sample-bindings.xml. This scheme increases the second most significant digit of every port by 100. For example, the JNDI port is 1099 by default but 1199 using the ports-01 scheme; the HTTP port is 8080 by default but 8180 using the ports-01 scheme. The sample-bindings.xml file contains 4 port schemes:
ports-default
ports-01
ports-02
ports-03
You may want to configure the port set used at start up from the command line or through a system property. If so, adjust the MBean's ServerName to refer to a system property, for example:
<mbean code="org.jboss.services.binding.ServiceBindingManager"
name="jboss.system:service=ServiceBindingManager">
<attribute name="ServerName">${jboss.service.binding.set:ports-default}</attribute>
<attribute name="StoreURL">${jboss.home.url}/docs/examples/binding-manager/sample-bindings.xml</attribute>
<attribute name="StoreFactoryClassName">
org.jboss.services.binding.XMLServicesStoreFactory
</attribute>
</mbean>
Now change it through the following property directly on run.sh/run.bat or add it to your run.conf options:
-Djboss.service.binding.set=ports-01
If you need more than 4 port sets defined in sample-bindings.xml by default, please refer to the following article for JBOSS 6 EAP:
https://access.redhat.com/site/solutions/237933

Database connection issue - App unable to recover

I have a Java EE Web App running on JBoss AS 7.2 connecting to a Postgresql 9.4 database (hosted on RDS).
The App is quite large and does a mixture of web page serving, API calls and Scheduled Tasks
More and more frequently I am having to reboot the application server as the whole app has ground to a halt, checking DB stats I can see the number of connections has gone through the roof along with database CPU
(big spike as app stops responding, soon as I restart Jboss it drops back)
The database logs show that the connection to the client has been lost:
LOG: could not send data to client: Broken pipe
FATAL: connection to client lost
The jboss logs start filling up as transactions time-out...
Caused by: javax.transaction.RollbackException: ARJUNA016063: The transaction is not active!
The only way to fix is to restart JBoss and the number of connections goes back to normal.
My DB datasource configuration looks like this..
<datasource jta="false" jndi-name="java:/appWebDatasource" pool-name="jdbc/appWebDatasource" enabled="true" use-java-context="true" use-ccm="false">
<connection-url>jdbc:postgresql://${web.db.url}/MyApp</connection-url>
<driver>postgresql</driver>
<security>
<user-name>jboss</user-name>
<password>******</password>
</security>
<validation>
<check-valid-connection-sql>select 1</check-valid-connection-sql>
<validate-on-match>false</validate-on-match>
<background-validation>true</background-validation>
</validation>
<statement>
<share-prepared-statements>false</share-prepared-statements>
</statement>
</datasource>
I have been checking the pg_stat_activity table as soon as the issue occurs and there are no idle in transaction connections, they are all either idle or active
So my question is, how to configure JBoss or Postgresql in a way to stop this increase in number of connections that crashes the app??
You can have a cap on the max number of connections by declaring the max pool size you want to allow with this paramater <max-pool-size>
You have to consider your application and choose an appropriate size to set in <max-pool-size>
As you need to use the validation checker mechanism also along with parameter already mentioned by DaveB in data source configuration, given in the doc.

Restcomm - Solving SMSC GW 7.2 configuration failures

We configured the latest version (7.2) SMSC-GW to work on on our server with the environment (cassandra and such). However, after setting up everything. Some failures are appearing (which did not appear in previous versions).
Firstly, when connecting the simulators and the gateway using the default settings (JSS7 <-> SMSCGW <-> SMPP)
JSS7 is connected and sending, but no response is received.
SMPP is connected to SMSC-GW and the EMSE is bound. SMPP tries to send to SS7 but receives a response PDU packet failure from the SMSC-GW
I tried configuring DB routing rules, but that did not work.
Also, the log in the SMSC-GW server is frequently displaying the following message:
16:00:28,504 INFO [SchedulerResourceAdaptor] (pool-56-thread-1) Not all SBB are running now: ServicesDownList=[smscTxSmppServerServiceState, smscRxSmppServerServiceState, smscTxSipServerServiceState, smscRxSipServerServiceState, smscTxHttpServerServiceState, moServiceState, homeRoutingServiceState, mtServiceState, alertServiceState, chargingServiceState, ]
And the JSS7 management console GUI is displaying this (which looks wrong):
So are these the source of the SMSC-GW failures?
UPDATE: I found this error in the server.log
2017-02-02 10:57:42,005 WARN [org.mobicents.slee.container.deployment.jboss.SleeContainerDeployerImpl] (SLEE-InternalDeployer-thread-1) SLEE DUs not deployed, due to missing dependencies: file:/home/coreteam/kitchensink/restcomm-smsc-7.2.109/jboss-5.1.0.GA/server/simulator/deploy/smsc-services-du-7.2.109.jar/
Followed by:
EventTypeID[name=org.mobicents.smsc.slee.services.smpp.server.events.SS7_SEND_MT,vendor=org.mobicents,version=1.0]
ResourceAdaptorTypeID[name=PersistenceResourceAdaptorType,vendor=org.mobicents,version=1.0]
ResourceAdaptorTypeID[name=SchedulerResourceAdaptorType,vendor=org.mobicents,version=1.0]
SipRA
EventTypeID[name=org.mobicents.smsc.slee.services.smpp.server.events.SS7_SEND_RSDS,vendor=org.mobicents,version=1.0]
SchedulerResourceAdaptor^M
PersistenceResourceAdaptor^M
EventTypeID[name=org.mobicents.smsc.slee.services.smpp.server.events.SMPP_SM,vendor=org.mobicents,version=1.0]
EventTypeID[name=org.mobicents.smsc.slee.services.smpp.server.events.SS7_SM,vendor=org.mobicents,version=1.0]
EventTypeID[name=org.mobicents.smsc.slee.services.smpp.server.events.SIP_SM,vendor=org.mobicents,version=1.0]
2017-02-02 14:41:17,450 WARN [org.mobicents.slee.container.deployment.jboss.DeploymentManager] (main) Unable to INSTALL smsc-services-du-7.3.0-SNAPSHOT.jar right now. Waiting for dependencies to be resolved.
Solved it quite a while ago, but thought I would share. I just simply installed the SipRA missing dependency by adding the following in the deploy-config.xml file:
<ra-entity
resource-adaptor-id="ResourceAdaptorID[name=JainSipResourceAdaptor,vendor=net.java.slee.sip,version=1.2]"
entity-name="SipRA">
<properties>
<property name="javax.sip.PORT" type="java.lang.Integer" value="5060" />
</properties>
<ra-link name="SipRA" />
In the $JBOSS_HOME/server/profile_name/deploy/restcomm-slee directory.
I set the port to some other value since that number was already taken by some other service.
The smsc-services-du-7.2.109.jar then installed automatically the next time I ran the SMSC-GW.

Glassfish Server Adapter in Eclipse (java.lang.IllegalArgumentException: port out of range:1118080)

i deploy my webApp to a glassfish 3.1.2 Server. I use Eclipse Juno SR1 with glassfish Server adapter.
But from today, if i click on the server i get an Error.
java.lang.IllegalArgumentException: port out of range:1118080
at java.net.InetSocketAddress.(InetSocketAddress.java:118)
at com.sun.enterprise.jst.server.sunappsrv.SunAppServer.isRunning(SunAppServer.java:590)
at com.sun.enterprise.jst.server.sunappsrv.SunAppServer.isRunning(SunAppServer.java:583)
at com.sun.enterprise.jst.server.sunappsrv.actions.AppServerContextAction.acceptIfServerRunning(AppServerContextAction.java:171)
at com.sun.enterprise.jst.server.sunappsrv.actions.OpenBrowserAction.accept(OpenBrowserAction.java:44)
at com.sun.enterprise.jst.server.sunappsrv.actions.AppServerContextAction.selectionChanged(AppServerContextAction.java:234)
at org.eclipse.ui.actions.SelectionProviderAction.selectionChanged(SelectionProviderAction.java:143)
at org.eclipse.jface.viewers.Viewer$2.run(Viewer.java:164)
at org.eclipse.core.runtime.SafeRunner.run(SafeRunner.java:42)
at org.eclipse.ui.internal.JFaceUtil$1.run(JFaceUtil.java:49)
at org.eclipse.jface.util.SafeRunnable.run(SafeRunnable.java:175)
at org.eclipse.jface.viewers.Viewer.fireSelectionChanged(Viewer.java:162)
at org.eclipse.jface.viewers.StructuredViewer.updateSelection(StructuredViewer.java:2188)
So i can't start/stop and deploy my Webapps?
Any ideas???
Thx Tim
The maximum port number is 65535, so check your configuration for which TCP port your server is listening on and set it to a number less than 65535.
I had the same error. In my case the config/domain.xml of my domain was empty (0 Byte no content). Also deleting the domain with
asadmin delete-domain --domaindir glassfish3/glassfish/domains domain1
doesn't worked. So I had delete the domain directory and recreate the domain again (with asadmin).
That's works for me.
Steffen
I got this error and looking at domain.xml I realized that my connection pool description had special characters unescaped. Removing that characters was enough to glassfish to work again