Hibernate Search +infinispan +jgroups back end slave lock issue - hibernate-search

I am new to hibernate search. We decided to user hibernate search for my application. We choose jgroups as a backend. Here is my configuration file.
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:infinispan:config:7.0
http://www.infinispan.org/schemas/infinispan-config-7.0.xsd
urn:infinispan:config:store:jdbc:7.0 http://www.infinispan.org/schemas/infinispan-cachestore-jdbc-config-7.0.xsd"
xmlns="urn:infinispan:config:7.0"
xmlns:jdbc="urn:infinispan:config:store:jdbc:7.0">
<!-- *************************** -->
<!-- System-wide global settings -->
<!-- *************************** -->
<jgroups>
<!-- Note that the JGroups transport uses sensible defaults if no configuration
property is defined. See the JGroupsTransport javadocs for more flags.
jgroups-udp.xml is the default stack bundled in the Infinispan core jar: integration
and tuning are tested by Infinispan. -->
<stack-file name="default-jgroups-tcp" path="proform-jgroups.xml" />
</jgroups>
<cache-container name="HibernateSearch" default-cache="default" statistics="false" shutdown-hook="DONT_REGISTER">
<transport stack="default-jgroups-tcp" cluster="venkatcluster"/>
<!-- Duplicate domains are allowed so that multiple deployments with default configuration
of Hibernate Search applications work - if possible it would be better to use JNDI to share
the CacheManager across applications -->
<jmx duplicate-domains="true" />
<!-- *************************************** -->
<!-- Cache to store Lucene's file metadata -->
<!-- *************************************** -->
<replicated-cache name="LuceneIndexesMetadata" mode="SYNC" remote-timeout="25000">
<transaction mode="NONE"/>
<state-transfer enabled="true" timeout="480000" await-initial-transfer="true" />
<indexing index="NONE" />
<eviction max-entries="-1" strategy="NONE"/>
<expiration max-idle="-1"/>
<persistence passivation="false">
<jdbc:string-keyed-jdbc-store preload="true" fetch-state="true" read-only="false" purge="false">
<property name="key2StringMapper">org.infinispan.lucene.LuceneKey2StringMapper</property>
<jdbc:connection-pool connection-url="jdbc:mysql://localhost:3306/entityindex" driver="com.mysql.jdbc.Driver" password="pf_user1!" username="pf_user"></jdbc:connection-pool>
<jdbc:string-keyed-table drop-on-exit="false" create-on-start="true" prefix="ISPN_STRING_TABLE">
<jdbc:id-column name="ID" type="VARCHAR(255)"/>
<jdbc:data-column name="DATA" type="BLOB"/>
<jdbc:timestamp-column name="TIMESTAMP" type="BIGINT"/>
</jdbc:string-keyed-table>
</jdbc:string-keyed-jdbc-store>
</persistence>
</replicated-cache>
<!-- **************************** -->
<!-- Cache to store Lucene data -->
<!-- **************************** -->
<distributed-cache name="LuceneIndexesData" mode="SYNC" remote-timeout="25000">
<transaction mode="NONE"/>
<state-transfer enabled="true" timeout="480000" await-initial-transfer="true" />
<indexing index="NONE" />
<eviction max-entries="-1" strategy="NONE"/>
<expiration max-idle="-1"/>
<persistence passivation="false">
<jdbc:string-keyed-jdbc-store preload="true" fetch-state="true" read-only="false" purge="false">
<property name="key2StringMapper">org.infinispan.lucene.LuceneKey2StringMapper</property>
<jdbc:connection-pool connection-url="jdbc:mysql://localhost:3306/entityindex" driver="com.mysql.jdbc.Driver" password="pf_user1!" username="pf_user"></jdbc:connection-pool>
<jdbc:string-keyed-table drop-on-exit="false" create-on-start="true" prefix="ISPN_STRING_TABLE">
<jdbc:id-column name="ID" type="VARCHAR(255)"/>
<jdbc:data-column name="DATA" type="BLOB"/>
<jdbc:timestamp-column name="TIMESTAMP" type="BIGINT"/>
</jdbc:string-keyed-table>
</jdbc:string-keyed-jdbc-store>
</persistence>
</distributed-cache>
<!-- ***************************** -->
<!-- Cache to store Lucene locks -->
<!-- ***************************** -->
<replicated-cache name="LuceneIndexesLocking" mode="SYNC" remote-timeout="25000">
<transaction mode="NONE"/>
<state-transfer enabled="true" timeout="480000" await-initial-transfer="true" />
<indexing index="NONE" />
<eviction max-entries="-1" strategy="NONE"/>
<expiration max-idle="-1"/>
<persistence passivation="false">
<jdbc:string-keyed-jdbc-store preload="true" fetch-state="true" read-only="false" purge="false">
<property name="key2StringMapper">org.infinispan.lucene.LuceneKey2StringMapper</property>
<jdbc:connection-pool connection-url="jdbc:mysql://localhost:3306/entityindex" driver="com.mysql.jdbc.Driver" password="pf_user1!" username="pf_user"></jdbc:connection-pool>
<jdbc:string-keyed-table drop-on-exit="false" create-on-start="true" prefix="ISPN_STRING_TABLE">
<jdbc:id-column name="ID" type="VARCHAR(255)"/>
<jdbc:data-column name="DATA" type="BLOB"/>
<jdbc:timestamp-column name="TIMESTAMP" type="BIGINT"/>
</jdbc:string-keyed-table>
</jdbc:string-keyed-jdbc-store>
</persistence>
</replicated-cache>
</cache-container>
This is my jgroups-file:
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups
http://www.jgroups.org/schema/JGroups-3.6.xsd">
<TCP bind_addr="${jgroups.tcp.address:127.0.0.1}"
bind_port="${jgroups.tcp.port:7801}"
enable_diagnostics="false"
thread_naming_pattern="pl"
send_buf_size="640k"
sock_conn_timeout="300"
thread_pool.min_threads="${jgroups.thread_pool.min_threads:2}"
thread_pool.max_threads="${jgroups.thread_pool.max_threads:30}"
thread_pool.keep_alive_time="60000"
thread_pool.queue_enabled="false"
internal_thread_pool.min_threads=
"${jgroups.internal_thread_pool.min_threads:5}"
internal_thread_pool.max_threads=
"${jgroups.internal_thread_pool.max_threads:20}"
internal_thread_pool.keep_alive_time="60000"
internal_thread_pool.queue_enabled="true"
internal_thread_pool.queue_max_size="500"
oob_thread_pool.min_threads="${jgroups.oob_thread_pool.min_threads:20}"
oob_thread_pool.max_threads="${jgroups.oob_thread_pool.max_threads:200}"
oob_thread_pool.keep_alive_time="60000"
oob_thread_pool.queue_enabled="false"
/>
<S3_PING access_key=""
secret_access_key=""
location="mybucket"
/>
<MERGE3 min_interval="10000"
max_interval="30000"
/>
<FD_SOCK />
<FD_ALL timeout="60000"
interval="15000"
timeout_check_interval="5000"
/>
<VERIFY_SUSPECT timeout="5000" />
<pbcast.NAKACK2 use_mcast_xmit="false"
xmit_interval="1000"
xmit_table_num_rows="50"
xmit_table_msgs_per_row="1024"
xmit_table_max_compaction_time="30000"
max_msg_batch_size="100"
resend_last_seqno="true"
/>
<UNICAST3 xmit_interval="500"
xmit_table_num_rows="50"
xmit_table_msgs_per_row="1024"
xmit_table_max_compaction_time="30000"
max_msg_batch_size="100"
conn_expiry_timeout="0"
/>
<pbcast.STABLE stability_delay="500"
desired_avg_gossip="5000"
max_bytes="1M"
/>
<pbcast.GMS print_local_addr="false"
join_timeout="15000"
/>
<MFC max_credits="2m"
min_threshold="0.40"
/>
<FRAG2 />
</config>
This is my flush-tcp file:-
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:org:jgroups"
xsi:schemaLocation="urn:org:jgroups
http://www.jgroups.org/schema/jgroups.xsd">
<TCP bind_port="7801"/>
<S3_PING access_key=""
secret_access_key=""
location=""
/>
<MERGE3/>
<FD_SOCK/>
<FD/>
<VERIFY_SUSPECT/>
<pbcast.NAKACK2 use_mcast_xmit="false"/>
<UNICAST3/>
<pbcast.STABLE/>
<pbcast.GMS/>
<MFC/>
<FRAG2/>
<pbcast.STATE_TRANSFER/>
<pbcast.FLUSH timeout="0"/>
</config>
These are hibernate settings:
propertyMap.put("hibernate.search.default.directory_provider",
"infinispan");
propertyMap.put("hibernate.search.lucene_version",
KeywordUtil.LUCENE_4_10_4);
propertyMap.put("hibernate.search.infinispan.configuration_resourcename",
"hibernate-search-infinispan-config.xml");
propertyMap.put("hibernate.search.default.​worker.execution","sync");
propertyMap.put("hibernate.search.default.​worker.backend","jgroups");
propertyMap.put("hibernate.search.services.jgroups.configurationFile",
"flush-tcp.xml");
propertyMap.put("hibernate.search.default.exclusive_index_use","true");
Initially we start the cluster with one node with the above configuration. Depends on the load we will add nodes to the cluster. This is our architecture.
Assume that 10-00 AM we started the cluster. only node will become master node. and everthing is fine.
10-10 Am we added one more node to the cluster with slight config change. Here is the change
propertyMap.put("hibernate.search.default.exclusive_index_use","false");
I created an index through second node. Then the locking error comes up. here is the error.
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
org.infinispan.lucene.locking.BaseLuceneLock#46578a74
Problem:- In theory second node should become slave and it should never acquire lock on the index. It should indicate the master node to create the index through jgroups channel. But its not happening. Can one of you please help me on this. Our production system is in problem. Please help me on this.

Problem:- In theory second node should become slave and it should
never acquire lock on the index. It should indicate the master node to
create the index through jgroups channel.
There may be two problems here.
1. Using different values for exclusive_index_use
Maybe someone else can confirm, but unless your new node only deals with a completely separate persistent unit with completely different indexes, I doubt it's a good idea to use a different value for exclusive_index_use on different nodes.
exclusive_index_use is not about not acquiring locks, it's more about releasing them as soon as possible (after each changeset). If your other nodes work in exclusive node, they will never release locks, and your new node will time out waiting for the lock.
Also note that disabling exclusive_index_use is a sure way to decrease write performance, because it requires to constantly close and open index writers. Use with caution.
And finally, as you pointed out, only one node should write to the index at any given time (the JGroups master), so disabling exclusive_index_use should not be needed in your case. There must be another problem...
2. Master/slave election
If I remember correctly, the default master/node election strategy elect a new master when you add a new node. Also, we fixed a few bugs related to dynamic master election in the latest Hibernate Search version (not released yet), so you may be affected by one of those.
You can try using the jgroupsMaster backend on your first node and jgroupsSlave on your second node. There won't be any automatic master election anymore, so you'll lose the ability to maintain service when the master node fails, but from what I understand your main concern is scaling up, so it might give you a temporary solution.
On the master node:
propertyMap.put("hibernate.search.default.​worker.backend","jgroupsMaster");
On the slave node:
propertyMap.put("hibernate.search.default.​worker.backend","jgroupsSlave");
WARNING: you will need a full restart! Keeping the current jgroups backend on your master while adding another node with the jgroupsSlave backend will result in trouble!
You may also need some configuration changes regarding your Infinispan directory, but I'm not familiar with this directory. You can check the documentation: https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#jgroups-backend

Related

Getting 502 http status code on a Service Fabric stateless service deployed on lesser node than configured VM Scaleset nodes

We have deployed various stateless services on a 5 node cluster with -1 as instance count as Singleton partition scheme. Recently, we decided to deploy the few stateless services only on 3 nodes out of 5 by defining instance count as 3.
After deployment, the stateless services with -1 as instance count are working and responding with HttpStatus 200 Ok. however, a stateless service deployed with 3 instance node count are intermittently responding with HttpStatus 502 with following error (from fiddler):
The connection to 'someservername.centralus.cloudapp.azure.com' failed.
System.Security.SecurityException Failed to negotiate HTTPS connection with server.fiddler.network.https> HTTPS handshake to someservername.centralus.cloudapp.azure.com failed. System.IO.IOException Authentication failed because the remote party has closed the transport stream.
Below is the application manifest of deployed application for reference
<ApplicationManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ApplicationTypeName="MyService.ServiceFabricType" ApplicationTypeVersion="1.0.0.1.1" ManifestId="8747c387-a7fc-4b05-b189-b1c01958f066" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<Parameters>
<Parameter Name="My_Service_ASPNETCORE_ENVIRONMENT" DefaultValue="" />
<Parameter Name="My_Service_InstanceCount" DefaultValue="3" />
</Parameters>
<ServiceManifestImport>
<ServiceManifestRef ServiceManifestName="MyServicePkg" ServiceManifestVersion="1.0.0.1.1" />
<ConfigOverrides />
<EnvironmentOverrides CodePackageRef="code">
<EnvironmentVariable Name="ASPNETCORE_ENVIRONMENT" Value="[My_Service_ASPNETCORE_ENVIRONMENT]" />
</EnvironmentOverrides>
</ServiceManifestImport>
<DefaultServices>
<Service Name="MyService" ServicePackageActivationMode="ExclusiveProcess">
<StatelessService ServiceTypeName="MyServiceType" InstanceCount="[My_Service_InstanceCount]">
<SingletonPartition />
</StatelessService>
</Service>
</DefaultServices>
</ApplicationManifest>
and service manifest :
<ServiceManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ManifestId="59ea463b-5e4c-44f5-8982-5658b35d6c89" Name="MyServicePkg" Version="1.0.0.1.1" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<ServiceTypes>
<StatelessServiceType ServiceTypeName="MyService" />
</ServiceTypes>
<CodePackage Name="Code" Version="1.0.0.1.1">
<EntryPoint>
<ExeHost>
<Program>MyService.exe</Program>
<WorkingFolder>CodePackage</WorkingFolder>
</ExeHost>
</EntryPoint>
<EnvironmentVariables>
<EnvironmentVariable Name="ASPNETCORE_ENVIRONMENT" Value="" />
</EnvironmentVariables>
</CodePackage>
<ConfigPackage Name="Config" Version="1.0.0.1.1" />
<Resources>
<Endpoints>
<Endpoint Name="ServiceEndpoint" Protocol="https" Type="Input" Port="9226" />
</Endpoints>
</Resources>
</ServiceManifest>
Is it mandatory to deploy a stateless service all nodes in service fabric?
If no, how the above scenario can be configured?
Note - Currently Service Fabric is configured with Silver durability tier and with reverse proxy in disabled state. Also did not get any relevant solution from this azure documentation.

Service Fabric Explorer Health State Unknown

The node in my partition keeps switching between Health State = OK and Health State = unknown.
Sometimes the node disappears.
I have tried deleting the service, the app and unprovisioning the type, then redeploying, however I get the same problem.
It is a Service Fabric stateful service, and it's running fine locally, the issue I'm having is only in my dev environment.
I'm using 5 nodes.
ServiceManifest.xml:
<?xml version="1.0" encoding="utf-8"?>
<ServiceManifest Name="Integration.Optical.ServicePkg"
Version="1.0.0"
xmlns="http://schemas.microsoft.com/2011/01/fabric"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ServiceTypes>
<!-- This is the name of your ServiceType.
This name must match the string used in the RegisterServiceAsync call in Program.cs. -->
<StatefulServiceType ServiceTypeName="Integration.Optical.ServiceType" />
</ServiceTypes>
<!-- Code package is your service executable. -->
<CodePackage Name="Code" Version="1.0.0">
<EntryPoint>
<ExeHost>
<Program>Integration.Optical.Service.exe</Program>
<WorkingFolder>CodePackage</WorkingFolder>
</ExeHost>
</EntryPoint>
<EnvironmentVariables>
<EnvironmentVariable Name="ASPNETCORE_ENVIRONMENT" Value=""/>
<EnvironmentVariable Name="KEYVAULT_ENDPOINT" Value=""/>
</EnvironmentVariables>
</CodePackage>
<!-- Config package is the contents of the Config directoy under PackageRoot that contains an
independently-updateable and versioned set of custom configuration settings for your service. -->
<ConfigPackage Name="Config" Version="1.0.0" />
<Resources>
<Endpoints>
<!-- This endpoint is used by the communication listener to obtain the port on which to
listen. Please note that if your service is partitioned, this port is shared with
replicas of different partitions that are placed in your code. -->
<Endpoint Name="ServiceEndpoint" />
<!-- This endpoint is used by the replicator for replicating the state of your service.
This endpoint is configured through a ReplicatorSettings config section in the Settings.xml
file under the ConfigPackage. -->
<Endpoint Name="ReplicatorEndpoint" />
</Endpoints>
</Resources>
</ServiceManifest>
ApplicationManifest.xml:
<?xml version="1.0" encoding="utf-8"?>
<ApplicationManifest xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ApplicationTypeName="Integration.OpticalType" ApplicationTypeVersion="1.0.0" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<Parameters>
<Parameter Name="Integration.Optical.Service_ASPNETCORE_ENVIRONMENT" DefaultValue="" />
<Parameter Name="Integration.Optical.Service_KEYVAULT_ENDPOINT" DefaultValue="" />
<Parameter Name="Integration.Optical.Service_MinReplicaSetSize" DefaultValue="3" />
<Parameter Name="Integration.Optical.Service_PartitionCount" DefaultValue="1" />
<Parameter Name="Integration.Optical.Service_TargetReplicaSetSize" DefaultValue="3" />
</Parameters>
<!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion
should match the Name and Version attributes of the ServiceManifest element defined in the
ServiceManifest.xml file. -->
<ServiceManifestImport>
<ServiceManifestRef ServiceManifestName="Integration.Optical.ServicePkg" ServiceManifestVersion="1.0.0" />
<ConfigOverrides />
<EnvironmentOverrides CodePackageRef="code">
<EnvironmentVariable Name="ASPNETCORE_ENVIRONMENT" Value="[Integration.Optical.Service_ASPNETCORE_ENVIRONMENT]" />
<EnvironmentVariable Name="KEYVAULT_ENDPOINT" Value="[Integration.Optical.Service_KEYVAULT_ENDPOINT]" />
</EnvironmentOverrides>
</ServiceManifestImport>
<DefaultServices>
<!-- The section below creates instances of service types, when an instance of this
application type is created. You can also create one or more instances of service type using the
ServiceFabric PowerShell module.
The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
<Service Name="Integration.Optical.Service" ServicePackageActivationMode="ExclusiveProcess">
<StatefulService ServiceTypeName="Integration.Optical.ServiceType" TargetReplicaSetSize="[Integration.Optical.Service_TargetReplicaSetSize]" MinReplicaSetSize="[Integration.Optical.Service_MinReplicaSetSize]">
<UniformInt64Partition PartitionCount="[Integration.Optical.Service_PartitionCount]" LowKey="-9223372036854775808" HighKey="9223372036854775807" />
</StatefulService>
</Service>
</DefaultServices>
</ApplicationManifest>
ApplicationParameters/Cloud.xml:
<?xml version="1.0" encoding="utf-8"?>
<Application xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Name="fabric:/Integration.Optical" xmlns="http://schemas.microsoft.com/2011/01/fabric">
<Parameters>
<Parameter Name="Integration.Optical.Service_ASPNETCORE_ENVIRONMENT" Value="" />
<Parameter Name="Integration.Optical.Service_KEYVAULT_ENDPOINT" Value="" />
<Parameter Name="Integration.Optical.Service_PartitionCount" Value="1" />
<Parameter Name="Integration.Optical.Service_MinReplicaSetSize" Value="1" />
<Parameter Name="Integration.Optical.Service_TargetReplicaSetSize" Value="1" />
</Parameters>
</Application>
Not sure what part of this fixed it. But this is what I did and it's now working:
In ServiceManifest.xml I added HasPersistedState = true:
<StatefulServiceType ServiceTypeName="Integration.Optical.ServiceType" HasPersistedState="true" />
I moved the app configuration code
ServiceRuntime.RegisterServiceAsync...
from Service.RunAsync() to Program.Main()

jgroups-kubernetes BindException

I used infinispan in the spring boot project, and I know that jgroups complete node communication and discovery in infinispan. I can already do multi-instance mutual discovery on the dockers. Now the problem is on Kubernetes.
The jgroups configuration file default-jgroups-kubernetes.xml I used was found in the official package of infinispan. I only modified tcp.port and the KUBE_PING tag:
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups-4.0.xsd">
<TCP bind_addr="${jgroups.tcp.address:match-interface:eth.*}"
bind_port="${jgroups.tcp.port:30001}"
enable_diagnostics="false"
thread_naming_pattern="pl"
send_buf_size="640k"
sock_conn_timeout="300"
bundler_type="no-bundler"
logical_addr_cache_expiration="360000"
thread_pool.min_threads="${jgroups.thread_pool.min_threads:0}"
thread_pool.max_threads="${jgroups.thread_pool.max_threads:200}"
thread_pool.keep_alive_time="60000"
/>
<kubernetes.KUBE_PING
port_range="3000"
namespace="default"
masterProtocol="https"
masterHost="https://192.1.5.110:32305"
masterPort="32305"
/>
<MERGE3 min_interval="10000"
max_interval="30000"
/>
<FD_SOCK />
<!-- Suspect node `timeout` to `timeout + timeout_check_interval` millis after the last heartbeat -->
<FD_ALL timeout="10000"
interval="2000"
timeout_check_interval="1000"
/>
<VERIFY_SUSPECT timeout="1000"/>
<pbcast.NAKACK2 use_mcast_xmit="false"
xmit_interval="100"
xmit_table_num_rows="50"
xmit_table_msgs_per_row="1024"
xmit_table_max_compaction_time="30000"
resend_last_seqno="true"
/>
<UNICAST3 xmit_interval="100"
xmit_table_num_rows="50"
xmit_table_msgs_per_row="1024"
xmit_table_max_compaction_time="30000"
/>
<pbcast.STABLE stability_delay="500"
desired_avg_gossip="5000"
max_bytes="1M"
/>
<pbcast.GMS print_local_addr="false"
join_timeout="${jgroups.join_timeout:5000}"
/>
<MFC max_credits="2m"
min_threshold="0.40"
/>
<FRAG3/>
</config>
By the way, I have introduced the dependencies of jgroups-kubernetes
I used the above configuration, and then the following exception:
...
Caused by: java.net.BindException: no port available in range [30001 .. 30051] (bind_addr=xxxxx%eth0)
at org.jgroups.util.Util.createServerSocket(Util.java:3512)
...
Where xxxxx represents an IPv6 address
I don't know how to solve this exception. I tried to change the TCP port to 7800, and then the range is unchanged. The result is still the above exception, but the information becomes [7800..7850]
Looking forward to your help.Thanks!
Try to use either IPv4 or IPv6, but not a mix of both. IPv4 can be forced by setting system property java.net.preferIPv4Stack to true.
A port_range of 3000 doesn't make any sense, use a smaller number such as 10, depending on how many containers you're running in the same pod (1 might work, too).
In masterHost you're also setting the port, that's not needed and may actually fail. As a matter of fact, none of the 3 master* properties need to be set, as Kubernetes sets these properties itself.

"Failed to create endpoint [XXX] on network because of a duplicate name" on local 5-node Service Fabric cluster

I have a local Service Fabric cluster of 5 nodes and I have a problem deploying my application on all nodes. When set to "1 Node", the cluster works fine. When set to "5 Nodes" it gives an error on all nodes but one. This is the error/warning message:
Error event: SourceId='System.Hosting', Property='CodePackageActivation:Code:EntryPoint:131919316927686034'.
There was an error during CodePackage activation.System.Fabric.FabricException (-2147017731)
Failed to start Container. ContainerName=sf-0-28e0002f-fd7d-412c-81b8-b78ca5339ce4_865991cc-9c36-493f-9b3d-95f6eba43851, ApplicationId=Proton.SFType_App0, ApplicationName=fabric:/Proton.SF.
DockerRequest returned StatusCode=InternalServerError with ResponseBody={"message":"failed to create endpoint sf-0-28e0002f-fd7d-412c-81b8-b78ca5339ce4_865991cc-9c36-493f-9b3d-95f6eba43851 on network nat: HNS failed with error : You were not connected because a duplicate name
The error message looks truncated. The application loads up fine, but on one node only. Am I missing something in 5-node configuration? The application we are deploying is a container which runs a .NET Core app. I've attached a screenshot of the error in Service Fabric Explorer.
Error Screenshot
ServiceManifest.xml
<?xml version="1.0" encoding="utf-8"?>
<ServiceManifest Name="Proton.TestingPkg"
Version="1.0.0"
xmlns="http://schemas.microsoft.com/2011/01/fabric"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ServiceTypes>
<!-- This is the name of your ServiceType.
The UseImplicitHost attribute indicates this is a guest service. -->
<StatelessServiceType ServiceTypeName="Proton.TestingType" UseImplicitHost="true" />
</ServiceTypes>
<!-- Code package is your service executable. -->
<CodePackage Name="Code" Version="1.0.0">
<EntryPoint>
<!-- Follow this link for more information about deploying Windows containers to Service Fabric: https://aka.ms/sfguestcontainers -->
<ContainerHost>
<ImageName>proton.azurecr.io/protontesting:latest</ImageName>
</ContainerHost>
</EntryPoint>
<!-- Pass environment variables to your container: -->
<!--
<EnvironmentVariables>
<EnvironmentVariable Name="VariableName" Value="VariableValue"/>
</EnvironmentVariables>
-->
</CodePackage>
<!-- Config package is the contents of the Config directoy under PackageRoot that contains an
independently-updateable and versioned set of custom configuration settings for your service. -->
<ConfigPackage Name="Config" Version="1.0.0" />
<Resources>
<Endpoints>
<!-- This endpoint is used by the communication listener to obtain the port on which to
listen. Please note that if your service is partitioned, this port is shared with
replicas of different partitions that are placed in your code. -->
<Endpoint Name="Proton.TestingTypeEndpoint" Port="8001" />
</Endpoints>
</Resources>
</ServiceManifest>
ApplicationManifest.xml
<?xml version="1.0" encoding="utf-8"?>
<ApplicationManifest ApplicationTypeName="Proton.SFType"
ApplicationTypeVersion="1.0.0"
xmlns="http://schemas.microsoft.com/2011/01/fabric"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Parameters>
<Parameter Name="Proton.Testing_InstanceCount" DefaultValue="-1" />
</Parameters>
<!-- Import the ServiceManifest from the ServicePackage. The ServiceManifestName and ServiceManifestVersion
should match the Name and Version attributes of the ServiceManifest element defined in the
ServiceManifest.xml file. -->
<ServiceManifestImport>
<ServiceManifestRef ServiceManifestName="Proton.TestingPkg" ServiceManifestVersion="1.0.0" />
<ConfigOverrides />
<Policies>
<ContainerHostPolicies CodePackageRef="Code">
<!-- See https://aka.ms/I7z0p9 for how to encrypt your repository password -->
<RepositoryCredentials AccountName="ProtonCluster" Password="XXX" PasswordEncrypted="false" />
<PortBinding ContainerPort="80" EndpointRef="Proton.TestingTypeEndpoint" />
</ContainerHostPolicies>
</Policies>
</ServiceManifestImport>
<DefaultServices>
<!-- The section below creates instances of service types, when an instance of this
application type is created. You can also create one or more instances of service type using the
ServiceFabric PowerShell module.
The attribute ServiceTypeName below must match the name defined in the imported ServiceManifest.xml file. -->
<Service Name="Proton.Testing" ServicePackageActivationMode="ExclusiveProcess">
<StatelessService ServiceTypeName="Proton.TestingType" InstanceCount="[Proton.Testing_InstanceCount]">
<SingletonPartition />
</StatelessService>
</Service>
</DefaultServices>
</ApplicationManifest>

Can't make JTA work on jboss AS7.1 with spring 3.1

We're trying to configure a Spring application to work with JTA transactions. It is not like it is failing, but everything we tried just do the select and ignores our persistence operations.
As you can see in the following logs, even though the save statement run there’s no insert statement, neither an exception, neither an error/warning logs
server.log
DEBUG [xxxx.xxxx.persistence] (http--0.0.0.0-8080-1) execute $Proxy363.save(whatever.core.entities.XxxAssignedTime#51a8c542)
DEBUG [org.springframework.data.repository.core.support.TransactionalRepositoryProxyPostProcessor$CustomAnnotationTransactionAttributeSource] (http--0.0.0.0-8080-1) Adding transactional method 'save' with attribute: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] (http--0.0.0.0-8080-1) Returning cached instance of singleton bean 'transactionManager'
DEBUG [org.springframework.transaction.jta.JtaTransactionManager] (http--0.0.0.0-8080-1) Participating in existing transaction
INFO [xxxx.xxxx.impl.BpmnUserBusinessImpl] (http--0.0.0.0-8080-1) Last Assigned Time Updated to *******, Thu Sep 06 15:48:48 ICT 2012
It is like thought the application server thinks everything went right. Whereas if you check the table no insertion or update was made.
The datasource that we intend to use with a spring application is successfully used with JTA transactions by a Java EE war application running in the same application server.
Since we have no idea where the problem might be I'd like to put it all in full context.
spring 3.1.2-RELEASE
hibernate 4.1.5.Final
spring-data-jpa 1.1.1.Final
apache axis 1.4
jboss AS7.1
DB: oracle 10g
We've been trying all sorts of crazy configuration in order to try to make it work, so I will just copy here the simplest one.
standalone.xml
<datasource jta="true" jndi-name="java:jboss/datasources/EngineDS" pool-name="EngineDS" enabled="true" use-java-context="true" use-ccm="true">
<connection-url>jdbc:oracle:thin:#server:port:****</connection-url>
<driver>oracle</driver>
<transaction-isolation>TRANSACTION_READ_COMMITTED</transaction-isolation>
<pool>
<min-pool-size>10</min-pool-size>
<max-pool-size>100</max-pool-size>
<prefill>true</prefill>
</pool>
<security>
<user-name>****</user-name>
<password>****</password>
</security>
<statement>
<prepared-statement-cache-size>32</prepared-statement-cache-size>
<share-prepared-statements>true</share-prepared-statements>
</statement>
</datasource>
<drivers>
<driver name="oracle" module="com.oracle.ojdbc6">
<xa-datasource-class>oracle.jdbc.xa.client.OracleXADataSource</xa-datasource-class>
</driver>
</drivers>
web.xml
[...]
<resource-ref id="DS">
<res-ref-name>EngineDS</res-ref-name>
<res-type>javax.sql.DataSource</res-type>
<res-auth>Container</res-auth>
<res-sharing-scope>Shareable</res-sharing-scope>
<mapped-name>java:jboss/datasources/EngineDS</mapped-name>
</resource-ref>
[...]
spring-jpa-config.xml
<jpa:repositories base-package="whatever.core.repositories" />
<jee:jndi-lookup id="dataSource" jndi-name="EngineDS" cache="true" resource-ref="true" expected-type="javax.sql.DataSource"/>
<bean id="entityManagerFactory"
class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="packagesToScan" value="whatever.core.entities" />
<property name="jpaVendorAdapter">
<bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
<property name="showSql" value="${hibernate.show_sql}" />
<property name="generateDdl" value="${jpa.generateDdl}" />
<property name="databasePlatform" value="${jpa.dialect}" />
</bean>
</property>
</bean>
<tx:annotation-driven />
<tx:jta-transaction-manager />
<bean class="org.springframework.orm.jpa.support.PersistenceAnnotationBeanPostProcessor"/>
<bean class="org.springframework.orm.hibernate4.HibernateExceptionTranslator"/>
We had made in the past jboss AS5/Spring 3.0 applications to work with JTA transactions, so it is not like is my first time doing this, and we've been looking around all possible blogs and open source projects I could find. Yet anything that seem to work smoothly to everybody seems to be ignored in my application. I'm sure it should be that we're missing something really stupid somewhere but we've tried so far 70+ different configurations and none seem do a simple insert that otherwise work when not trying JTA.
(The fact that we use axis 1.4 might be relevant or not, but I'd like to tell because our application only triggers actions after a web service call). At this point we are starting to believe in configuration paranormal activity...
Any clue anyone?
It turned out that the configuration above wasn't working due to the fact that I was using org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean.
To have JTA I should lookup for the JBoss container-managed entity manager spring-jpa.config.xml:
<!-- lookup the container-managed JPA-EMF -->
<!-- the JNDI name is specified in META-INF/persistence.xml -->
<!-- SEE: https://docs.jboss.org/author/display/AS71/JPA+Reference+Guide#JPAReferenceGuide-BindingEntityManagerFactorytoJNDI -->
<jee:jndi-lookup id="defaultPu" jndi-name="java:jboss/defaultPu" />
And in the META-INF/persistence.xml you bind the Entity Manager Factory into JNDI:
<!-- bind the EMF in JNDI -->
<property name="jboss.entity.manager.factory.jndi.name" value="java:jboss/defaultPu" />
So, I let Jboss bootstrap JPA, and in such way the JTA transactions work!