We followed the article https://ignite.apache.org/docs/latest/installation/kubernetes/amazon-eks-deployment#creating-service and setup the Ignite Cluster in the Kubernetes.
We were unable to establish the connection from the client application which is deployed in a different POD to this Ignite Cluster using TcpDiscoveryKubernetesIpFinder.
We saw the below errors in the Ignite Cluster node,
[SEVERE][main][TcpDiscoverySpi] Failed to get registered addresses from IP finder (retrying every 2000ms; change 'reconnectDelay' to configure the frequency of retries) [maxTimeout=0] class org.apache.ignite.spi.IgniteSpiException:
Failed to retrieve Ignite pods IP addresses. at org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder.getRegisteredAddresses(TcpDiscoveryKubernetesIpFinder.java:80)
Here is the spring.xml configuration:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder">
<property name="namespace" value="ignite-namespace" />
<property name="serviceName" value="ignite-service" />
</bean>
</property>
</bean>
</property>
</bean>
</beans>
We specified the namespace and service names in the deployment files,
cluster-role.yaml:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ignite
namespace: ignite-namespace
rules:
- apiGroups:
- ""
resources: # Here are the resources you can access
- pods
- endpoints
verbs: # That is what you can do with them
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: ignite
roleRef:
kind: ClusterRole
name: ignite
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: ignite
namespace: ignite-namespace
ignite-service.yaml:
apiVersion: v1
kind: Service
metadata:
# The name must be equal to KubernetesConnectionConfiguration.serviceName
name: ignite-service
# The name must be equal to KubernetesConnectionConfiguration.namespace
namespace: ignite-namespace
labels:
app: ignite
spec:
type: LoadBalancer
ports:
- name: rest
port: 8080
targetPort: 8080
- name: thinclients
port: 10800
targetPort: 10800
# The pod-to-service routing is required for apps that are not deployed in K8
sessionAffinity: ClientIP
selector:
# Must be equal to the label set for pods.
app: ignite
status:
loadBalancer: {}
any idea about this error:
No available Hazelcast instance. Please specify your Hazelcast configuration file path via "HazelcastCachingProvider.HAZELCAST_CONFIG_LOCATION"
Working fine with this cfg in a local kubernetes, but always getting this error when i set Kubernetes = true and multicast to false. I'm trying to use it for Liberty in IBMCloud Kubernetes.
<hazelcast xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.hazelcast.com/schema/config
https://hazelcast.com/schema/config/hazelcast-config-3.12.xsd">
<group>
<name>cluster</name>
</group>
<network>
<join>
<multicast enabled="true"/>
<kubernetes enabled="false"/>
</join>
</network>
</hazelcast>
server.xml
<httpSessionCache libraryRef="jCacheVendorLib"
uri="file:${server.config.dir}hazelcast-config.xml" />
<library id="jCacheVendorLib">
<file name="${shared.config.dir}/lib/global/hazelcast-3.12.6.jar" />
</library>
This is what I done:
I have a docker image using liberty, in the liberty configuration I set the following configuration to use hazelcast:
<server>
<featureManager>
...
<feature>sessionCache-1.0</feature>
...
</featureManager>
...
<httpSessionCache libraryRef="jCacheVendorLib"
uri="file:${server.config.dir}hazelcast-config.xml" />
<library id="jCacheVendorLib">
<file name="${shared.config.dir}/lib/global/hazelcast-3.12.6.jar" />
</library>
...
</server>
Then I set the configuration in hazelcast-config.xml. I only get the error when I set kubernetes=true and multicast=false. If I left kubernetes = false and multicast = true works fine on my local kubernetes, but hazelcast can't find other pods when I deployed it on cloud (looks like ips are on a different network)
<hazelcast xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.hazelcast.com/schema/config
https://hazelcast.com/schema/config/hazelcast-config-3.12.xsd">
<group>
<name>cluster</name>
</group>
<network>
<join>
<multicast enabled="true"/>
<kubernetes enabled="false"/>
</join>
</network>
</hazelcast>
Also I ran the RBAC yaml.
And run the following yaml to deploy it:
apiVersion: apps/v1
kind: Deployment
metadata:
name: employee-service
labels:
app: employee-service
spec:
replicas: 3
selector:
matchLabels:
app: employee-service
template:
metadata:
labels:
app: employee-service
spec:
containers:
- name: myapp
image: myapp
ports:
- name: http
containerPort: 8080
- name: multicast
containerPort: 5701
------------------------------
apiVersion: v1
kind: Service
metadata:
name: service
spec:
type: NodePort
selector:
app: employee-service
ports:
- protocol: TCP
port: 9080
targetPort: 9080
nodePort: 31234
If you use the Hazelcast Kubernetes plugin for the discovery, please make sure that you configured RBAC, for example with the following command.
kubectl apply -f https://raw.githubusercontent.com/hazelcast/hazelcast-kubernetes/master/rbac.yaml
Please also make sure that the default parameters work for you (you run your Hazelcast in the same namespace, etc.).
If that does not help, please share the full StackTrace logs.
I am trying to deploy a custom nifi instances working with external zookeeper, on Kubernetes (begineer).
Every thing works except for the state management within Nifi.
I understood that I have to update the state-management.xml file, with the right connection string :
<cluster-provider>
<id>zk-provider</id>
<class>org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider</class>
<property name="Connect String"></property>
<property name="Root Node">/nifi</property>
<property name="Session Timeout">10 seconds</property>
<property name="Access Control">Open</property>
</cluster-provider>
I do not know how I get access to this connection string within Kubernetes, this is my service.yml for zookeeper:
apiVersion: v1
kind: Service
metadata:
name: zk-hs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2888
name: server
- port: 3888
name: leader-election
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: zk-cs
labels:
app: zk
spec:
selector:
app: zk
ports:
- port: 2181
name: client
For Zookeeper leader election and so on, I used the following address :
zk-0.zk-hs.default.svc.cluster.local:2888:3888
But how to access to the port 2181?
You can just access zk-cs.default.svc.cluster.local:2181
Readiness Probe keeps the application in at a non-ready state. While being in this state the application cannot connect to any kubernetes service.
I'm using Ubuntu 18 for both master and nodes for my kubernetes cluster. (The problem still appeared when I used only master in the cluster, so I don't think this is a master node kind of problem).
I set up my kubernetes cluster with an Spring application, which uses hazelcast in order to manage cache. So, while using readiness probe, the application can't access a kubernetes service I created in order to connect the applications via hazelcast using the hazelcast-kubernetes plugin.
When I take out the readiness-probe, the application connects as soon as it can to the service creating hazelcast clusters successfully and everything works properly.
The readiness probe will connect to a rest api which its only response is a 200 code. However, while the application is going up, in the middle of the process it will start the hazelcast cluster, and as such, it will try to connect to the kubernetes hazelcast service which connects the app's cache with other pods, while the readiness probe hasn't been cleared and the pod is in a non-ready state due to the probe. This is when the application will not be able to connect to the kubernetes service and it will either fail or get stuck as a consequence of the configuration I add.
service.yaml:
apiVersion: v1
kind: Service
metadata:
name: my-app-cluster-hazelcast
spec:
selector:
app: my-app
ports:
- name: hazelcast
port: 5701
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-deployment
labels:
app: my-app-deployment
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 2
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
terminationGracePeriodSeconds: 180
containers:
- name: my-app
image: my-repo:5000/my-app-container
imagePullPolicy: Always
ports:
- containerPort: 5701
- containerPort: 9080
readinessProbe:
httpGet:
path: /app/api/excluded/sample
port: 9080
initialDelaySeconds: 120
periodSeconds: 15
securityContext:
capabilities:
add:
- SYS_ADMIN
env:
- name: container
value: docker
hazelcast.xml:
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast
xsi:schemaLocation="http://www.hazelcast.com/schema/config http://www.hazelcast.com/schema/config/hazelcast-config-3.11.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<properties>
<property name="hazelcast.jmx">false</property>
<property name="hazelcast.logging.type">slf4j</property>
</properties>
<network>
<port auto-increment="false">5701</port>
<outbound-ports>
<ports>49000,49001,49002,49003</ports>
</outbound-ports>
<join>
<multicast enabled="false"/>
<kubernetes enabled="true">
<namespace>default</namespace>
<service-name>my-app-cluster-hazelcast</service-name>
</kubernetes>
</join>
</network>
</hazelcast>
hazelcast-client.xml:
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast-client
xsi:schemaLocation="http://www.hazelcast.com/schema/client-config http://www.hazelcast.com/schema/client-config/hazelcast-client-config-3.11.xsd"
xmlns="http://www.hazelcast.com/schema/client-config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<properties>
<property name="hazelcast.logging.type">slf4j</property>
</properties>
<connection-strategy async-start="false" reconnect-mode="ON">
<connection-retry enabled="true">
<initial-backoff-millis>1000</initial-backoff-millis>
<max-backoff-millis>60000</max-backoff-millis>
</connection-retry>
</connection-strategy>
<network>
<kubernetes enabled="true">
<namespace>default</namespace>
<service-name>my-app-cluster-hazelcast</service-name>
</kubernetes>
</network>
</hazelcast-client>
Expected result:
The service is able to connect to the pods, creating endpoints in its description.
$ kubectl describe service my-app-cluster-hazelcast
Name: my-app-cluster-hazelcast
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-app-cluster-hazelcast","namespace":"default"},"spec":{"ports...
Selector: app=my-app
Type: ClusterIP
IP: 10.244.28.132
Port: hazelcast 5701/TCP
TargetPort: 5701/TCP
Endpoints: 10.244.4.10:5701,10.244.4.9:5701
Session Affinity: None
Events: <none>
The application runs properly and shows two members in its hazelcast cluster and the deployment is shown as ready, the application can be fully accessed:
logs:
2019-08-26 23:07:36,614 TRACE [hz._hzInstance_1_dev.InvocationMonitorThread] (com.hazelcast.spi.impl.operationservice.impl.InvocationMonitor): [10.244.4.10]:5701 [dev] [3.11] Broadcasting operation control packets to: 2 members
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
my-app-deployment 2/2 2 2 2m27s
Actual Result:
The service doesn't get any endpoint.
$ kubectl describe service my-app-cluster-hazelcast
Name: my-app-cluster-hazelcast
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"my-app-cluster-hazelcast","namespace":"default"},"spec":{"ports...
Selector: app=my-app
Type: ClusterIP
IP: 10.244.28.132
Port: hazelcast 5701/TCP
TargetPort: 5701/TCP
Endpoints:
Session Affinity: None
Events: <none>
The application gets stuck with the connection-strategy enabled in hazelcast-client.xml with the following logs, keeping its own cluster with no communication and the deployment in a non-ready state forever:
logs:
22:54:11.236 [hz.client_0.cluster-] WARN com.hazelcast.client.connection.ClientConnectionManager - hz.client_0 [dev] [3.11] Unable to get alive cluster connection, try in 57686 ms later, attempt 52 , cap retrytimeout millis 60000
22:55:02.036 [hz._hzInstance_1_dev.cached.thread-4] DEBUG com.hazelcast.internal.cluster.impl.MembershipManager - [10.244.4.8]:5701 [dev] [3.11] Sending member list to the non-master nodes:
Members {size:1, ver:1} [
Member [10.244.4.8]:5701 - 6a4c7184-8003-4d24-8023-6087d68e9709 this
]
22:55:08.968 [hz.client_0.cluster-] WARN com.hazelcast.client.connection.ClientConnectionManager - hz.client_0 [dev] [3.11] Unable to get alive cluster connection, try in 51173 ms later, attempt 53 , cap retrytimeout millis 60000
22:56:00.184 [hz.client_0.cluster-] WARN com.hazelcast.client.connection.ClientConnectionManager - hz.client_0 [dev] [3.11] Unable to get alive cluster connection, try in 55583 ms later, attempt 54 , cap retrytimeout millis 60000
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
my-app-deployment 0/2 2 0 45m
Just to clarify:
As described by OP with reference to readiness probe:
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers
In your service yaml you have
spec:
selector:
app: my-app
but in deployment yaml the labels value is different
metadata:
name: my-app-deployment
labels:
app: my-app-deployment
Is there any reason for this?
I setup the hazelcast kubernetes configuration as per the explanations given in the below link,
https://vertx.io/docs/vertx-hazelcast/java/#_configuring_for_kubernetes
But, hazelcast can identify all members on one node and not able to find across all the nodes in the cluster.
Please help us to solve this issue
Following is the service file for hazelcast of type ClusterIP,
apiVersion: v1
kind: Service
metadata:
name: cb-hazelcast-service
spec:
selector:
component: cb-hazelcast-service
type: ClusterIP
clusterIP: None
ports:
- name: hz-port-name
port: 5701
protocol: TCP
Following is the deployment file for microservice 1,
apiVersion: apps/v1
kind: Deployment
metadata:
name: cb-agent-service
spec:
replicas: 1
minReadySeconds: 30
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
selector:
matchLabels:
app: cb-agent-service
template:
metadata:
labels:
app: cb-agent-service
component: cb-hazelcast-service
spec:
containers:
- name: cb-agent-service
image: <docker-image-hub>/agent-service:hz-dns-001
#imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /usr/data/logs
name: shared-logs
ports:
- containerPort: 8085
name: cbport
ports:
- name: hazelcast
containerPort: 5701
volumes:
- name: shared-logs
hostPath:
path: /usr/data/logs
---
apiVersion: v1
kind: Service
metadata:
name: cb-agent-service
labels:
vertx-cluster: "true"
spec:
type: NodePort
ports:
- port: 80
targetPort: 8085
selector:
app: cb-agent-service
following is deployment for another microservice,
apiVersion: apps/v1
kind: Deployment
metadata:
name: cb-transaction-service
spec:
replicas: 1
minReadySeconds: 30
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
selector:
matchLabels:
app: cb-transaction-service
template:
metadata:
labels:
app: cb-transaction-service
component: cb-hazelcast-service
spec:
containers:
- name: cb-transaction-service
image: <docker-iamge-hub>/transaction-service:hz-dns-001
#imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /usr/data/logs
name: shared-logs
ports:
- containerPort: 8085
name: cbport
ports:
- name: hazelcast
containerPort: 5701
nodeSelector:
service: transaction
volumes:
- name: shared-logs
hostPath:
path: /usr/data/logs
---
apiVersion: v1
kind: Service
metadata:
name: cb-transaction-service
labels:
vertx-cluster: "true"
spec:
type: NodePort
ports:
- port: 80
targetPort: 8085
selector:
app: cb-transaction-service
Following is the cluster.xml file for all the microservices
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.6.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<properties>
<property name="hazelcast.memcache.enabled">false</property>
<property name="hazelcast.wait.seconds.before.join">0</property>
<property name="hazelcast.logging.type">slf4j</property>
<property name="hazelcast.health.monitoring.delay.seconds">2</property>
<property name="hazelcast.max.no.heartbeat.seconds">5</property>
<property name="hazelcast.max.no.master.confirmation.seconds">10</property>
<property name="hazelcast.master.confirmation.interval.seconds">10</property>
<property name="hazelcast.member.list.publish.interval.seconds">10</property>
<property name="hazelcast.connection.monitor.interval">10</property>
<property name="hazelcast.connection.monitor.max.faults">2</property>
<property name="hazelcast.partition.migration.timeout">10</property>
<property name="hazelcast.migration.min.delay.on.member.removed.seconds">3</property>
<!-- at the moment the discovery needs to be activated explicitly -->
<property name="hazelcast.discovery.enabled">true</property>
<property name="hazelcast.rest.enabled">false</property>
</properties>
<network>
<port auto-increment="true" port-count="10000">5701</port>
<outbound-ports>
<ports>0</ports>
</outbound-ports>
<join>
<multicast enabled="false"/>
<tcp-ip enabled="false"/>
<discovery-strategies>
<discovery-strategy enabled="true"
class="com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy">
<properties>
<property name="service-dns">cb-hazelcast-service.default.svc.cluster.local</property>
</properties>
</discovery-strategy>
</discovery-strategies>
</join>
</network>
<partition-group enabled="false"/>
<executor-service name="default">
<pool-size>16</pool-size>
<!--Queue capacity. 0 means Integer.MAX_VALUE.-->
<queue-capacity>0</queue-capacity>
</executor-service>
<map name="__vertx.subs">
<backup-count>1</backup-count>
<time-to-live-seconds>0</time-to-live-seconds>
<max-idle-seconds>0</max-idle-seconds>
<max-size policy="PER_NODE">0</max-size>
<eviction-percentage>25</eviction-percentage>
<merge-policy>com.hazelcast.map.merge.LatestUpdateMapMergePolicy</merge-policy>
</map>
<semaphore name="__vertx.*">
<initial-permits>1</initial-permits>
</semaphore>
</hazelcast>