Hazelcast map on kubernetes not in sync - kubernetes

Writing a spring based application written in kotlin with hazelcast I encounter issue when deploying to kubernetes.
For hazelcast on kubernetes I use DNS lookup mode for discovery.
I have the following hazelcast configuration:
hazelcast:
network:
join:
multicast:
enabled: false
kubernetes:
enabled: true
service-dns: my-application-hs
And the following service.yaml for the kubernetes deployment:
apiVersion: v1
kind: Service
metadata:
name: my-application
spec:
type: ClusterIP
selector:
component: my-application
ports:
- name: http
port: 80
protocol: TCP
targetPort: http
---
apiVersion: v1
kind: Service
metadata:
name: my-application-hs
spec:
type: ClusterIP
clusterIP: None
selector:
component: my-application
ports:
- name: hazelcast
port: 5701
A hazelcast map is used like this:
#Component
class CacheClientImplHazelcast(){
private val hz: HazelcastInstance
init{
val serializer = SerializerConfig()
.setTypeClass(MyDto::class.java)
.setImplementation(MyDtoSerializer())
val config = Config()
config.serializationConfig.addSerializerConfig(serializer)
hz = Hazelcast.newHazelcastInstance(config)
}
fun getAllData(): List<MyDto> {
val map: IMap<String, MyDto> = hz.getMap("my-map")
return map.values.toList()
}
fun putData(key:String, myDto: MyDto) {
val map: IMap<String, MyDto> = hz.getMap("my-map")
map.put(key, myDto)
}
override fun clear() {
val map: IMap<String, MyDto> = hz.getMap("my-map")
map.clear()
}
}
When running 3 instances on kubernetes the logs from hazelcast always show me 4 entries, something like this:
Members {size:4, ver:53} [
Member [10.4.2.32]:5701 - c1e70d6f-a62d-4924-9815-36bb1f98f141
Member [10.4.3.25]:5702 - be96c292-8847-4f56-ae32-f27f380d7c5b
Member [10.4.2.32]:5702 - 6ca96bfd-eb74-4149-8630-a2488e76d97d
Member [10.4.11.41]:5702 - 7e8b0bc9-ad2b-41eb-afbf-b7af9ed497bd this
]
(Side question 1: why do I see 4 here instead of 3?)
Now even though it seems the member are connected (at least the logs of the nodes show all the same member uuids) when I write data on one node it is not available on the other nodes. Calls to getAllData only show data that has been put into the hazelcast map on this node. When I send requests to the individual nodes (curl on shell) I only see a fraction of the data. When I send a request on the normal url of the pod then with round-robin I get data of the different nodes, which is not synchronized.
If I run the very same application locally with the following hazelcast.yaml:
hazelcast:
network:
join:
multicast:
enabled: false
Then it all works correctly and the cache really seems to be "synchronized" across different server instances running locally. For this test I start the application on different ports.
However what is also strange, even if I run 2 instances locally, the logs I see from hazelcast indicate there are 4 members:
Members {size:4, ver:4} [
Member [192.168.68.106]:5701 - 01713c9f-7718-4ed4-b532-aaf62443c425
Member [192.168.68.106]:5702 - 09921004-88ef-4fe5-9376-b88869fde2bc
Member [192.168.68.106]:5703 - cea4b13f-d538-48f1-b0f2-6c53678c5823 this
Member [192.168.68.106]:5704 - 44d84e70-5b68-4c69-a45b-fee39bd75554
]
(Side questions 2: Why do I see 4 members even I only started 2 locally?)
The main question now is: Why does this setup not work on kubernetes? Why does each node have a separated map that is not in sync with the other nodes?
Here some log message that could be relevant but I was not able to identify an issue with:
2022-04-19T12:29:31.414588357Z2022-04-19 12:29:31.414 INFO 1 --- [cached.thread-7] c.h.i.server.tcp.TcpServerConnector : [10.4.11.41]:5702 [dev] [5.1.1] Connecting to /10.4.3.25:5702, timeout: 10000, bind-any: true
2022-04-19T12:29:31.414806473Z2022-04-19 12:29:31.414 INFO 1 --- [.IO.thread-in-2] c.h.i.server.tcp.TcpServerConnection : [10.4.11.41]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.11.41:5702 and /10.4.3.25:46475
2022-04-19T12:29:31.414905573Z2022-04-19 12:29:31.414 INFO 1 --- [cached.thread-4] c.h.i.server.tcp.TcpServerConnector : [10.4.11.41]:5702 [dev] [5.1.1] Connecting to /10.4.2.32:5702, timeout: 10000, bind-any: true
2022-04-19T12:29:31.416520854Z2022-04-19 12:29:31.416 INFO 1 --- [.IO.thread-in-0] c.h.i.server.tcp.TcpServerConnection : [10.4.3.25]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.3.25:5702 and /10.4.11.41:40455
2022-04-19T12:29:31.416833551Z2022-04-19 12:29:31.416 INFO 1 --- [.IO.thread-in-1] c.h.i.server.tcp.TcpServerConnection : [10.4.2.32]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.2.32:54433 and /10.4.11.41:5702
2022-04-19T12:29:31.417377114Z2022-04-19 12:29:31.417 INFO 1 --- [.IO.thread-in-0] c.h.i.server.tcp.TcpServerConnection : [10.4.11.41]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.11.41:40455 and /10.4.3.25:5702
2022-04-19T12:29:31.417545174Z2022-04-19 12:29:31.417 INFO 1 --- [.IO.thread-in-2] c.h.i.server.tcp.TcpServerConnection : [10.4.2.32]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.2.32:5702 and /10.4.11.41:53547
2022-04-19T12:29:31.418541840Z2022-04-19 12:29:31.418 INFO 1 --- [.IO.thread-in-1] c.h.i.server.tcp.TcpServerConnection : [10.4.11.41]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.11.41:53547 and /10.4.2.32:5702
2022-04-19T12:29:31.419763311Z2022-04-19 12:29:31.419 INFO 1 --- [.IO.thread-in-2] c.h.i.server.tcp.TcpServerConnection : [10.4.3.25]:5702 [dev] [5.1.1] Initialized new cluster connection between /10.4.3.25:46475 and /10.4.11.41:5702
2022-04-19T12:29:31.676218042Z2022-04-19 12:29:31.675 INFO 1 --- [gulis.migration] c.h.i.partition.impl.MigrationManager : [10.4.2.32]:5701 [dev] [5.1.1] Repartitioning cluster data. Migration tasks count: 271

There are two issues here but only one problem.
why the extra instances
Spring (Boot) will create a Hazelcast instance for you if it finds a Hazelcast config file and no HazelcastInstance #Bean. You can fix by excluding the HazelcastAutoConfiguration.class or by returning the instance you create in your component class as a #Bean.
why the data sync issue on kubernetes
Each pod accidentally has 2 Hazelcast nodes, one on 5701 and one on 5702. But your ClusterIP only lists 5701. Some instances in the pod can't be reached from outside. This will go away when you fix the first issue.

Related

Access kafka on cloud from onprem K8

I am trying to connect to kafka broker setup on azure aks from onprem rancher k8 cluster over internet . I have created a loadbalancer listener on azure kafka. it is creating 4 public ip's using azure load balancer service.
- name: external
port: 9093
type: loadbalancer
tls: true
authentication:
type: tls
configuration:
bootstrap:
loadBalancerIP: 172.22.199.99
annotations:
external-dns.alpha.kubernetes.io/hostname: bootstrap.example.com
brokers:
- broker: 0
loadBalancerIP: 172.22.199.100
annotations:
external-dns.alpha.kubernetes.io/hostname: kafka-0.example.com
- broker: 1
loadBalancerIP: 172.22.199.101
annotations:
external-dns.alpha.kubernetes.io/hostname: kafka-1.example.com
- broker: 2
loadBalancerIP: 172.22.199.102
annotations:
external-dns.alpha.kubernetes.io/hostname: kafka-2.example.com
brokerCertChainAndKey:
secretName: source-kafka-listener-cert
certificate: tls.crt
key: tls.key
now to connect from onprem i have opened firewall for only bootstrap lb ip .My understanding is that bootstrap will inturn route the request to individual broker but that is not happening. When i try to connect connection is established with bootstrap loadbalncer ip but after that i get timeout error .
2022-08-22 08:14:04,659 INFO Metrics reporters closed (org.apache.kafka.common.metrics.Metrics) [kafka-admin-client-thread | adminclient-1]
2022-08-22 08:14:04,659 ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed) [main] org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:70)
at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:51)
please let me know if i have to open firewall for individual brokers as well?

How configure the application.server dynamically for Kafka Streams remote interactive queries on a spring boot app running on Kubernetes

We have a Kubernetes cluster with three pod running, i want to know what are the RPC endpoint we need to provide in application.server to make interactive query work.
So we have a use case where we need to query state-store with gRPC server.
while creating gRPC server we are giving 50052 as a port..
but i am not able to get what should be the value of application.server as it take Host:Port
for host do we need to give the endpoint ip of each pod and port as 50052?
For Example below:
$>kubectl get ep
NAME ENDPOINTS AGE
myapp 10.8.2.85:8080,10.8.2.88:8080 10d
Pod1 -> 10.8.2.85:8080
Pod2 -> 10.8.2.88:8080
So the value of application.server will be?
1. 10.8.2.85:50052 (port is what i am giving in gRPC server)
2. 10.8.2.88:50052 (port is what i am giving in gRPC server)
If above application.server values are correct then How to get this POD IP dynamically?
You can make your pods IP address available as an environment variable and then reference the environment variable in your application.yml
See: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
pod.yml
apiVersion: v1
kind: Pod
metadata:
name: kafka-stream-app
spec:
containers:
- name: kafka-stream-app
#... some more configs
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
application.yml
spring:
cloud:
stream:
kafka:
streams:
binder:
configuration:
application.server: ${MY_POD_IP}:50052

How to access kafka installed outside kubernetes cluster from a service provisioned inside a kubernetes cluster

My setup is like, I have a producer service provisioned as part of minikube cluster and it is trying to publish messages to a kafka instance running on the host machine.
I have written a kafka service and endpoints yaml as follows:
kind: Service
apiVersion: v1
metadata:
name: kafka
spec:
ports:
- name: "broker"
protocol: "TCP"
port: 9092
targetPort: 9092
nodePort: 0
---
kind: Endpoints
apiVersion: v1
metadata:
name: kafka
namespace: default
subsets:
- addresses:
- ip: 10.0.2.2
ports:
- name: "broker"
port: 9092
The ip address of the host machine from inside the minikube cluster mentioned in the endpoint is acquired from the following command:
minikube ssh "route -n | grep ^0.0.0.0 | awk '{ print \$2 }'"
The problem I am facing is that the topic is getting created when producer tries to publish message for the first time but no messages are getting written on to that topic.
Digging into the pod logs, I found that producer is trying to connect to kafka instance on localhost or something (not really sure of it).
2020-05-17T19:09:43.021Z [warn] org.apache.kafka.clients.NetworkClient [] -
[Producer clientId=/system/sharding/kafkaProducer-greetings/singleton/singleton/producer]
Connection to node 0 (omkara/127.0.1.1:9092) could not be established. Broker may not be available.
Following which I suspected that probably I need to modify server.properties with the following change:
listeners=PLAINTEXT://localhost:9092
This however resulted in the change in the ip address in the log:
2020-05-17T19:09:43.021Z [warn] org.apache.kafka.clients.NetworkClient [] -
[Producer clientId=/system/sharding/kafkaProducer-greetings/singleton/singleton/producer]
Connection to node 0 (omkara/127.0.0.1:9092) could not be established. Broker may not be available.
I am not sure what ip address must be mentioned here? Or what is an alternate solution? And if it is even possible to connect from inside the kubernetes cluster to the kafka instance installed outside the kubernetes cluster.
Since producer kafka client is on the very same network as the brokers, we need to configure an additional listener like so:
listeners=INTERNAL://0.0.0.0:9093,EXTERNAL://0.0.0.0:9092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=INTERNAL://localhost:9093,EXTERNAL://10.0.2.2:9092
inter.broker.listener.name=INTERNAL
We can verify messages in topic like so:
kafka-console-consumer.sh --bootstrap-server INTERNAL://0.0.0.0:9093 --topic greetings --from-beginning
{"name":"Alice","message":"Namastey"}
You can find a detailed explaination on understanding & provisioning kafka listeners here.

Access SQL Server database from Kubernetes Pod

My deployed Spring boot application to trying to connect to an external SQL Server database from Kubernetes Pod. But every time it fails with error
Failed to initialize pool: The TCP/IP connection to the host <>, port 1443 has failed.
Error: "Connection timed out: no further information.
Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.
I have tried to exec into the Pod and successfully ping the DB server without any issues
Below are the solutions I have tried:
Created a Service and Endpoint and provided the DB IP in configuration file tried to bring up the application in the Pod
Tried using the Internal IP from Endpoint instead of DB IP in configuration to see Internal IP is resolved to DB IP
But both these cases gave the same result. Below is the yaml I am using the create the Service and Endpoint.
---
apiVersion: v1
kind: Service
metadata:
name: mssql
namespace: cattle
spec:
type: ClusterIP
ports:
- port: 1433
---
apiVersion: v1
kind: Endpoints
metadata:
name: mssql
namespace: cattle
subsets:
- addresses:
- ip: <<DB IP>>
ports:
- port: 1433
Please let me know if I am wrong or missing in this setup.
Additional information the K8s setup
It is clustered master with external etcd cluster topology
OS on the nodes is CentOS
Able to ping the server from all nodes and the pods that are created
For this scenario a headless service is very useful. You will redirect traffic to this ip without defining an endpoint.
kind: "Service"
apiVersion: "v1"
metadata:
namespace: "your-namespace"
name: "ftp"
spec:
type: ExternalName
externalName: your-ip
The issue was resolved by updating the deployment yaml with IP address. Since all the servers were in same subnet, I did not need the to create a service or endpoint to access the DB. Thank you for all the inputs on the post

mulitple external name in kubernetes service to access the the external Remotely hosted mongodb with connectionstring

I would like to connect my Kubernetes Deployment to a remotely hosted database with URI.
I am able to connect to remotely hosted database with URI using Docker. Now I'd like to understand how I can specify multiple external names in Kubernetes service file.
I have a MongoDB cluster with the below URL:
mongodb://username:password#mngd-new-pr1-01:27017,mngd-new-pr2-02:27017,mngd-new-pr3-03:27017/
I have followed Kubernetes best practices: mapping external services. When I setup a single external name, it is working.
How can I specify all the 3 clusters in the external name?
kind: Service
apiVersion: v1
metadata:
name: mongo
spec:
type: ExternalName
externalName: mngd-new-pr1-01,mngd-new-pr2-02,mngd-new-pr3-03
ports:
- port: 27017
since i was unable to create the multiple external names .
I went with creating the headless service and then created the endpoints for the service .As described "Scenario 1: Database outside cluster with IP address"
From the logs , I think the connectivity is being established . but later there was exception like below and it was disconnected .
2019-03-20 11:26:13.941 INFO 1 --- [38.200.19:27038] org.mongodb.driver.connection : Opened connection [connectionId{localValue:1, serverValue:386066}] to .38.200.19:27038
2019-03-20 11:26:13.953 INFO 1 --- [.164.29.4:27038] org.mongodb.driver.connection : Opened connection [connectionId{localValue:2, serverValue:458254}] to .164.29.4:27038
2019-03-20 11:26:13.988 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Monitor thread successfully connected to server with description ServerDescription{address=.38.200.19:27038, type=REPLICA_SET_PRIMARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 8]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=45440955, setName='no-prd-rep', canonicalAddress=mngd-new-pr1-01:27038, hosts=[mngd-new-pr1-01:27038, mngd-new-pr1-02:27038, mngd-new-pr1-03:27038], passives=[], arbiters=[], primary='mngd-new-pr1-01:27038'
2019-03-20 11:26:13.990 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Adding discovered server mngd-new-pr1-01:27038 to client view of cluster
2019-03-20 11:26:13.992 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Adding discovered server mngd-new-pr1-02:27038 to client view of cluster
2019-03-20 11:26:13.993 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Adding discovered server mngd-new-pr1-03:27038 to client view of cluster
2019-03-20 11:26:13.997 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Server 102.227.4:27038 is no longer a member of the replica set. Removing from client view of cluster.
2019-03-20 11:26:14.001 INFO 1 --- [.164.29.4:27038] org.mongodb.driver.cluster : Monitor thread successfully connected to server with description ServerDescription{address=.164.29.4:27038, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 6, 8]}, minWireVersion=0, maxWireVersion=6, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=47581993, setName='no-prd-rep', canonicalAddress=mngd-new-pr1-01:27038, hosts=[mngd-new-pr1-01:27038, mngd-new-pr1-02:27038, mngd-new-pr1-03:27038], passives=[], arbiters=[], primary='mngd-new-pr1-01:27038',
2019-03-20 11:26:14.001 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Server 38.200.19:27038 is no longer a member of the replica set. Removing from client view of cluster.
2019-03-20 11:26:14.001 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Server 164.29.4:27038 is no longer a member of the replica set. Removing from client view of cluster.
2019-03-20 11:26:14.001 INFO 1 --- [38.200.19:27038] org.mongodb.driver.cluster : Canonical address mngd-new-pr1-01:27038 does not match server address. Removing .38.200.19:27038 from client view of cluster
2019-03-20 11:26:34.012 INFO 1 --- [2-prd2-01:27038] org.mongodb.driver.cluster : Exception in monitor thread while connecting to server mngd-new-pr1-01:27038
com.mongodb.MongoSocketException: mngd-new-pr1-01: Name or service not known
at com.mongodb.ServerAddress.getSocketAddress(http://ServerAddress.java:188 ) ~[mongodb-driver-core-3.6.4.jar!/:na]
So since we are using the endpoints as ip address and its not matching with the connection string specified in the deployment yaml connection string it might be failing .
Really confusing me a lot :)
PS : to check the connectivity to external mongo cluster i have launched the single pod
apiVersion: v1
kind: Pod
metadata:
name: proxy-chk
spec:
containers:
- name: centos
image: centos
command: ["/bin/sh", "-c", "while : ;do curl -L http://${MONGODBendpointipaddress}:27038/; sleep 10; done"]
In the logs i can see the it is able to establish the connectivity .
" It looks like you are trying to access MongoDB over HTTP on the native driver port. "
So i think the headless service which i created earlier its able to route the traffic
Need your advise .
One alternative could be create one service service each for mongo host but that defeat abstraction if you need to add more hosts in future.