Connect to Cassandra on Kubernetes using java-driver - kubernetes

We are bringing up a Cassandra cluster, using k8ssandra helm chart, it exposes several services, our client applications are using the datastax Java-Driver and running at the same k8s cluster as the Cassandra cluster (this is testing phase)
CqlSessionBuilder builder = CqlSession.builder();
What is the recommended way to connect the application (via the Driver) to Cassandra?
Adding all nodes?
for (String node :nodes) {
builder.addContactPoint(new InetSocketAddress(node, 9042));
}
Adding just the service address?
builder.addContactPoint(new InetSocketAddress(service-dns-name , 9042))
Adding the service address as unresolved? (would that even work?)
builder.addContactPoint(InetSocketAddress.createUnresolved(service-dns-name , 9042))

The k8ssandra Helm chart deploys a CassandraDatacenter object and cass-operator in addition to a number of other resources. cass-operator is responsible for managing the CassandraDatacenter. It creates the StatefulSet(s) and creates several headless services including:
datacenter service
seeds service
all pods service
The seeds service only resolves to pods that are seeds. Its name is of the form <cluster-name>-seed-service. Because of the ephemeral nature of pods cass-operator may designate different C* nodes as seed nodes. Do not use the seed service for connecting client applications.
The all pods service resolves to all Cassandra pods, regardless of whether they are readiness. Its name is of the form <cluster-name>-<dc-name>-all-pods-service. This service is intended to facilitate with monitoring. Do not use the all pods service for connecting client applications.
The datacenter service resolves to ready pods. Its name is of the form <cluster-name>-<dc-name>-service This is the service that you should use for connecting client applications. Do not directly use pod IPs as they will change over time.

Adding all nodes?
You definitely do not need to add all of the nodes as contact points. Even in vanilla Cassandra, only adding a few is fine as the driver will gossip and find the rest.
Adding just the service address?
Your second option of binding on the service address is all you should need to do. The nice thing about the service address, is that it will account for changing/removing of IPs in the cluster.

Related

Cross cluster communication in GKE Multi-cluster Service

I’m using GKE multi-cluster service and have configured two clusters.
On one cluster I have an endpoint I want to consume and it's hard-coded on address:
redpanda-0.redpanda.processing.svc.cluster.local.
Does anyone know how I can reach this from the other cluster?
EDIT:
I have exported the service, which is then automatically imported into the other cluster. Previously, I have been able to connect to the other cluster using SERVICE_EXPORT_NAME.NAMESPACE.svc.clusterset.local, but then I had to change the endpoint address manually to this exact address. In my new case, the endpoint address is not configurable.

Communication between Pods in Kubernetes. Service object or Cluster Networking?

I'm a beginner in Kubernetes and I have a situation as following: I have two differents Pods: PodA and PodB. Firstly, I want to expose PodA to the outside world, so I create a Service (type NodePort or LoadBalancer) for PodA, which is not difficult to understand for me.
Then I want PodA communicate to PodB, and after several hours googling, I found the answer is that I also need to create a Service (type ClusterIP if I want to keep PodB only visible inside the cluster) for PodB, and if I do so, I can let PodA and PodB comminucate to each other. But the problem is I also found this article. According to this webpage, they say that the communication between pods on the same node can be done via cbr0, a Network Bridge, or the communication between pods on different nodes can be done via a route table of the cluster, and they don't mention anything to the Service object (which means we don't need Service object ???).
In fact, I also read the documents of K8s and I found in the Cluster Networking
Cluster Networking
...
2. Pod-to-Pod communications: this is the primary focus of this document.
...
where they also focus on to the Pod-to-Pod communications, but there is no stuff relevant to the Service object.
So, I'm really confusing right now and my question is: Could you please explain to me the connection between these stuff in the article and the Service object? The Service object is a high-level abstract of the cbr0 and route table? And in the end, how can the Pods can communicate to each other?
If I misunderstand something, please, point it out for me, I really appreciate that.
Thank you guys !!!
Motivation behind using a service in a Kubernetes cluster.
Kubernetes Pods are mortal. They are born and when they die, they are not resurrected. If you use a Deployment to run your app, it can create and destroy Pods dynamically.
Each Pod gets its own IP address, however in a Deployment, the set of Pods running in one moment in time could be different from the set of Pods running that application a moment later.
This leads to a problem: if some set of Pods (call them “backends”) provides functionality to other Pods (call them “frontends”) inside your cluster, how do the frontends find out and keep track of which IP address to connect to, so that the frontend can use the backend part of the workload?
That being said, a service is handy when your deployments (podA and podB) are dynamically managed.
Your PodA can always communicate with PodB if it knows the address or the DNS name of PodB. In a cluster environment, there may be multiple replicas of PodB, or an instance of PodB may die and be replaced by another instance with a different address and different name. A Service is an abstraction to deal with this situation. If you use a Service to expose your PodB, then all pods in the cluster can talk to an instance of PodB using that service, which has a fixed name and fixed address no matter how many instances of PodB exists and what their addresses are.
First, I read it as you are dealing with two applications, e.g. ApplicationA and ApplicationB. Don't use the Pod abstraction when you reason about your architecture. On Kubernetes, you are dealing with a distributed system, and it is designed so that you should have multiple instances of your Application, e.g. for High Availability. Each instance of your application is a Pod.
Deploy your applications ApplicationA and ApplicationB as a Deployment resource. Then it is easy do do rolling upgrades without downtime, and Kubernetes will restart any instance of your application if it crash.
For every Deployment or for you, application, create one Service resource, (e.g. ServiceA and ServiceB). When you communicate from ApplicationA to another application, use the Service, e.g. ServiceB. The service will load balance your requests to the instances of the other application, and you can upgrade your Deployment without downtime.
1.Cluster networking : As the name suggests, all the pods deployed in the cluster will be connected by implementing any kubernetes network model like DANM, flannel
Check this link to see how to create a cluster network.
Creating cluster network
With the CNI installed (by implementing cluster network), every pod will get an IP.
2.Service objects created with type ClusterIP, points to the this IPs (via endpoint) created internally to communicate.
Answering your question, Yes, The Service object is a high-level abstract of the cbr0 and route table.
You can use service object to communicate between pods.
You can also implement service mesh like envoy / Istio if the network is complex.

How DNS service works in the Kubernetes?

I am new to the Kubernetes, and I'm trying to understand that how can I apply it for my use-case scenario.
I managed to install a 3-node cluster on VMs within the same network. Searching about K8S's concepts and reading related articles, still I couldn't find answer for my below question. Please let me know if you have knowledge on this:
I've noticed that internal DNS service of K8S applies on the pods and this way services can find each other with hostnames instead of IPs.
Is this applicable for communication between pods of different nodes or this is only within the services inside a single node? (In other words, do we have a dns service on the node level in the K8S, or its only about pods?)
The reason for this question is the scenario that I have in mind:
I need to deploy a micro-service application (written in Java) with K8S. I made docker images from each service in my application and its working locally. Currently, these services are connected via pre-defined IP addresses.
Is there a way to run each of these services within a separate K8S node and use from its DNS service to connect the nodes without pre-defining IPs?
A service serves as an internal endpoint and (depending on the configuration) load balancer to one or several pods behind it. All communication typically is done between services, not between pods. Pods run on nodes, services don't really run anything, they are just routing traffic to the appropriate pods.
A service is a cluster-wide configuration that does not depend on a node, thus you can use a service name in the whole cluster, completely independent from where a pod is located.
So yes, your use case of running pods on different nodes and communicate between service names is a typical setup.

How to install Kubernetes dashboard on external IP address?

How to install Kubernetes dashboard on external IP address?
Is there any tutorial for this?
You can expose services and pods in several ways:
expose the internal ClusterIP service through Ingress, if you have that set up.
change the service type to use 'type: LoadBalancer', which will try to create an external load balancer.
If you have external IP addresses on your kubernetes nodes, you can also expose the ports directly on the node hosts; however, I would avoid these unless it's a small, test cluster.
change the service type to 'type: NodePort', which will utilize a port above 30000 on all cluster machines.
expose the pod directly using 'type: HostPort' in the pod spec.
Depending on your cluster type (Kops-created, GKE, EKS, AKS and so on), different variants may not be setup. Hosted clusters typically support and recommend LoadBalancers, which they charge for, but may or may not have support for NodePort/HostPort.
Another, more important note is that you must ensure you protect the dashboard. Running an unprotected dashboard is a sure way of getting your cluster compromised; this recently happened to Tesla. A decent writeup on various way to protect yourself was written by Jo Beda of Heptio

OpenShift and hostnetwork=true

I have deployed two POD-s with hostnetwork set to true. When the POD-s are deployed on same OpenShfit node then everything works fine since they can discover each other using node IP.
When the POD-s are deployed on different OpenShift nodes then they cant discover each other, I get no route to host if I want to point one POD to another using node IP. How to fix this?
The uswitch/kiam (https://github.com/uswitch/kiam) service is a good example of a use case.
it has an agent process that runs on the hostnetwork of all worker nodes because it modifies a firewall rule to intercept API requests (from containers running on the host) to the AWS api.
it also has a server process that runs on the hostnetwork to access the AWS api since the AWS api is on a subnet that is only available to the host network.
finally... the agent talks to the server using GRPC which connects directly to one of the IP addresses that are returned when looking up the kiam-server.
so you have pods of the agent deployment running on the hostnetwork of node A trying to connect to kiam server running on the hostnetwork of node B.... which just does not work.
furthermore, this is a private service... it should not be available from outside the network.
If you want the two containers to be share the same physical machine and take advantage of loopback for quick communications, then you would be better off defining them together as a single Pod with two containers.
If the two containers are meant to float over a larger cluster and be more loosely coupled, then I'd recommend taking advantage of the Service construct within Kubernetes (under OpenShift) and using that for the appropriate discovery.
Services are documented at https://kubernetes.io/docs/concepts/services-networking/service/, and along with an internal DNS service (if implemented - common in Kubernetes 1.4 and later) they provide a means to let Kubernetes manage where things are, updating an internal DNS entry in the form of <servicename>.<namespace>.svc.cluster.local. So for example, if you set up a Pod with a service named "backend" in the default namespace, the other Pod could reference it as backend.default.svc.cluster.local. The Kubernetes documentation on the DNS portion of this is available at https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
This also avoids the "hostnetwork=true" complication, and lets OpenShift (or specifically Kubernetes) manage the networking.
If you have to absolutely use hostnetwork, you should be creating router and then use those routers to have the communication between pods. You can create ha proxy based router in opeshift, reference here --https://docs.openshift.com/enterprise/3.0/install_config/install/deploy_router.html