How does Cassandra driver update contactPoints if all pods are restarted in Kubernetes without restarting the client application? - kubernetes

We have created a statefulset & headless service. There are 2 ways by which we can define peer ips in application:
Use 'cassandra-headless-service-name' in contactPoints
Fetch the peers ip from headless-service & externalize the peers ip and read these ips when initializing the connection.
SO far so good.
Above will work if one/some pods are restarted, not all. In this case, driver will updated the new ips automatically.
But, how this will work in case of complete outage ? If all pods are down & when they come back, if all pods ip are changed (IP can change in Kubernetes), how do application will connect to Cassandra?

In a complete outage, you're right, the application will not have any valid endpoints for the cluster. Those will need to be refreshed (and the app restarted) before the app will connect to Cassandra.
We actually wrote a RESTful API that we can use query current, valid endpoints by cluster. That way, the app teams can find the current IPs for their cluster at any time. I recommend doing something similar.

Related

Mongo connection URI for statefulSet in K8s, each replica (pod) or the (headless) service?

I’m little bit unsure what would the correct connection URI for my applications for the Mongodb statefulset. I have three replicas running in my cluster, each one in separate node.
Should I configure the pods OR the headless service (load balancer for the pods)?
The documentation is directing to use pods, like this (Running MongoDB on Kubernetes with StatefulSets | Kubernetes):
mongodb://user:pwd#mongo-0.mongo,mongo-1.mongo,mongo-2.mongo:27017/dbname_?
But I have got it working with the service as well:
mongodb://user:pwd#mongodb-headless.svc.cluster.local:27017/dbname_?authSource=admin&replicaSet=rs0
But, I don’t know what is the correct URI? The problem I’m having is that when some of the replicas goes down, for some reason, the application crashes as the database connection is lost. I think this is where the headless-service comes in picture, but no, the documentation says to configure the pods. And if I scale the replicas I need to reconfigure the URI. This does not sound so dynamic.
I’m also facing some issues with the headless service, as if it is in different namespace I cannot get the connection work with namespace defined, like:
mongodb-headless.namespace.svc.cluster.local:27017
Have I missed something?
Thank you in any advance!
EDIT: added replicaset for service/lb URI example (I had this configured...)
I think your way of referencing the headless service will result in mongodb only using the first in the set.
Another way is using the MongoDB's DNS seed list connection format together with Kubernetes' support for DNS SRV records. If you named your Service's port mongodb, then the following connection string ought to work:
mongodb+srv://user:pwd#mongodb-headless.namespace.svc.cluster.local:27017/dbname_?
MongoDB clients will use DNS to get a seed list on connection, which stays up to date with the actual Pods running.
Note that this enables tls by default, which you probably do not want.

Connect to Cassandra on Kubernetes using java-driver

We are bringing up a Cassandra cluster, using k8ssandra helm chart, it exposes several services, our client applications are using the datastax Java-Driver and running at the same k8s cluster as the Cassandra cluster (this is testing phase)
CqlSessionBuilder builder = CqlSession.builder();
What is the recommended way to connect the application (via the Driver) to Cassandra?
Adding all nodes?
for (String node :nodes) {
builder.addContactPoint(new InetSocketAddress(node, 9042));
}
Adding just the service address?
builder.addContactPoint(new InetSocketAddress(service-dns-name , 9042))
Adding the service address as unresolved? (would that even work?)
builder.addContactPoint(InetSocketAddress.createUnresolved(service-dns-name , 9042))
The k8ssandra Helm chart deploys a CassandraDatacenter object and cass-operator in addition to a number of other resources. cass-operator is responsible for managing the CassandraDatacenter. It creates the StatefulSet(s) and creates several headless services including:
datacenter service
seeds service
all pods service
The seeds service only resolves to pods that are seeds. Its name is of the form <cluster-name>-seed-service. Because of the ephemeral nature of pods cass-operator may designate different C* nodes as seed nodes. Do not use the seed service for connecting client applications.
The all pods service resolves to all Cassandra pods, regardless of whether they are readiness. Its name is of the form <cluster-name>-<dc-name>-all-pods-service. This service is intended to facilitate with monitoring. Do not use the all pods service for connecting client applications.
The datacenter service resolves to ready pods. Its name is of the form <cluster-name>-<dc-name>-service This is the service that you should use for connecting client applications. Do not directly use pod IPs as they will change over time.
Adding all nodes?
You definitely do not need to add all of the nodes as contact points. Even in vanilla Cassandra, only adding a few is fine as the driver will gossip and find the rest.
Adding just the service address?
Your second option of binding on the service address is all you should need to do. The nice thing about the service address, is that it will account for changing/removing of IPs in the cluster.

Within a Kubernetes cluster catch outgoing requests from a Pod and redirect to a different target

I have a cluster with 3 nodes. In each node i have a frontend application running in a Pod and backend application running in a separate Pod.
I send data from the frontend application to the backend application, to do this i utilise the Cluster IP Service and k8 dns resource.
I also have a function in my frontend where i send data to a separate service unrelated to my k8s cluster. I send this data using a standard AJAX request to a url with a payload i.e http://my-seperate-service-unrelated-tok8.com.
All of this works correctly and the cluster operates as i want. - i have this cluster deployed to GKE. 

I now want to run this cluster local using minikube, which i have been able to do, however, when i am running locally i do not want to send data to my external service - instead i want to forward it to either a new Pod i will create or just not send it.


The problem here is i need a proxy to intercept outgoing network traffic, check if the outgoing request is the request i am looking for and if it is then redirect it.
I understand each node running in a cluster has a kube-proxy service running within the node - which is used to forward traffic to the relevant services in the cluster. 

I would like to either extend this service, or create a new proxy service where i can listen for outgoing traffic to a specific url and redirect it. 

Is this possible to do in a k8 cluster? I assume there is a Service i can create to listen for all outgoing requests and redirect specific requests based on rules i set. 

I wasn’t sure if k8 clusters have a Service already configured i can simply add to - that’s why i thought of the kube-proxy, would anyone be able to advice on this?

I wanted to add this proxy so i don’t have to change my code when its ran locally in minikube or deployed to GKE.


Any help is greatly appreciated. Thanks!
I did a tool that help you to forward a service to another service,local port, service from other cluster, etc...
This way you can have exactly your same urls, ports and code... but the underlying services gets "replaced", if I understand correctly this is what you are looking for.
Here is a quick example of an stage service being replaced with my local 3000 port
This is the repository with more info and examples: linker-tool
If you are interested let me know if you need help or have any question.

Proxy outgoing traffic of Kubernetes cluster through a static IP

I am trying to build a service that needs to be connected to a socket over the internet without downtime. The service will be reading and publishing info to a message queue, messages should be published only once and in the order received.
For this reason I thought of deploying it into Kubernetes where I can automatically have multiple replicas in case one process fails, i.e. just one process (pod) should be running all time, not multiple pods publishing the same messages to the queue.
These requests need to be routed through a proxy with a static IP, otherwise I cannot connect to the socket. I understand this may not be a standard use case as a reverse proxy as it is normally use with load balancers such as Nginx.
How is it possible to build this kind of forward proxy in Kubernetes?
I will be deploying this on Google Container Engine.
Assuming you're happy to use Terraform, you can use this:
https://github.com/GoogleCloudPlatform/terraform-google-nat-gateway
However, there's one caveat and that is it may inbound traffic to other clusters in that same region/zone.
Is the LoadBalancer that you need?
kubernetes create external loadbalancer,you can see this doc.

Fixed number of pods with a specific purpose (socket connection)

we are planning to use kubernetes and I am validating, if and how it fits our requirements.
One concern is the following:
I want to build an application/pod, which is connecting to a certain service on the internet (host and port) and keeps the socket alive as long as we need it (usually forever). The number of sockets the application will connect to may vary.
For the inter pod communication we are going to use RabbitMQ.
What is the correct/best practise approach for that purpose?
One pod handling all/multiple sockets?
Replicated pods handling multiple socket?
One Socket per pod?
How do i react, if the number of sockets changes?
At the moment we want to use the gitlab-ci and helm for our CI pipeline.
kubernetes deploys Pods and has two abstractions: Deployments and StatefulSets. The former deploys ephemeral Pods whose hostname and IP changes. The latter retains the state.
If you're deploying kubernetes only for this application, it's an overkill imho. I'd rather use plain Docker or a simpler-than-kubernetes orchestrator such as Docker Swarm Mode or Kontena.
If kubernetes is your only option, you could deploy the app as a StatefulSet. That way its hostname will remain between restarts. Have the app monitor its hostname and connect to the appropriate endpoint. For example, the app-1 Pod connects to endpoint:10001, app-2 Pod connects to endpoint:10002, and so on...
When more Pods are needed to connect to more sockets, either increase the StatefulSet's replicas manually, or write a sidecar application to monitor the no. of sockets and up/down the replicas automatically.