Communication between two Cassandra nodes - nosql

Assume two Cassandra nodes running on hosts A and B respectively. Which TCP and/or UDP ports needs to be open between hosts A and B for Cassandra to operate properly?

That depends on how you have configured storage-conf.xml on your two nodes.
Hint. take a look at <StoragePort>7000</StoragePort> in storage-conf.xml.
(TCP port 7000 is the standard/default port used by Cassandra for internal communication, i.e. address to bind to and tell other nodes to connect to).
UDP port (7001 default) was previous used for gossip, was removed in 0.6.0.

Related

How can I have more than 64K connections per node in Kubernetes?

I have an EKS Kubernetes cluster. High level the setup is:
a) There is an EC2 instance, lets call it "VM" or "Host"
b) In the VM, there is a POD running 2 containers: Side Car HAProxy Container + MyApp Container
What happens is that when external requests come, inside of HAProxy container, I can see that the source IP is the "Host" IP. As the Host has a single IP, there can be a maximum of 64K connections to HAProxy.
I'm curious to know how to workaround this problem as I want to be able to make like 256K connections per Host.
I'm not sure is you understand reason for 64k limit so try to explain it
At first that is a good answer about 64k limitations
Let's say that HAProxy (192.168.100.100) listening at port 8080 and free ports at Host (192.168.1.1) are 1,353~65,353, so you have combination of:
source 192.168.1.1:1353~65353 → destination 192.168.100.100:8080
That is 64k simultaneous connections. I don't know how often NAT table is updating, but after update unused ports will be reused. So simultaneous is important
If your only problem is limit of connections per IP, here is couple solutions:
Run multiple HAProxyes. Three containers increase limit to 64,000 X 3 = 192,000
Listen multiple ports on HAProxy (check about SO_REUSEPORT). Three ports (8080, 8081, 8082) increase max number of connections to 192,000
Host interface IP is acting like a gateway for Docker internal network so I not sure if it is possible to set couple IPs for Host or HAProxy. At least I didn't find information about it.
It turns that in Kubernetes one can configure how we want clients to access the service and the choice that we had was nodePort. When we changed it to hostPort, the source IP was seen in the haproxy container and hence the limitation that I was having was removed.
If this option would have failed, my next option was to try the recommendation in the other response which was to have haproxy listening in multiple ports. Thankfully that was not needed.
Thanks!

Image Store Port Selection

How does the Image Store Service choose it's ports? Right now, it looks to choose different ports every time Service Fabric starts, sometimes clashing with our application ports.
The ephemeral endpoint ports range is set, but the Image Store Service is still using ports inside the Application Ports Range. Our services have fixed ports.
I have not found a lot of documentation regarding this topic.
SF system services rely on the ephemeral port range to communicate. Make sure there's no overlap between application port ranges and ephemeral port ranges.

Mapping ports in Kubernetes

I'm trying to wrap my head around how kubernetes (k8s) utilises ports. Having read the API documentation as well as the available docs, I'm not sure how the port mapping and port flow works.
Let's say I have three containers with an externally hosted database, my k8s cluster is three on-prem CoreOS nodes, and there is a software-defined load balancer in front of all three nodes to forward traffic to all three nodes on ports 3306 and 10082.
Container A utilises incoming port 8080, needs to talk to Container B and C, but does not need external access. It is defined with Replication Controller A that has 1 replica.
Container B utilises incoming port 8081 to talk to Container A and C, but needs to access the external database on port 3306. It is defined with Replication Controller B that has 2 replicas.
Container C utilises incoming port 8082, needs to talk to Container A and B, but also needs external access on port 10082 for end users. It is defined with Replication Controller C that has 3 replicas.
I have three services to abstract the replication controllers.
Service A selects Replication Controller A and needs to forward incoming traffic on port 9080 to port 8080.
Service B selects Replication Controller B and needs to forward incoming traffic on ports 9081 and 3306 to ports 8081 and 3306.
Service C selects Replication Controller C and needs to forward incoming traffic on port 9082 to port 8082.
I have one endpoint for the external database, configured to on port 3306 with an IPv4 address.
Goals:
Services need to abstract Replication Controller ports.
Service B needs to be able to be reached from an external system on port 3306
on all nodes.
Service C needs to be able to be reached from an external system on port 10082
on all nodes.
With that:
When would I use each of the types of ports; i.e. port, targetPort, nodePort, etc.?
Thanks for the very detailed setup, but I still have some questions.
1) When you say "Container" {A,B,C} do you mean Pod? Or are A, B, C containers in the same Pod?
2) "Container B utilises incoming port 8081 to talk to Container A and C" - What do you mean that it uses an INcoming port to talk to other containers? Who opens the connection, to whom, and on what destination port?
3) "needs to access the external database on port 3306" but later "needs to be able to be reached from an external system on port 3306" - Does B access an external database or is it serving a database on 3306?
I'm confused on where traffic is coming in and where it is going out in this explanation.
In general, you should avoid thinking in terms of nodes and you should avoid thinking about pods talking to pods (or containers to containers). You have some number of Services, each of which is backed by some number of Pods. Client pods (usually) talk to Services. Services receive traffic on a port and send that traffic to the corresponding targetPort on Pods. Pods receive traffic on a containerPort.
None of that requires hostPorts or nodePorts. The last question is which of these Services need to be accessed from outside the cluster, and what is your environment capable of wrt load-balancing.
If you answer this far, then I can come back for round 2 :)

Maximum TCP connections (with different IPs/containers)

I was reading this answer: Increasing the maximum number of tcp/ip connections in linux
My setup is as follows. I have multiple containers, each with their own IP, and I communicate to a process inside them through tcp (its all same machine). As such, all the processes use the same port, but have their own unique IP. Does that mean I will not run into the tcp limitations described above (since those seem to be port-based?)
Controller Process -> container 1 [with unique IP, port X]
-> container 2 [with unique IP, port X]
-> container 3 [with unique IP, port X]
-> container 5 [with unique IP, port X]
Apologies if this question is rudimentary phrased, new to some of this stuff. Happy to provide any additional info.

OpenMq clustering not supported for loopback addresses

If I start up a single instance of the broker on a loopback address I get the following:
[05/Sep/2014:16:45:11 BST] WARNING [B3236]: Bad bind address of portmapper service for cluster, please change imq.portmapper.hostname: Loopback IP address is not allowed in broker address localhost[localhost/127.0.0.1] for cluster
[05/Sep/2014:16:45:11 BST] WARNING [B1137]: Cluster initialization failed. Disabling the cluster service.
I have a setup (actually the Azure Compute Emulator) which allows multiple vms/processes to be started up with their own unique ipaddresses of the form 127.X.X.X which are actually loopback addresses as far as java.net.InetAddress is concenrned. Therefore despite the fact that I am successfully using these addresses for socket to socket communication between those vm/processes I cannot use them to run an OpenMq cluster.
As a work around I have set up the brokers to bind to a SINGLE non loopback address and use different ports and that works. So it's not the case that you can't cluster on one ipaddress.
Why was loopback disallowed?
If it is theoretically possible, is there a setting to enable it for clustering?
According to Amy Kang of Oracle opnenmq users mailing list this is by design since clustering is intended to be across muultiple servers. You can however bind several brokers to one non loopback address and use different ports.