Akka-cluster discovering other machines in local network - scala

I'm trying to run http://typesafe.com/activator/template/akka-distributed-workers on few machines connected to local network.
I want to host configuration be as transparent as possible, so I set in my project configuration just linux.local (as netty.tcp.hostname and as seed nodes) and at each machine there is a avahi daemon which is resolving linux.local to appropriate IP address.
Should akka-cluster/akka-remote discover other machines automatically using gossip protocol or above configuration won't be work and I need to explicitly set on each machine the IP address e.g. passing it by argument?

You need to set the hostname configuration on each machine to be an address where that machine can be contacted by the other nodes in the cluster.
So unfortunately, the configuration does need to be different on each node. One way to do this is to override the host configuration programmatically in your application code.
The seed nodes list, however, should be the same for all the nodes, and also should be the externally accessible addresses.

Related

External IP of Google Cloud Dataproc cluster changes after cluster restart

There is an option for google cloud dataproc to stop(Not delete) the cluster (Master + Worker nodes) and start as well but when we do so, external IP address of master and worker nodes are changing which causes problem for using Hue and other IP based Web UI on it.
Is there any option to persist the same IP after restart?
Though Dataproc doesn't currently provide a direct option for using static IP addresses, you can use the underlying Compute Engine interfaces to add a static IP address to your master node, possibly removing the previous "ephemeral IP address".
That said, if you're accessing your UIs through external IP addresses, that presumably means you also had to manage your firewall rules to carefully limit the inbound IP ranges. Depending on what UIs you're using, if they're not using HTTPS/SSL then that's still not ideal even if you have firewall rules limiting access from other external sources.
The recommended way to access your Dataproc UIs is through SSH tunnels; you can even add the gcloud compute ssh and browser-launching commands to a shell script for convenience if you don't want to re-type all the SSH flags each time. This approach would also ensure that links work in pages like the YARN ResourceManager, since those will be using GCE internal hostnames which your external IP address would not work for.

Joining an external Node to an existing Kubernetes Cluster

I have a custom Kubernetes Cluster (deployed using kubeadm) running on Virtual Machines from an IAAS Provider. The Kubernetes Nodes have no Internet facing IP Adresses (except for the Master Node, which I also use for Ingress).
I'm now trying to join a Machine to this Cluster that is not hosted by my main IAAS provider. I want to do this because I need specialized computing resources for my application that are not offered by the IAAS.
What is the best way to do this?
Here's what I've tried already:
Run the Cluster on Internet facing IP Adresses
I have no trouble joining the Node when I tell kube-apiserver on the Master Node to listen on 0.0.0.0 and use public IP Adresses for every Node. However, this approach is non-ideal from a security perspective and also leads to higher cost because public IP Adresses have to be leased for Nodes that normally don't need them.
Create a Tunnel to the Master Node using sshuttle
I've had moderate success by creating a tunnel from the external Machine to the Kubernetes Master Node using sshuttle, which is configured on my external Machine to route 10.0.0.0/8 through the tunnel. This works in principle, but it seems way too hacky and is also a bit unstable (sometimes the external machine can't get a route to the other nodes, I have yet to investigate this problem further).
Here are some ideas that could work, but I haven't tried yet because I don't favor these approaches:
Use a proper VPN
I could try to use a proper VPN tunnel to connect the Machine. I don't favor this solution because it would add a (admittedly quite small) overhead to the Cluster.
Use a cluster federation
It looks like kubefed was made specifically for this purpose. However, I think this is overkill in my case: I'm only trying to join a single external Machine to the Cluster. Using Kubefed would add a ton of overhead (Federation Control Plane on my Main Cluster + Single Host Kubernetes Deployment on the external machine).
I couldn't think about any better solution than a VPN here. Especially since you have only one isolated node, it should be relatively easy to make the handshake happen between this node and your master.
Routing the traffic from "internal" nodes to this isolated node is also trivial. Because all nodes already use the master as their default gateway, modifying the route table on the master is enough to forward the traffic from internal nodes to the isolated node through the tunnel.
You have to be careful with the configuration of your container network though. Depending on the solution you use to deploy it, you may have to assign a different subnet to the Docker bridge on the other side of the VPN.

Google Container Engine: assign static IP to nodes for outbound traffic

I am using Google Container Engine to launch a cluster that connects to remote services (in a different data center / provider). The containers that are connecting may not have a kubernetes service associated with them and don't need external in-bound ip addresses. However, I want to set up firewall rules on the remote machines and have a known subnet that the nodes will be within when I expand/reduce the cluster or if a node goes down and is re-built.
In looking at Google Networks they appear to be related to internal networks (e.g. 10.128.0.0, etc). The external IP lets me set up single static IP addresses but not a range and I don't see how to apply that to a node — applying to a load balancer won't change the outbound IP address.
Is there a way I can reserve a block of IP addresses for my cluster to use in my firewall rules on my remote servers? Or is there some other solution I'm missing for this kind of thing?
The proper solution for this is to use a VPN to connect the two networks. Google Cloud VPN allows you to create this on the Google side.

Getting ZooKeeper to run on Google's Compute Engine using external IPs

I have been trying to setup a ZooKeeper cluster on the Google Compute Engine and have run into some issues when using the external IPs of the machines. My cluster consists of 3 nodes on their own separate instances on GCE.
Now, when I configure each node to use the external IP of the instance they seem to be unable to communicate with each other.
zoo.cfg
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=externalIp1:2888:3888
server.2=externalIp2:2888:3888
server.3=externalIp3:2888:3888
If I configure them with their internal IP, however, everything works perfectly fine. My guess is that when ZooKeeper starts up, it binds itself to the internal IP of the instance regardless of the configurations. Because of this, when each node tries to look for the other 2 using the external IPs that they were configured, they're unable to find them.
So my question is, is there any way to make it so that ZooKeeper uses the external IP of the machine instead of the internal one? I'm relatively new to the Google Cloud Platform and to setting up hardware in general, so I'm not really sure if something like ip forwarding, firewall rules, or something else would achieve what I'm trying to do (assuming it's even possible).
According to the Zookeeper 3.4.5 docs, you need to specify the following option:
clientPortAddress
New in 3.3.0: the address (ipv4, ipv6 or hostname) to listen for client connections; that is, the address that clients attempt to connect to. This is optional, by default we bind in such a way that any connection to the clientPort for any address/interface/nic on the server will be accepted.
Although it appears that by default, it will bind to all available IPs on the server, so theoretically, it should have worked as you have set it up.
Important note: if Zookeeper instances talk to each other using external IPs rather than internal IPs, you will be charged for data egress whereas if all communication is over internal network (using internal IPs) within the same zone, you won't.

OpenMq clustering not supported for loopback addresses

If I start up a single instance of the broker on a loopback address I get the following:
[05/Sep/2014:16:45:11 BST] WARNING [B3236]: Bad bind address of portmapper service for cluster, please change imq.portmapper.hostname: Loopback IP address is not allowed in broker address localhost[localhost/127.0.0.1] for cluster
[05/Sep/2014:16:45:11 BST] WARNING [B1137]: Cluster initialization failed. Disabling the cluster service.
I have a setup (actually the Azure Compute Emulator) which allows multiple vms/processes to be started up with their own unique ipaddresses of the form 127.X.X.X which are actually loopback addresses as far as java.net.InetAddress is concenrned. Therefore despite the fact that I am successfully using these addresses for socket to socket communication between those vm/processes I cannot use them to run an OpenMq cluster.
As a work around I have set up the brokers to bind to a SINGLE non loopback address and use different ports and that works. So it's not the case that you can't cluster on one ipaddress.
Why was loopback disallowed?
If it is theoretically possible, is there a setting to enable it for clustering?
According to Amy Kang of Oracle opnenmq users mailing list this is by design since clustering is intended to be across muultiple servers. You can however bind several brokers to one non loopback address and use different ports.