Phoenix application on a fleet of machines - deployment

I'm developing a real-time Phoenix App using it's Channel and Socket modules. The App consists of a few processes and GenServers. I have a use case, where on an event (which is an API call from a microservice), I need to broadcast messages to all different topics on my channel on different timestamps. I have achieved this using Process.send_after(..) on my local machine for now. But my doubt is:
On a fleet of machines, because the API call will hit only a single machine in the cluster,
other machines wouldn't be able to send the broadcast messages. And this would lead to inaccuracy. How can I let all the machines know of this particular event? Or am I doing it wrong?

Assuming you know the names of nodes in the cluster, you might loop nodes, calling Node.spawn/2 on each:
def broadcast(msg) do
Process.send_after ...
end
def broadcast_everywhere(msg) do
Enum.each(#nodes, fn node ->
# if not node == Node.self do
Node.spawn node, fn ->
Broadcaster.broadcast(msg)
end
# end
end)
end
Uncomment the commented lines if the current node was already served, and [probably] somehow ensure the nodes are connected and alive upfront.
Also, Node.spawn_link/* might be worth to take a glance at.

Related

Why Receptionist.Subscribe first messages in Akka don't contain all cluster members registered with the specified key?

I have two actors: the first already in the cluster (all in localhost) at port 25457 and the second to be started at port 25458.
Both have the following behaviour:
val addressBookKey = ServiceKey[Message]("address_book")
val listingResponseAdapter = ctx.messageAdapter[Receptionist.Listing] {
case addressBookKey.Listing(p) => OnlinePlayers(p) }
Cluster(ctx.system).manager ! Join(address)
ctx.system.receptionist ! Register(addressBookKey, ctx.self)
ctx.system.receptionist ! Subscribe(addressBookKey, listingResponseAdapter)
Behaviors.receiveMessagePartial {
case m =>
System.err.println(m)
Behaviors.same
}
When the second actor joins stderr prints Set(), Set(Actor[akka://system/user#0]) and then Set(Actor[akka://system/user#0], Actor[akka://system#localhost:27457/user#0])
When the second actor leaves, the first actor prints two times Set(Actor[akka://system/user#0])
How can the second actor receive directly all cluster participants?
Why the first actor prints two times the set after the second leaves?
Thanks
Joining the cluster is an async process, you have only just triggered joining by sending Join to the manager, actually joining the cluster is happening at some point after that. The receptionist can only know about registered services on nodes that has completed joining the cluster.
This means that when you subscribe to the receptionist, joining the cluster has likely not completed yet, so you get the locally registered services (because of ordering guarantees the receptionist will always get the local register message before it receives the subscribe), then once joining the cluster completes the receptionist learns about services on other nodes and the subscriber is updated.
To be sure other nodes are known you would have to wait with the subscription until the node has joined the cluster, this can be achieved by subscribing to cluster state and subscribing only after the node itself has been marked as Up.
In general it is often good to make something that works regardless of cluster node join as it makes it easier to test and also run the same component without cluster. So for example switching behaviour of the subscribing actor when there are no registered services vs at least one, or with a minimum count of services for the service key.
Not entirely sure about why you see the duplicated update when the actor on the other node "leaves", but there are some specifics around the CRDT used for the receptionist registry where it may have to re-remove a service for consistency, that could perhaps explain it.

Invoking CloudRun endpoint from within itself

Assuming there is a Flask web server that has two routes, deployed as a CloudRun service over GKE.
#app.route('/cpu_intensive', methods=['POST'], endpoint='cpu_intensive')
def cpu_intensive():
#TODO: some actions, cpu intensive
#app.route('/batch_request', methods=['POST'], endpoint='batch_request')
def batch_request():
#TODO: invoke cpu_intensive
A "batch_request" is a batch of many same structured requests - each one is highly CPU intensive and handled by the function "cpu_intensive". No reasonable machine can handle a large batch and thus it needs to be paralleled across multiple replicas.
The deployment is configured that every instance can handle only 1 request at a time, so when multiple requests arrive CloudRun will replicate the instance.
I would like to have a service with these two endpoints, one to accept "batch_requests" and only break them down to smaller requests and another endpoint to actually handle a single "cpu_intensive" request. What is the best way for "batch_request" break down the batch to smaller requests and invoke "cpu_intensive" so that CloudRun will scale the number of instances?
make http request to localhost - doesn't work since the load balancer is not aware of these calls.
keep the deployment URL in a conf file and make a network call to it?
Other suggestions?
With more detail, it's now clearer!!
You have 2 responsibilities
One to split -> Many request can be handle in parallel, no compute intensive
One to process -> Each request must be processed on a dedicated instance because of compute intensive process.
If your split performs internal calls (with localhost for example) you will be only on the same instance, and you will parallelize nothing (just multi thread the same request on the same instance)
So, for this, you need 2 services:
one to split, and it can accept several concurrent request
The second to process, and this time you need to set the concurrency param to 1 to be sure to accept only one request in the same time.
To improve your design, and if the batch processing can be asynchronous (I mean, the split process don't need to know when the batch process is over), you can add PubSub or Cloud Task in the middle to decouple the 2 parts.
And if the processing requires more than 4 CPUs 4Gb of memory, or takes more than 1 hour, use Cloud Run on GKE and not Cloud Run managed.
Last word: Now, if you don't use PubSub, the best way is to set the Batch Process URL in Env Var of your Split Service to know it.
I believe for this use case it's much better to use GKE rather than Cloud Run. You can create two kubernetes deployements one for the batch_request app and one for the cpu_intensive app. the second one will be used as worker for the batch_request app and will scale on demand when there are more requests to the batch_request app. I believe this is called master-worker architecture in which you separate your app front from intensive work or batch jobs.

Communication protocol

I'm developing distributed system that consists of master and worker servers. There should be 2 kind of messages:
Heartbeat
Master gets state of worker and respond immediately with appropriate command. For instance:
Message from Worker to Master: "Hey there! I have data a,b,c"
Response from Master to Worker: "All ok, But throw away c - we dont need this anymore"
The participants exchange this messages with interval T.
Direct master command
Lets say client asks master to kill job #123. Here is conversation:
Message from Master to Worker: "Alarm! We need to kill job #123"
Message from Worker to Master: "No problem! Done."
Obvious that we can't predict when this message appear.
Simplest solution is that master is initiator of all communications for both messages (in case of heartbeat we will include another one from master to start exchange). But lets assume that it is expensive to do all heartbeat housekeeping on master side for N workers. And we don't want to waste our resources to keep several tcp connections to worker servers so we have just one.
Is there any solution for this constraints?
First off, you have to do some bookkeeping somewhere. Otherwise, who's going to realize that a worker has died? The natural place to put that data is on the master, if you're building a master/worker system. Otherwise, the workers could be asked to keep track of each other in a long circle, or a randomized graph. If a worker notices that their accountabilibuddy is not responding anymore, it can alert the master.
Same thing applies to the list of jobs currently running; who keeps track of that? It also scales O(n), so presumably the master doesn't have space for that either. Sharding that data out among the workers (e.g. by keeping track of what things their accountabilibuddy is supposed to be doing) only works so far; if a and b crashes, and a is the only one looking after b, you just lost the list of jobs running on b (and possibly the alert that was supposed to notify you that b crashed).
I'd recommend a distributed consensus algorithm for this kind of task. For production, use something someone else has already written; they probably know what they're doing. If it's for learning purposes, which I presume, have a look at the raft consensus algorithm. It's not too hard to understand, but still highlights a lot of the complexity in distributed systems. The simulator is gold for proper understanding.
A master/worker system will never properly work with less than O(n) resources for n workers in the face of crashing workers. By definition, the master needs to control the workers, which is an O(n) job, even if some workers manage other workers. Also, what happens if the master crashes?
Like Filip Haglund said read the raft paper you should also implement it yourself. However in a nutshell what you need to extract from it would be this. In regaurds to membership management.
You need to keep membership lists and the masters Identity on all nodes.
Raft does it's heartbeat sending on master's end it is not very expensive network wise you don't need to keep them open. Every 200 ms to a second you need to send the heartbeat if they don't reply back the Master tells the slaves remove member x from list.
However what what to do if the master dies well basically you need to preset candidate nodes. If you haven't received a heart beat within the timeout the candidate requests votes from the rest of the cluster. If you get the slightest majority you become the new leader.
If you want to join a existing cluster basically same as above if not leader respond not leader with leaders address.

How to establish IP communication between two forked parts of a perl script

I have to write a program that serves multiple clients that access multiple resources (webcams) at the same time.
Example: clients A and B both asks for the current position of two pan-tilt cameras A and B.
I have to avoid that the clients speak directly to that cameras (as there can be many clients)
So my idea was to have a process for each client (who connects through a socket) and a process for each cam.
If a client requests the position for cam A the program forks new process for that cam, and that process polls the cam position repeatedly for 10 seconds and then exits. Within that 10 second period each position request from any client should be served by this cam-A process.
The problem is: How can the cam processes communicate with the client processes?
My naive approach is the use of global variables (camA-posX, camA-posY, camB-posX, camB-posY,...) that the cam processes write to and the client processes read from. I even don't know if globals between forked processes are possible at all.
My second approach is to use pipes like in perlipc/Safe Pipe Opens but this only covers parent-child communication.
Another problem: There must be someone (the parent process?) who has to decide if I have to fork a new cam process or if it is still running.
Maybe it's even better to write two programs (using the second approach), one for the clients and one for the cameras, that communicates through a single socket with each other.
If the number of cams and clients raise there even might be a need of scaling the whole thing to distribute the load.
You can't use global variables. Once the processes are forked, they no longer share memory space and therefore global variables are distinct between them. You can only do this with threads, and using shared memory for communication needs to be done very carefully (as does anything in thread concurrent programming :)
For lower level IPC, use IPC::Msg
To be honest, if you need to worry about scaling, I would seriously recommend looking outside the IPC box, and using a real database to manage your communication.
It can either be a relational database, or noSQL one, as long as it is one that guarantees transaction atomicity. mySQL should work perfectly fine.
Another similar approach (if DB is a bit of an overkill) is to use messaging queues, as discussed here: " A queueing system for Perl "
Some other solutions discussed:
What's the fastest Perl IPC/message queue for a single machine?
Message queues in Perl, PHP, Python lists several options for message queues.

Select remote actor on random port

In Scala, if I register a remote actor using alive(0), the actor is registered at a random port.
I can do the registration like this: register('fooParActor, self) in the act method.
Now, on the master/server side, I can select an actor by supplying the port. Do I need to manually scan the ports in order to use random ports?
The problem I am trying to solve is to create n actors on a node and then select them all at a master/server program, e.g. start 10 slaves on node x and get a list 10 remote actors at node y.
How is this done?
There's no need to register various ports for the actors. Instead you need one port for the whole actor system - more precisely the akka kernel (that the server needs to know too). See this page of the documentation for how all of this works in detail.
In order to select a remote actor you can then look it up via its path in the remote actor system, similarly to something like this:
context.actorFor("akka://actorSystemName#10.0.0.1:2552/user/someActorName/1")
In that case, you would have created the n actors as children of the someActorName actor and given them the names 1 to n (so you could get the others via .../someActorName/2, .../someActorName/3 and so on).
There's no need to randomize anything at all here and given how you described the problem, there is also no need for randomization within that. You simply start the 10 actors up and number them from 1 to 10. Any random numbers would just unnecessarily complicate things.
As for really random ports I can only agree with sourcedelica. You need a fixed port to communicate the random ones, or some other way of communications. If someone doesn't know where to communicate to due to the random port it simply won't work.
You need to have at least one ActorSystem with a well known port. Then the other ActorSystems can use port 0 to have Akka assign a random port. The slave ActorSystems will have actors register with an actor on the Master so it knows all of the remote systems.
If you absolutely need to have your master use a random port it will need to communicate its port out of band (using a shared filesystem or database).