I'm investigating the use of Scala/Play/Akka on Heroku and I'm curious about something. Suppose I have an application structured as a network of Akka actors. Some of the actors will be in-process with the web application itself, but I may want to set aside one or more nodes as dedicated Akka actors: for example, a group of cache manager actors.
To configure Akka remoting, I need to supply IP addresses in akka.conf. But since Heroku nodes are somewhat ephemeral, I won't know each node's address at the time I write the config file.
It might simplify things to have a central "registration" node, but even there, I won't know the IP address of that node in advance.
So how do my Akka nodes refer to each other on Heroku?
Heroku dynos can't talk directly to each other so you will have to use an external messaging system like RabbitMQ. There is a great article on the Heroku Dev Center about how to coordinate Akka Actors in this way:
https://devcenter.heroku.com/articles/scaling-out-with-scala-and-akka
Related
My company is interested in using a stand-alone Service Fabric cluster to manage communications with robots. In our scenario, each robot would host its own rosbridge server, and our Service Fabric application would maintain WebSocket clients to each robot. I envision a stateful service partitioned along device ids which opens connections on startup. It should monitor connection health via heartbeats, pass messages from the robots to some protocol gateway service, and listen to other services for messages to pass to the robots.
I have not seen discussion of this style of external communications in the Service Fabric documentation - I cannot tell if this is because:
There are no special considerations for managing WebSockets (or any two-way network protocol) this way from Service Fabric. I've seen no discussion of restrictions and see no reason, conceptually, why I can't do this. I originally thought replication would be problematic (duplicate messages?), but since only one replica can be primary at any time this appears to be a non-issue.
Service Fabric is not well-suited to bi-directional communication with external devices
I would appreciate some guidance on whether this architecture is feasible. If not, discussion on why it won't work will be helpful. General discussion of limitations around bi-directional communication between Service Fabric services and external devices is welcome. I would prefer if we could keep discussion to stand-alone clusters - we have no plans to use Azure services at this time.
Any particular reason you want SF to host the client and not the other way around?
Doing the way you suggest, I think you will face big challenges to make SF find these devices on your network and keep track of them, for example, Firewall, IPs, NAT, planned maintenance, failures, connection issues, unless you are planning to do it by hand.
From the brief description I saw in the docs your provided about rosbridge server, I could understand that you have to host it on a Server(like you would with a service fabric service) and your devices would connect to it, in this case, your devices would have installed the ROS to make this communication.
Regarding your concerns about the communication, service fabric services are just executable programs you would normally run on your local machine, if it works there will likely work on service fabric environment on premise, the only extra care you have to worry is the external access to the cluster(if in azure or network configurations) and service discovery.
In my point of view, you should use SF as the central point of communication, and each device would connect to SF services.
The other approach would be using Azure IoT Hub to bridge the communication between both. There is a nice Iot Hub + Service Fabric Sample that might be suitable for your needs.
Because you want to avoid Azure, you could in this case replace IoT Hub with another messaging platform or implement the rosbridge in your service to handle the calls.
I hope I understood everything right.
About the obstacles:
I think the major issue here is that bi-directional connection can be established between service replica and the robot.
This has two major problems:
Only primary replica has write access - i.e. only one replica would be able to modify state. This issue hence could be mitigated by creating a separate partition for each robot (but please remember that you can't change partition count after the service was created) or by creating a separate service instance for each robot (this would allow you to dynamically add or remove robots but would require additional logic related to service discoverability).
The replica can be shutdown (terminated), moved to another node (shutdown and start of new replica) or even demoted (the primary replica get's demoted to secondary and another secondary replica get's promoted to primary) by various reasons. So the service code and robot communication code should be able to handle this.
About WebSockets
This looks possible by implementing custom ICommunicationListener and other things using WebSockets.
I could not found any broadcast or pub/sub pattern between Reliable Services in any documentation. Did I miss anything?
My use case is , we need to notify custom event to all the SF stateful service replica in cluster if there any state change in any primary replica.
I am aware of Reliable state manager events which triggers when any change in Reliable collections.
Is there any other broadcast , pub/sub events to communicate between services replicas of the cluster ?
Thanks,
Ashish
Did you see this oss project and package? It allows pub/sub messaging between services.
Why reinvent the wheel?
Service Fabric does not contains a brokered messaging engine because:
There are lot's of options already in the market available for this.
Would make your system tight coupled with service fabric runtime.
Why not just use Service Bus Pub\Sub Topics?
If the concern is latency, why not run RabitMQ, ActiveMQ or any other messaging system as a guest executable service or maybe inside a container.
If you had this feature on SF, you would have to write your services dependent on this feature, once you start adding external dependencies, you gonna face an integration challenge to forward these events to systems outside your cluster, having to create a service listening to these events just to forward it to another queue\topic.
It will just add extra work, complexity and maintenance to your solution.
My use case is that I want to set up a cluster of nodes which run Akka Actors. Each actor would be an instance of the same actor to handle a WebSocket connection to a certain user.
Each actor would register itself with a unique path. On a non-clustered setup I can simply call an actor by its path like system.actorSelection(s"user/$client") where $client is a unique name to an actor instance. I have to pass messages to these actors so they can then send it back to their respective WebSocket client.
Apparently Akka Cluster offers a variety of setup: http://doc.akka.io/docs/akka/current/scala/cluster-usage.html
I want to run my nodes on Kubernetes where I can't reliable configure instance names/domains as instances will be coming and going.
What is the simplest set up for Akka Cluster in this scenario?
Did not see any impact with Kubernetes. For your case, I think akka cluster sharding is just for you, use shardRegion to get the proper sharded actor & send message to it directly. For every docker instance just need to make itself as part of cluster node, then not necessary to use a fixed address to find an actor, then, instance dynamic join & leave are all ok.
I'm fairly new to Akka and new to distributed programming in general. Using Akka's Mist component, I've created supervised actors to handle HTTP requests asynchronously. Everything is currently running on one physical machine with local actors. What I don't understand is how to build a truly fault-tolerant system with more than one box. As stated in the Akka docs:
Also, you (usually) need to know if one box is down and/or the service you are talking to on the other box is down. Here actor supervision/linking is a critical tool for not only monitoring the health of remote services, but to actually manage the service, do something about the problem if the actor or node is down. Such as restarting actors on the same node or on another node.
How do I do this? I'm looking for an example or pointers on how to begin making my application distributed. Other services in our group use Apache gateways in front of multiple Tomcat instances, so the event of a Tomcat server going down is transparent to the user. I'm deploying my service to the Akka microkernel and need to achieve a similar level of high availability across more than one physical box.
I'm using Akka 1.1.3.
Remote supervision works only with client-managed remote actors for the Akka 1.x series.
Akka 2.0 that is currently under development will support transparent clustering, cluster-wide supervision and cluster-wide lifecycle monitoring.
You might consider putting an HTTP load balancer in front of Akka Microkernel instances running Mist, this would match what your group does with 'Apache gateways'.
Another approach would be to expose remote actors on a number of instances and then use Akka's LoadBalancer or Actor Pool to send messages around, see here
The second approach is a bit of a pain if you have a dynamic pool of machines, because the pool of devices wants to be specified programatically. Akka 2.0 addresses this with cluster support that is setup in the akka.conf file.
As far as the release date of 2.0, for what its worth 1.2 was just recently released on 2011-Sept-19.
Running a web server on node.js is a simple thing to do (as seen by its excellent examples and documentation) but I wonder how you can fully use the CPU resources of a dedicated server?
Since node.js is single-threaded the only way to take advantage of multiple processors is via multiple processes. Of course, only one process can bind to a port so it seems there would have to be a master/worker pattern wherein the master forks children, binds to the incoming port, and delegates incoming connections (and the actual processing work) to the children. (Perhaps via a hungry-consumer pattern?)
Is this the best way to scale a web server running node.js? If so, are there libraries to simplify the master/worker pattern? If not, what patterns or deployment setups are recommended to best use the entire resources of a dedicated machine?
(Is this a better question for ServerFault?)
Multi-node is a library that provides the master/worker pattern.
If the server processes don't need to be able to talk to each other, and you aren't using Socket.IO, a simple option would be to just start one process/core, bind to local ports, and use something like nginx or HAProxy to load balance between them.
If you're using express, I'd use tj's Cluster: http://learnboost.github.com/cluster/
It provides 'transparent' cpu based load balancing, which is nice because you can use your existing express app, and it scales it across cores relatively painlessly.