Apache Thrift Service Auto Discovery - service

I want to develop some local network services using apache thrift. There should be multiple services waiting for ONE master to connect to them and use them exclusively until the master releases them. The services are written in multiple languages.
The choice to use thrift was done because I need some simple remote procedure call mechansim for communication between the services that is fast and supports multiple languages. While thrift is good for RPC, I need some mechanism to locate the service TCP addresses and ports via some auto-discovery mechanism before to be able to connect the thrift server/clients with each other without hardwiring the addresses.
What are the possibilities for auto-discovering of such sort of services do I have?
Thanks!

There is nothing which you just plug into your scheme of things. You can build something similar using Apache ZooKeeper. Netflix's curator provides a good set of tools to build this, on top of ZooKeeper. See https://github.com/Netflix/curator

Related

Service Fabric Strategies for Bi-Directional Communication with External Devices

My company is interested in using a stand-alone Service Fabric cluster to manage communications with robots. In our scenario, each robot would host its own rosbridge server, and our Service Fabric application would maintain WebSocket clients to each robot. I envision a stateful service partitioned along device ids which opens connections on startup. It should monitor connection health via heartbeats, pass messages from the robots to some protocol gateway service, and listen to other services for messages to pass to the robots.
I have not seen discussion of this style of external communications in the Service Fabric documentation - I cannot tell if this is because:
There are no special considerations for managing WebSockets (or any two-way network protocol) this way from Service Fabric. I've seen no discussion of restrictions and see no reason, conceptually, why I can't do this. I originally thought replication would be problematic (duplicate messages?), but since only one replica can be primary at any time this appears to be a non-issue.
Service Fabric is not well-suited to bi-directional communication with external devices
I would appreciate some guidance on whether this architecture is feasible. If not, discussion on why it won't work will be helpful. General discussion of limitations around bi-directional communication between Service Fabric services and external devices is welcome. I would prefer if we could keep discussion to stand-alone clusters - we have no plans to use Azure services at this time.
Any particular reason you want SF to host the client and not the other way around?
Doing the way you suggest, I think you will face big challenges to make SF find these devices on your network and keep track of them, for example, Firewall, IPs, NAT, planned maintenance, failures, connection issues, unless you are planning to do it by hand.
From the brief description I saw in the docs your provided about rosbridge server, I could understand that you have to host it on a Server(like you would with a service fabric service) and your devices would connect to it, in this case, your devices would have installed the ROS to make this communication.
Regarding your concerns about the communication, service fabric services are just executable programs you would normally run on your local machine, if it works there will likely work on service fabric environment on premise, the only extra care you have to worry is the external access to the cluster(if in azure or network configurations) and service discovery.
In my point of view, you should use SF as the central point of communication, and each device would connect to SF services.
The other approach would be using Azure IoT Hub to bridge the communication between both. There is a nice Iot Hub + Service Fabric Sample that might be suitable for your needs.
Because you want to avoid Azure, you could in this case replace IoT Hub with another messaging platform or implement the rosbridge in your service to handle the calls.
I hope I understood everything right.
About the obstacles:
I think the major issue here is that bi-directional connection can be established between service replica and the robot.
This has two major problems:
Only primary replica has write access - i.e. only one replica would be able to modify state. This issue hence could be mitigated by creating a separate partition for each robot (but please remember that you can't change partition count after the service was created) or by creating a separate service instance for each robot (this would allow you to dynamically add or remove robots but would require additional logic related to service discoverability).
The replica can be shutdown (terminated), moved to another node (shutdown and start of new replica) or even demoted (the primary replica get's demoted to secondary and another secondary replica get's promoted to primary) by various reasons. So the service code and robot communication code should be able to handle this.
About WebSockets
This looks possible by implementing custom ICommunicationListener and other things using WebSockets.

How do config tools like Consul "push" config updates to clients?

There is an emerging trend of ripping global state out of traditional "static" config management tools like Chef/Puppet/Ansible, and instead storing configurations in some centralized/distributed tool, of which the main players appear to be:
ZooKeeper (Apache)
Consul (Hashicorp)
Eureka (Netflix)
Each of these tools works differently, but the principle is the same:
Store your env vars and other dynamic configurations (that is, stuff that is subject to change) in these tools as key/value pairs
Connect to these tools/services via clients at startup and pull down your config KV pairs. This typically requires the client to supply a service name ("MY_APP"), and an environment ("DEV", "PROD", etc.).
There is an excellent Consul Java client which explains all of this beautifully and provides ample code examples.
My understanding of these tools is that they are built on top of consensus algorithms such as Zab, Paxos and Gossip that allow config updates to spread almost virally, with eventual consistency, throughout your nodes. So the idea there is that if you have a myapp app that has 20 nodes, say myapp01 through myapp20, if you make a config change to one of them, that change will naturally "spread" throughout the 20 nodes over a period of seconds/minutes.
My problem is: how do these updates actually deploy to each node? In none of the client APIs (the one I linked to above, the ZooKeeper API, or the Eureka API) do I see some kind of callback functionality that can be set up and used to notify the client when the centralized service (e.g. the Consul cluster) wants to push and reload config updates.
So I ask: how is this supposed to work (dynamic config deployment and reload on clients)? I'm interested in any viable answer for any of those 3 tools, though Consul's API seems to be the most advanced IMHO.
You could use cfg4j for that. It's a Java configuration library for distributed services. It supports Consul as one of the configuration sources.
That's a nice question. I can tell how Consul HTTP client works.
I also think initially that it works in the push mechanism but while I was recently exploring Consul, I found that all Consul clients poll server for changes they want to watch. Although it is a bit different polling mechanism, Consul supports blocking queries. These are HTTP requests with a max timeout of 10 mins. This query waits until there is some change on the watched key/folder and return with the latest index. If the index is changed, the client reloads the configuration. For more info : Consul Blocking Query

Spring Data-Couchbase Client Configuration with moxi Client Possible?

We are running client-side MOXI on the same machine as our Tomcat servers, with MOXI currently talking to a cluster of membase servers on 3 different machines. The java clients talk to MOXI using spymemcached communicating to MOXI through data port 11211.
We are going to migrate to Couchbase now and from a development perspective, we'd like to use spring-data with couchbase but our infrastructure team want to keep MOXI on the client machines and only communicate via port 11211. It seems that when configuring the Couchbase client, this won't work as MOXI doesn't proxy the port 8901 (admin port) that the CouchbaseClient class uses to discover the Couchbase cluster. Does this mean that if we keep our current infrastructure that Spring Data is off the table?
I am new to this and have went through the Couchbase documentation, and it seems like what I want to do isn't possible, but would like to confirm this. Currently, to configure spring-data I am using this:
<couchbase:couchbase bucket="appsbucket" password="" host="localhost"/>
<couchbase:repositories base-package="com.pathto.myrepositories"/>
Localhost is where MOXI is running, but the assumption made by the the couchbase bean (the CouchbaseClient configuration) is that the couchbase admin port is available at port 8901. Of course, if instead of localhost I point it toward one of the servers hosting Couchbase, I don't have an issue other than our infrastructure team takes umbrage with this configuration.
Once you move to Couchbase with a smart client there isn't really much value in moxi; and in fact you'll be introducing an additional network hop (client -> moxi; moxi -> cluster).
You can think of the smart clients as conceptually having an embedded moxi - as the smart clients are aware of the cluster topology and know which node to communicate with to access a given document.
I suggest you take a look a the Deployment strategies section in the Couchbase admin guide which explains all this in more detail.

Can I use a ES client that creates a local node and joins a cluster over TCP on Heroku?

I'm hoping to connect to a remote ElasticSearch cluster from a Scala app running on Heroku. I've never used ES before, but as I understand it the most efficient way to connect from Java/Scala is to create a local data-less node that joins the cluster you want to query and talks to it over its native TCP interface. Is it possible (and / or allowed) to use such a client on Heroku?
Not at time of writing, you have to use the HTTP interface. This will restrict your choice of client libraries too.

Horizontal scaling of node.js server instances on a single machine

Running a web server on node.js is a simple thing to do (as seen by its excellent examples and documentation) but I wonder how you can fully use the CPU resources of a dedicated server?
Since node.js is single-threaded the only way to take advantage of multiple processors is via multiple processes. Of course, only one process can bind to a port so it seems there would have to be a master/worker pattern wherein the master forks children, binds to the incoming port, and delegates incoming connections (and the actual processing work) to the children. (Perhaps via a hungry-consumer pattern?)
Is this the best way to scale a web server running node.js? If so, are there libraries to simplify the master/worker pattern? If not, what patterns or deployment setups are recommended to best use the entire resources of a dedicated machine?
(Is this a better question for ServerFault?)
Multi-node is a library that provides the master/worker pattern.
If the server processes don't need to be able to talk to each other, and you aren't using Socket.IO, a simple option would be to just start one process/core, bind to local ports, and use something like nginx or HAProxy to load balance between them.
If you're using express, I'd use tj's Cluster: http://learnboost.github.com/cluster/
It provides 'transparent' cpu based load balancing, which is nice because you can use your existing express app, and it scales it across cores relatively painlessly.