Clustering of locators on three different nodes in apache geode - cluster-analysis

I want to clustering a locator in three different nodes or boxes in Apache Geode and if its possible, is there any way we can make Master slave feature or HA feature. Client can connect to any locator which is less occupied at a time.

Geode locators are already highly available. Clients will automatically retry to another locator if one is not available, without impacting your application.
Clients will automatically discover new locators. So it's technically possible to start all of your client point only at a single locator and let it discover the rest. However it's better practice to pass in the list of known locators when starting your client in case one locator is not available at startup.

Related

architecture pattern for microservices

I have a microservices architecture whose logs have to be sent to a remote Kafka topic.
Next to it, the consumer of this topic will send the logs to an ELK stack (an other team)
I want to have a dedicated microservice (fwk-proxy-elasticsearch) whose responsability is to collec the logs from the others one and send them to the remote kafka topic.
what's the best protocol to dispatch all the logs aggregated from my microservices to the fwk-proxy-elasticsearch microservice ?
I want this pattern to not duplicate the security configuration of the remote kafka topic. I want to centralize it in a single place.
May I use vertx event bus for that ? or kafka is beter ? or someother tool ?
May I use vertx to send message from jvm to jvm ?
Moreover, in a microservice architecture, is it a good pattern to centralize a use case in a dedicated microservice? (remote http connection for example)
On my point of view, it allows business microservices to focus on a business issue and not to worry over the protocol that the result has to be sent.
Thanks!
I believe you can use both Vert.x event bus and Kafka to propagate the logs, there are pros and cons on each approach.
While I understand the reasoning behind this decision, I would still consider a dedicated solution built for this purpose, like Fluentd, which is able to aggregate the logs and push them into multiple sources (including Kafka, via the dedicated plugin). I'm sure there are other similar solutions.
There are a couple of important benefits that I see if you use a dedicated solution, instead of building it yourself:
The level of configurability, which is definitely useful in the future (in a dedicated solution, you need to write code each time you want to build something new)
The number of destinations where you can export the logs
Support for a hybrid architecture - with a few config updates, you will be able to grab logs from non-JVM microservices

Sample REST Observable service and a remote subscriber client in Java 9/RxJava 2

Here is the background:
We have a cluster (of 3) different services deployed on various containers (like Tomcat, TomEE, JBoss) etc. Each of the services does one thing. Like one service manages a common DB and provides REST services to CRUD the db. One service puts some data into a JMS Queue, Another service reads from the Queue and updates the DB. There is a client app that makes a REST service call to one of the service that sets off creating a row in the db, pushing that row into a queue etc.
Question: We need to implement the client app so that we know at any given point in time where the processing is. How do I implement this in RcJava 2/Java 9?
First, you need to determine what functionality in RxJava 2 will benefit you.
Coordination between asynchronous sources. Since you have a) event-driven requests from one side, and b) network queries on the other sides, this is a good fit so far.
Managing a stream of data, transforming and combining from one or more sources. You have given no indication that this is required.
Second, you need to determine what RxJava 2 does not provide:
Network connections. This is provided by your existing libraries.
Database management. Again, this is provided in your existing solutions.
Now, you have to decide whether the firstlies add up to something you can benefit from, given the up-front costs of learning a new library.

Service Fabric dynamic partitioning

So I am doing some research into using Service Fabric for a very large application. One thing I need to have is a service that is partitioned by name, which seems fairly trivial at the application manifest level.
However, I really would like to be able to add and remove named partitions on the fly without having to republish the application.
Each partition represents our equivalent of a tenant, and we want to have a backend management app to add new tenants.
Each partition will be a long-running application that fires up a TCP server that uses a custom protocol, and I'll need to be able to query for the address by name from the cluster.
Is this possible with Service Fabric, and if so is there any documentation on this, or something I should be looking for?
Each partition represents our equivalent of a tenant, and we want to have a backend management app to add new tenants.
You need to rethink your model. Partitioning is for distributing data so it accessible fast, for read and write. But within the same logical container.
If you want to do some multitenant in Service Fabric you can deploy an Application multiple times to the cluster.
From Visual Studio it seems you can only have one instance of an Application. This is because in the ApplicationManifest.xml there are DefaultServices defined. This is okay for developing on the local Service Fabric cluster. For production you might want to consider deploying the application with powershell, this will open up the possibility to deploy the same application multiple times with settings for each instance(like: tenant name, security, ... )
And not only Applications can be deployed multiple times, stateful/stateless services as well. So you could have one application and for each tenant you deploy a service of a certain type. Services are findable via the naming service inside Service Fabric, see the FabricClient class for more info on that.
It is not possible to change the partition count for an existing application.
From https://azure.microsoft.com/en-us/documentation/articles/service-fabric-concepts-partitioning/#plan-for-partitioning (emphasis mine):
In rare cases, you may end up needing more partitions than you have initially chosen. As you cannot change the partition count after the fact, you would need to apply some advanced partition approaches, such as creating a new service instance of the same service type. You would also need to implement some client-side logic that routes the requests to the correct service instance, based on client-side knowledge that your client code must maintain.
You are encouraged to do up-front capacity planning to determine the maximum number of partitions you will need - and if you end up needing more, you'll need to implement some special client side handling to cope.
We had the same problem and ended up creating an instance of the service for each tenant. This is pretty easy to do and will scale to any number of tenants.

Microservice, amqp and service registry / discovery

I m studying Microservices architecture and I m actually wondering something.
I m quite okay with the fact of using (back) service discovery to make request able on REST based microservices. I need to know where's the service (or at least the front of the server cluster) to make requests. So it make sense to be able to discover an ip:port in that case.
But I was wondering what could be the aim of using service registry / discovery when dealing with AMQP (based only, without HTTP possible calls) ?
I mean, using AMQP is just like "I need that, and I expect somebody to answer me", I dont have to know who's the server that sent me back the response.
So what is the aim of using service registry / discovery with AMQP based microservice ?
Thanks for your help
AMQP (any MOM, actually) provides a way for processes to communicate without having to mind about actual IP addresses, communication security, routing, among other concerns. That does not necessarily means that any process can trust or even has any information about the processes it communicates with.
Message queues do solve half of the process: how to reach the remote service. But they do not solve the other half: which service is the right one for me. In other words, which service:
has the resources I need
can be trusted (is hosted on a reliable server, has a satisfactory service implementation, is located in a country where the local laws are compatible with your requirements, etc)
charges what you want to pay (although people rarely discuss cost when it comes to microservices)
will be there during the whole time window needed to process your service -- keep in mind that servers are becoming more and more volatile. Some servers are actually containers that can last for a couple minutes.
Those two problems are almost linearly independent. To solve the second kind of problems, you have resource brokers in Grid computing. There is also resource allocation in order to make sure that the last item above is correctly managed.
There are some alternative strategies such as multicasting the intention to use a service and waiting for replies with offers. You may have reverse auction in such a case, for instance.
In short, the rule of thumb is that if you do not have an a priori knowledge about which service you are going to use (hardcoded or in some configuration file), your agent will have to negotiate, which includes dynamic service discovery.

Rolling Over Streaming Connections During Upgrades

I am working on an application that uses Amazon Kinesis, and one of the things I was wondering about is how you can roll over an application during an upgrade without data loss on streams. I have heard about things like blue/green deployments and such, but I was wondering what is the best practice for upgrading a data streaming service so you don't loose data from your streams.
For example, my application has an HTTP endpoint that ingests data as a series of POST operations. If I want to replace the service with a newer version, how do I manage existing application streaming to my endpoint?
One common method is having a software load balancer (LB) with a virtual IP; behind this LB there would be at least two HTTP ingestion endpoints during normal operation. During upgrade, each endpoint is announced out and upgraded in turn. The LB ensures that no traffic is forwarded to an announced out endpoint.
(The endpoints themselves can be on separate VMs, Docker containers or physical nodes).
Of course, the stream needs to be finite; the TCP socket/HTTP stream is owned by one of the endpoints. However, as long as the stream can be stopped gracefully, the following flow works, assuming endpoint A owns the current ingestion:
Tell endpoint A not to accept new streams. All new streams will be redirected only to endpoint B by the LB.
Gracefully stop existing streams on endpoint A.
Upgrade A.
Announce A back in.
Rinse and repeat with endpoint B.
As a side point, you would need two endpoints with a load balanced (or master/slave) set-up if you require any reasonable uptime and reliability guarantees.
There are more bespoke methods which allow hot code swap on the same endpoint, but they are more bespoke and rely on specific internal design (e.g. separate process between networking and processing stack connected by IPC).