Can Log4j's SocketAppenders support a many-to-one (or 3-to-2) pipeline? - sockets

I would like to implement the following logging architecture:
whereby I have three clients logging via a <SocketAppender> to a single load balancer (Nginx in this case), and just two log listeners (LogServer1, Logserver2) to consume the TCP traffic from all three clients.
Can all three apps' messages be consumed simultaneously like this? I have been prototyping and I am repeatedly finding that each client gets exclusive access to one listener, leaving latecomer clients' messages dropped completely. There must be N servers to handle N clients. Is this true?

Related

What are the options of routing HTTP connections to one specific instance out of many instances behind a load balancer?

Assume there is a system that accepts millions of simultaneous WebSocket connections from client applications. I was wondering if there is a way to route WebSocket connections to a specific instance behind a load balancer (or IP/Domain/etc) if clients provide some form of metadata, such as hash key, instance name, etc.
For instance, let's say each WebSocket client of the above system will always belong to a group (e.g. max group size of 100), and it will attempt to communicate with 99 other clients using the above system as a message gateway.
So the system's responsibility is to relay messages sent from clients in a group to other 99 clients in the same group. Clients won't ever need to communicate with other clients who belong to different groups.
Of course, one way to tackle this problem is to use Pubsub system, such that regardless of which instance clients are connected to, the server can simply publish the message to the Pubsub system with a group identifier and other clients can subscribe to the messages with a group identifier.
However, the Pubsub system can potentially encounter scaling challenges, excessive resource usage (single message getting published to thousands of instances), management overhead, latency increase, cost, and etc.
If it is possible to guarantee that WebSocket clients in a group will all be connected to the instance behind LB, we can skip using the Pubsub system and make things simpler, lower latency, and etc.
Would this be something that is possible to do, and if it isn't, what would be the best option?
(I am using Kubernetes in one of the cloud service providers if that matters.)
Routing in HTTP is generally based on the hostname and/or URL path. Sometimes to a lesser degree on other headers like cookies. But in this case it would mean that each group should have it's own unique URL.
But that part is easy, what I think you're really asking is "given arbitrary URLs, how can I get consistent routing?" which is much, much more complicated. The base concept is "consistent hashing", you hash the URL and use that to pick which endpoint to talk to. But then how to do deal with adding or removing replicas without scrambling the mapping entirely. That usually means using a hash ring and assigning portions of the hash space to specific replicas. Unfortunately this is the point where off-the-shelf tools aren't enough. These kinds of systems require deep knowledge of your protocol and system specifics so you'll probably need to rig this up yourself.

Communication between microservices - request data

I am dealing with communication between microservices.
For example (fictive example, just for the illustration):
Microservice A - Store Users (getUser, etc.)
Microservice B - Store Orders (createOrder, etc.)
Now if I want to add new Order from the Client app, I need to know user address. So the request would be like this:
Client -> Microservice B (createOrder for userId 5) -> Microservice A (getUser with id 5)
The microservice B will create order with details (address) from the User Microservice.
PROBLEM TO SOLVE: How effectively deal with communication between microservice A and microservice B, as we have to wait until the response come back?
OPTIONS:
Use RestAPI,
Use AMQP, like RabbitMQ and deal with this issue via RPC. (https://www.rabbitmq.com/tutorials/tutorial-six-dotnet.html)
I don't know what will be better for the performance. Is call faster via RabbitMQ, or RestAPI? What is the best solution for microservice architecture?
In your case using direct REST calls should be fine.
Option 1 Use Rest API :
When you need synchronous communication. For example, your case. This option is suitable.
Option 2 Use AMQP :
When you need asynchronous communication. For example when your order service creates order you may want to notify product service to reduce the product quantity. Or you may want to nofity user service that order for user is successfully placed.
I highly recommend having a look at http://microservices.io/patterns/index.html
It all depends on your service's communication behaviour to choose between REST APIs and Event-Based design Or Both.
What you do is based on your requirement you can choose REST APIs where you see synchronous behaviour between services
and go with Event based design where you find services needs asynchronous behaviour, there is no harm combining both also.
Ideally for inter-process communication protocol it is better to go with messaging and for client-service REST APIs are best fitted.
Check the Communication style in microservices.io
REST based Architecture
Advantage
Request/Response is easy and best fitted when you need synchronous environments.
Simpler system since there in no intermediate broker
Promotes orchestration i.e Service can take action based on response of other service.
Drawback
Services needs to discover locations of service instances.
One to one Mapping between services.
Rest used HTTP which is general purpose protocol built on top of TCP/IP which adds enormous amount of overhead when using it to pass messages.
Event Driven Architecture
Advantage
Event-driven architectures are appealing to API developers because they function very well in asynchronous environments.
Loose coupling since it decouples services as on a event of once service multiple services can take action based on application requirement. it is easy to plug-in any new consumer to producer.
Improved availability since the message broker buffers messages until the consumer is able to process them.
Drawback
Additional complexity of message broker, which must be highly available
Debugging an event request is not that easy.
Personally I am not a fan of using a message broker for RPC. It adds unnecessary complexity and overhead.
How do you host your long-lived RabbitMQ consumer in your Users web service? If you make it some static singleton, in your web service how do you deal with scaling and concurrency? Or do you make it a stand-alone daemon process? Now you have two User applications instead of one. What happens if your Users consumer slows down, by the time it consumes the request message the Orders service context might have timed-out and sent another message or given up.
For RPC I would suggest simple HTTP.
There is a pattern involving a message broker that can avoid the need for a synchronous network call. The pattern is for services to consume events from other services and store that data locally in their own database. Then when the time comes when the Orders service needs a user record it can access it from its own database.
In your case, your Users app doesn't need to know anything about orders, but your Orders app needs to know some details about your users. So every time a user is added, modified, removed etc, the Users service emits an event (UserCreated, UserModified, UserRemoved). The Orders service can subscribe to those events and store only the data it needs, such as the user address.
The benefit is that is that at request time, your Orders service has one less synchronous dependency on another service. Testing the service is easier as you have fewer request time dependencies. There are also drawbacks however such as some latency between user record changes occuring and being received by the Orders app. Something to consider.
UPDATE
If you do go with RabbitMQ for RPC then remember to make use of the message TTL feature. If the client will timeout, then set the message expiration to that period. This will help avoid wasted work on the part of the consumer and avoid a queue getting backed up under load. One issue with RPC over a message broker is that once a queue fills up it can add long latencies that take a while to recover from. Setting your message expiration to your client timeout helps avoid that.
Regarding RabbitMQ for RPC. Normally we use a message broker for decoupling and durability. Seeing as RPC is a synchronous communication, that is, we are waiting for a response, then durability is not a consideration. That leaves us decoupling. The question is does that decoupling buy you anything over the decoupling you can do with HTTP via a gateway or Docker service names?

Rolling Over Streaming Connections During Upgrades

I am working on an application that uses Amazon Kinesis, and one of the things I was wondering about is how you can roll over an application during an upgrade without data loss on streams. I have heard about things like blue/green deployments and such, but I was wondering what is the best practice for upgrading a data streaming service so you don't loose data from your streams.
For example, my application has an HTTP endpoint that ingests data as a series of POST operations. If I want to replace the service with a newer version, how do I manage existing application streaming to my endpoint?
One common method is having a software load balancer (LB) with a virtual IP; behind this LB there would be at least two HTTP ingestion endpoints during normal operation. During upgrade, each endpoint is announced out and upgraded in turn. The LB ensures that no traffic is forwarded to an announced out endpoint.
(The endpoints themselves can be on separate VMs, Docker containers or physical nodes).
Of course, the stream needs to be finite; the TCP socket/HTTP stream is owned by one of the endpoints. However, as long as the stream can be stopped gracefully, the following flow works, assuming endpoint A owns the current ingestion:
Tell endpoint A not to accept new streams. All new streams will be redirected only to endpoint B by the LB.
Gracefully stop existing streams on endpoint A.
Upgrade A.
Announce A back in.
Rinse and repeat with endpoint B.
As a side point, you would need two endpoints with a load balanced (or master/slave) set-up if you require any reasonable uptime and reliability guarantees.
There are more bespoke methods which allow hot code swap on the same endpoint, but they are more bespoke and rely on specific internal design (e.g. separate process between networking and processing stack connected by IPC).

Scaling WebSockets with a Message Queue

I have built a WebSockets server that acts as a chat message router (i.e. receiving messages from clients and pushing them to other clients according to a client ID).
It is a requirement that the service be able to scale to handle many millions of concurrent open socket connections, and I wish to be able to horizontally scale the server.
The architecture I have had in mind is to put the websocket server nodes behind a load balancer, which will create a problem because clients connected to different nodes won't know about each other. While both clients A and B enter via the LoadBalancer, client A might have an open connection with node 1 while client B is connected to node 2 - each node holds it's own dictionary of open socket connections.
To solve this problem, I was thinking of using some MQ system like ZeroMQ or RabbitMQ. All of the websocket server nodes will be subscribers of the MQ server, and when a node gets a request to route a message to a client which is not in the local connections dictionary, it will pub-lish a message to the MQ server, which will tell all the sub-scriber nodes to look for this client and issue the message if it's connected to that node.
Q1: Does this architecture make sense?
Q2: Is the pub-sub pattern described here really what I am looking for?
ZeroMQ would be my option - both architecture-wise & performance-wise
-- fast & low latency ( can measure your implementation performance & overheads, down to sub [usec] scale )
-- broker-less ( does not introduce another point-of-failure, while itself can have { N+1 | N+M } self-healing architecture )
-- smart Formal Communication Pattern primitives ready to be used ( PUB / SUB is the least cardinal one )
-- fair-queue & load balancing architectures built-in ( invisible for external observer )
-- many transport Classes for server-side internal multi-process / multi-threading distributed / parallel processing
-- ready to almost linear scaleability
Adaptive node re-discovery
This is a bit more complex subject. Your intention to create a feasible architecture will have to drill down into more details to solve.
Node authentication vs. peer-to-peer messaging
Node (re)-discovery vs. legal & privacy issues
Node based autonomous self-organising Agents vs. needs for central policy enforcement
To update this for 2021, we just solved this problem where we needed to design a system that could handle millions of simultaneous WS connections from IoT devices. The WS server just relays messages to our Serverless API backend that handles the actual logic. We chose to use docker and the node ws package using an auto-scaling AWS ECS Fargate cluster with an ALB in front of it.
This solved the main problem of routing messages, but then we had the same issue of how do we route response messages from the server. We initially thought of just keeping a central DB of connections, but routing messages to a specific Fargate instance behind an ALB didn't seem feasible.
Instead, we set up a simple sub/pub pattern using AWS SNS (https://aws.amazon.com/pub-sub-messaging/). Every WS server receives the response and then searches its own WS connections. Since each Fargate instance handles just routing (no logic), they can handle a lot of connections when we vertically scale them.
Update: To make this even more performant, you can use a persistent connection like Redis Pub/Sub to allow the response message to only go to one single server instead of every server.

Heartbeat Protocols/Algorithms or best practices

Recently I've added some load-balancing capabilities to a piece of software that I wrote. It is a networked application that does some data crunching based on input coming from a SQL database. Since the crunching can be pretty intensive I've added the capability to have multiple instances of this application running on different servers to split the load but as it is now the load balancing is a manual act. A user must specify which instances take which portion of the input domain.
I would like to take that to the next level and program the instances to automatically negotiate the diving up of the input data and to recognize if one of them "disappears" (has crashed or has been powered down) so that the remaining instances can take on the failed instance's workload.
In order to implement this I'm considering using a simple heartbeat protocol between the instances to determine who's online and who isn't and while this is not terribly complicated I'd like to know if there are any established heartbeat network protocols (based on UDP, TCP or both).
Obviously this happens a lot in the networking world with clustering, fail-over and high-availability technologies so I guess in the end I'd like to know if maybe there are any established protocols or algorithms that I should be aware of or implement.
EDIT
It seems, based on the answers, that either there are no well established heart-beat protocols or that nobody knows about them (which would imply that they aren't so well established after all) in which case I'm just going to roll my own.
While none of the answers offered what I was looking for specifically I'm going to vote for Matt Davis's answer since it was the closest and he pointed out a good idea to use multicast.
Thank you all for your time~
Distribued Interactive Simulation (DIS), which is defined under IEEE Standard 1278, uses a default heartbeat of 5 seconds via UDP broadcast. A DIS heartbeat is essentially an Entity State PDU, which fully defines the state, including the position, of the given entity. Due to its application within the simulation community, DIS also uses a concept referred to as dead-reckoning to provide higher frequency heartbeats when the actual position, for example, is outside a given threshold of its predicted position.
In your case, a DIS Entity State PDU would be overkill. I only mention it to make note of the fact that heartbeats can vary in frequency depending on the circumstances. I don't know that you'd need something like this for the application you described, but you never know.
For heartbeats, use UDP, not TCP. A heartbeat is, by nature, a connectionless contrivance, so it goes that UDP (connectionless) is more relevant here than TCP (connection-oriented).
The thing to keep in mind about UDP broadcasts is that a broadcast message is confined to the broadcast domain. In short, if you have computers that are separated by a layer 3 device, e.g., a router, then broadcasts are not going to work because the router will not transmit broadcast messages from one broadcast domain to another. In this case, I would recommend using multicast since it will span the broadcast domains, providing the time-to-live (TTL) value is set high enough. It's also a more automated approach than directed unicast, which would require the sender to know the IP address of the receiver in order to send the message.
Broadcast a heartbeat every t using UDP; if you haven't heard from a machine in more than k*t, then it's assumed down. Be careful that the aggregate bandwidth used isn't a drain on resources. You can use IP broadcast addresses, or keep a list of specific IPs you're doing work for.
Make sure the heartbeat includes a "reboot count" as well as "machine ID" so that you know previous server state isn't around.
I'd recommend using MapReduce if it fits. It would save a lot of work.
I'm not sure this will answer the question but you might be interested by the way Weblogic Server clustering work under the hood. From the book Mastering BEA WebLogic Server:
[...] WebLogic Server clustering provides a loose coupling of the servers in the cluster. Each server in the cluster is independent and does not rely on any other server for any fundamental operations. Even if contact with every other server is lost, each server will continue to run and be able to process the requests it receives. Each server in the cluster maintains its own list of other servers in the cluster through periodic heartbeat messages. Every 10 seconds, each server sends a heartbeat message to the other servers in the cluster to let them know it is still alive. Heartbeat messages are sent using IP multicast technology built into the JVM, making this mechanism efficient and scalable as the number of servers in the cluster gets large. Each server receives these heartbeat messages from other servers and uses them to maintain its current cluster membership list. If a server misses receiving three heartbeat messages in a row from any other server, it takes that server out of its membership list until it receives another heartbeat message from that server. This heartbeat technology allows servers to be dynamically added and dropped from the cluster with no impact on the existing servers’ configurations.
Cisco content switches are a hardware solution for this problem. They implement a virtual IP address as a front end to multiple real servers, whose real IP addresses are known to the switch. The switch periodically sends HTTP HEAD requests to the web servers, to verify they are still running (which the switch software calls a "keepalive", although this doesn't keep the server itself alive). The Cisco switch accepts traffic on the virtual IP and forwards it to the actual web servers, using configurable load balancing such as round-robin, or user-defined load balancing.
These switches retail in the $3-10K range, although my business partner picked one up on eBay for about $300 a year ago. If you can afford one, they do represent a proven hardware solution to the question of how to have a service spread transparently across multiple servers. Redhat includes a built-in port configuration so that you could implement your own Cisco switch using a cheap RedHat box. Google for "virtual ip address" and "cisco content router" for more information.
In addition to trying hardware load-balancers, you can also try a free-open-source load-balancing software application such as HAProxy, available for Linux and the BSDs.