I am currently designing the archicture for an HA WebRTC installation (using Liveswitch Server). Among the requirements is the setup of an auto failover scenario for a postgresql database.
Since I intend to deploy nginx as a load balancer in a different part of the system already I was wondering whether the above postgresql scenario can be accomplished using nginx as well.
IMPORTANT: notification of the failover case to the admin via email or similar is a must (of course).
Is this possible using nginx and which setup should be chosen for the database instances (hot-standby, warm-standby, etc.)?
If not: what would be the solution of choice?
Related
I’ve been looking into Keycloak as an on-prem IAM and SSO solution for my company. One thing that I’m unclear on from reading the documentation is if Keycloak’s clustered mode can handle our requirements for instance federation across sites.
We have some remote manned sites that occasionally run critical telemetry-gathering processes. Our AD domain is replicated to those sites.
The issue is that there is a single internet link to the sites. If we had keycloak at the main office, and the internet link went down for a day, any software at the remote site that relies on keycloak to authenticate wouldn’t work (which would be a big problem).
Can we set up Keycloak in a cluster mode (ie, putting an instance at each site), so that if this link went out, remote users are able to connect to their local instance automatically and authenticate with local apps? What happens when the connection is restored and the databases are out of sync - does keycloak automatically repair this?
Cheers
In general answer is "yes", you can setup two keycloak instances in different locations, and link them with each other via cluster (under the hood it would be infinispan cache replication). But it depends on details of your infrastructure.
Main goal of Keycloak cluster is to perform sessions cache replication between nodes. So in simplest case you can setup two nodes that looks to same DB instance, and when first node goes down second would handle whole job, but if DB also goes down second node would be useless. In such case each site should have both separate Keycloak node and DB replica (how to achieve DB replication is out of scope of this topic). Third option is to use multitenancy feature of keycloak application adapter, in that case you secure application by two separate Keycloak instances, that know nothing about each other.
Try to start from this documentation article:
https://www.keycloak.org/docs/latest/server_installation/index.html#crossdc-mode
My company is interested in using a stand-alone Service Fabric cluster to manage communications with robots. In our scenario, each robot would host its own rosbridge server, and our Service Fabric application would maintain WebSocket clients to each robot. I envision a stateful service partitioned along device ids which opens connections on startup. It should monitor connection health via heartbeats, pass messages from the robots to some protocol gateway service, and listen to other services for messages to pass to the robots.
I have not seen discussion of this style of external communications in the Service Fabric documentation - I cannot tell if this is because:
There are no special considerations for managing WebSockets (or any two-way network protocol) this way from Service Fabric. I've seen no discussion of restrictions and see no reason, conceptually, why I can't do this. I originally thought replication would be problematic (duplicate messages?), but since only one replica can be primary at any time this appears to be a non-issue.
Service Fabric is not well-suited to bi-directional communication with external devices
I would appreciate some guidance on whether this architecture is feasible. If not, discussion on why it won't work will be helpful. General discussion of limitations around bi-directional communication between Service Fabric services and external devices is welcome. I would prefer if we could keep discussion to stand-alone clusters - we have no plans to use Azure services at this time.
Any particular reason you want SF to host the client and not the other way around?
Doing the way you suggest, I think you will face big challenges to make SF find these devices on your network and keep track of them, for example, Firewall, IPs, NAT, planned maintenance, failures, connection issues, unless you are planning to do it by hand.
From the brief description I saw in the docs your provided about rosbridge server, I could understand that you have to host it on a Server(like you would with a service fabric service) and your devices would connect to it, in this case, your devices would have installed the ROS to make this communication.
Regarding your concerns about the communication, service fabric services are just executable programs you would normally run on your local machine, if it works there will likely work on service fabric environment on premise, the only extra care you have to worry is the external access to the cluster(if in azure or network configurations) and service discovery.
In my point of view, you should use SF as the central point of communication, and each device would connect to SF services.
The other approach would be using Azure IoT Hub to bridge the communication between both. There is a nice Iot Hub + Service Fabric Sample that might be suitable for your needs.
Because you want to avoid Azure, you could in this case replace IoT Hub with another messaging platform or implement the rosbridge in your service to handle the calls.
I hope I understood everything right.
About the obstacles:
I think the major issue here is that bi-directional connection can be established between service replica and the robot.
This has two major problems:
Only primary replica has write access - i.e. only one replica would be able to modify state. This issue hence could be mitigated by creating a separate partition for each robot (but please remember that you can't change partition count after the service was created) or by creating a separate service instance for each robot (this would allow you to dynamically add or remove robots but would require additional logic related to service discoverability).
The replica can be shutdown (terminated), moved to another node (shutdown and start of new replica) or even demoted (the primary replica get's demoted to secondary and another secondary replica get's promoted to primary) by various reasons. So the service code and robot communication code should be able to handle this.
About WebSockets
This looks possible by implementing custom ICommunicationListener and other things using WebSockets.
I'm trying to implement an architecture that's similar to the coreos's production architecture (shown below)
Should I run the database as a central service or one or more of the workers?
I figured the database needs some kind of replication, which makes me think that putting it in the worker cluster makes more sense, but I'm just not sure.
This should be run as a worker. The central services are the basic things that come with CoreOS (mainly etcd). The workers host your applications, the database being one of them. You do have a persistence issue because your database will have state to remember between restarts. So, there is a bigger issue of how do you make that persistence? One was to do it is use a host file and give the database an affinity to that host and mount the host file. Another thing you might consider is running more than one database (if your db technology supports that) and replicate that database so you have two (or more) copies in different workers. (non-affinity). If your database creates transaction logs that can be applied to a backup, you can manage those transaction logs in a worker.
Another thing to consider is not using a container for your database. The database is a weird animal, its care and feeding is not like the rest of the applications. So it is reasonable (in my opinion) to have your database managed and maintained outside the scope of your cluster (but still reachable by the cluster).
So I am a little confused by reading the documents.
I want to setup AppFabric caching and hosting.
Can I do the following?
DC
SQL Server
AppFabric1
AppFabric2
All these computers are joined to the DC.
I want to be able to have AppFabric1 be the mainhost but also part of the cache cluster?
What about AppFabric2? or AppFabricX? How can I make them part of the cache cluster?
Do I have to make AppFabric1 and AppFabric2 configured in Windows as part of a cluster (i.e setup the entire environment as a cluster)?
Can I install AppFabric independently on AppFabric1 and 2 and have them cluster together and "make it work"? If so - how?
I see documentation about setting it up in a webfarm but also a workgroup... and that's it. nothing about computers joined to a domain.
I want to setup AppFabric caching and hosting.
Caching and Hosting are two totaly different things and generally don't share the same use cases.
AppFabric Caching provides an in-memory, distributed cache platform for Windows Server, previously named Velocity. The cache cluster is a collection of one or more instances of the Caching Service working together. You can easily add new cache host without restarting the cluster in the "storage location" (xml or sql server).
Can I install AppFabric independently on AppFabric1 and 2 and have
them cluster together and "make it work"? If so - how?
Don't worry... this can be done easily during installation. In addition, there are powerfull PS module to to the same thing.
AppFabric Hosting enhance the hosting of WCF and Workflow Foundation services in WAS (autostart, monitoring of hosted services, workflow persistence, ...). There is no cluster here and basically you just have to configure to monitoring/persistence DB for each server.
Just try it !
When you are adding the second node in the AppFabric cluster, make sure to choose the option Join Cluster (instead of New Cluster) and point to the path of the share where you stored the configuration (assuming that you used FILE SHARE to store the configuration of the cluster). The share that you used should be accessible from Appfabric2.
Scenario
Multiple application servers host web services written in Java, running in SpringSource dm Server. To implement a new requirement, they will need to query a read-only PostgreSQL database.
Issue
To support redundancy, at least two PostgreSQL instances will be running. Access to PostgreSQL must be load balanced and must auto-fail over to currently running instances if an instance should go down. Auto-discovery of newly running instances is desirable but not required.
Research
I have reviewed the official PostgreSQL documentation on this issue. However, that focuses on the more general case of read/write access to the database. Top google results tend to lead to older newsgroup messages or dead projects such as Sequoia or DB Balancer, as well as one active project PG Pool II
Question
What are your real-world experiences with PG Pool II? What other simple and reliable alternatives are available?
PostgreSQL's wiki also lists clustering solutions, and the page on Replication, Clustering, and Connection Pooling has a table showing which solutions are suitable for load balancing.
I'm looking forward to PostgreSQL 9.0's combination of Hot Standby and Streaming Replication.
Have you looked at SQL Relay?
The standard solution for something like this is to look at Slony, Londiste or Bucardo. They all provide async replication to many slaves, where the slaves are read-only.
You then implement the load-balancing independent of this - on the TCP layer with something like HAProxy. Such a solution will be able to do failover of the read connections (though you'll still loose transaction visibility on a failover, and have to start new transaction on the new slave - but that's fine for most people)
Then all you have left is failover of the master role. There are supported ways of doing it on all these systems. None of them are automatic by default (because automatic failover of a database master role is really dangerous - consider the situation you are in once you've got split brain), but they can be automated easily if the requirement needs this for the master as well.