Issues with failback for logical replication using postgres - postgresql

I am using PostgreSQL 13 to set up logical replication. When the original active node (A) becomes the secondary, the prior secondary(B) which is now active node needs to sync to the (A) node.
To summarize the issue:
Node A is active and fails at some point of time. Node B takes over and now running is active and accepting I/O from application. Now when Node A is recovered from failure and ready to become active again. In order to happen this Node A is trying to get the data which may be have been added while Node A was down. To get the this data Node A is creating a subscription to Node B which is now acting as a publisher. Issue is that this subscription on Node A fails as Node A already has some data before it went down and this data results in conflicts.
So what are my options here?

Related

Redis master wipes out Redis slave data on restart

Sorry this is my first time working with Redis. I have a redis master deployment and a redis slave deployment (via K8s). The replication from master to slave is working as expected. However, when I kill the master altogether and bring it back up again, the sync wipes out the data of slave as well.
I have tried enabling appendonly on either and both but had no luck.
Question # 1: How can I preserve the data in slave when the master node comes back to life?
Question # 2: Is it a practice to sync data back from slave into master?
Yes, the correct practice would be to promote the slave to master and then slave the restarted node to it to sync the state. If you bring up an empty node that is declared as the master, the slave will faithfully replicate whatever is - or isn't - on it.
You can configure periodic saving to disk, so that you can restart a master node and have it load the state as of the last save to disk. You can also manually cause a save to disk via the SAVE command. See the persistence chapter in the manual. If you SAVE to disk, then immediately restart the master node, the state as saved to disk will be loaded back up. Any writes that occur between the last SAVE and node shutdown will be lost.
Along these lines, Redis HA is often done with Redis Sentinel, which manages auto-promotion and discovery of master nodes within a replicated cluster, so that the cluster can survive and auto-heal from the loss of the current master. This lets slaves replicate from the active master, and on the loss of the master (or a network partition that causes a quorum of sentinels to lose visibility to the master), the Sentinel quorum will elect a new master and coordinate the re-slaving of other nodes to it for ensure uptime. This is an AP system, as Redis replication is eventually consistent, and therefore does have the potential to lose writes which are not replicated to a slave or flushed to disk before node shutdown.

How to prevent data inconsistency when one node lose network connectivity in kubernetes

I have a situation where I have a cluster with a service (we named it A1) and its data which is on a remote storage like cephFS in my case. the number of replica for my service is 1. Assume I have 5 node in my cluster and service A1 reside in node 1. something happens with node 1 network and it lose the connectivity with cephFS cluster and my Kubernetes cluster as well (or docker-swarm). cluster mark it as unreachable and start a new service (we named it A2) on node 2 to keep replica as 1. after for example 15 min node 1 network fixed and node 1 get back to cluster and have service A1 running already (assume it didn't crash while it loses its connectivity with remote storage).
I worked with docker-swarm and recently switched to Kubernetes. I see Kuber has a feature call StatefulSet but when I read about it. it doesn't answer my question. (or I may miss something when I read about it)
Question A: what does cluster do. does it keep A2 and shutdown A1 or let A1 keeps working and shutdown A2 (Logically it should shutdown A1)
Question B (and my primary question as well!): Assume that the cluster wants to shutdown on of these services (for example A1). This service does some save on storage when it wants to shutdown. in this case state A1 save to disk and A2 with newer state saved something before A1 network get fixed.
There must be some locks when we mount the volume to the container in which when it attached to one container other container cant write to that (let A1 failed when want to save its old state data on disk)
The way it works - using docker swarm terminology -
You have a service. A service is a description of some image you'd like to run, how many replicas and so on. Assuming the service specifies at least 1 replica should be running it will create a task that will schedule a container on a swarm node.
So the service is associated with 0 to many tasks, where each task has 0 - if its still starting or 1 container - if the task is running or stopped - which is on a node.
So, when swarm (the orcestrator) detects a node go offline, it principally sees that a number of tasks associated with a service have lost their containers, and so the replication (in terms of running tasks) is no longer correct for the service, and it creates new tasks which in turn will schedule new containers on the available nodes.
On the disconnected node, the swarm worker notices that it has lost connection to the swarm managers so it cleans up all the tasks it is holding onto as it no longer has current information about them. In the process of cleaning the tasks up, the associated containers get stopped.
This is good because when the node finally reconnects there is no race condition where there are two tasks running. Only "A2" is running and "A1" has been shut down.
This is bad if you have a situation where nodes can lose connectivity to the managers frequently, but you need the services to keep running on those nodes regardless, as they will be shut down each time the workers detach.
The process on K8s is pretty much the same just change the terminology.

Mongo cluster, safe reboot for secondary

Let's say we have a Mongo cluster (3 or more nodes). Realized that even quick restart of Secondary node affect Primary. We need to shutdown Secondary for some reason for short time. What is the best/correct procedure (with particular commands examples please)? Should we remove the node from cluster or just set it to maintenance is enough, like?:
mongocluster:SECONDARY> db.adminCommand({"replSetMaintenance":true})
does this command affect only one particular Secondary node where it was applied?
do we need to switch the particular node to hidden mode for maintenance also?
do we need to switch the particular node to delayed replica mode for maintenance also?

Zooker Failover Strategies

We are young team building an applicaiton using Storm and Kafka.
We have common Zookeeper ensemble of 3 nodes which is used by both Storm and Kafka.
I wrote a test case to test zooker Failovers
1) Check all the three nodes are running and confirm one is elected as a Leader.
2) Using Zookeeper unix client, created a znode and set a value. Verify the values are reflected on other nodes.
3) Modify the znode. set value in one node and verify other nodes have the change reflected.
4) Kill one of the worker nodes and make sure the master/leader is notified about the crash.
5) Kill the leader node. Verify out of other two nodes, one is elected as a leader.
Do i need i add any more test case? additional ideas/suggestion/pointers to add?
From the documentation
Verifying automatic failover
Once automatic failover has been set up, you should test its operation. To do so, first locate the active NameNode. You can tell which node is active by visiting the NameNode web interfaces -- each node reports its HA state at the top of the page.
Once you have located your active NameNode, you may cause a failure on that node. For example, you can use kill -9 to simulate a JVM crash. Or, you could power cycle the machine or unplug its network interface to simulate a different kind of outage. After triggering the outage you wish to test, the other NameNode should automatically become active within several seconds. The amount of time required to detect a failure and trigger a fail-over depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc daemons as well as the NameNode daemons in order to further diagnose the issue.
more on setting up automatic failover

MongoDB share-nothing slaves

I'd like to use mongodb to distribute a cached database to some distributed worker nodes I'll be firing up in EC2 on demand. When a node goes up, a local copy of mongo should connect to a master copy of the database (say, mongomaster.mycompany.com) and pull down a fresh copy of the database. It should continue to replicate changes from the master until the node is shut down and released from the pool.
The requirements are that the master need not know about each individual slave being fired up, nor should the slave have any knowledge of other nodes outside the master (mongomaster.mycompany.com).
The slave should be read only, the master will be the only node accepting writes (and never from one of these ec2 nodes).
I've looked into replica sets, and this doesn't seem to be possible. I've done something similar to this before with a master/slave setup, but it was unreliable. The master/slave replication was prone to sudden catastrophic failure.
Regarding replicasets: While I don't imagine you could have a set member invisible to the primary (and other nodes), due to the need for replication, you can tailor a particular node to come pretty close to what you want:
Set the newly-launched node to priority 0 (meaning it cannot become primary)
Set the newly-launched node to 'hidden'
Here are links to more info on priority 0 and hidden nodes.