DRP for postgres-xl - postgresql

After installing and setting up a 2 node cluster of postgres-xl 9.2, where coordinator and GTM are running on node1 and the Datanode is set up on node2.
Now before I use it in production I have to deliver a DRP solution.
Does anyone have a DR plan for postgres-xl 9.2 architechture?
Best Regards,
Aviel B.

So from what you described you only have one of each node... What are you expecting to recover too??
Postgres-XL is a clustered solution. If you only have one of each node then you have no cluster and not only are you not getting any scaling advantage it is actually going to run slower than stand alone Postgres. Plus you have nothing to recover to. If you lose either node you have completely lost the database.
Also the docs recommend you put the coordinator and data nodes on the same server if you are going to combine nodes.
So for the simplest solution in Replication mode you would need something like
Server1 GTM
Server2 GTM Proxy
Server3 Coordinator 1 & DataNode 1
Server4 Coordinator 2 & DataNode 2
Postgres-XL has no fail over support so any failure will require manual intervention.
If you use the replication DISTRIBUTED BY option you would just remove the failing node from the cluster and restart everything.
If you used another DISTRIBUTED BY options then data is shared over multiple nodes which means if you lose any node you lose everything. So for this option you will need to have a slave instance of every data node and coordinator node you have. If one of the nodes fails then you would remove that node from the cluster and replace it with its slave backup node. Then restart it all.

Related

ProxySQL vs MaxScale on Kubernetes

I'm looking to set up a writing proxy for our MariaDB database on Kubernetes. The problem we are currently having is that we only have one Write master on our 3 master galera cluster setup. So even though we have ours pods replication properly, if our first node goes down then our other two masters end up failing because they are not able to be written to.
I saw this was a possible option to use either ProxySQL or MaxScale for Write proxying, but I'm not sure if I'm reading their uses properly. Do I have the right idea looking to deploy either of these two applications/services on Kubernetes to fix my problem? Would I be able to write to any of the Masters in the cluster?
MaxScale will handle selecting which server to write to as long as you use the readwritesplit router and the galeramon monitor.
Here's an example configuration for MaxScale that does load balancing of reads but sends writes to one node:
[maxscale]
threads=auto
[node1]
type=server
address=node1-address
port=3306
[node2]
type=server
address=node2-address
port=3306
[node3]
type=server
address=node3-address
port=3306
[Galera-Cluster]
type=monitor
module=galeramon
servers=node1,node2,node3
user=my-user
password=my-password
[RW-Split-Router]
type=service
router=readwritesplit
cluster=Galera-Cluster
user=my-user
password=my-password
[RW-Split-Listener]
type=listener
service=RW-Split-Router
protocol=mariadbclient
port=4006
The reason writes are only done on one node at a time is because doing it on multiple Galera nodes won't improve write performance and it results in conflicts when transactions are committed (applications seem to rarely handle these).

Postgres Patroni and etcd on the Same Machine

Assuming I have 2 postgres servers (1 master and 1 slave) and I'm using Patroni for high availability
1) I intend to have three-machine etcd cluster. Is it OK to use the 2 postgres machines also for etcd + another server, or it is preferable to use machines that are not used by Postgres?
2) What are my options of directing the read request to the slave and the write requests to the master without using pgpool?
Thanks!
yes, it is the best practice to run etcd on the two PostgreSQL machines.
the only safe way to do that is in your application. The application has to be taught to read from one database connection and write to another.
There is no safe way to distinguish a writing query from a non-writing one; consider
SELECT delete_some_rows();
The application also has to be aware that changes will not be visible immediately on the replica.
Streaming replication is of limited use when it comes to scaling...

Is it possible to set up an ejabberd cluster in master-slave mode where data is not being replicated?

I am trying to setup a very simple cluster of 2 ejabberd nodes. However, while trying to go through the official ejabberd documentation and using the join_cluster argument that comes along with the ejabberdctl script, I always end up with a multi-master cluster where both the mnesia databases have replicated data.
Is it possible to set up a ejabberd cluster in master-slave mode? And if yes, then what I am I missing?
In my understanding, a slave get the data replicated but would simply not be active. The slave needs the data to be able to take over the task of the master at some point.
It seems to means that the core of the setup you describe is not about disabling replication but about not sending traffic to the slave, no ?
In that case, this is just a matter of configuring your load balancing mechanism to route the traffic accordingly to your preference..

Zooker Failover Strategies

We are young team building an applicaiton using Storm and Kafka.
We have common Zookeeper ensemble of 3 nodes which is used by both Storm and Kafka.
I wrote a test case to test zooker Failovers
1) Check all the three nodes are running and confirm one is elected as a Leader.
2) Using Zookeeper unix client, created a znode and set a value. Verify the values are reflected on other nodes.
3) Modify the znode. set value in one node and verify other nodes have the change reflected.
4) Kill one of the worker nodes and make sure the master/leader is notified about the crash.
5) Kill the leader node. Verify out of other two nodes, one is elected as a leader.
Do i need i add any more test case? additional ideas/suggestion/pointers to add?
From the documentation
Verifying automatic failover
Once automatic failover has been set up, you should test its operation. To do so, first locate the active NameNode. You can tell which node is active by visiting the NameNode web interfaces -- each node reports its HA state at the top of the page.
Once you have located your active NameNode, you may cause a failure on that node. For example, you can use kill -9 to simulate a JVM crash. Or, you could power cycle the machine or unplug its network interface to simulate a different kind of outage. After triggering the outage you wish to test, the other NameNode should automatically become active within several seconds. The amount of time required to detect a failure and trigger a fail-over depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc daemons as well as the NameNode daemons in order to further diagnose the issue.
more on setting up automatic failover

MongoDB share-nothing slaves

I'd like to use mongodb to distribute a cached database to some distributed worker nodes I'll be firing up in EC2 on demand. When a node goes up, a local copy of mongo should connect to a master copy of the database (say, mongomaster.mycompany.com) and pull down a fresh copy of the database. It should continue to replicate changes from the master until the node is shut down and released from the pool.
The requirements are that the master need not know about each individual slave being fired up, nor should the slave have any knowledge of other nodes outside the master (mongomaster.mycompany.com).
The slave should be read only, the master will be the only node accepting writes (and never from one of these ec2 nodes).
I've looked into replica sets, and this doesn't seem to be possible. I've done something similar to this before with a master/slave setup, but it was unreliable. The master/slave replication was prone to sudden catastrophic failure.
Regarding replicasets: While I don't imagine you could have a set member invisible to the primary (and other nodes), due to the need for replication, you can tailor a particular node to come pretty close to what you want:
Set the newly-launched node to priority 0 (meaning it cannot become primary)
Set the newly-launched node to 'hidden'
Here are links to more info on priority 0 and hidden nodes.