I am thinking of using Citus opensource for dualnode cluster - my questions are basically 2:
- if this kind of clustering is available - in the case of a failover is the slave node promoted to master? If yes - how - does it use WAL?
- If such a way of clusterisation is not possible what is an alternative for that except pgpool?
Thank you.
Citus isn't a high-availability solution for single-node PostgreSQL. Citus shards/partitions your data across multiple servers, and can thus use multiple CPU cores in parallel for your queries or transactions.
Citus is suitable for a variety of use-cases, and you can find more information on those here.
For high-availability, Citus can replicate data across multiple nodes, or you can set up streaming replication for each worker node. Citus Cloud uses streaming replication for each node, and you can find more information on how Citus Cloud manages HA on our documentation.
Related
I have a use-case where I have to replicate Apache Ignite data(persistent + non-persistent) from one cluster(hosted on AWS) to another(hosted on GCP) and eventually shut down the original cluster.
I have come across GridGain. They have DataCenter replication and Kafka Connect integration methods available. Though this looks promising, it is available in enterprise edition.
I am more inclined towards using open-source. So, Need guidance if there's a good way of doing this replication. Please point me to the resources.
Also, guide on which replication method of Gridgain should be considered.
I have a Aurora postgresql cluster in AWS and have one DB instance in this cluster. The postgresql DB is only used for write not for read. I use it as a backup database. I know Aurora has read scaling policy and I can create multiple DB instances in this cluster to improve read performance. But it doesn't benefit my case (write only). My question is that is there any benefits for me to spin up multiple db instances in the cluster. Aurura postgresql is a single master mode which means only one instance can take write. If I deploy multiple instances/replicas, they are basically useless. Do I understand it correctly?
Yes, you are correct. In your case there is no reason to launch additional instances in the cluster.
In the future, you may be able to use Aurora multi-master to give you more performance for writes. This is only available for MySQL 5.6 at the moment. See https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html
I have created a replicated Postgresql database (Master - Slave). I did this with an already existing Ansible Playbook (Role) , which I don't fully understand yet. The cluster currently consists of only 2 databases on different VMs.
So I want to test this replication now.
Unfortunately I have little experience with Postgresql.
How can I control whether they connect stable?
If the slave really takes over the task if the master should fail?
Many thanks for any information, tips & tricks.
Postgresql v. 9.6
Official PostgreSQL does not yet support automatic failover (Although there are multiple third-party projects which support this feature). Therefore if the deployment you have mentioned is only official PostgreSQL, after master failure, none of replicas take over the write task. But they can answer read queries if they are configured as hot_standby.
If you want to check the state of replication, in master you can check out pg_stat_replication in master.
Also these official docs would help you understand Postgres streaming replication & failover better:
https://www.postgresql.org/docs/9.6/warm-standby.html#STREAMING-REPLICATION
https://www.postgresql.org/docs/9.6/warm-standby-failover.html
For database directories for MongoDB, Cassandra or Elasticsearch clusters with high availability, should I use EBS or EFS? MongoDB, Cassnadra and Elasticsearch clusters take care of replicating data across nodes if they are configured to have replication factor > 1, so EFS replication feature may not be needed I giuess.
EBS - for databases
EFS - for file sharing across applications, VMs etc
Here is a good article that differentiates between the storage types
https://dzone.com/articles/confused-by-aws-storage-options-s3-ebs-amp-efs-explained
EFS is for multiple servers having access to the same set of files. Cassandra has replication built in, so it has no use for that feature. You would not want multiple Cassandra nodes accessing the same files anyway as each node manages its own sstables.
Not to mention Cassandra is disk intensive and gets angry if there is latency. Cassandra connections time out really easily. So, using an NFS mount (EFS) instead of a “local” disk is just a bad idea.
Read this if you haven’t already: https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/
(Can’t speak for other databases like MongoDB.)
I need the proper way of failover mechanism for mongodb on aws ec2. I know failover can be accomplished by replica sets, but what is the best way to fire a new mongo installed ubuntu-ec2 ami node and add it to replica set again automatically (with zero manual operation) and return the replica set to it's proper state ?
EBS has some problems, but if I use local instance storage, I will lost the dead nodes data, but does the replica got all the master data and so is replaca is enough to recover everthing (on mongo 1.8 with journaling), or do I have to use only EBS ?
How should I start mongo instances, If I should start with repair option, how can I sperate node's first run from failover restart ?
Regards,
The easiest way to bring up new nodes is to bring up a new node with a recent backup.
So now it's a question of how you do your backup and how you restore from the backup quickly.
The MongoDB site has a write up for backups (in general) and backups on EC2 specifically. There's also a write-up for adding a new set member.
You can do this with instance storage or EBS drives, but you'll need different strategies for each. There's really no single way to do this, so I would check out the docs I've linked to for a primer.
Highly recommend reading Sean Coates' article on mutli-node MongoDB Elections, failover and AWS - specifically, the subtlety on distributed arbiter nodes (e.g., make sure to give yourself a voting majority when an AZ goes down). A similar recommendation can be found in a comment on this (now-closed) MongoDB vs. Cassandra thread.