What are the deployment forms of opengauss database? What are their advantages and disadvantages? - frameworks

openGauss is an open source relational database management system, which deeply integrates Huawei's years of experience in the database field. Combined with the needs of enterprise scenarios, what deployment forms does it support in the enterprise architecture?
Can single node installation be used for personal learning?
What is the difference with PostgreSQL?

Deployment Solutions:
The openGauss can be deployed in standalone mode or with one primary node and multiple standby nodes.
Common Concepts:
1.Standalone :There is only one database instance.
2.Primary/Standby
There are primary and standby database instances in the system. The primary instance supports read and write, and the standby instance supports read-only.
3.One primary and multiple standbys
The system has one primary node and multiple standby nodes. Up to 8 standby nodes are supported.
4.Hot/Cold backup
Cold backup: There is only a simple backup set that cannot provide services.
Hot backup: Backup databases can provide services for external systems.

Related

PostgreSQL in multi region architecture in azure

I am designing a multi-region architecture from my current architecture with the PostgreSQL database. I want to use the Azure front door to enhance the performance based on the source region the user is running the software from. So The architecture has to be active/active or active/ standby? and depending on this what I can do for the database because from my research the replica of PostgreSQL is only read-only. When the user connects from one region the database in that region became the primary database and allow write/read mode?

Kubernetes - application-consistent backup

Is it possible to perform a backup on Kubernetes in application-consistent manner?
Some of the backup solutions I found mainly base on freezing the pod and then launching the backup to maintain consistency (Heptio's Ark for example.)
The idea of application-consistent backups is to capture all data in memory and all transactions in process. This is performed by using some type of client software co-resident with the database application to quiesce the database application, flush its memory cache, complete all its writes in order and then perform the backup.
In its turn, Kubernetes operates with specifications of resources (e.g., Deployments, Services, etc.) and their statuses, and in any given time the resource status must be the same as defined in the specification. For storing any important data in Kubernetes, persistent volumes are used. In other words, you cannot perform a backup in an application-consistent manner on Kubernetes, because the main idea of it is different.
It is possible that a specific application for a specific database exists and allows implementing such type of backup. But it is related to that application but not to Kubernetes itself.

Will "PostgreSQL Streaming Replication" fit this use case?

I am designing an application for public organizations.
The purpose is to record data (text and video streams) which will be produced in "local" offices, where connectivity is not guaranteed, and where the power will be available only during the occurrences of meetings.
One of the requisites of the project is the "locality" of the data storage, since data is considered "sensitive" and "important".
One second requisite of the project is to publish to a web server a portion of the data produced during the meetings.
The database server shall be PostgreSQL.
I plan to set up a second PostgreSQL database server on the web infrastructure hosting the web server, and synchronize it with the "local" database.
The "public" database will be accessed only by *selection queryes" (no writes).
I see PostgreSQL does implement "Streaming Replication" PostgreSQL Streaming Replication since version 9.0.
The question(s):
Is PostgreSQL Streaming Replication ready for primetime?
Does it fit the use case I describe above?
Should I expect any major problem?
Could you suggest alternative, better solutions?
Yes it is the best solution for your case you should know that
the master database and standby database will be 100% identiques
standby database will not allow to write (read only)
If you have the configuration of master - standby you will not have problems , but if you use master - master configuration , it may cause some problems .

How to setup cross region replica of AWS RDS for PostgreSQL

I have a RDS for PostgreSQL setup in ASIA and would like to have a read copy in US.
But unfortunately just found from the official site that only RDS for MySQL has cross-region replica but not for PostgreSQL.
And I saw this page introduced other ways to migrate data in to and out of RDS for PostgreSQL.
If not buy an EC2 to install a PostgreSQL by myself in US, is there any way the synchronize data from ASIA RDS to US RDS?
It all depends on the purpose of your replication. Is it to provide a local data source and avoid network latencies ?
Assuming that your goal is to have cross-region replication, you have a couple of options.
Custom EC2 Instances
You can create your own EC2 instances and install PostgreSQL so you can customize replication behavior.
I've documented configuring master-slave replication with PostgreSQL on my blog: http://thedulinreport.com/2015/01/31/configuring-master-slave-replication-with-postgresql/
Of course, you lose some of the benefits of AWS RDS, namely automated multi-AZ redundancy, etc., and now all of a sudden you have to be responsible for maintaining your configuration. This is far from perfect.
Two-Phase Commit
Alternate option is to build replication into your application. One approach is to use a database driver that can do this, or to do your own two-phase commit. If you are using Java, some ideas are described here: JDBC - Connect Multiple Databases
Use SQS to uncouple database writes
Ok, so this one is the one I would personally prefer. For all of your database writes you should use SQS and have background writer processes that take messages off the queue.
You will need to have a writer in Asia and a writer in the US regions. To publish on SQS across regions you can utilize SNS configuration that publishes messages onto multiple queues: http://docs.aws.amazon.com/sns/latest/dg/SendMessageToSQS.html
Of course, unlike a two phase commit, this approach is subject to bugs and it is possible for your US database to get out of sync. You will need to implement a reconciliation process -- a simple one can be a pg_dump from Asian and pg_restore into US on a weekly basis to re-sync it, for instance. Another approach can do something like a Cassandra read-repair: every 10 reads out of your US database, spin up a background process to run the same query against Asian database and if they return different results you can kick off a process to replay some messages.
This approach is common, actually, and I've seen it used on Wall St.
So, pick your battle: either you create your own EC2 instances and take ownership of configuration and devops (yuck), implement a two-phase commit that guarantees consistency, or relax consistency requirements and use SQS and asynchronous writers.
This is now directly supported by RDS.
Example of creating a cross region replica using the CLI:
aws rds create-db-instance-read-replica \
--db-instance-identifier DBInstanceIdentifier \
--region us-west-2 \
--source-db-instance-identifier arn:aws:rds:us-east-1:123456789012:db:my-postgres-instance

Load Balancing and Failover for Read-Only PostgreSQL Database

Scenario
Multiple application servers host web services written in Java, running in SpringSource dm Server. To implement a new requirement, they will need to query a read-only PostgreSQL database.
Issue
To support redundancy, at least two PostgreSQL instances will be running. Access to PostgreSQL must be load balanced and must auto-fail over to currently running instances if an instance should go down. Auto-discovery of newly running instances is desirable but not required.
Research
I have reviewed the official PostgreSQL documentation on this issue. However, that focuses on the more general case of read/write access to the database. Top google results tend to lead to older newsgroup messages or dead projects such as Sequoia or DB Balancer, as well as one active project PG Pool II
Question
What are your real-world experiences with PG Pool II? What other simple and reliable alternatives are available?
PostgreSQL's wiki also lists clustering solutions, and the page on Replication, Clustering, and Connection Pooling has a table showing which solutions are suitable for load balancing.
I'm looking forward to PostgreSQL 9.0's combination of Hot Standby and Streaming Replication.
Have you looked at SQL Relay?
The standard solution for something like this is to look at Slony, Londiste or Bucardo. They all provide async replication to many slaves, where the slaves are read-only.
You then implement the load-balancing independent of this - on the TCP layer with something like HAProxy. Such a solution will be able to do failover of the read connections (though you'll still loose transaction visibility on a failover, and have to start new transaction on the new slave - but that's fine for most people)
Then all you have left is failover of the master role. There are supported ways of doing it on all these systems. None of them are automatic by default (because automatic failover of a database master role is really dangerous - consider the situation you are in once you've got split brain), but they can be automated easily if the requirement needs this for the master as well.