I created an RDS Proxy with existing Aurora PostgreSQL cluster.
But I want to pair the proxy with specific read replica instance of the cluster. Is that possible?
From what AWS claims about RDS proxy:
The same consideration applies for RDS DB instances in replication configurations. You can associate a proxy only with the writer DB instance, not a read replica.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/rds-proxy.html
Should be possible now as per https://aws.amazon.com/about-aws/whats-new/2021/03/amazon-rds-proxy-adds-read-only-endpoints-for-amazon-aurora-replicas/
Try RDS Proxy Endpoint, which allows you to get use of read replicas:
You can create and connect to read-only endpoints called reader endpoints when you use RDS Proxy with Aurora clusters. These reader endpoints help to improve the read scalability of your query-intensive applications. Reader endpoints also help to improve the availability of your connections if a reader DB instance in your cluster becomes unavailable.
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/rds-proxy.html#rds-proxy-endpoints
Related
RDS Postgres Replicas can scale up to 5 replicas. But when I create a replica, it creates it as a single instance, not as a cluster.
If I want to use RDS Postgres Read Replica clusters so that my single application can handle high TPS and the TPS can be shared by multiple RDS Replicas.
In know this is possible with Aurora replicas, as Aurora creates a cluster of replicas which has single endpoint and which can scale in or scale out. But All normal RDS
Postgres Replicas are created like single instances with different endpoints.
Is it possible to make RDS postgres replicas as a cluster with 1 endpoint?
Clusters are for Aurora, not for RDS. So you have to make sure you choose Aurora when you try to create your Database in AWS Console:
#Marin is correct.
RDS does not provide auto load balancing between running reader instances.
You have to manage load balancing between replica instances yourself.
In Aurora, there is auto load balancing as well as auto scaling amongst different reader instances.
I have been learning Kubernetes for a few weeks and now I am trying to figure out the right way to connect a web server to a statefulset correctly.
Let's say I deployed a master-slave Postgres statefulset and now I will connect my web server to it. By using a cluster IP service, the requests will be load balanced across the master and the slaves for both reading (SELECT) and writing (UPDATE, INSERT, DELETE) records, right? But I can't do that because writing requests should be handled by the master. However, when I point my web server to the master using the headless service that will give us a DNS entry for each pod, I won't get any load balancing to the other slave replications and all of the requests will be handled by one instance and that is the master. So how am I supposed to connect them the right way? By obtaining both load balancing to all replications along with the slave in reading records and forwarding writing records requests to the master?
Should I use two endpoints in the web server and configure them in writing and reading records?
Or maybe I am using headless services and statefulsets the wrong way since I am new to Kubernetes?
Well, your thinking is correct - the master should be read-write and replicas should be read only. How to configure it properly? There are different possible approaches.
First approach is what you thinking about, to setup two headless services - one for accessing primary instances, the second one to access to the replica instances - good example is Kubegres:
In this example, Kubegres created 2 Kubernetes Headless services (of default type ClusterIP) using the name defined in YAML (e.g. "mypostgres"):
a Kubernetes service "mypostgres" allowing to access to the Primary PostgreSql instances
a Kubernetes service "mypostgres-replica" allowing to access to the Replica PostgreSql instances
Then you will have two endpoints:
Consequently, a client app running inside a Kubernetes cluster, would use the hostname "mypostgres" to connect to the Primary PostgreSql for read and write requests, and optionally it can also use the hostname "mypostgres-replica" to connect to any of the available Replica PostgreSql for read requests.
Check this starting guide for more details.
It's worth noting that there are many database solutions which are using this approach - another example is MySQL. Here is a good article in Kubernetes documentation about setting MySQL using Stateful set.
Another approach is to use some middleware component which will act as a gatekeeper to the cluster, for example Pg-Pool:
Pg pool is a middleware component that sits in front of the Postgres servers and acts as a gatekeeper to the cluster.
It mainly serves two purposes: Load balancing & Limiting the requests.
Load Balancing: Pg pool takes connection requests and queries. It analyzes the query to decide where the query should be sent.
Read-only queries can be handled by read-replicas. Write operations can only be handled by the primary server. In this way, it loads balances the cluster.
Limits the requests: Like any other system, Postgres has a limit on no. of concurrent connections it can handle gracefully.
Pg-pool limits the no. of connections it takes up and queues up the remaining. Thus, gracefully handling the overload.
Then you will have one endpoint for all operations - the Pg-Pool service. Check this article for more details, including the whole setup process.
I have an application (AWS API Gateway) using an Aurora PostgreSQL cluster.
The cluster has 1 read/write (primary) and one reader endpoint.
At the moment, my application connections to the specific writer instance for all operations:
rds-instance-1.xxx.ap-southeast-2.rds.amazonaws.com
But I have the following endpoints available:
rds.cluster-xxx.ap-southeast-2.rds.amazonaws.com
rds.cluster-ro-xxx.ap-southeast-2.rds.amazonaws.com
rds-instance-1.xxx.ap-southeast-2.rds.amazonaws.com
rds-instance-1-ap-southeast-2c.xxx.ap-southeast-2.rds.amazonaws.com
If I am doing read and write operations, should I be connecting to the instance endpoint I'm using? Or should i use rds.cluster-xxx.ap-southeast-2.rds.amazonaws.com ? What are the benefits of using the different endpoints? I understand that if I connect to a read only endpoint I can only do reads, but for read/writes what's the difference connecting to:
rds.cluster-xxx.ap-southeast-2.rds.amazonaws.com
Or
rds-instance-1.xxx.ap-southeast-2.rds.amazonaws.com
?
What is the right / best endpoint to use for general workloads, and why?
You should use cluster reader/writer endpoint.
rds.cluster-xxx.ap-southeast-2.rds.amazonaws.com
rds.cluster-ro-xxx.ap-southeast-2.rds.amazonaws.com
The main benefit of using cluster endpoint is that if the failover occurs due to some reason you will not worry about the endpoint and you will can expect a minimal interruption of service.
Or what if you have 3 read replica then how you will manage to connect the reader? so Better to use cluster reader/writer endpoint.
Using the Reader Endpoint
You use the reader endpoint for read-only connections for your Aurora
cluster. This endpoint uses a load-balancing mechanism to help your
cluster handle a query-intensive workload. The reader endpoint is the
endpoint that you supply to applications that do reporting or other
read-only operations on the cluster.
Using the Cluster Endpoint
You use the cluster endpoint when you administer your cluster, perform
extract, transform, load (ETL) operations, or develop and test
applications. The cluster endpoint connects to the primary instance of
the cluster. The primary instance is the only DB instance where you
can create tables and indexes, run INSERT statements, and perform
other DDL and DML operations.
Instance endpoint
The instance endpoint provides direct control over connections to the
DB cluster, for scenarios where using the cluster endpoint or reader
endpoint might not be appropriate. For example, your client
application might require more fine-grained load balancing based on
workload type. In this case, you can configure multiple clients to
connect to different Aurora Replicas in a DB cluster to distribute
read workloads. For an example that uses instance endpoints to improve
connection speed after a failover for Aurora PostgreSQL
You can check furhter details AWS RDS Endpoints
Is it possible to establish connection from my localhost app to a replica-set postgres kubernetes? or what solution I need to do for having a mirror of my production database?
Thanks in advance
What you need is a so-called PostgreSQL Kubernetes operator that will be responsible for building Kubernetes objects based on your requests.
You can have a look at OperatorHub.io, they have some PostgreSQL operators.
Maybe an easier solution is KubeDB and the KubeDB PostgreSQL implementation.
The operator will also create a Kubernetes Service that will create a resolvable name linked to the Kubernetes Pods of your PostgreSQL cluster. KubeDB doc explains how to connect to the database in their documentation.
Now coming to your question :
Is it possible to establish connection from my localhost app [...]
You can access the Kubernetes service from outside but you will have to create a Kubernetes Load Balancer. See this blog article which explains it in details.
we have a instance of AWS RDS Aurora PostgreSQL Serverless with a vpc security group associated allowing connections from any place and any port, but we are unable to connect.
we always get the error "could not connect to server: Connection timed out "
We have found references to "public accessibility" parameter to solve the problem, but we are unable to find where to make the change.
Any help?
THanks
Aurora Serverless does not support publicly accessible endpoints at this time. It must be accessed from inside the VPC. Make sure you are attempting to connect to Aurora from within the VPC, and that the security group assigned to the Aurora cluster has the appropriate rules to allow access.