Accessing Postgres RDS from Kubernetes cluster in AWS

Accessing Postgres RDS from Kubernetes cluster in AWS - postgresql

My Kubernetes cluster setup has n-tier web application running in dev and test environments on AWS. For the production environment, postgres RDS was chosen, to ensure periodic backup. While creating a postgres RDS instance, kubernetes-vpc was selected for db-subnet to keep networking stuff simple during pilot run.
Also, security group selected is the same as kubernetes-minions.
Following is the service and endpoint yaml:
apiVersion: v1
kind: Service
metadata:
labels:
name: pgsql-rds
name: pgsql-rds
spec:
ports:
- port: 5432
protocol: TCP
targetPort: 5432
--
apiVersion: v1
kind: Endpoints
metadata:
name: pgsql-rds
subsets:
- addresses:
- ip: 52.123.44.55
ports:
- port: 5432
name: pgsql-rds
protocol: TCP
When web-app service and deployment is created, it's unable to connect to RDS instance.
The log is as follows:
java.sql.SQLException: Error in allocating a connection. Cause: Connection could not be allocated because: Connection to pgsql-rds:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
What am I missing? any pointers to resolve the issue appreciated.

This has to do with DNS resolving. When you use the RDS dns name INSIDE the same VPC it will be resolved to a private ip. When you use the same dns name on the internet or another VPC you will get the public ip of the RDS instance.
This is a problem because from another VPC you can not make use of the load balancing feature unless you expose the RDS instance to the public internet.

It's been a while the issue was resolved.
Don't exactly remember now, which step I missed that caused connection problem.
But, below are the steps that did work for me.
Pre-requisite: kubernetes cluster is set up with vpc ('k8s-vpc')
Create VPC SUBNET
Go to vpc dashboard, ensure same aws region as k8s minion. (you will see existing 'k8s-vpc')
Create subnet with each availability zone.
Select 'k8s-vpc' as vpc from drop-down.
CIDR could be 172.20.16.0/24 or 172.20.32.0/24
Create DB SUBNET and SUBNET GROUP FOR VPC of k8s minion if not already available.
Go to RDS Dashboard.
Create subnet group (e.g. my-db-subnet-group) for DB and add all subnet from step 1 to create subnet group.
From RDS Dashboard create Parameter Group
(e.g. my-db-param-group) for Postgres (version 9.5 in this example)
Copy value for max_connections to the max_prepared_transactions field and save
Create RDS instance for Postgres DB
Launch DB instance -> select Engine Postgres -> Choose stage (Production or Dev/Test)
-> Give instance spec.s (instance type & disk space etc.) and specify DB settings (user/password)
-> Configure Advanced settings
vpc selection as 'k8s-vpc'
DB subnet should be one created in previous step (my-db-subnet-group)
VPC security group should be from Kubernetes minions - so that no additional config. required for access from minions
Select Publicly Accessible - to connect to postgres from internet
Select Parameter Group as 'my-db-param-group'.
Specify Database options, backup and maintenance options and finally launch the instance
Also check security group of VPC and add inbound rule to allow connection to postgres port.
You can test connection from one of the k8s pod (kubectl exec -it) where postgres client is installed.
Make sure to change user to postgres.
Connect to RDS using psql as shown below:
$ psql --host=my-rds-dev.cyhi3va0y2or.ap-northeast-1.rds.amazonaws.com --port=5432 --username=<masterUserName> --password --dbname=<masterDB>
If everything is set up correctly, it should prompt you for password of db user.
Providing correct password will finally connect to RDS.
This article was of great help.

Your IP is of the form: 52.123.44.55. This is a public IP. See the official RFC
Since you said both are in the same VPC, you could have used the internal IP address instead.
That said, the error "Connection to pgsql-rds:5432 refused" means that the address was resolved, otherwise you would get "psql: error: could not translate host name "psql-rds" to address: Name or service not known". Therefore, it is not a DNS issue as cited on another answer.
The cause of the block is likely that the security group was not configured to accept requests from the EC2 instance external IP address. This if the official AWS documentation on connecting to RDS scenarios.
You might have already whitelisted all connections from VPC, however double check the security groups. I would not recommend using a whitelist for the external IP address, however it works if you put the external IP address there. It is a security concern when you don't have an elastic IP address and there are data transfer costs unless you have a more complex setup.
That said, you could have avoided the Kubernetes resources and used the DNS address of the RDS instance.
If you had to avoid using the DNS address of the RDS instance directly, you could have used the following:
apiVersion: v1
kind: Service
metadata:
name: psql-rds
spec:
externalName: my-rds-dev.cyhi3va0y2or.ap-northeast-1.rds.amazonaws.com
ports:
- port: 5432
protocol: TCP
targetPort: 5432
sessionAffinity: None
type: ExternalName
With the setup above, you don't need a Kubernetes Endpoint. You can just use psql-rds, or any variation using the namespace as the domain or the fully qualified version, such as psql-rds.default. This is the documentation for the ExternalName
I see that the original poster mentioned the problem was solved, however it is not clear or well documented that the problem was a combination of using the external IP address and checking the security group rules.

Related

How to access locally installed postgresql in microk8s cluster

I have installed Postgresql and microk8s on Ubuntu 18.
One of my microservice which is inside microk8s single node cluster needs to access postgresql installed on same VM.
Some articles suggesting that I should create service.yml and endpoint.yml like this.
apiVersion: v1
metadata:
name: postgresql
spec:
type: ClusterIP
ports:
- port: 5432
targetPort: 5432
---
kind: Endpoints
apiVersion: v1
metadata:
name: postgresql
subsets:
- addresses:
- ip: ?????
ports:
- port: 5432
Now, I am not getting what should I put in subsets.addresses.ip field ?

First you need to configure your Postgresql to listen not only on your vm's localhost. Let's assume you have a network interface with IP address 10.1.2.3, which is configured on your node, on which Postgresql instance is installed.
Add the following entry in your /etc/postgresql/10/main/postgresql.conf:
listen_addresses = 'localhost,10.1.2.3'
and restart your postgres service:
sudo systemctl restart postgresql
You can check if it listens on the desired address by running:
sudo ss -ntlp | grep postgres
From your Pods deployed within your Microk8s cluster you should be able to reach IP addresses of your node e.g. you should be able to ping mentioned 10.1.2.3 from your Pods.
As it doesn't require any loadbalancing you can reach to your Postgresql directly from your Pods without a need of configuring additional Service, that exposes it to your cluster.
If you don't want to refer to your Postgresql instance in your application using it's IP address, you can edit your Deployment (which manages the set of Pods that connect to your postgres db) to modify the default content of /etc/hosts file used by your Pods.
Edit your app Deployment by running:
microk8s.kubectl edit deployment your-app
and add the following section under Pod template spec:
hostAliases: # it should be on the same indentation level as "containers:"
- hostnames:
- postgres
- postgresql
ip: 10.1.2.3
After saving it, all your Pods managed by this Deployment will be recreated according to the new specification. When you exec into your Pod by running:
microk8s.kubectl exec -ti pod-name -- /bin/bash
you should see additional section in your /etc/hosts file:
# Entries added by HostAliases.
10.1.2.3 postgres postgresql
Since now you can refer to your Postgres instance in your app by names postgres:5432 or postgresql:5432 and it will be resolved to your VM's IP address.
I hope it helps.
UPDATE:
I almost forgot that some time ago I've posted an answer on a very similar topic. You can find it here. It describes the usage of a Service without selector, which is basically what you mentioned in your question. And yes, it also can be used for configuring access to your Postgresql instance running on the same host. As this kind of Service doesn't have selectors by its definition, no endpoint is automatically created by kubernetes and you need to create one by yourself. Once you have the IP address of your Postgres instance (in our example it is 10.1.2.3) you can use it in your endpoint definition.
Once you configure everything on the side of kubernetes you may still encounter an issue with Postgres. In your Pod that is trying to connect to the Postgres instance you may see the following error message:
org.postgresql.util.PSQLException: FATAL: no pg_hba.conf entry for host 10.1.7.151
It basically means that your pg_hba.conf file lacks the required entry that would allow your Pod to access your Postgresql database. Authentication is host-based, so in other words only hosts with certain IPs or with IPs within certain IP range are allowed to authenticate.
Client authentication is controlled by a configuration file, which
traditionally is named pg_hba.conf and is stored in the database
cluster's data directory. (HBA stands for host-based authentication.)
So now you probably wonder which network you should allow in your pg_hba.conf. To handle cluster networking Microk8s uses flannel. Take a look at the content of your /var/snap/microk8s/common/run/flannel/subnet.env file. Mine looks as follows:
FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.53.1/24
FLANNEL_MTU=1410
FLANNEL_IPMASQ=false
Adding to your pg_hba.conf only flannel subnet should be enough to ensure that all your Pods can connect to Posgresql.

access postgres in kubernetes from an application outside the cluster

Am trying to access postgres db deployed in kubernetes(kubeadm) on centos vms from another application running on another centos vm. I have deployed postgres service as 'NodePort' type. My understanding is we can deploy it as LoadBalancer type only on cloud providers like AWS/Azure and not on baremetal vm. So now am trying to configure 'ingress' with NodePort type service. But am still unable to access my db other than using kubectl exec $Pod-Name on kubernetes master.
My ingress.yaml is
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: postgres-ingress
spec:
backend:
serviceName: postgres
servicePort: 5432
which does not show up any address as below
NAME HOSTS ADDRESS PORTS AGE
postgres-ingress * 80 4m19s
am not even able to access it from pgadmin on my local mac. Am I missing something?
Any help is highly appreciated.

Ingress won't work, it's only designed for HTTP traffic, and the Postgres protocol is not HTTP. You want solutions that deal with just raw TCP traffic:
A NodePort service alone should be enough. It's probably the simplest solution. Find out the port by doing kubectl describe on the service, and then connect your Postgres client to the IP of the node VM (not the pod or service) on that port.
You can use port-forwarding: kubectl port-forward pod/your-postgres-pod 5432:5432, and then connect your Postgres client to localhost:5432. This is my preferred way for accessing the database from your local machine (it's very handy and secure) but I wouldn't use it for production workloads (kubectl must be always running so it's somewhat fragile and you don't get the best performance).
If you do special networking configuration, it is possible to directly access the service or pod IPs from outside the cluster. You have to route traffic for the pod and service CIDR ranges to the k8s nodes, this will probably involve configuring your VM hypervisors, routers and firewalls, and is highly dependent on what networking (CNI) plugin are you using for your Kubernetes cluster.

Unable to connect to server ( postgres ) - GCP - Kubernetes

I want to connect to my Postgres DB . I use deployment NodePort IP for the host field and also data from config file :
data:
POSTGRES_DB: postgresdb
POSTGRES_PASSWORD: my_password
POSTGRES_USER: postgresadmin
But I get error . What do I do wrong ? If you need more info - let me know .

Unless you are connected to your cluster through VPN (or direct connect), you can't access 10.121.8.109. It's a private IP address and only available for apps and services within you VPC.
You need to create public access for your node port service. Try kubectl get service to find out the External IP for your service. Then try to connect to your IP address from External IP.
Rather than using NodePort service, you are better off using Load Balancer type service which might give you better flexibility in managing this especially in a production env. But it will cost a little more Likelihood of an IP Address to change is high, but load balancer or ingress service would automatically manage this for you through a fixed DNS. So you need to weigh the pros and cons of using service type based on your workload.

Accessing GCP Internal Load Balancer from another region

I need to access an internal application running on GKE Nginx Ingress service riding on Internal Load Balancer, from another GCP region.
I am fully aware that it is not possible using direct Google networking and it is a huge limitation (GCP Feature Request).
Internal Load Balancer can be accessed perfectly well via VPN tunnel from AWS, but I am not sure that creating such a tunnel between GCP regions under the same network is a good idea.
Workarounds are welcomed!

In the release notes from GCP, it is stated that:
Global access is an optional parameter for internal LoadBalancer Services that allows clients from any region in your VPC to access the internal TCP/UDP Load Balancer IP address.
Global access is enabled per-Service using the following annotation:
networking.gke.io/internal-load-balancer-allow-global-access: "true".
UPDATE: Below service works for GKE v1.16.x & newer versions:
apiVersion: v1
kind: Service
metadata:
name: ilb-global
annotations:
# Required to assign internal IP address
cloud.google.com/load-balancer-type: "Internal"
# Required to enable global access
networking.gke.io/internal-load-balancer-allow-global-access: "true"
labels:
app: hello
spec:
type: LoadBalancer
selector:
app: hello
ports:
- port: 80
targetPort: 8080
protocol: TCP
For GKE v1.15.x and older versions:
Accessing internal load balancer IP from a VM sitting in a different region will not work. But this helped me to make the internal load balancer global.
As we know internal load balancer is nothing but a forwarding rule, we can use gcloud command to enable global access.
Firstly get the internal IP address of the Load Balancer using kubectl and save its IP like below:
# COMMAND:
kubectl get services/ilb-global
# OUTPUT:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ilb-global LoadBalancer 10.0.12.12 10.123.4.5 80:32400/TCP 18m
Note the value of "EXTERNAL-IP" or simply run the below command to make it even simpler:
# COMMAND:
kubectl get service/ilb-global \
-o jsonpath='{.status.loadBalancer.ingress[].ip}'
# OUTPUT:
10.123.4.5
GCP gives a randomly generated ID to the forwarding rule created for this Load Balancer. If you have multiple forwarding rules, use the following command to figure out which one is the internal load balancer you just created:
# COMMAND:
gcloud compute forwarding-rules list | grep 10.123.4.5
# OUTPUT
NAME REGION IP_ADDRESS IP_PROTOCOL TARGET
a26cmodifiedb3f8252484ed9d0192 asia-south1 10.123.4.5 TCP asia-south1/backendServices/a26cmodified44904b3f8252484ed9d019
NOTE: If you not working on Linux or grep is not installed, simply run gcloud compute forwarding-rules list and manually look for the forwarding rule having the IP address we are looking for.
Note the name of the forwarding-rule and run the following command to update the forwarding rule with --allow-global-access (remember adding beta, as it is still a beta feature):
# COMMAND:
gcloud beta compute forwarding-rules update a26cmodified904b3f8252484ed9d0192 \
--region asia-south1 --allow-global-access
# OUTPUT:
Updated [https://www.googleapis.com/compute/beta/projects/PROJECT/regions/REGION/forwardingRules/a26hehemodifiedhehe490252484ed9d0192].
And it's done. Now you can access this internal IP (10.123.4.5) from any instance in any region (but the same VPC network).

Another possible way is to implement the ngnix reverser proxy server on an compute engine in the same region as of GKE cluster, and use the internal IP of compute engine instance to communicate with the services of the GKE.

First of all, note that the only way to connect any GCP resource (in this case your GKE cluster) from an on premise location, it’s either through a Cloud Interconnect or VPN set up, which actually they must be in the same region and VPC to be able to communicate with each other.
Having said that, I see you won’t like to do that under the same VPC, therefore a workaround for your scenario could be:
Creating a Service of type LoadBalancer, so your cluster can be reachable through and external (public) IP by exposing this service. If you are worried about the security, you can use Istio to enforce access policies for example.
Or, to create an HTTP(S) load balancing with Ingress, so your cluster can be reachable through its external (public) IP. Where again, for security purposes you can use GCP Cloud Armor which actually so far works only for HTTP(S) Load Balancing.

How to expose NodePort to internet on GCE

How can I expose service of type NodePort to internet without using type LoadBalancer? Every resource I have found was doing it by using load balancer. But I don't want load balancing its expensive and unnecessary for my use case because I am running one instance of postgres image which is mounting to persistent disk and I would like to be able to connect to my database from my PC using pgAdmin. If it is possible could you please provide bit more detailed answer as I am new to Kubernetes, GCE and networking.
Just for the record and bit more context I have deployment running 3 replicas of my API server to which I am connecting through load balancer with set loadBalancerIP and another deployment which is running one instance of postgres with NodePort service through which my API servers are communicating with my db. And my problem is that maintaining the db without public access is hard.

using NodePort as Service type works straight away e.g. like this:
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
type: NodePort
ports:
- port: 80
nodePort: 30080
name: http
- port: 443
nodePort: 30443
name: https
selector:
name: nginx
More details can be found in the documentation.
The drawback of using NodePort is that you've to take care of integrating with your providers firewall by yourself. A starting port for that can also be found in the Configuring Your Cloud Provider's Firewalls section of the official documentation.
For GCE opening up the above for publicly on all nodes could look like:
gcloud compute firewall-rules create myservice --allow tcp:30080,tcp:30443
Once this is in place your services should be accessable through any of the public IPs of your nodes. You'll find them with:
gcloud compute instances list

You can run kubectl in a terminal window (command or power shell in windows) to port forward the postgresql deployment to your localhost.
kubectl port-forward deployment/my-pg-deployment 5432:5432
While this command is running (it runs in the foreground) you can use pgAdmin to point to localhost:5432 to access your pod on the gke. Simply close the terminal once you are done using the pgadmin.

For the sake of improved security: if in doubt about exposing a service like a database to the public internet, you might like the idea of hiding it behind a simple linux VM called jump host, also called bastion host in the official GCP documentation which is recommended. This way your database instance will continue being open towards the internal network. You then can remove the external IP address so that it stops being exposed to the internet.
The high level concept:
public internet <- SSH:22 -> bastion host <- db:5432 -> database service
After setting up your ssh connection and establishing connection, you could reach out to the database by forwarding the database port (see example below).
The Procedure Overview
Create the GCE VM
Specific requirements:
Pick the image of a Linux distribution you are familiar with
VM Connectivity to internet: Attach a public IP to the VM (you can do this during or after the installation)
Security: Go to Firewall rules and add a new rule opening port 22 at internal VM IP. Restrict the incoming connections to your home public IP
Go to your local machine, from which you would connect, and setup the connection like in the following example below.
SSH Connect to the bastion host VM
An example setup for your ssh connection, located at $HOME/.ssh/config (if this file called config doesn't exist, just create it):
Host bastion-host-vm
Hostname external-vm-ip
User root
IdentityFile ~/.ssh/id_ed25519
LocalForward 5432 internal-vm-ip:5432
Now you are ready for connecting from your local machine terminal with this command:
ssh bastion-host-vm
Once connected, you could now pick your favorite database client and connect to localhost:5432 (which is the forwarded port through the ssh connection from the remote database instance, which is behind the ssh host).
CAUTION: The port forwarding is only function as long as the ssh connection is established. If you disconnect or close the terminal window the ssh connection will close, and so the database port forwarding as well. So keep the terminal open and connection to your bastion host established as long as you are using the database connection.
Pro tipp for cost saving on the GCE VM
you could use the free tier offer for creating the bastion host VM which means increased protection for free.
Search for "Compute Engine" in the official table.
You could check this thread for more details on GCE free limits.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse