Multiple Database in a postgres cluster on Kubernetes? - postgresql

Is it possible to have multiple databse in a cluster with Crunchydata (postgres)?
When I create a cluster with "pgo create cluster" command I can specify only one database.
-d, --database string If specified, sets the name of the initial database that is created for the user. Defaults to the value set in the PostgreSQL Operator configuration, or if that is not present, the name of the cluster
But I need multiple database per cluster, and I can't find any official way to create them.
Another question: How can I find the "superuser" username and password to login to PGOAmin Web?
Thanks a lot.

This could be useful, however it stated:
"It may make more sense to have each of your databases in its own
cluster if you want to have them spread out over your Kubernetes
topology."
https://github.com/CrunchyData/postgres-operator/issues/2655

Related

What is the process to clean up orphaned dynamic creds in Hashicorp Vault?

I am using Hashicorp Vault to generate dynamic creds in multiple database clusters. We have one database cluster that is somewhat ephemeral so on occasion it will be refreshed from another database cluster. This database cluster will be connected to via Vault dynamic creds just like the other database clusters.
I have the process to clean up the database users brought over by the backup from the source system when this cluster is refreshed but I don't know how I should handle the Vault cleanup. The database config will be the same (same host/user) but all the existing database user accounts recently created by Vault will be gone after the refresh so I don't know what I need to do to reset/clean up Vault for that database. The database system I'm using (Redshift) doesn't seem to have DROP USER ... IF EXISTS type of syntax otherwise I would simply use that in the dynamic role's revocation_statements and let it cycle out naturally that way.
So my main question is how do I reset or delete all the dynamic creds that were created for a specific database cluster in Vault if the database cluster is refreshed or no longer exists?
I figured out the answer to this and I wanted to share here in case anyone else encounters this.
The "lease revoke" documentation explains that you can use the -prefix switch to revoke leases based on a partial or full prefix match.
Using this information you can run a command similar to the following in order to force revoke existing leases for a specific role:
vault lease revoke -force -prefix database/creds/ROLE_NAME
Using the -force switch will remove the lease even if the revocation_statements fails to process (case when the database user no longer exists).
As an aside, the following command can be used to list leases and is useful to check before and after that all the leases are, in fact, revoked:
vault list sys/leases/lookup/database/creds/ROLE_NAME
This solves my problem of "how to I remove leases for orphaned Vault dynamic credentials" in cases where the target database is refreshed from a backup which is the case I am using this for.

If I declare 2 replicas of PostgreSQL StatefulSet pods in k8s, are they the same database or they just share the volume?

After making 2 replicas of PostgreSQL StatefulSet pods in k8s, are the the same database?
If they do, why I created DB and user in one pod, and can not find the value in the other.
If they not, is there no point of creating replicas?
There isn't one simple answer here, it depends on how you configured things. Postgres doesn't support multiple instances sharing the same underlying volume without massive corruption so if you did set things up that way, it's definitely a mistake. More common would be to use the volumeClaimTemplate system so each pod gets its own distinct storage. Then you set up Postgres streaming replication yourself.
Or look at using an operator which handles that setup (and probably more) for you.
To add the answer in coderanger, as he said it's not easy to say how Postgres will work with the multi replicas, and data replication across the cluster unless checking more in-depth. Setting the multiple replicas directly without reading the document of replication of data might lead to big issue.
Here is one nice example from google for ref : https://cloud.google.com/architecture/deploying-highly-available-postgresql-with-gke
For the example of Postgres database replication example and clustering config files : https://github.com/CrunchyData/crunchy-containers/tree/master/examples/kube

Why do you need to reference the IAM role in the COPY command on Redshift?

Regarding using the COPY command for populating a Redshift table with data from S3; I'm wondering if there is a reason for why you have to specify a role via its ARN which provides the permissions even though the Redshift cluster is already associated with that rule. This seems redundant to me but probably there is a reason for this. Hence my question.
This question arose upon reading the Redshift getting started guide; specifically regarding steps 2, 3 and 6.
It's not mandatory to reference an IAM Role when using the COPY command. This is one of several authorization methods available for the cluster to access external resources (e.g. files stored on S3). The reason for specifying the IAM_ROLE clause is to tell Redshift that this is the authorization method to use, you could alternatively specify ACCESS_KEY_ID/SECRET_ACCESS_KEY or CREDENTIALS.
https://docs.aws.amazon.com/redshift/latest/dg/copy-usage_notes-access-permissions.html
The reason you need to add the ARN for a specific IAM role is that it's possible to add more than one role to a cluster.

Kubernetes: Databases & DB Users

We are planning to use Kube for Postgres deployments. Our applications will be microservices with separated schema (or logical database). For security sake, we'd like to have separate users for each schema/logical_db.
I suppose that the db/schema&user should be created by Kube, so the application itself does not need to have access to DB admin account.
In Stolon it seems there is just a possibility to create a single user and single database and this seems to be the case also for other HA Postgres charts.
Question: What is the preferred way in Microservices in Kube to create DB users?
When it comes to creating user, as you said, most charts and containers will have environment variables for creating a user at boot time. However, most of them do not consider the possibility of creating multiple users at boot time.
What other containers do is, as you said, have the root credentials in k8s secrets so they access the database and create the proper schemas and users. This does not necessarily need to be done in the application logic but, for example, using an init container that sets up the proper database for your application to run.
https://kubernetes.io/docs/concepts/workloads/pods/init-containers
This way you would have a pod with two containers: one for your application and an init container for setting up the DB.

deploying mongodb on google cloud platform?

Hello all actually for my startup i am using google cloud platform, now i am using app engine with node.js this part is working fine but now for database, as i am mongoDB i saw this for mongoDB https://console.cloud.google.com/launcher/details/click-to-deploy-images/mongodb?q=mongo now when i launched it on my server now it created three instances in my compute engine but now i don't know which is primary instance and which is secondary, also one more thing as i read that primary instance should be used for writing data and secondary for reading, now when i will query my database should i provide secondary instance url and for updating/inserting data in my mongodb database should i provide primary instance url otherwise which url should i use for CRUD operations on my mongodb database ?? also after launcing this do i have to make any changes in any conf file or in any file manually or they already done that for me ?? Also do i have to make instance groups of all three instances or not ??
Please if any one of you think i have not done any research on this or its not a valid stackoverflow question then i am so sorry google cloud platform is very much new that's why there is not much documentation on it also this is my first time here in deploying my code on servers that's why i am completely noob in this field Thanks Anyways please help me ut of here guys.
but now i don't know which is primary instance and which is secondary,
Generally the Cloud Launcher will name the primary with suffix -1 (dash one). For example by default it would create mongodb-1-server-1 instance as the primary.
Although you can also discover which one is the primary by running rs.status() on any of the instances via the mongo shell. As an example:
mongo --host <External instance IP> --port <Port Number>
You can get the list of external IPs of the instances using gcloud. For example:
gcloud compute instances list
By default you won't be able to connect straight away, you need to create a firewall rule for the compute engines to open port(s). For example:
gcloud compute firewall-rules create default-allow-mongo --allow tcp:<PORT NUMBER> --source-ranges 0.0.0.0/0 --target-tags mongodb --description "Allow mongodb access to all IPs"
Insert a sensible port number, please avoid using the default value. You may also want to limit the source IP ranges. i.e. your office IP. See also Cloud Platform: Networking
i read that primary instance should be used for writing data and secondary for reading,
Generally replication is to provide redundancy and high availability. Where the primary instance is being used to read and write, and secondaries act as replicas to provide a level of fault tolerance. i.e. the loss of primary server.
See also:
MongoDB Replication.
Replication Read Preference.
MongoDB Sharding.
now when i will query my database should i provide secondary instance url and for updating/inserting data in my mongodb database should i provide primary instance url otherwise which url should i use for CRUD operations on my mongodb database
You can provide both in MongoDB URI and the driver will figure out where to read/write. For example in your Node.js you could have:
mongodb://<instance 1>:<port 1>,<instance 2>:<port 2>/<database name>?replicaSet=<replica set name>
The default replica set name set by Cloud Launcher is rs0. Also see:
Node Driver: URI.
Node Driver: Read Preference.
also after launcing this do i have to make any changes in any conf file or in any file manually or they already done that for me ?? Also do i have to make instance groups of all three instances or not ??
This depends on your application use case, but if you are launching through click and deploy the MongoDB config should all be taken care of.
For a complete guide please follow tutorial : Deploy MongoDB with Node.js. I would also recommend to check out MongoDB security checklist.
Hope that helps.