User limitation on Postgresql synthetic monitoring using Airflow

User limitation on Postgresql synthetic monitoring using Airflow - postgresql

I am trying to write a synthetic monitoring for my on-prem postgresql service, using airflow. The monitoring should return if a cluster is available for creating tables, writing and reading data, and deleting tables.
The clusters on my service are using SSL certificates for authentication, which means a client is required to provide a suitable client certificate in order to connect to the cluster.
Currently, I have implemented my monitoring by creating a global user which will have a certificate with permissions to all the cluster. The user will have permissions to create, write and read only on one schema, dedicated to this monitoring. Using airflow, I will connect with this user each of my postgresql clusters and try to create a table, write to it, read, and then delete it. If one of the actions fails - the DAG will write a log describing the reason for failure.
My main problem with this solution it not being able to limit such a powerful user with accessibility to all of my clusters. In case an intruder will get the user's client certificate, he would be able to explode the DB storage by writing huge amount of data or overload queries and fail the cluster.
I am looking for some ideas for limiting this user so it will be able to act only for it's purpose- the simple actions required for this monitoring, and could not be exploit by an attacker. Alternatively, I would appreciate any suggestions for different implementation for this monitoring.
I searched for build in postgresql configurations that will allow me to limit the dedicated monitoring schema / limiting the amount of queries performed by the user.

Related

Deploying Strapi on Kubernetes (GKE)

I want to deploy Strapi on GKE (Kubernetes), I have a docker-compose file, and I think I can use kcompose to create the deployment.
My questions is, has anyone used Mongodb Atlas + GKE or should I deploy Mongo on my own?

The question is more opinion based. It all depends on your needs.
If your needs match one of below you should stay with MongoDB:
Your app runs on-prem and contracts or privacy statements dont allow you to store data with a 3rd party.
You need large storage but not much query power.
There is other privacy/compliance issues.
Your app does not have internet access (firewalls, isolated environments)
You are running 3rd party applications that require a very old version of MongoDB
Here are some MongoDB Altas advantages:
Easily deploy, modify, and elastically scale their database clusters with a few clicks or an API call
Gain complete visibility into the performance the database and the underlying instances
Focus more on development, with built-in operational and security best practices such as geographically distributed, auto-healing clusters, and always-on authentication and encryption.
The best way would be if you will check how work with MongoDB Atlas on GCP looks alike. You can check this tutorial.

Application Performance monitoring on Swisscom Application Cloud

I am investigating options for monitoring our installation in Swisscom's cloud-foundry. My objectives are the following:
monitor performance indicators for deployed application (such as cpu, disk, memory)
monitor performance indicators for services (slow queries, number of queries, ideally also some metrics on hitting quotas)
So far, I understand the options are the following (including some BUTs):
I used a very nice TOP cf-plugin (github)
This works very well. It seems that it registers itself to get the required firehose nozzles and consume data.
That is very useful for tracing / ad-hoc monitoring, but not very good for a serious infrastructure monitoring.
Another way I found is to use firehose-syslog solution.
This can be deployed as an app to (as far as I understand) do the job in similar way, as the TOP cf plugin.
The problem is, that it requires registered client, so it can authenticate with the doppler endpoint. For some reason, the top-cf-plugin does that automatically / in another way.
Last option i am considering is to build the monitoring itself to the App (using a special buildpack)
That can be for example done with Datadog. But it seems to also require a dedicated uaa client to register the Nozzle.
I would like to check, if somebody is (was) on the similar road, has some findings.
Eventually I would like to raise the following questions towards the swisscom community support:
is it possible to register uaac client to be able to ingest events through the firehose nozzle from external service? (this requires admin credentials if I was reading correctly)
is there an alternative way to authenticate with the nozzle (for example using a special user and his authentication token?)
is there any alternative to monitor the CF deployments in Swisscom? Eventually, is there a paper, blogpost or other form of documentation, that would be helpful in this respect (also for other users of AppCloud)?

Since it requires admin permissions, we can not give out UAA clients for the firehose.
However, there are different ways to get metrics in context of a user.
CF API
You can obtain basic metrics of a specific app by polling the CF API:
https://apidocs.cloudfoundry.org/5.0.0/apps/get_detailed_stats_for_a_started_app.html
However, since you have to poll (and for each app), it's not the recommended way.
Metrics in syslog drain
CF allows devs to forward their logs to syslog drains; in more recent versions, CF also sends metrics to this syslog drain (see https://docs.cloudfoundry.org/devguide/deploy-apps/streaming-logs.html#container-metrics).
For example, you could use Swisscom's Elasticsearch service to store these metrics and then analyze it using Kibana.
Metrics using loggregator (firehose)
The firehose allows streaming logs to clients for two types of roles:
Streaming all logs to admins (which requires a UAA client with admin permissions) and streaming app logs and metrics to devs with permissions in the app's space. This is also what the cf logs command uses. cf top also works this way (it enumerates all apps and streams the logs of each app).
However, you will find out that most open source tools that leverage the firehose only work in admin mode, since they're written for the platform operator.
Of course you also have the possibility to monitor your app by instrumenting it (white box approach), for example by configuring Spring actuator in a Spring boot app or by including an agent of your favourite APM vendor (Dynatrace, AppDynamics, ...)
I guess this is the most common approach; we've seen a lot of teams having success by instrumenting their applications. Especially since advanced monitoring anyway requires you to create your own metrics as the firehose provided cpu/memory metrics are not that powerful in a microservice world.
However, option 2. would be worth a try as well, especially since the ELK's stack metric support is getting better and better.

Kubernetes - application-consistent backup

Is it possible to perform a backup on Kubernetes in application-consistent manner?
Some of the backup solutions I found mainly base on freezing the pod and then launching the backup to maintain consistency (Heptio's Ark for example.)

The idea of application-consistent backups is to capture all data in memory and all transactions in process. This is performed by using some type of client software co-resident with the database application to quiesce the database application, flush its memory cache, complete all its writes in order and then perform the backup.
In its turn, Kubernetes operates with specifications of resources (e.g., Deployments, Services, etc.) and their statuses, and in any given time the resource status must be the same as defined in the specification. For storing any important data in Kubernetes, persistent volumes are used. In other words, you cannot perform a backup in an application-consistent manner on Kubernetes, because the main idea of it is different.
It is possible that a specific application for a specific database exists and allows implementing such type of backup. But it is related to that application but not to Kubernetes itself.

Does AWS Aurura Postgresql do automatic query splitting

I am trying to work out if Aurora Postgres cluster endpoint does automatic query read write splitting between the reader and writer/s? Or to I need to use something like pgpool2 for the splitting?
I have attempted pushing read/writes at it, but it looks like only the reader is being hit?

but it looks like only the reader is being hit?
Only the writer should be hit via the cluster endpoint.
The cluster endpoint connects you to the primary instance for the DB cluster. You can perform both read and write operations using the cluster endpoint.
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Connecting.html
Aurora does not do read/write splitting. It provides you with a cluster endpoint, which automatically points to the cluster's current writer, and a read-only endpoint, which connects you to any one of the readers.
If the cluster only has one instance (Aurora doesn't technically require you to have any readers, but recovery in the event of a failure would require much more time, since the one instance would have to be rebuilt by the system if it failed -- but no data would be lost) then both endpoints take you to the writer.

PostgreSQL Multi master Synchronisation

I have a scenario as follows,
One cloud server is running an application with PGSQL as DB
Multiple local servers are running with same application with PGSQL as DB
User may access the cloud server for read/write data
User may access any of the local server to read/write data
What I need is synchronisation between all these databases. The synchronisation can be done live if connectivity is available, or immediately when connectivity is available.
Please guide me with some inputs, where can i start from.

Rethink your requirements.
Multimaster replication is full of pitfalls, and it is easy to get your databases out of sync unless you plan carefully. You'd probably be better off with a single master node.
That said, you could look at BDR by 2ndQuadrant which provides such functionality.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse