What makes my RDS allocated storage so high?

What makes my RDS allocated storage so high? - postgresql

I have databases running on AWS RDS. A lot of event notification says Storage size 3630 GB is approaching the maximum storage threshold 4000 GB. Increase the maximum storage threshold.. I've checked all my databases size, it's only around 150 GB. What makes my DB instance auto-scaled that high until 3630 GB? How can I check what is inside my allocated storage?
I already run query to check my databases size, check autoscale settings, and check automatic backup. I dont know what makes my allocated storage so high. I want to know what makes this problem.

Related

Understanding MongoDB storage size, logical data, vs database size, in MongoDB Atlas

We have a cluster with 100 GB storage, per the configuration for the cluster in mongodb atlas.
And the overview page for the cluster, it shows that 43.3 GB out of a 100 GB max are used.
Since the clusters configuration also has 100 GB storage selected, I am assuming the 100 GB of disc space is the same as the 100 GB available storage?
When we click into our database, it shows the database size is 66.64 GB + 3.21 GB indexes, for a total size of about 70GB.
What is the difference between the 100GB of available storage and disc, and the database size + index size of 70GB? Should we be concerned that the 70 GB is approaching 100GB, or is it only the 43.3 GB of disc usage that matters?
Edit Since I've posted this, MongoDB has removed database size, and replaced it with both storage size and logical data size, which further complicates this. In most instances, the logical data size is 3-4x the storage size.

Your mongodb database is using by default wiredTiger storage engine with snappy compression which mean that most probably your data stored on disk is using 43.3GB , but the actual(uncompressed) data size is ~ 70GB , so there is no place to worry about since you have used only 43.3% from your 100GB storage. Afcourse you need to monitor your data grow and if it is increasing faster you may need to allocate more space ...

GCP CloudSQL (PostgreSQL) Crash During Stored Procedure Execution and Failover

I have a stored procedure in GCP CloudSQL (PostgreSQL v9.0.23). It works find in lower environments; but when it runs in Production (with significantly more volume), it crashes the DB itself which results in a Failover.
When we checked the metrics, what we found out is that the memory is more than 90% just before it crashes (15 GB out of the 16GB instance memory). Also the Read / Writes are very high >1000 Ops per second.
The SP does some select and insert statements. Any suggestions to improve this situation helps.
Thanks in advance.

As you have mentioned that the Cloud SQL instance is running smoothly with a small amount of workload but crashing with the Production environment where more intensive workloads are there, it seems the issue is with the instance size. So I would suggest you increase the instance size as per your need.
Also you have mentioned that the memory usage is 15 GB out of 16 GB which amounts to nearly 94%. As per this document your Cloud SQL instance will not be covered under Cloud SQL SLA if memory usage is over 90% for more than 6 hours of duration. So I would suggest you keep the memory usage within 90%. Also I would suggest keeping the CPU utilization as mentioned in this document. To know when your instance reaches any threshold I will suggest you set a monitoring alert for that metrics as mentioned here.
If increasing your instance size doesn’t help I would recommend you to create a support ticket with Google Cloud Support so that they can investigate in detail.

Postgresql Aurora DB freeable_memory

I have a question regarding the freeable memory for AWS Aurora Postgres.
We recently wanted to create an index on one of our dbs and the db died and made a failover to the slave which all worked fine. It looks like the freeable memory dropped by the configured 500mb of maintenance_work_mem and by that went to around 800mb of memory - right after that the 32gig instance died.
1) I am wondering if the memory that is freeable is the overall system memory and if a low memory here could invoke the system oom killer on the AWS Aurora instance? So we may want to plan in more head room for operational tasks and the running of autovacuum jobs to not encounter this issue again?
2) The actual work of the index creation should then have used the free local storage as far as I understood, so the size of the index shouldn't have mattered, right?
Thanks in advance,
Chris

Regarding 1)
Freeable Memory from (https://forums.aws.amazon.com/thread.jspa?threadID=209720)
The freeable memory includes the amount of physical memory left unused
by the system plus the total amount of buffer or page cache memory
that are free and available.
So it's freeable memory across the entire system. While MySQL is the
main consumer of memory on the host we do have internal processes in
addition to the OS that use up a small amount of additional memory.
If you see your freeable memory near 0 or also start seeing swap usage
then you may need to scale up to a larger instance class or adjust
MySQL memory settings. For example decreasing the
innodb_buffer_pool_size (by default set to 75% of physical memory) is
one way example of adjusting MySQL memory settings.
That also means that if the memory gets low its likely to impact the process in some form. In this thread (https://forums.aws.amazon.com/thread.jspa?messageID=881320&#881320) e.g. it was mentioned that it caused the mysql server to restart.
Regarding 2)
This is like it is described in the documentation (https://aws.amazon.com/premiumsupport/knowledge-center/postgresql-aurora-storage-issue/) so I guess its right and the size shouldn't have mattered.
Storage used for temporary data and logs (local storage). All DB
temporary files (for example, logs and temporary tables) are stored in
the instance local storage. This includes sorting operations, hash
tables, and grouping operations that are required by queries.
Each Aurora instance contains a limited amount of local storage that
is determined by the instance class. Typically, the amount of local
storage is twice the amount of memory on the instance. If you perform
a sort or index creation operation that requires more memory than is
available on your instance, Aurora uses the local storage to fulfill
the operation.

Google Cloud SQL instance fills up to 10 TB

An instance of our Google Cloud SQL instance filled up to 10 TB of disk space (thank you auto expansion ;) with a lot of
[Note] InnoDB: page_cleaner: 1000ms intended loop took 4546ms. The settings might not be optimal. (flushed=9360 and evicted=13008, during the time.)
log entries.
We restarted it and now disk space is back to ~250GB. However the instance still seems to have 10TB of SSD attached and I could not find a way to reduce it back to 500 GB in the ui (seems only storage increase is supported). Any pointers to reduce disk usage?

Do we need Provisioned IOPS for RDS instance that's using 60 IOPS according to monitoring?

We have PostgreSQL instance serving tens of r/w queries per second.
Instance type: db.m3.2xlarge
Instance Provisioned IOPS (SSD): 1000
Instance storage size: 100GB , Database size is about 5-10GB.
It is serving 100s of simultaneous clients with read-write queries. Yet, when we look at Cloudwatch Monitoring it shows IOPS in range of 20-60.
And Read iOPS is around 0!
This can't be right with 100s of connections and clients performing read/write queries all the time?
The Postgres configuration is standard, we did not turn off fsync.
Is the cache so effective that IOPS is not a factor with database size of 5GB?
Or AWS monitoring console wrong?
Paying for 1000 IOPS cost extra $300 for this db instance.
And minimum IOPS you can buy is 1000.
I am wondering if we can do without IOPS?
Or AWS monitoring is not correct?
Or 20 IOPS we're having now will kill the server performance if we have non-IOPS server?
Or with 5GB database it mostly fits in cache and IOPS is not a factor?

#CraigRinger is correct. If your dataset is small enough to fit entirely in memory, you won't need provisioned IOPS since insert/update traffic and logs are the only consuming IOPS.
But in case someone finds this topic, here's what CloudWatch looks like when you've exhausted your GP2 credits. As you can see there the Read and Write IOPS charts don't tell us much, but the read/write latency charts show massive spikes.
For context, these are 2 weeks of a PostgreSQL read replica used for analytics. The switch from 100GB GP2 (300 Base IOPS, $11.50/mo) to 100GB io1 (1000 IOPS, $112.50/mo) happens about 2/3 way through these charts (no more latency spikes). The cheaper option would've been to just up the quantity of GP2 storage. Provisioned IOPS are outrageously overpriced, but predictable behavior during heavy workloads in this instance made sense.

Your DB is almost entirely cached in RAM. (You can confirm this with use of the pg_buffercache extension). Those IOPS numbers are entirely to be expected. I would expect this server to be just fine without provisioned IOPS.
If you restart the instance it'll be slow for a little while as it builds the cache back up, but 5GB isn't much for that. Also, having provisioned iops actually makes this worse, because as well as setting a minimum I/O rate, piops sets the maximum too. It's a target rate not a minimum.
By contrast, regular volumes can burst to much higher read rates than piops volumes, so they'll perform better when you're warming the cache back up after a restart.
BTW:
Restarting the database won't slow it much, as it only has to read data from the OS's disk cache back into shared_buffers. It's only if you restart the whole machine that you'll see a slowdown for a while. If you want to simulate this without a restart, you can use Linux's drop_caches feature:
echo 1 | sudo tee -a /proc/sys/vm/drop_caches
This is actually worse than the situation after a restart because it evicts binaries and libraries from memory too. The system will chug very heavily at first, as it reads the very frequently accessed binaries and libraries it's executing back into RAM. Then you'll start to see cache recovery behaviour like you would after a restart.
Also, you have too many connections configured. Install pgbouncer, put it in front of the database, and reduce your max_connections. You'll get better performance.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse