AWS RDS Storage increasing unexpectedly

AWS RDS Storage increasing unexpectedly - postgresql

I’m using RDS PostgreSQL with the free tier configuration (t2.micro instance, 20 GB of general purpose database storage (SSD) and 20 GB of backups storage).
I’ve created a database with a few tables and the current usage is below 200 MB.
The issue is that the storage has been increasing day by day at a rate of 1 GB per day since I’ve created the database. As a workaround I’ve tried to stop this behavior by decreasing the backup retention period from 7 to 0 days. From yesterday on, no more backups were stored. However, the storage kept on growing at almost the same rate.
I have no clue what's going on and I don't know what to do to stop the consumption and avoid exceeding the free tier storage.
Apart from that, I also don't know which storage is the one that has been increasing (data or backup storage) because the AWS platform only says that the increase is in the RDS Storage (it doesn't distinguish between both types of storages). As I’m not storing a lot of data in the database I suspect the problem is in the backups and snapshots storage and not in the data storage itself.
Thanks in advance!
Update 2022-09-13
The curious fact is that I have a budget threshold of 8 Gb/month and on the same day every month the threshold is exceeded (in my case, on the 13th of each month). But at the end of the month it kinda resets by itself (it returns to zero). I don't know what is happening. Does anyone know where that temporary Gbs usage comes from?

Related

GCP CloudSQL (PostgreSQL) Crash During Stored Procedure Execution and Failover

I have a stored procedure in GCP CloudSQL (PostgreSQL v9.0.23). It works find in lower environments; but when it runs in Production (with significantly more volume), it crashes the DB itself which results in a Failover.
When we checked the metrics, what we found out is that the memory is more than 90% just before it crashes (15 GB out of the 16GB instance memory). Also the Read / Writes are very high >1000 Ops per second.
The SP does some select and insert statements. Any suggestions to improve this situation helps.
Thanks in advance.

As you have mentioned that the Cloud SQL instance is running smoothly with a small amount of workload but crashing with the Production environment where more intensive workloads are there, it seems the issue is with the instance size. So I would suggest you increase the instance size as per your need.
Also you have mentioned that the memory usage is 15 GB out of 16 GB which amounts to nearly 94%. As per this document your Cloud SQL instance will not be covered under Cloud SQL SLA if memory usage is over 90% for more than 6 hours of duration. So I would suggest you keep the memory usage within 90%. Also I would suggest keeping the CPU utilization as mentioned in this document. To know when your instance reaches any threshold I will suggest you set a monitoring alert for that metrics as mentioned here.
If increasing your instance size doesn’t help I would recommend you to create a support ticket with Google Cloud Support so that they can investigate in detail.

AWS RDS: high write IOPS

We are using RDS instance of type m4.xlarge in ap-south-1 region using Postgres-9.6 Engine.
Yesterday, the write IOPS skyrocketed suddenly. Here are some details around that time. Around 20GB space got filled in 1 and a half hour and then got free automatically. The number of connections also went crazy up to 1500.
Write IOPS Graph,
Free Storage Space Graph,
There is no way our application could have written so much data - 20 GB. Even if it had written, how it got automatically deleted?
Is there any way of knowing what happened here? Has anyone else faced this issue?

Loading data to Postgres RDS is still slow after tuning parameters

We have created a RDS postgres instance (m4.xlarge) with 200GB storage (Provisioned IOPS). We are trying to upload data from company data mart to the 23 tables in RDS using DataStage. However the uploads are quite slow. It takes about 6 hours to load 400K records.
Then I started tuning the following parameters according to Best Practices for Working with PostgreSQL:
autovacuum 0
checkpoint_completion_target 0.9
checkpoint_timeout 3600
maintenance_work_mem {DBInstanceClassMemory/16384}
max_wal_size 3145728
synchronous_commit off
Other than these, I also turned off multi AZ and back-up. SSL is enabled though, not sure this will change anything. However, after all the changes, still not much improvement. DataStage is uploading data in parallel already ~12 threads. Write IOPS is around 40/sec. Is this value normal? Is there anything else I can do to speed up the data transfer?

In Postgresql, you're going to have to wait 1 full round trip (latency) for each insert statement written. This latency is the latency between the database all the way to the machine where the data is being loaded from.
In AWS you have many options to improve performance.
For starters, you can load your raw data onto an EC2 instance and start importing from there, however, you will likely not be able to use your dataStage tool unless it can be loaded directly on the ec2 instance.
You can configure dataStage to use batch processing where each insert statement actually contains many rows.. generally, the more, the faster.
disable data compression and make sure you've done everything you can to minimize latency between the two endpoints.

Size of SQL disk storage space growing 32GB a day

Since April 1st, the size of my DB storage space grows by 32GB a day. It's very unusual, and based on the 500GB disk, this will not last for much longer.
Why is the DB growing by 32GB a day?
For context, I've allocated a 500GB disk; binary logs are enabled; automated backups are enabled.

I tested further. The reason for the DB growing so dramatically every night is due to the binary logs. Every night Magento indexes run, and produce 32GB of binary logging data. Not all Magento stores will be the same, but large Magento stores beware.
The solution, temporarily at least, is to disable binary logging. Have a look at the image to see the reclaimed disk space after disabling the option.
This will make it a challenge when setting up read/failover replicas. It would be nice if the MySQL instance is configured to purge/prune binary logs after a set amount of time has passed, or at least once operations have been copied to slave instances. Maybe it does, but I haven't investigated. Given current time constraints, I was not going to wait until the purge/prune happened, if it even would.

Could it be your DB log that is growing at a rapid pace?
I have had this issue in the past and ended up creating a job for the SQL agent that runs once a week and purges the log.

Do we need Provisioned IOPS for RDS instance that's using 60 IOPS according to monitoring?

We have PostgreSQL instance serving tens of r/w queries per second.
Instance type: db.m3.2xlarge
Instance Provisioned IOPS (SSD): 1000
Instance storage size: 100GB , Database size is about 5-10GB.
It is serving 100s of simultaneous clients with read-write queries. Yet, when we look at Cloudwatch Monitoring it shows IOPS in range of 20-60.
And Read iOPS is around 0!
This can't be right with 100s of connections and clients performing read/write queries all the time?
The Postgres configuration is standard, we did not turn off fsync.
Is the cache so effective that IOPS is not a factor with database size of 5GB?
Or AWS monitoring console wrong?
Paying for 1000 IOPS cost extra $300 for this db instance.
And minimum IOPS you can buy is 1000.
I am wondering if we can do without IOPS?
Or AWS monitoring is not correct?
Or 20 IOPS we're having now will kill the server performance if we have non-IOPS server?
Or with 5GB database it mostly fits in cache and IOPS is not a factor?

#CraigRinger is correct. If your dataset is small enough to fit entirely in memory, you won't need provisioned IOPS since insert/update traffic and logs are the only consuming IOPS.
But in case someone finds this topic, here's what CloudWatch looks like when you've exhausted your GP2 credits. As you can see there the Read and Write IOPS charts don't tell us much, but the read/write latency charts show massive spikes.
For context, these are 2 weeks of a PostgreSQL read replica used for analytics. The switch from 100GB GP2 (300 Base IOPS, $11.50/mo) to 100GB io1 (1000 IOPS, $112.50/mo) happens about 2/3 way through these charts (no more latency spikes). The cheaper option would've been to just up the quantity of GP2 storage. Provisioned IOPS are outrageously overpriced, but predictable behavior during heavy workloads in this instance made sense.

Your DB is almost entirely cached in RAM. (You can confirm this with use of the pg_buffercache extension). Those IOPS numbers are entirely to be expected. I would expect this server to be just fine without provisioned IOPS.
If you restart the instance it'll be slow for a little while as it builds the cache back up, but 5GB isn't much for that. Also, having provisioned iops actually makes this worse, because as well as setting a minimum I/O rate, piops sets the maximum too. It's a target rate not a minimum.
By contrast, regular volumes can burst to much higher read rates than piops volumes, so they'll perform better when you're warming the cache back up after a restart.
BTW:
Restarting the database won't slow it much, as it only has to read data from the OS's disk cache back into shared_buffers. It's only if you restart the whole machine that you'll see a slowdown for a while. If you want to simulate this without a restart, you can use Linux's drop_caches feature:
echo 1 | sudo tee -a /proc/sys/vm/drop_caches
This is actually worse than the situation after a restart because it evicts binaries and libraries from memory too. The system will chug very heavily at first, as it reads the very frequently accessed binaries and libraries it's executing back into RAM. Then you'll start to see cache recovery behaviour like you would after a restart.
Also, you have too many connections configured. Install pgbouncer, put it in front of the database, and reduce your max_connections. You'll get better performance.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse