VACUUM is taking a lot of time - postgresql

The following command VACUUM my_table has been running for 24 hours already on Postgres (v11.5)
The table has around:
112 million rows
Table Space: 193 GB
6 indexes on 6 different fields + Primary Key index
Is this normal?
More information if it helps...
AWS RDS instance
16GB memory + 4 vCPU (db.m5.xlarge)
800GB allocated storage (Database is taking 495GB of that so far)
Provisioned IOPS - 10000
Adding more info here -
SELECT relname, n_dead_tup FROM pg_stat_user_tables; returns 163441017
We are not running any application queries against the DB, we wanted to let the DB finish vacuum process

It can be. Maybe your 16GB RAM is too low for effective manipulation with big tables (190GB). Very generic rule says so RAM should be about 1/10 db size.
What you can check:
a) look to table pg_stat_activity for related process and check if vacuum is not in waiting on lock state.
b) if you can, check metrics related to IO. Maybe there you will see high IO waits - this is signal, so your IO is overloaded, and then vacuum can be very slow. Table 193GB is really big.

Related

slow query fetch time

I am using gcp cloud sql (mysql 8.0.18) and I am trying to execute a query for only 5000 rows,
SELECT * FROM client_1079.c_crmleads ORDER BY LeadID DESC LIMIT 5000;
but I think the execution is taking long time to fetch data
here is the time details
Affected rows: 0 Found rows: 5,000 Warnings: 0 Duration for 1 query: 0.797 sec. (+ 117.609 sec. network)
Instance configuration is vCPU: 8 , RAM: 20 GB, SSD: 410GB
screenshot of gcp cloud sql instance
also I am facing some issues on high table_open_cache and high ram utilization.
how do I reduce open_table_cache also how to increase instance performance?
Looks like the size of the data retrieved is quite large and the time spent on sending the data from the SQL instance to your App is the reason of the latency observed.
You may review your use case and maybe retrieve less information, or try to parallellize queries, or improve the SQL instance I/O performance (it is related to DB Disk Size).

Postgres11 vacuum taking days

I have set maintenance_work_mem 50GB and trying to run 200GB non partitioned table with 10 indexes manually. I am wondering it is taking 3 days still not completed vacuum.
I know, these many indexes bad but not understanding why vacuum not finished.
Note: No blockings observed

PostgreSQL suddently takes all the disk space

I am facing a very strange issue on my server, my configuration is very straight-forward:
Small VPS, 500 MiB RAM, 40 GiB disk
Debian stable at install time, now probably old_stable
PostgreSQL v11.11
The data is very small, the use of a database for my purpose is probably overkill, but handy:
7 tables
7 views, including one of them which is a little bit scary
The biggest table have a few hundred records
The full dump of the database gives me a file of 93 KiB
Everything was very fast for 1.5 year. Yesterday, the database suddenly became very slow. My investigations showed that the size of the data on the disk was 34 GiB and I had no disk space available anymore.
After more investigations, I tried the command "vacuum full", which deleted the useless 34 GiB. The disk space changed from 100% usage to 10% usage and the performances came back immediately. One day later, the system is slow again, I saw the disk usage is now around 50%.
I have no clue about what is going on, any suggestion?
I'd recommend reading Optimize and Improve PostgreSQL Performance with VACUUM, ANALYZE, and REINDEX and Routine Vacuuming. Here's some relevant bits.
In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table
You must have done a lot of deletes and updates, so Postgres consumed a lot of disk space. vacuum recovers this space. vacuum full isn't normally necessary and will lock your database.
Normally there is an autovacuum daemon running which will vacuum periodically. It probably isn't running. Check with show autovacuum and show track_counts. Both need to be true for autovacuum to run.
You can see what is "bloating" your database with the check_postgres tool.

Postgres count(*) extremely slowly

I know that count(*) in Postgres is generally slow, however I have a database where it's super slow. I'm talking about minutes even hours.
There is approximately 40M rows in a table and the table consists of 29 columns (most of the are text, 4 are double precision). There is an index on one column which should be unique and I've already run vacuum full. It took around one hour to complete but without no observable results.
Database uses dedicated server with 32GB ram. I set shared_buffers to 8GB and work_mem to 80MB but no speed improvement. I'm aware there are some techniques to get approximated count or to use external table to keep the count but I'm not interested in the count specifically, I'm more concerned about performance in general, since now it's awful. When I run the count there are no CPU peeks or something. Could someone point where to look? Can it be that data are structured so badly that 40M rows are too much for postgres to handle?

SELECT performance issues on postgresql 9.1

I am building a large postgres 9.1 database on ubuntu 12.04, with one table that holds about 80 million rows or so. Whenever I run a SELECT statement:
SELECT * FROM db WHERE ID=1;
It takes almost 2.5 minutes to execute the query which returns only a few thousand rows. After running a few diagnostics on the disk I/O, I think that is not the problem, but just in case below is the output from a diagnostic. (I have 2GB of RAM) I am not exactly sure what a good output is here, but it seems ballpark given stats found for other servers on the internet.
time sh -c "dd if=/dev/zero of=bigfile bs=8k count=500000 && sync"
500000+0 records in
500000+0 records out
4096000000 bytes (4.1 GB) copied, 106.969 s, 38.3 MB/s
real 1m49.091s
user 0m0.248s
sys 0m9.369s
I have modified postgresql.conf considerably, boosting the effective_cache to 75% of ram, shared_buffers to 25%, checkpoint_segments to 15, work_mem to 256MB, autovacuum, SHMMAX on the kernel, etc. I have had some performance increases, but not more than 5% better. Networking shouldnt be an issue since it still takes a long time even running on localhost. I am planning to add even more data, and the query time seems to be growing quickly with the number of rows.
It seems like I should be able to run these SELECT statements in a few seconds, not a few minutes. Any suggestions on where this bottleneck could be?
sorry if this is inexcusably obvious, but do you have an index on the ID column?
also, though I'm not blaming the disk, you merely tested sequential bandwidth, which tells you very little about latency. though I have to say that 38 MB/s is underwhelming even for that measure...