Slow PostgreSQL sequential scans on RDS? - postgresql

I have an RDS PostgreSQL instance that's running simple queries, much slower than I would expect - particularly sequential scans, like copying a table or counting a table.
Eg. create table copied_table as (select * from original_table) or select count(*) from some_table
Running count(*) on a 30GB table takes ~15 minutes (with indexes, immediately following a vaccuum).
It's an RDS db.r3.large, 15 GB memory, 400GB SSD. Watching the metrics logs, I've never seen Read IOPS exceed 1,400 and it's usually around 500, well below my expected base.
Configuration:
work_mem: 2GB,
shared_buffers: 3GB,
effective_cache_size: 8GB
wal_buffers: 16MB,
checkpoint_segments: 16
Is this the expected timing? Should I be seeing higher IOPS?

There is not much you can do around plain count queries like that in Postgres, except in 9.6 that implemented parallel sequential scans, which is not available yet in RDS.
Event though, there is a some tips that you can find here. Generally, it's recommended to try to make Postgres to use Index Only Scan, by creating indexes and it's columns in the projection.
SELECT id FROM table WHERE id > 6 and id <100;
-- or
SELECT count(id) FROM table ...
Table should have an index on that column.
The queries that you exposed as example, won't avoid the sequential scan. For the CREATE TABLE, if you don't care about the order in the table, you can open a few backends and import in parallel by filtering by a key range. Also, the only way to speed up this on RDS is increasing IOPs.

Related

Insert into timescaledb filling up ram - wrong chunk_time_interval?

I have a postgres table which I would like to migrate to a timescaledb hypertable. I am using the faster method from this tutorial https://docs.timescale.com/timescaledb/latest/how-to-guides/migrate-data/same-db/#convert-the-new-table-to-a-hypertable to do so.
The command I am using is: INSERT INTO new_table SELECT * FROM old_table; where new_table is a hypertable
Is the problem that I have set chunk_time_interval incorrectly? I used 1h which really should be fine. The total dataset is about 650GB in the original postgres table and spans about 5 months. So that means the average chunk is about 200MB in size which is well below the recommended 25% * 32GB RAM. I actually purposefully chose a number I thought much to low because of additional data I will load to other hypertables in future.
If this is not the problem then what is?
Is there a way to limit postgres or timescaledb to not go over a set amount of ram to protect other processes?
I have experienced this problem before when using a space partition together with the time partition. Check that the number of chunks you have is not too high and make sure to always include a time range in your query. Whats the output of
select * from timescaledb_information.hypertables;
Does it show a high number of chunks in your hypertable?

How to decrease size of a large postgresql table

I have a postgresql table that is "frozen" i.e. no new data is coming into it. The table is strictly used for reading purposes. The table contains about 17M records. The table has 130 columns and can be queried multiple different ways. To make the queries faster, I created indices for all combinations for filters that can be used. So I have a total of about 265 indexes on the table. Each index is about 1.1 GB. This makes the total table size to be around 265 GB. I have vacuumed the table as well.
Question
Is there a way to further bring down the disk usage of this table?
Is there a better way to handle queries for "frozen" tables that never get any data entered into them?
If your table or indexes are bloated, then VACUUM FULL tablename could shrink them. But if they aren't bloated then this won't do any good. This is not a benign operation, it will lock the table for a period of time (needing rebuild hundreds of index, probably a long period of time) and generate large amounts of IO and of WAL, the last of which will be especially troublesome for replicas. So I would test it on a non-production clone to see it actually shrinks things and see about how long of a maintenance window you will need to declare.
Other than that, be more judicious in your choice of indexes. How did you get the list of "all combinations for filters that can be used"? Was it by inspecting your source code, or just by tackling slow queries one by one until you ran out of slow queries? Maybe you can look at snapshots of pg_stat_user_indexes taken a few days apart to see if all them are actually being used.
Are these mostly two-column indexes?

PostgreSQL V12 create temp table runs out of shared memory, can I create on disk?

Running on RHEL 7
PostgreSQL Version 12
System has 28G Memory, and 12G shared Memory
The DB uses over 6T on disk
Some rows have around 300 million rows.
Moved my DB from version 9 to version 12 and am running tests on the new DB. We have a process that generates summary data in a temporary table and then we query the temporary table for different things, and then we delete the temporary table - much faster than running very similar queries multiple times is why this was done.
They query is similar to this:
CREATE TEMPORARY TABLE
XXX
AS
SELECT
COUNT(t.id) AS count,
t.tagged AS tagged,
t.tag_state AS tag_state,
t.error AS error,
td.duplicate AS duplicate
FROM
ttt t
INNER JOIN tweet_data td ON (td.tweet_id = t.id)
GROUP BY
t.tagged,
t.tag_state,
t.error,
td.duplicate;
Note that this works fine on V9, but, I have not watched it very carefully on V9 to see what it does. On V12, shared memory usage grows slowly and then after about 15 minutes it kicks into high gear, grows to about 12G and then tries to make it bigger and failes:
The error is:
ERROR: could not resize shared memory segment "/PostgreSQL.868719775" to 2147483648 bytes: No space left on device
On a whim, we ran just the select statement without creating the temporary table and it also failed while shared memory was increasing, but, the error message said that it was killed by admin.
I am currently running vacuum against the DB to see if that helps.
The largest concern is that this does work with V9, but fails on V12. I also know that they query engine is very different and new in V12 compared to V9.
I had some crazy hope that running vacuum in stages would make a difference. The data was migrated using pg_upgrade.
vacuumdb -U postgres -p 5431 --all --analyze-in-stages
I don't know if the temporary table is created or not, but, after running vacuum, we ran the full query again creating the temp table and it also failed.
Any thoughts? Is my only choice to try more shared memory?
These shared memory segments are used for communication between worker processes with parallel query.
PostgreSQL seems to be tight on resources, and while the error is a symptom rather than the cause of the problem, you can improve the situation by disabling parallel query for this statement:
SET max_parallel_workers_per_gather = 0;
Then your query will take more time, but use less resources, which might be enough to get rid of the problem.
In the long run, you should review your configuration, which might be too generous with memory or the number of connections, but I cannot diagnose that from here.

PostgreSQL - long running SELECT on Big XML - Data in TOAST

I am currently analyzing why the application installed on top of PostgreSQL we are using is sometimes soo slow. The logfiles are showing that queries to a specific table have extremely long execution times.
I further found out, that it is one column on the table, which contains XML documents (ranging from a few bytes to one entry with ~7MB XML data), which is the cause of the slow query.
There are 1100 Rows in the table and a
SELECT * FROM mytable
has the same query execution time of 5 Seconds as
SELECT [XML-column-only] FROM mytable
But in contrast, a
SELECT [id-only] FROM mytable
has a query execution time of only 0.2s!
I couldn't produce any noticeable differences depending on the settings (the usual ones, work_mem, shared_buffers,...), there is even almost no difference in comparison between our production server (PostgreSQL 9.3) and running it in a VM on PostgreSQL 9.4 on my workstation PC.
Disk monitoring shows almost no I/O activity for the query.
So the last thing I went to analyze was the Network I/O.
Of course, as mentioned before, it's a lot of data in these XML Column. Total size for the 1100 rows (XML column only) is 36 MB. Divided for the 5 seconds running time, this are a mere 7.2MB/s Network Transfer, which equal around 60MBit/s. Which is a little bit slow, as we all are on Gbit Ethernet, aren't we? :D Windows Taskmanger also show a utilization of 6% for the Networking during the runtime of the query, which is in concordance with the manual calculation from before.
Furthermore, the query execution time is almost linear to the amount of XML data in the table. For testing I deleted the top 10% rows with the largest amount of data on the XML column, and the execution time (now ~18 instead of 36MB to transfer) dropped to 2.5s instead of 5s.
So, to get to the point: What are my options on the database administration side (we cannot touch or change the application itself), to make the simple SELECT for this XML-Data noticeable faster? Is there any bottleneck I didn't take into account yet? Or is this the normal PostgreSQL behaviour?
EDIT: I use pgAdmin III. The Execution plan (explain (analyze, verbose) select * from gdb_items) shows a much shorter total runtime, than the actual query and the statement duration entry in the log:
Seq Scan on sde.gdb_items (cost=0.00..181.51 rows=1151 width=1399) (actual time=0.005..0.193 rows=1151 loops=1)
Output: objectid, uuid, type, name, physicalname, path, url, properties, defaults, datasetsubtype1, datasetsubtype2, datasetinfo1, datasetinfo2, definition, documentation, iteminfo, shape
Total runtime: 0.243 ms

Why Restore doesnt end?

First I thought was too big restore, so instead of a single 2GB (compress) db backup I split into several backup, one for schema. This schema map has 600 Mb. Next step would be split for tables.
This one have some spatial data from my country map, not sure if that is relavant.
As you can see almost 2h. The disk arent really in use anymore. When restore start, disk reach 100% several times. But last hour has been flat 0%
And as you can see here I can access the data in the all the restored tables. So looks like is already done.
Is this normal?.
There is anything I can check to see what is doing the restore?
Hardware Setup:
Core i7 # 3.4 GHz - 24 GB Ram
DB on 250 gb SSD Backup files in SATA disk
EDIT
SELECT application_name, query, *
FROM pg_stat_activity
ORDER BY application_name, query;
Yes, that seems perfectly normal.
Most likely you observe index or constraint creation. Look at the output of
SELECT * FROM pg_stat_activity;
to confirm that (it should contain CREATE INDEX or ALTER TABLE).
It is too late now, but increasing maintenance_work_mem will speed up index creation.