Postgres: truncate / load causes basic queries to take seconds - postgresql

I have a Postgres table that on a nightly basis gets truncated, and then reloaded in a bulk insert (a million or so records).
This table is behaving very strangely: basic queries such as "SELECT * from mytable LIMIT 10" are taking 40+ seconds. Records are narrow, just a couple integer columns.
Perplexed.. very much appreciate your advice.

Related

postgres Query optimization merge index

I am experienced in fine tuning in oracle, but in postgres I am unable to improve performance.
Problem statement: I need to aggregate rows out of one postgres table - that has large no.of columns (110) and 175 million rows for a month range. The query other than aggregation has a very simple where clause :
where time between '2019-03-15' and '2019-04-15'
and org_name in ('xxx','yyy'.. 15 elements)
There are individutal btree indexes on table for each "time" idx_time column and "org_name" idx_org_name but not composite index.
I tried creating new index with ('org_name','time') but my manager does not want to change anything.
How can I make it run faster? It takes 15 minutes now (in case of smaller set of org_name it takes 6 minutes ). Most time is spent on data access from table.
Is parallel execution possible?
thanks, Jay
QUERY EXPLAIN ANALYZE :

Postgres multi-column index is taking forever to complete

I have a table with around 270,000,000 rows and this is how I created it.
CREATE TABLE init_package_details AS
SELECT pcont.package_content_id as package_content_id,
pcont.activity_id as activity_id,
pc.org_id as org_id,
pc.bed_type as bed_type,
pc.is_override as is_override,
pmmap.package_id as package_id,
pcont.activity_qty as activity_qty,
pcont.charge_head as charge_head,
pcont.activity_charge as charge,
COALESCE(pc.charge,0) - COALESCE(pc.discount,0) as package_charge
FROM a pc
JOIN b od ON
(od.org_id = pc.org_id AND od.status='A')
JOIN c pm ON
(pc.package_id=pm.package_id)
JOIN d pmmap ON
(pmmap.pack_master_id=pm.package_id)
JOIN e pcont ON
(pcont.package_id=pmmap.package_id);
I need to build index on the init_package_details table.
This table is getting created at around 5-6 mins.
I have created btree index like,
CREATE INDEX init_package_details_package_content_id_idx
ON init_package_details(package_content_id);`
which is taking 10 mins (More than the time to create and populate the table itself)
And, when I create another index like,
CREATE INDEX init_package_details_package_act_org_bt_id_idx
ON init_package_details(activity_id,org_id,bed_type);
It just freezes and taking forever to complete. I waited for around 30 mins before I manually cancelled it.
Below are stats from iotop -o if it helps,
When I created table Averaging around 110-120 MB/s (This is how 270 million rows got inserted in 5-6 mins)
When I created First Index, It was averaging at around 70 MB/s
On second index, it is snailing at 5-7 MB/s
Could someone explain Why is this happening? Is there anyway I can speedup the index creations here?
EDIT 1: There are no other connections accessing the table. And, pg_stat_activity shows active as status throughout the running time. This happens inside a transaction (this is happening between BEGIN and COMMIT, it contains many other scripts in same .sql file).
EDIT 2:
postgres=# show work_mem ;
work_mem
----------
5MB
(1 row)
postgres=# show maintenance_work_mem;
maintenance_work_mem
----------------------
16MB
Building indexes takes a long time, that's normal.
If you are not bottlenecked on I/O, you are probably on CPU.
There are a few things to improve the performance:
Set maintenance_work_mem very high.
Use PostgreSQL v11 or better, where several parallel workers can be used.

Query on large, indexed table times out

I am relatively new to using Postgres, but am wondering what could be the workaround here.
I have a table with about 20 columns and 250 million rows, and an index created for the timestamp column time (but no partitions).
Queries sent to the table have been failing (although using the view first/last 100 rows function in PgAdmin works), running endlessly. Even simple select * queries.
For example, if I want to LIMIT a selection of the data to 10
SELECT * from mytable
WHERE time::timestamp < '2019-01-01'
LIMIT 10;
Such a query hangs - what can be done to optimize queries in a table this large? When the table was of a smaller size (~ 100 million rows), queries would always complete. What should one do in this case?
If time is of data type timestamp or the index is created on (time::timestamp), the query should be fast as lightning.
Please show the CREATE TABLE and the CREATE INDEX statement, and the EXPLAIN output for the query for more details.
"Query that doesn't complete" usually means that it does disk swaps. Especially when you mention the fact that with 100M rows it manages to complete. That's because index for 100M rows still fits in your memory. But index twice this size doesn't.
Limit won't help you here, as database probably decides to read the index first, and that's what kills it.
You could try and increase available memory, but partitioning would actually be the best solution here.
Partitioning means smaller tables. Smaller tables means smaller indexes. Smaller indexes have better chances to fit into your memory.

Postgresql select count(*) takes too long time

I have a table in my postgresql table. The table has about 9.100.000 rows. When I execute a query select count(*) from table the execution time is about 1.5 minutes. Is this normal? And what can I do decrease this tim?
If you want an estimation of the size you can use count_estimate. It is much faster.
https://wiki.postgresql.org/wiki/Count_estimate
Another workaround is to use a statistics field, to increase it every time a new row is being added.
Also please read https://www.citusdata.com/blog/2016/10/12/count-performance/

Postgres, operation are slow

My Postgres queries are slow on the table records.
A simple request like that can take 15 seconds !
The result: 32k (on 1.5 millions)
SELECT COUNT(*)
FROM project.records
WHERE created_at > NOW() - INTERVAL '1 day'
I have an index on created_at (which is a timestamp)
What can I do to manage this ? Is it my table who is too big ?
As suggested by #Andomar, I removed the large columns to another table.
I made sure to do a VACUUM ANALYZE to really clean the table.
Now the query take 400ms.