PostgreSQL large number of partition tables problem using hash partitioning - postgresql

I have a very large database with more than 1.5 billion records for device data and growing.
I manage this by having a separate table for each device, about 1000 devices (tables) with an index table for daily stats. Some devices produce much more data than others, so I have tables with more than 20 million rows and others with less than 1 million.
I use indexes, but queries and data processing gets very slow on large tables.
I just upgraded to PostgreSQL 13 from 9.6 and tried to do one single table with hash partition with a least 3600 tables to import all the tables into this one and speed up the process.
But as soon as I do this, I was able to insert some rows, but when I try to query or count rows I get out of shared memory, and max locks per transaction issues.
I tried to fine tune but didn’t succeed. I dropped the tables to 1000, but in certain operations I get the error once again, just for testing I dropped down to 100 and it works, but queries are slower with the same amount data in a stand alone table.
I tried range partition in each individual table for year period and improved but will be very messy to maintain thousands of tables with yearly ranges (note I am running in a server with 24 virtual processors and 32 GB RAM).
The question is: Is it possible to have a hash partition with more than 1000 tables? If so, what I am doing wrong?

Related

Postgres count(*) extremely slowly

I know that count(*) in Postgres is generally slow, however I have a database where it's super slow. I'm talking about minutes even hours.
There is approximately 40M rows in a table and the table consists of 29 columns (most of the are text, 4 are double precision). There is an index on one column which should be unique and I've already run vacuum full. It took around one hour to complete but without no observable results.
Database uses dedicated server with 32GB ram. I set shared_buffers to 8GB and work_mem to 80MB but no speed improvement. I'm aware there are some techniques to get approximated count or to use external table to keep the count but I'm not interested in the count specifically, I'm more concerned about performance in general, since now it's awful. When I run the count there are no CPU peeks or something. Could someone point where to look? Can it be that data are structured so badly that 40M rows are too much for postgres to handle?

Postgres Partitioning Query Performance when Partitioned for Delete

We are on Postgresql 12 and looking to partition a group of tables that are all related by Data Source Name. A source can have tens of millions of records and the whole dataset makes up about 900GB of space across the 2000 data sources. We don't have a good way to update these records so we are looking at a full dump and reload any time we need to update data for a source. This is why we are looking at using partitioning so we can load the new data into a new partition, detach (and later drop) the partition that currently houses the data, and then attach the new partition with the latest data. Queries will be performed via a single ID field. My concern is that since we are partitioning by source name and querying by an ID that isn't used in the partition definition that we won't be able to utilize any partition pruning and our queries will suffer for it.
How concerned should we be with query performance for this use case? There will be an index defined on the ID that is being queried, but based on the Postgres documentation it can add a lot of planning time and use a lot of memory to service queries that look at many partitions.
Performance will suffer, but it will depend on the number of partitions how much. The more partitions you have, the slower both planning and execution time will get, so keep the number low.
You can save on query planning time by defining a prepared statement and reusing it.

Slow bulk read from Postgres Read replica while updating the rows we read

We have on RDS a main Postgres server and a read replica.
We constantly write and update new data for the last couple of days.
Reading from the read-replica works fine when looking at older data but when trying to read from the last couple of days, where we keep updating the data on the main server, is painfully slow.
Queries that take 2-3 minutes on old data can timeout after 20 minutes when querying data from the last day or two.
Looking at the monitors like CPU I don't see any extra load on the read replica.
Is there a solution for this?
You are accessing over 65 buffers for ever 1 visible row found in the index scan (and over 500 buffers for each row which is returned by the index scan, since 90% are filtered out by the mmsi criterion).
One issue is that your index is not as well selective as it could be. If you had the index on (day, mmsi) rather than just (day) it should be about 10 times faster.
But it also looks like you have a massive amount of bloat.
You are probably not vacuuming the table often enough. With your described UPDATE pattern, all the vacuum needs are accumulating in the newest data, but the activity counters are evaluated based on the full table size, so autovacuum is not done often enough to suit the needs of the new data. You could lower the scale factor for this table:
alter table simplified_blips set (autovacuum_vacuum_scale_factor = 0.01)
Or if you partition the data based on "day", then the partitions for newer days will naturally get vacuumed more often because the occurrence of updates will be judged against the size of each partition, it won't get diluted out by the size of all the older inactive partitions. Also, each vacuum run will take less work, as it won't have to scan all of the indexes of the entire table, just the indexes of the active partitions.
As suggested, the problem was bloat.
When you update a record in an ACID database the database creates a new version of the record with the new updated record.
After the update you end with a "dead record" (AKA dead tuple)
Once in a while the database will do autovacuum and clean the table from the dead tuples.
Usually the autovacuum should be fine but if your table is really large and updated often you should consider changing the autovacuum analysis and size to be more aggressive.

Bulk Insert and Update in MySQL Cluster

Currently we are using Mysql Cluster with Mybaits. When we do bulk insertion or updation into particular table, it took more than 120 seconds but expectation is below 30 secs.
For Example 10k records, First we tried to update the 10k rows at time, it took more than 180 to 240 minutes. So we moved to some solution splitting into batches like 4k, 4k, 2k, this also took 120 to 180 minutes. Finally we spitted the records to 2k, 2k, .... took 90 to 120 seconds but CPU usage went to high.
There is no relationship on that table.
Any solutions for these cases, shall we move to nosql or optimization in db level.
Cluster is very efficient when batching as network roundtrips are avoided. But your inserts sound terribly slow. Even serial inserts when not using batching should be much faster.
When I insert 20k batched records into a cluster table it takes about 0.18sec on my laptop. Depends obviously on schema and amount of data.
Make sure you are not using e.g. auto-commit after each record. Also use
INSERT ... VALUES (), (), () ... type batched inserts
rather than
INSERT ... VALUES ()
INSERT ... VALUES ()
You can also increase the ndb-batch-size depending on the amount of data to insert in one transaction.
Details about your setup, how you insert, if there are blobs and what schema and data look like will help to answer more specifically.

How many (Maximum) DB2 multi row fetch cursor can be maintained in PLI/COBOL program?

How many (Maximum) DB2 multi row fetch cursor can be maintained in PLI/COBOL program as part of good performance?
I have a requirement to maintain 4 cursors in PLI program but I am concerned about number of multi fetch cursors in single program.
Is there any other way to check multi row fetch is more effective than normal cursor? I tried with 1000 records but I couldn't see the running time difference.
IBM published some information (PDF) about multi-row fetch performance when this feature first became available in DB2 8 in 2005. Their data mentions nothing about the number of cursors in a given program, only the number of rows fetched.
From this I infer the number of multi-row fetch cursors itself is not of concern from a performance standpoint -- within reason. If someone pushes the limits of reason with 10,000 such cursors I will not be responsible for their anguish.
The IBM Redbook linked to earlier indicates there is a 40% CPU time improvement retrieving 10 rows per fetch, and a 50% CPU time improvement retrieving 100+ rows per fetch. Caveats:
The performance improvement using multi-row fetch in general depends
on:
Number of rows fetched in one fetch
Number of columns fetched
(more improvement with fewer columns), data type and size of the
columns
Complexity of the fetch. The fixed overhead saved for not
having to go between the database engine and the application program
has a lower percentage impact for complex SQL that has longer path
lengths.
If the multi-row fetch reads more rows per statement, it results in
CPU time improvement, but after 10 to 100 rows per multi-row fetch,
the benefit is decreased. The benefit decreases because, if the cost
of one API overhead per row is 100% in a single row statement, it gets
divided by the number of rows processed in one SQL statement. So it
becomes 10% with 10 rows, 1% with 100 rows, 0.1% for 1000 rows, and
then the benefit becomes negligible.
The Redbook also has some discussion of how they did their testing to arrive at their performance figures. In short, they varied the number of rows retrieved and reran their program several times, pretty much what you'd expect.