I am trying to learn PostgreSQL MVCC architecture. It says that MVCC creates a separate snapshot for each concurrent query. Isn't this approach memory inefficient?
For example if there are 1000 concurrent queries and table size is huge. This will create multiple instances of the table.
Is my understanding correct?
It says that MVCC creates a separate snapshot for each concurrent query. Isn't this approach memory inefficient?
You could argue it is memory inefficient. It usually isn't a big problem in practise.
For example if there are 1000 concurrent queries and table size is huge.
Why would you have/want 1000 concurrent queries? Do you have 1000 CPUs? If there is a risk that you will try to establish 1000 concurrent queries, then you should deploy some entry control mechanism (like a connection pooler) that prevents this from happening, with a fallback to max_connections.
This will create multiple instances of the table.
A snapshot is not a copy of the table. Is just a set of information that gets applied to the base table rows dynamically to decide which rows are visible in that snapshot. The size of a snapshot is proportional to number of concurrent transactions (one reason not have 1000 of them), not to the size of the table.
Related
As I understand, when you perform a query that doesn't filter by one primary key, you perform a cross-partition query. For this to be executed, the query is sent to all physical partitions of your CDB collection, executed in parallel in each of them, and then returned.
As you scale to tens of thousands of requests per second, that means that each of the tens of thousands of requests is executed on each physical partition.
Does this mean that eventually each partition will reach its limit of requests per second it can serve, and horizontal scaling will no longer give any benefit? Because for every new physical partition CDB adds, it will need to serve all requests coming in, so it's not adding new throughout capacity, only storage.
The downstream implication being that even if at a small scale you're ok with incurring the increased RU cost for cross-partition queries, to truly be able to scale indefinitely your data model should ensure queries hit only one partition (possibly by denormalizing it).
Yes, cross partition queries will not allow a database like Cosmos DB (or any horizontally scalable database) to scale.
Databases like Cosmos DB provide unlimited scale because it scales horizontally. The objective for your partition strategy should be to answer your high volume queries with one, or at a minimum, a bounded set of partitions. The effort around partition strategy is to chose a property that is nearly always passed in queries. Denormalization is generally more a function of modeling data around requests. It has less to do with partitioning directly.
If you would like to learn more about partitioning and modeling with Cosmos DB I highly recommend watching this video. It presents the topics very well, Data modeling & partitioning: What every relational database dev needs to know
We are on Postgresql 12 and looking to partition a group of tables that are all related by Data Source Name. A source can have tens of millions of records and the whole dataset makes up about 900GB of space across the 2000 data sources. We don't have a good way to update these records so we are looking at a full dump and reload any time we need to update data for a source. This is why we are looking at using partitioning so we can load the new data into a new partition, detach (and later drop) the partition that currently houses the data, and then attach the new partition with the latest data. Queries will be performed via a single ID field. My concern is that since we are partitioning by source name and querying by an ID that isn't used in the partition definition that we won't be able to utilize any partition pruning and our queries will suffer for it.
How concerned should we be with query performance for this use case? There will be an index defined on the ID that is being queried, but based on the Postgres documentation it can add a lot of planning time and use a lot of memory to service queries that look at many partitions.
Performance will suffer, but it will depend on the number of partitions how much. The more partitions you have, the slower both planning and execution time will get, so keep the number low.
You can save on query planning time by defining a prepared statement and reusing it.
We have on RDS a main Postgres server and a read replica.
We constantly write and update new data for the last couple of days.
Reading from the read-replica works fine when looking at older data but when trying to read from the last couple of days, where we keep updating the data on the main server, is painfully slow.
Queries that take 2-3 minutes on old data can timeout after 20 minutes when querying data from the last day or two.
Looking at the monitors like CPU I don't see any extra load on the read replica.
Is there a solution for this?
You are accessing over 65 buffers for ever 1 visible row found in the index scan (and over 500 buffers for each row which is returned by the index scan, since 90% are filtered out by the mmsi criterion).
One issue is that your index is not as well selective as it could be. If you had the index on (day, mmsi) rather than just (day) it should be about 10 times faster.
But it also looks like you have a massive amount of bloat.
You are probably not vacuuming the table often enough. With your described UPDATE pattern, all the vacuum needs are accumulating in the newest data, but the activity counters are evaluated based on the full table size, so autovacuum is not done often enough to suit the needs of the new data. You could lower the scale factor for this table:
alter table simplified_blips set (autovacuum_vacuum_scale_factor = 0.01)
Or if you partition the data based on "day", then the partitions for newer days will naturally get vacuumed more often because the occurrence of updates will be judged against the size of each partition, it won't get diluted out by the size of all the older inactive partitions. Also, each vacuum run will take less work, as it won't have to scan all of the indexes of the entire table, just the indexes of the active partitions.
As suggested, the problem was bloat.
When you update a record in an ACID database the database creates a new version of the record with the new updated record.
After the update you end with a "dead record" (AKA dead tuple)
Once in a while the database will do autovacuum and clean the table from the dead tuples.
Usually the autovacuum should be fine but if your table is really large and updated often you should consider changing the autovacuum analysis and size to be more aggressive.
I created a multi-threaded connections from Java to KDB then have records inserted to a single table concurrently.
But it seems that the sum of the individual duration and the overall duration is almost the same as if no concurrent insertion happened.
Would you know if KDB supports parallel insertion?
If so, is there any setting I should do?
Does it have a record-level or table-level locking?
kdb does not support parallel inserts into in-memory tables. In fact updates to in-memory data may only be made from the q main thread. This means that tables are 'locked' (can't be amended) essentially to all clients if a q server is started with a negative port, and the issue is irrelevant if the q session is in single threaded mode (as most sessions tend to be). The situation is a little different for tables stored on disk (I can expand on that later if required).
In order to accelerate your inserts I would suggest looking at the following:
a) Are the inserts batched, rather than as a series of single inserts? One insert of 1k rows will take much less time that 1k inserts of one row.
b) Are the inserts sent async or sync? Changing between these two may speed up insertion rates but at the cost of knowing if the inserts executed correctly.
Can you share more about your use case? Is your Java client sending market data? if so would a TP style setup be more appropriate? See kdb+ tick and its derivatives such as TorQ (note that TorQ is developed by my employer).
A KDB process is a single-threaded process in general (except when running in multiple slave thread/process mode) https://code.kx.com/q/ref/cmdline/#-s-slaves
Though you have multiple java threads writing data to q process, the data is getting written in KDB in a sequential manner, hence it is not giving any performance benefit. it does not need the table/row level locking due to this
though I would recommend that you stream the data in async mode (negative handle), this will let your java threads come quickly rather than waiting for KDB to complete the operation, this will definitely improve the performance at the writing side.
While using parallel processing mode(slave threads - positive number), the slave threads are not allowed writing to the global tables/variables; you would need to use multi-process mode to achive that(negative number while launching the q process)
How many (Maximum) DB2 multi row fetch cursor can be maintained in PLI/COBOL program as part of good performance?
I have a requirement to maintain 4 cursors in PLI program but I am concerned about number of multi fetch cursors in single program.
Is there any other way to check multi row fetch is more effective than normal cursor? I tried with 1000 records but I couldn't see the running time difference.
IBM published some information (PDF) about multi-row fetch performance when this feature first became available in DB2 8 in 2005. Their data mentions nothing about the number of cursors in a given program, only the number of rows fetched.
From this I infer the number of multi-row fetch cursors itself is not of concern from a performance standpoint -- within reason. If someone pushes the limits of reason with 10,000 such cursors I will not be responsible for their anguish.
The IBM Redbook linked to earlier indicates there is a 40% CPU time improvement retrieving 10 rows per fetch, and a 50% CPU time improvement retrieving 100+ rows per fetch. Caveats:
The performance improvement using multi-row fetch in general depends
on:
Number of rows fetched in one fetch
Number of columns fetched
(more improvement with fewer columns), data type and size of the
columns
Complexity of the fetch. The fixed overhead saved for not
having to go between the database engine and the application program
has a lower percentage impact for complex SQL that has longer path
lengths.
If the multi-row fetch reads more rows per statement, it results in
CPU time improvement, but after 10 to 100 rows per multi-row fetch,
the benefit is decreased. The benefit decreases because, if the cost
of one API overhead per row is 100% in a single row statement, it gets
divided by the number of rows processed in one SQL statement. So it
becomes 10% with 10 rows, 1% with 100 rows, 0.1% for 1000 rows, and
then the benefit becomes negligible.
The Redbook also has some discussion of how they did their testing to arrive at their performance figures. In short, they varied the number of rows retrieved and reran their program several times, pretty much what you'd expect.