I'm trying to use the Postgres JDBC driver to query data from a table where each rows can be up to about 50MB. Unfortunately, without any memory restrictions, the Postgres driver can use too much memory and cause OOMs (even with a very healthy Xmx) because it buffers so much data locally.
I've tried to restrict the driver to using less memory, for example 1GB, and telling it to buffer less too. Since no one row is bigger than 50MB this should work fine, but unfortunately I'm now getting Exceptions thrown from the Postgres driver itself. The exceptions are because it is trying to allocate more memory than I have configured it with.
If I use this configuration:
"jdbc:postgresql://localhost/dbname?maxResultBuffer=1G&adaptiveFetch=true&adaptiveFetchMaximum=2&defaultRowFetchSize=1"
I'll get an Exception thrown here, in PGStream
if (resultBufferByteCount > maxResultBuffer) {
throw new PSQLException(GT.tr(
"Result set exceeded maxResultBuffer limit. Received: {0}; Current limit: {1}",
String.valueOf(resultBufferByteCount), String.valueOf(maxResultBuffer)),PSQLState.COMMUNICATION_ERROR);
}
If I set a breakpoint there I can see:
value = 41155480
resultBufferByteCount = 1021091718
maxResultBuffer = 1000000000
Which shows it's picking up the config fine. I've also inspected it to make sure it's getting the fetch size config and it is.
Is there some other config I'm missing? Clearly the Postgres driver is reading more rows than I've allowed it to.
thanks
(postgreqsl 42.5.1, java 17.0.5, hikaricp 5.0.1 with max connections of 1)
The adaptive buffer, like setFetchSize, only works if autocommit is off. If Autocommit is on, then they are silently ignored. I don't know if there is a way to turn autocommit off though the jdbc connect string, I haven't found one.
Related
What I'm trying to do:
I'm trying to move about 2m records from one table into another. To do this, I'm doing an insert statement which is fed by a select query.
insert into my_table (
select a, b, c
from my_other_table
where (condition)
)
However, while running this, I keep running out of memory.
What I expected (and why I'm confused):
If the working set was larger than could fit in memory, I totally thought Postgres would buffer pages onto the disk and do the write iteratively behind the scenes.
However, what's happening is that it apparently tries to read all of the selected content into memory prior to stuffing it into the other table.
Even on our chunky r5.2xl instance, it consumes all of the memory until eventually the OOM Killer fires and the Aurora reboots the instance.
This graph is showing freeable memory dip down to zero everytime I run the query. The memory shooting back up is due to the instance automatically being killed and rebooted due to OOM.
My main question:
Why is Postgresql not being smarter and doing its own pagination behind the scenes?
Is there some setting I need to enable to get it to be aware of its memory limitations?
What I've tried:
Adjusting shared_buffers and work_mem parameters.
Aurora's default shared_buffer value allocates 20gb to our instance. I've tried dialing this down to 10gb, and then 6.5gb (restarting each time) but to no avail. The only affect was to make the query take ages and still ultimately consume all memory available after running for about 30min.
I similarly tried setting work_mem all the way to allowable minimum, but this seemingly had no effect on the end result as well.
What I can do as a work around:
I could, of course, do the pagination / batching from the client:
computeBatchOffsets(context).forEach(batchOffset ->
context.insertInto(BLAH)
.select(DSL.asterisk())
.from(FOO)
.limit(batchOffset)
.offset(batchOffset)
.execute()
But, in addition to it being slower than just letting the database do it, it "feels" like something the database should surely be able to do internally. So, I'm confused why I'd need to handle it at the client level.
I recently upgraded a Postgres 9.6 instance to 11.1 on Google Cloud SQL. Since then I've begun to notice a large number of the following error across multiple queries:
org.postgresql.util.PSQLException: ERROR: could not resize shared
memory segment "/PostgreSQL.78044234" to 2097152 bytes: No space left
on device
From what I've read, this is probably due to changes that came in PG10, and the typical solution involves increasing the instance's shared memory. To my knowledge this isn't possible on Google Cloud SQL though. I've also tried adjusting work_mem with no positive effect.
This may not matter, but for completeness, the instance is configured with 30 gigs of RAM, 120 gigs of SSD hd space and 8 CPU's. I'd assume that Google would provide an appropriate shared memory setting for those specs, but perhaps not? Any ideas?
UPDATE
Setting the database flag random_page_cost to 1 appears to have reduced the impact the issue. This isn't a full solution though so would still love to get a proper fix if one is out there.
Credit goes to this blog post for the idea.
UPDATE 2
The original issue report was closed and a new internal issue that isnt viewable by the public was created. According to a GCP Account Manager's email reply however, a fix was rolled out by Google on 8/11/2019.
This worked for me, I think google needs to change a flag on how they're starting the postgres container on their end that we can't influence inside postgres.
https://www.postgresql.org/message-id/CAEepm%3D2wXSfmS601nUVCftJKRPF%3DPRX%2BDYZxMeT8M2WwLSanVQ%40mail.gmail.com
Bingo. Somehow your container tech is limiting shared memory. That
error is working as designed. You could figure out how to fix the
mount options, or you could disable parallelism with
max_parallel_workers_per_gather = 0.
show max_parallel_workers_per_gather;
-- 2
-- Run your query
-- Query fails
alter user ${MY_PROD_USER} set max_parallel_workers_per_gather=0;
-- Run query again -- query should work
alter user ${MY_PROD_USER} set max_parallel_workers_per_gather=2;
-- -- Run query again -- fails
You may consider increasing Tier of the instance, that will have influence on machine memory, vCPU cores, and resources available to your Cloud SQL instance. Check available machine types
In Google Cloud SQL PostgreSQL is also possible to change database flags, that have influence on memory consumption:
max_connections: some memory resources can be allocated per-client, so the maximum number of clients suggests the maximum possible memory use
shared_buffers: determines how much memory is dedicated to PostgreSQL to use for caching data
autovacuum - should be on.
I recommend lowering the limits, to lower memory consumption.
I tried issuing a read all request (select * from tblName) using JDBC cursors (setFetchSize). The temp_file_limit property in postgresql.conf is 500 KB. When I execute the prepared statement, I get a psql exception -
org.postgresql.util.PSQLException: ERROR: temporary file size exceeds temp_file_limit (500kB)
The documentation in PostgreSQL.conf says "# limits per-session temp file space".
https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor as per this link, the connection must have disabled auto-commit and I had not disabled it. I did that as well, yet I ended up with the same issue. My understanding is that such large read operations are written to a temporary file before loading to the resultset. If this is the case then with a low temp_file_limit, I will never be able to read very large data, even with a cursor. This does not make sense to why it is made configurable in the first place.
Setting the temp_file_limit to -1 (unbounded file size) solved this issue for me. Am I correct in taking this approach?
I have seen queries that would use a lot of space (sometimes over 1TiB), kept going until all free space was used and other queries started crashing. Setting it to lets say reasonable (in our case) 100GiB would have protected other sessions.
Setting it to 500kB seems indeed pointless.
FYI temp files are used not only for cursors.
Having a postgres DB on AWS-RDS the Swap Usage in constantly rising.
Why is it rising? I tried rebooting but it does not sink. AWS writes that high swap usage is "indicative of performance issues"
I am writing data to this DB. CPU and Memory do look healthy:
To be precise i have a
db.t2.micro-Instance and at the moment ~30/100 GB Data in 5 Tables - General Purpose SSD. With the default postgresql.conf.
The swap-graph looks as follows:
Swap Usage warning:
Well It seems that your queries are using a memory volume over your available. So you should look at your queries execution plan and find out largest loads. That queries exceeds the memory available for postgresql. Usually over-much joining (i.e. bad database structure, which would be better denonarmalized if applicable), or lots of nested queries, or queries with IN clauses - those are typical suspects. I guess amazon delivered as much as possible for postgresql.conf and those default values are quite good for this tiny machine.
But once again unless your swap size is not exceeding your available memory and your are on a SSD - there would be not that much harm of it
check the
select * from pg_stat_activity;
and see if which process taking long and how many processes sleeping, try to change your RDS DBparameter according to your need.
Obviously you ran out of memory. db.t2.micro has only 1GB of RAM. You should look in htop output to see which processes takes most of memory and try to optimize memory usage. Also there is nice utility called pgtop (http://ptop.projects.pgfoundry.org/) which shows current queries, number of rows read, etc. You can use it to view your postgress state in realtime. By the way, if you cannot install pgtop you can get just the same information from posgres internal tools - check out documentation of postgres stats collector https://www.postgresql.org/docs/9.6/static/monitoring-stats.html
Actually it is difficult to say what the problem is exactly but db.t2.micro is a very limited instance. You should consider taking a biggier instance especially if you are using postgres in production.
I use PostgreSQL 9.5 and have set default value for huge_pages = try. How can I determine if postgres is using it while server is running?
_pages (enum)
Enables/disables the use of huge memory pages. Valid values are try (the default), on, and off.
At present, this feature is supported only on Linux. The setting is ignored on other systems when set to try.
The use of huge pages results in smaller page tables and less CPU time spent on memory management, increasing performance. For more details, see Section 17.4.4.
With huge_pages set to try, the server will try to use huge pages, but fall back to using normal allocation if that fails. With on, failure to use huge pages will prevent the server from starting up. With off, huge pages will not be used.