I run nightly jobs that have quite a few long-lasting memory-heavy queries in Cloud SQL Postgres instance (PostgreSQL 12.11 with 12 CPUs and 40GB Memory).
The workload and the amount of data increased lately and i started seeing issues with the db more and more often, where nightly jobs(when db is under the most load) would run forever and never succeed or timeout. As I understand its because of the memory usage (also seeing this is Total memory usage section the memory reaches 100% capacity during the peak hours).
The only thing that helps is restart, which frees the memory, but it's an emergency short-term fix.
From the database configs, I have these set:
increased work_mem to 400Mb = 1% of RAM (1-5% recommended) Should I increase/decrease it?
increased maintenance_work_mem 4GB = 10% of RAM (10-20% recommended)
shared_buffers = 13.5GB (default)
How can I configure the instance to account for the load without having to increase recourses? Maybe there is a way to somehow free RAM without having to restart the instance?
Thank you so much in advance!
I have a RDS Aurora Postgres instance crashing with OOM error when the freeable memory reaches close to 0. This error generates a database restart and once the freeable memory decreases again over the time, a new restart happens. After each new restart, the freeable memory goes back to around 16500MB.
Does anyone has any idea on why this is happening?
Some information:
Number of ongoing IDLE connections: around 450 IDLE connections
Instance class = db.r6g.2xlarge
vCPU = 8
RAM = 64 GB
Engine version = 11.9
shared_buffers = 43926128kB
work_mem = 4MB
temp_buffers = 8MB
wal_buffers = 16MB
max_connections = 5000
maintenance_work_mem = 1042MB
autovacuum_work_mem = -1
These connections are keep alive for application reutilization and when new queries goes to the database, one of those connections are used by the application. This is not a pool of connection implementation, just 100 instances of my application are connected to the database.
It seems some of these connections/processes are "eating" the memory over the time. By checking the OS processes, I saw some of these processes are increasing the RES memory. For example, one idle process had 920.9 MB as RES metric but now it has 3.96 GiB.
RES metric refers to physical memory being used by the process as per this AWS doc: https://docs.amazonaws.cn/en_us/AmazonRDS/latest/AuroraUserGuide/USER_Monitoring.OS.Viewing.html
I'm wondering if this issue is related to these idle connections as described here: https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/
Maybe I should reduce the number of connections to the database.
Freeable Memory on CloudWatch graph:
General metrics on CloudWatch:
Enhanced monitoring metrics:
OS processes (around 100 ongoing processes):
I have a cloud server in cloudways, the CPU load is very high even after I upgrade my server up 2 levels but the strange thing is the ram is almost free ( server 16 GB ram 6 Core) is there anything we can do to take advantage of that free ram to reduce CPU load.
Regards
No CPU and RAM are different things
Check the reason why your CPU is highly loaded.
Maybe your host where your VM runs on is overloaded. Did you try to
contact your cloud provider?
I have installed Nominatim to a server dedicated just for OSM data, with the following configurations: CentOS 7 operating system, 2x Intel XEON CPU L5420 # 2.50GHz (Total 8 CPU cores), 16 GB of ram, and 2x2TB SATA hard drive.
I've configured the postgresql based on the recomendations on the Nominatim install wiki (http://wiki.openstreetmap.org/wiki/Nominatim/Installation#PostgreSQL_Tuning), taking into account, that my machine has only got 16 GB instead of the 32 GB recommended for those configs. I've used the following things:
shared_buffers = 1GB # recommended for a 32GB machine was 2 GB
maintenance_work_mem = 4GB # recommended for a 32GB macinhe was 8 GB
work_mem = 20MB # recommended for a 32GB machine was 50 MB
effective_cache_size = 10GB # recommended for a 32GB machine was 24 GB
synchronous_commit = off
checkpoint_segments = 100
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
fsync = off
full_page_writes = off`
First, I've tried importing a small country extract(Luxembourg), setting a cache size of 6000, using the setup.php file from utils, it was imported succesfully under 1 hour.
Secondly, I've deleted the data of Luxembourg, and imported for another test purpose the country extract of Great Brittain, using a cache size of 8000, it imported succesfully as well, in around 2-3 hours.
Today, I've decided, to try to import the whole planet.pbf file, so I've deleted the postgresql database, downloaded a pbf of the planet from one of the official mirror sites, and ran the setup with a cache size of 10000. Beforehand, I've read up some benchmarks to get a vague idea of how much time and space will this operation take.
When the import started, I was very surprised. The importing of the nodes went with a whopping high speed of 1095.6k/s, in the benchmark which I've analyized (a 32GB ram machine), it was only 311.7k/s.
But when the import of the nodes finished, and the import of the ways started, the speed significantly dropped. It was importing the ways with the speed of 0.16k/s (altough it was slowly rising, it started from 0.05k/s, and in 4 hours it rised to the above mentioned value).
I've stopped the import, and tried to tweak the settings. I've allocated a higher cache size first (12000), but with no success, the nodes imported with a very high speed, but the ways remained at 0.10-0.13k/s. I then tried allocating a new swap file(the original was 8GB, I've allocated another 32GB as a swap file), but that didn't change anything neither. Lastly, I've edited the setup.php, changed the --number-processes from 1, to 6, and included the --slim keyword when osm2psql is started from there, but nothing changed.
Right now I am out of ideas. Is this speed decrease normal? Should I upgrade my machine to the recommended memory? I tought that a 16GB ram would be enough for planet pbf, I was aware that it could take more time with this machine, then with a 32 GB, but this seems very much. If the whole planet import would take not more then 12-15 days, I would be ok with that, but as things look now, with these settings the import would take around 2 months, and this is just too much, considering, an error could occur anywhere, and I have to start the whole import process again.
Any ideas what could cause this problem, or what other tweaks could I try, to fasten the import process?
Thanks
I had a similar performance problem using SATA drives, when I replaced the SATA drives for SSD drives the ways import speeded up from 0.02k/s to 8.29k/s. Now I have a very slow relations import which is at 0.01/s rate, so I believe memory is also an important factor for a full planet import but I have not tested it again.
Im still fighting with mongoDB and I think this war will end is not soon.
My database has a size of 15.95 Gb;
Objects - 9963099;
Data Size - 4.65g;
Storage Size - 7.21g;
Extents - 269;
Indexes - 19;
Index Size - 1.68g;
Powered by:
Quad Xeon E3-1220 4 × 3.10 GHz / 8Gb
For me to pay dearly for a dedicated server.
On VPS 6GB memory, database is not imported.
Migrate to the cloud service?
https://www.dotcloud.com/pricing.html
I try to pick up the rate but there max 4Gb memory mongoDB (USD 552.96/month o_0), I even import your base can not, not enough memory.
Or something I do not know about cloud services (no experience with)?
Cloud services are not available to a large database mongoDB?
2 x Xeon 3.60 GHz, 2M Cache, 800 MHz FSB / 12Gb
http://support.dell.com/support/edocs/systems/pe1850/en/UG/p1295aa.htm
Will work my database on that server?
This is of course all the fun and get the experience in the development, but already beginning to pall ... =]
You shouldn't have an issue with a db of this size. We were running a mongodb instance on Dotcloud with 100's of GB of data. It may just be because Dotcloud only allow 10GB of HDD space by default per service.
We were able to backup and restore that instance on 4GB of RAM - albeit that it took several hours
I would suggest you email them directly support#dotcloud.com to get help increasing the HDD allocation of your instance.
You can also consider using ObjectRocket which is a MOngoDB as a service. For a 20Gb database the price is $149 per month - http://www.objectrocket.com/pricing