Identify source of increased PostgreSQL shared memory - postgresql

After upgrading from PostgreSQL 9.6 to 10 and also updating the overlying application (Trend Micro Deep Security), we see an increase in the overall shared memory utilization by more than 300%.
Currently the shared memory is around 3GB, which aligns with the shared_buffer parameter, set to 3072MB. Using ps, top and pmap I was able to tell that almost the entire shared memory is used by postgres-related processes.
However, I would like to know what's the actual cause of that increase. Is there anyway to identify the real root cause?

Related

Postgresql Aurora DB freeable_memory

I have a question regarding the freeable memory for AWS Aurora Postgres.
We recently wanted to create an index on one of our dbs and the db died and made a failover to the slave which all worked fine. It looks like the freeable memory dropped by the configured 500mb of maintenance_work_mem and by that went to around 800mb of memory - right after that the 32gig instance died.
1) I am wondering if the memory that is freeable is the overall system memory and if a low memory here could invoke the system oom killer on the AWS Aurora instance? So we may want to plan in more head room for operational tasks and the running of autovacuum jobs to not encounter this issue again?
2) The actual work of the index creation should then have used the free local storage as far as I understood, so the size of the index shouldn't have mattered, right?
Thanks in advance,
Chris
Regarding 1)
Freeable Memory from (https://forums.aws.amazon.com/thread.jspa?threadID=209720)
The freeable memory includes the amount of physical memory left unused
by the system plus the total amount of buffer or page cache memory
that are free and available.
So it's freeable memory across the entire system. While MySQL is the
main consumer of memory on the host we do have internal processes in
addition to the OS that use up a small amount of additional memory.
If you see your freeable memory near 0 or also start seeing swap usage
then you may need to scale up to a larger instance class or adjust
MySQL memory settings. For example decreasing the
innodb_buffer_pool_size (by default set to 75% of physical memory) is
one way example of adjusting MySQL memory settings.
That also means that if the memory gets low its likely to impact the process in some form. In this thread (https://forums.aws.amazon.com/thread.jspa?messageID=881320&#881320) e.g. it was mentioned that it caused the mysql server to restart.
Regarding 2)
This is like it is described in the documentation (https://aws.amazon.com/premiumsupport/knowledge-center/postgresql-aurora-storage-issue/) so I guess its right and the size shouldn't have mattered.
Storage used for temporary data and logs (local storage). All DB
temporary files (for example, logs and temporary tables) are stored in
the instance local storage. This includes sorting operations, hash
tables, and grouping operations that are required by queries.
Each Aurora instance contains a limited amount of local storage that
is determined by the instance class. Typically, the amount of local
storage is twice the amount of memory on the instance. If you perform
a sort or index creation operation that requires more memory than is
available on your instance, Aurora uses the local storage to fulfill
the operation.

mongodb limit server memory consumption

I'm testing running monbodb on the kubernetes platform where I can limit the resources used by the running container.
Say I set the memory limit to 256Mb. The problem is that for example while making backup memory consumption increases to the limit and container gets restarted by kubernetes.
So the question is is there a way to limit mongodb memory consumption for my case so that it would not cause the crush by exeeding memory limit set by platform.
I could of course increase the limit but I'm interested in a principal solution and would like to understand this process better because I don't really now how memory consumed by mongodb and container os. Is it possible to tune mongodb/underlying linux os to work inside existing limits.
The limits that you have set are good enough for a monogodb pod, these are the limits used by the community as well.
The only way I think you can get around this for backups is to increase the memory limits, but still it might fail, because in other places on stackoverflow people have experienced OOM killing on VMs with memory of giga bytes. MongoDB basically tries to eat any and every memory that is made available to it.
Also there are other ways to backup mongodb: https://dba.stackexchange.com/questions/76130/how-to-backup-large-mongodb-database
I am not sure how this aligns in the k8s world.

Mongo Server Status - "Resident" memory

After starting Mongo via mongod, I ran a Mongo query that took 300 seconds. Calling db.serverStatus() on my "admin" db showed Mongo having resident memory of 1 GB. The docs explain that "resident" memory is the amount of physical disk/RAM that Mongo uses.
Then, I re-ran the same query, but it took 8 seconds. Looking at the resident memory this time, I saw 5 GB.
The large increase in RAM, I believe, helps to explain why the query time shrank from 300 to 8 seconds, but why did the resident memory jump so quickly?
Is there some type of "warming" step recommended to prepare Mongo so as to avoid 300 second queries?
There reason behind that MongoDB uses mmap functionality of the operating system. This means, at least on Linux systems That the memory handling of the mongodb is based on some functionality of the operating system called memory mapped files.
The memory in Linux systems is addressed in several levels basically any program will see an address space on 32-bit systems of 2GB over all, on 64-bit systems 128TB. This is a virtual address space which means on 32/64bit that amount of memory can be addressed with 4kb memory pages(page is the individually handled part of the memory). That is why if you start mongoDB on a 32 bit system it will rise a warning that the database on such system can handle only 2GB of data. Obviously this virtual address space is bigger than the amount of physical memory so there is a mapping between these virtual addresses and the physical ones. Some of the virtual addresses reside in really physical memory so they are in the real memory,but the algorithm which ensures this is on the side of the kernel. Programs running on Linux systems can deal only with virtual addresses, if one tries to access a virtual memory address which is not in physical memory, a pagefault occurs (you can track this on the serverStatus commands extra info field). (You can find short explanation of this here)
Accessing memory in case if the virtual address reside in physical memory is as fast as the memory, accessing a virtual address which has no physical currently means a paging from disk to memory and read the memory so as fast as the disks random read. (This makes the different in your case)
There is a command in mongoDB which with you can enforce the caching of a collection or an index this command is the touch
If you use this command to load the data into memory before the first query you will get the results in 8sec at first try. Unfortunately you cannot really force the OS to keep always this in memory, so if you have others things using up the memory OS will page out this data in some time.
IF you have enough physical memory mongoDB will keep everything the data and indexes in memory. This not always needed. There is a portion of data which need to be in memory to avoid extensive amount of pagefaults this is the workingset. You can check the size of the working set with the db.runCommand( { serverStatus: 1, workingSet: 1 } ) command.
You cannot handle the paging while it is OS level, but if you have enough memory usually the kernel likes to keep as much stuff cached as it can. If the workingset fits in memory you are more or less ok. If some documents really rarely accessed and there is not enough memory to keep everything there they will be paged out anyway.
When you run a query several things can happen. An index can cover which means no documents will be touched at all, if your query is selective in some notion only a part of the index will be touched. unfortunately it is really hard to define memory is sufficient and the only thing what you can do is to monitor (the workingset metric is an estimation). The symptom of running out of memory can be identified check this presentation. And use MMS.

How to keep 32 bit mongodb memory usage down on changing dataset

I'm using MongoDB on a 32 bit production system, which sucks but it's out of my control right now. The challenge is to keep the memory usage under ~2.5GB since going over this will cause 32 bit systems to crash.
According to the mongoDB team, the best way to track the memory usage is to use your operating system's process tracking system (i.e. ps or htop on Unix systems; Process Explorer on Windows.) for virtual memory size.
The DB mainly consists of one table which is continually cycling data, i.e. receiving data at regular intervals from sensors, and every day a cron job wipes all data from before the last 3 days. Over a period of time, the memory usage slowly increases. I took some notes over time using db.serverStats(), db.lectura.totalSize() and ps, shown in the chart below. Note that the size of the table in question has reduced in the last month but the memory usage increased nonetheless.
Now, there is some scope for adjustment in how many days of data I store. Today I deleted basically half of the data, and then restarted mongodb, and yet the mem virtual / mem mapped and most importantly memory usage according to ps have hardly changed! Why do these not reduce when I wipe data (and restart)? I read some other questions where people said that mongo isn't really using all the memory that it might appear to be using, and that you can't clear the cache or limit memory use. But then how can I ensure I stay under the 2.5GB limit?
Unless there is a way to stem this dataset-size-irrespective gradual increase in memory usage, it seems to me that the 32-bit version of Mongo is unuseable. Note: I don't mind losing a bit of performance if it solves the problem.
To answer regarding why the mapped and virtual memory usage does not decrease with the deletes, the mapped number is actually what you get when you mmap() the entire set of data files. This does not shrink when you delete records, because although the space is freed up inside the data files, they are not themselves reduced in size - the files are just more empty afterwards.
Virtual will include journal files, and connections, and other non-data related memory usage also, but the same principle applies there. This, and more, is described here:
http://www.mongodb.org/display/DOCS/Checking+Server+Memory+Usage
So, the 2GB storage size limitation on 32-bit will actually apply to the data files whether or not there is data in them. To reclaim deleted space, you will have to run a repair. This is a blocking operation and will require the database to be offline/unavailable while it was run. It will also need up to 2x the original size in terms of free disk space to be able to run the repair, since it essentially represents writing out the files again from scratch.
This limitation, and the problems it causes, is why the 32-bit version should not be run in production, it is just not suitable. I would recommend getting onto a 64-bit version as soon as possible.
By the way, neither of these figures (mapped or virtual) actually represents your resident memory usage, which is what you really want to look at. The best way to do this over time is via MMS, which is the free monitoring service provided by 10gen - it will graph virtual, mapped and resident memory for you over time as well as plenty of other stats.
If you want an immediate view, run mongostat and check out the corresponding memory columns (res, mapped, virtual).
In general, when using 64-bit builds with essentially unlimited storage, the data will usually greatly exceed the available memory. Therefore, mongod will use all of the available memory it can in terms of resident memory (which is why you should always have swap configured to the OOM Killer does not come into play).
Once that is used, the OS does not stop allocating memory, it will just have the oldest items paged out to make room for the new data (LRU). In other words, the recycling of memory will be done for you, and the resident memory level will remain fairly constant.
Your options for stretching 32-bit are limited, but you can try some things. The thing that you run out of is address space, and the increases in the sizes of additional database files mean that you would like to avoid crossing over the boundary from "n" files to "n+1". It may be worth structuring your data into more or fewer databases so that you can get the maximum amount of actual data into memory and as little as possible "dead space".
For example, if your database named "mydatabase" consists of the files mydatabase.ns (the namespace file) at 16 MB, mydatabase.0 at 64 MB, mydatabase.1 at 128 MB and mydatabase.2 at 256 MB, then the next file created for this database will be mydatabase.3 at 512 MB. If instead of adding to mydatabase you instead created an additional database "mynewdatabase" it would start life with mynewdatabase.ns at 16 MB and mynewdatabase.0 at 64 MB ... quite a bit smaller than the 512 MB that adding to the original database would be. In fact, you could create 4 new databases for less space than would be consumed by adding a new file to the original database, and because the files are smaller they would be easier to fit into contiguous blocks of memory.
It is a well-known message that 32-bit should not be used for production.
Use 64-bit systems.
Point.

Postgres causing swapping on CentOS

All,
I am running CentOS 6.0 with Postgresql 8.4 and can't seem to figure out how to prevent so much disc swap from occurring. I have 12 gigs of RAM and 4 processors and I am doing some simple updates (1 table at a time). I thought for a minute that the inserts happening in parallel from a script I wrong was causing the large memory usage but when I saw the simple update causing it too I basically threw in the towel and decided to ask for help.
I pasted the conf file here. http://pastebin.com/e0jdBu0J
You can see that I set the buffers relatively low and the connection amounts high. The DB service will not start if I set the shared buffers any higher than 64 megs. Anyone have an idea what may be causing this for me?
Thanks,
Adam
If you're going into swap, increasing shared_buffers will make the problem worse; you'll be taking RAM away from the part that's running out and swapping, instead dedicating memory to the database caching. It's worth fixing SHMMAX etc. just on general principle and for later tuning work, but that's not going to help with this problem.
Guessing at the identify of your memory gobbling source is a crapshoot. Far better to look at data from "top -c" and ps to find which processes are using a lot of it. It's possible for a really bad query to consume way more memory than it should. If you see memory use spike up for a PostgreSQL process running something, check the process ID against the information in pg_stat_tables to see what it's doing.
There are a couple of things that can cause this sort of issue that often surprise people. If you are doing a large number of row updates in a single transaction, and there are foreign key checks or triggers involved, that can run out of memory. The queue of things to check in each of those cases is kept in RAM, and can be surprisingly big.
There are two problems with your PostgreSQL settings that might be related. Databases don't actually work very well if you have a lot more active connections than cores in the server; best performance is normally 2 to 3 active clients per core. And all sorts of things go wrong once you've got more than a few hundred connection. There is some connections^2 behavior that gets ugly there performance wise, and there are some memory issues too. If you really need 1250 connections, you should be using a connection pooler such as pgBouncer or pgpool-II.
And effective_io_concurrency = 1000 is way too high for any hardware on the planet. Useful values for that in a small multiple of how many disks you have in the server. I have no idea what happens as far as memory usage goes when you set it that high, but it's not been tested very well at that range. Normal settings more like 1 to 25. The parameters outlined at Tuning Your PostgreSQL Server are much more important than it is; the concurrency value only impacts one particular type of table scan.
Centos 6 seems to have a very conservative shmmax as a default
Set your shared buffers to that recommended by postgres tuning resources
see for explanation and how to set.
To experiment you can (as root) use sysctl -w kernel.shmmax = n
where n is the value that the startup error message that postgres is trying to allocate on startup. When you identify the value you wish to use permanently then set that in /etc/sysctl.conf