Postgres Checkpointer process always running - postgresql

I am running PostgreSQL server and limited the shared_buffers to 4GB.
When I insert a large number of records in the database, the checkpointer process start consuming the RAM. This process neither ends nor decreases the RAM consumption even after a day.
Any idea why this is happening?

That is perfectly fine. The memory reported as allocated to the process is in fact the shared memory of shared_buffers that remains allocated for the entire life of the PostgreSQL server process.
Since it is the job of the checkpointer to read dirty pages from shared buffers and write them to disk, it is to be expected that the process reads a lot if that memory.
If you want to reduce the amount of disk I/O the checkpointer has to perform during bulk inserts, increase max_wal_size.
To see how much RAM is still free on your machine, consult the “available” field of the free output.

The checkpointer is a permanent process. It can nap for long periods of time when there is nothing to do, but it doesn't go away while the server is still running. Most of the memory it uses is the same memory being shared by everyone else. The top command (which I think is what you are showing there) includes all the shared memory a process has ever touched, and doesn't prorate it for the number of other processes it is being shared with.
Indeed if shared_buffers is 4GB, then the numbers you report here are suspiciously low, not high.

Related

PostgreSQL suddently takes all the disk space

I am facing a very strange issue on my server, my configuration is very straight-forward:
Small VPS, 500 MiB RAM, 40 GiB disk
Debian stable at install time, now probably old_stable
PostgreSQL v11.11
The data is very small, the use of a database for my purpose is probably overkill, but handy:
7 tables
7 views, including one of them which is a little bit scary
The biggest table have a few hundred records
The full dump of the database gives me a file of 93 KiB
Everything was very fast for 1.5 year. Yesterday, the database suddenly became very slow. My investigations showed that the size of the data on the disk was 34 GiB and I had no disk space available anymore.
After more investigations, I tried the command "vacuum full", which deleted the useless 34 GiB. The disk space changed from 100% usage to 10% usage and the performances came back immediately. One day later, the system is slow again, I saw the disk usage is now around 50%.
I have no clue about what is going on, any suggestion?
I'd recommend reading Optimize and Improve PostgreSQL Performance with VACUUM, ANALYZE, and REINDEX and Routine Vacuuming. Here's some relevant bits.
In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table
You must have done a lot of deletes and updates, so Postgres consumed a lot of disk space. vacuum recovers this space. vacuum full isn't normally necessary and will lock your database.
Normally there is an autovacuum daemon running which will vacuum periodically. It probably isn't running. Check with show autovacuum and show track_counts. Both need to be true for autovacuum to run.
You can see what is "bloating" your database with the check_postgres tool.

why is kdb process showing high memory usage on system?

I am running into serious memory issues with my kdb process. Here is the architecture in brief.
The process runs in slave mode (4 slaves). It loads a ton of data from database into memory initially (total size of all variables loaded in memory calculated from -22! is approx 11G). Initially this matches .Q.w[] and close to unix process memory usage. This data set increases by very little in incremental operations. However, after a long operation, although the kdb internal memory stats (.Q.w[]) show expected memory usage (both used and heap) ~ 13 G, the process is consuming close to 25G on the system (unix /proc, top) eventually running out of physical memory.
Now, when I run garbage collection manually (.Q.gc[]), it frees up memory and brings unix process usage close to heap number displayed by .Q.w[].
I am running Q 2.7 version with -g 1 option to run garbage collection in immediate mode.
Why is unix process usage so significantly differently from kdb internal statistic -- where is the difference coming from? Why is "-g 1" option not working? When i run a simple example, it works fine. But in this case, it seems to leak a lot of memory.
I tried with 2.6 version which is supposed to have automated garbage collection. Suprisingly, there is still a huge difference between used and heap numbers from .Q.w when running with version 2.6 both in single threaded (each) and multi threaded modes (peach). Any ideas?
I am not sure of the concrete answer but this is my deduction based on following information (and some practical experiments) which is mentioned on wiki:
http://code.kx.com/q/ref/control/#peach
It says:
Memory Usage
Each slave thread has its own heap, a minimum of 64MB.
Since kdb 2.7 2011.09.21, .Q.gc[] in the main thread executes gc in the slave threads too.
Automatic garbage collection within each thread (triggered by a wsful, or hitting the artificial heap limit as specified with -w on the command line) is only executed for that particular thread, not across all threads.
Symbols are internalized from a single memory area common to all threads.
My observations:
Thread Specific Memory:
.Q.w[] only shows stats of main thread and not the summation of all the threads (total process memory). This could be tested by starting 'q' with 2 threads. Total memory in that case should be at least 128MB as per point 1 but .Q.w[] it still shows 64 MB.
That's why in your case at the start memory stats were close to unix stats as all the data was in main thread and nothing on other threads. After doing some operations some threads might have taken some memory (used/garbage) which is not shown by .Q.w[].
Garbage collector call
As mentioned on wiki, calling garbage collector on main thread calls GC on all threads. So that might have collected the garbage memory from threads and reduced the total memory usage which was reflected by reduced unix memory stats.

PostgreSQL Table in memory

I created a database containing a total of 3 tables for a specific purpose. The total size of all tables is about 850 MB - very lean... out of which one single table contains about 800 MB (including index) of data and 5 million records (daily addition of about 6000 records).
The system is PG-Windows with 8 GB RAM Windows 7 laptop with SSD.
I allocated 2048MB as shared_buffers, 256MB as temp_buffers and 128MB as work_mem.
I execute a single query multiple times against the single table - hoping that the table stays in RAM (hence the above parameters).
But, although I see a spike in memory usage during execution (by about 200 MB), I do not see memory consumption remaining at at least 500 MB (for the data to stay in memory). All postgres exe running show 2-6 MB size in task manager. Hence, I suspect the LRU does not keep the data in memory.
Average query execution time is about 2 seconds (very simple single table query)... but I need to get it down to about 10-20 ms or even lesser if possible, purely because there are just too many times, the same is going to be executed and can be achieved only by keeping stuff in memory.
Any advice?
Regards,
Kapil
You should not expect postgres processes to show large memory use, even if the whole database is cached in RAM.
That is because PostgreSQL relies on buffered reads from the operating system buffer cache. In simplified terms, when PostgreSQL does a read(), the OS looks to see whether the requested blocks are cached in the "free" RAM that it uses for disk cache. If the block is in cache, the OS returns it almost instantly. If the block is not in cache the OS reads it from disk, adds it to the disk cache, and returns the block. Subsequent reads will fetch it from the cache unless it's displaced from the cache by other blocks.
That means that if you have enough free memory to fit the whole database in "free" operating system memory, you won't tend to hit the disk for reads.
Depending on the OS, behaviour for disk writes may differ. Linux will write-back cache "dirty" buffers, and will still return blocks from cache even if they've been written to. It'll write these back to the disk lazily unless forced to write them immediately by an fsync() as Pg uses at COMMIT time. When it does that it marks the cached blocks clean, but doesn't flush them. I don't know how Windows behaves here.
The point is that PostgreSQL can be running entirely out of RAM with a 1GB database, even though no PostgreSQL process seems to be using much RAM. Having shared_buffers too high just leads to double-caching and can reduce the amount of RAM available for the OS to cache blocks.
It isn't easy to see exactly what's cached in RAM because Pg relies on the OS cache. That's why I referred you to pg_fincore.
If you're on Windows and this won't work, you really just have to rely on observing disk activity. Does performance monitor show lots of uncached disk reads? Does operating system memory monitoring show lots of memory used for disk cache in the OS?
Make sure that effective_cache_size correctly reflects the RAM used for disk cache. It will help PostgreSQL choose appropriate query plans.
You are making the assumption, without apparent evidence, that the query performance you are experiencing is explained by disk read delays, and that it can be improved by in-memory caching. This may not be the case at all. You need to look at explain analyze output and system performance metrics to see what's going on.

MongoDB Stops Responding During Background Flush

Mongodb Background Flushing blocks all the requests:
Server: Windows server 2008 R2
CPU Usage: 10 %
Memory: 64G, Used 7%, 250MB for Mongod
Disk % Read/Write Time: less than 5% (According to Perfmon)
Mongodb Version: 2.4.6
Mongostat Normally:
insert:509 query:608 update:331 delete:*0 command:852|0 flushes:0 mapped:63.1g vsize:127g faults:6449 locked db:Radius:12.0%
Mongostat Before(maybe while) Flushing:
insert:1 query:4 update:3 delete:*0 command:7|0 flushes:0 mapped:63.1g vsize:127g faults:313 locked db:local:0.0%
And Mongostat After Flushing:
insert:1572 query:1849 update:1028 delete:*0 command:2673|0 flushes:1 mapped:63.1g vsize:127g faults:21065 locked db:.:99.0%
As you see when flushes happening lock is 99% just at this point mongod stops responding any read/write operation (mongotop and mongostat also stop). The flushing takes about 7 to 8 seconds to complete which does not increase disk load more than 10%.
Is there any suggestions?
Under Windows server 2008 R2 (and other versions of Windows I would suspect, although I don't know for sure), MongoDB's (2.4 and older) background flush process imposes a global lock, doing substantial blocking of reads and writes, and the length of the flush time tends to be proportional to the amount of memory MongoDB is using (both resident and system cache for memory-mapped files), even if very little actual write activity is going on. This is a phenomenon we ran into at our shop.
In one replica set where we were using MongoDB version 2.2.2, on a host with some 128 GBs of RAM, when most of the RAM was in use either as resident memory or as standby system cache, the flush time was reliably between 10 and 15 seconds under almost no load and could go as high as 30 to 40 seconds under load. This could cause Mongo to go into long pauses of unresponsiveness every minute. Our storage did not show signs of being stressed.
The basic problem, it seems, is that Windows handles flushing to memory-mapped files differently than Linux. Apparently, the process is synchronous under Windows and this has a number of side effects, although I don't understand the technical details well enough to comment.
MongoDb, Inc., is aware of this issue and is working on optimizations to address it. The problem is documented in a couple of tickets:
https://jira.mongodb.org/browse/SERVER-13444
https://jira.mongodb.org/browse/SERVER-12401
What to do?
The phenomenon is tied, to some degree, to the minimum latency of the disk subsystem as measured under low stress, so you might try experimenting with faster disks, if you can. Some improvements have been reported with this approach.
A strategy that worked for us in some limited degree is avoiding provisioning too much RAM. It happened that we really didn't need 128 GBs of RAM, so by dialing back on the RAM, we were able to reduce the flush time. Naturally, that wouldn't work for everyone.
The latest versions of MongoDB (2.6.0 and later) seem to handle the
situation better in that writes are still blocked during the long
flush but reads are able to proceed.
If you are working with a sharded cluster, you could try dividing the RAM by putting multiple shards on the same host. We didn't try this ourselves, but it seems like it might have worked. On the other hand, careful design and testing would be highly recommended in any such scenario to avoid compromising performance and/or high availability
We tried playing with syncdelay. Reducing it didn't help (the long flush times just happened more frequently). Increasing it helped a little (there was more time between flushes to get work done), but increasing it too much can exacerbate the problem severely. We boosted the syncdelay to five minutes (300 seconds), at one point, and were rewarded with a background flush of 20 minutes.
Some optimizations are in the works at MongoDB, Inc. These may be available soon.
In our case, to relieve the pressure on the primary host, we periodically rebooted one of the secondaries (clearing all memory) and then failed over to it. Naturally, there is some performance hit due to re-caching, and I think this only worked for us because our workload is write-heavy. Moreover, this technique not in any sense a solution. But if high flush times are causing serious disruption, this may be one way to "reduce the fever" so to speak.
Consider running on Linux... :-)
Background flush by default does not block read/write. mongod does flush every 60s, unless otherwise specified with -syncDelay parameter. syncDelay uses fsync() operation, which can set to block write while in-memory pages flush to disk. A blocked write could have potential to block reads as well. Read more: http://docs.mongodb.org/manual/reference/command/fsync/
However, normally a flush should not take more than 1000ms (1 second). If it does, it is likely the amount of data flushing to disk is too large for your disk to handle.
Solution: upgrade to a faster disk like SSD, or decrease flush interval (try 30s, rather than the default 60s).

What happens when you Postgres shared_buffers is too small?

In the Postgres documentation, it says the parameter "shared_buffers" sets the amount of memory the database server uses for shared memory buffers. I know if this value is too high, then the database server might use too much memory than what is available, and may cause paging to occur.
However, what happens if this value is too low? Would the database just crash if it didn't have enough memory for an intensive query? Specifically, what would it lead to? High IO wait times? High CPU usage?
It won't crash; it may perform poorly.