Mongodb pod consuming memory even though it is in idle state - mongodb

When Inserting data in mongodb its memory usage increases then
the data base is dropped and connections are closed, but still the memory usage continue to increase.
I have already configured wiredTiger to 700mb
As you can see the graph in the screen shot attached down,at every 30 mins data insertion and deletion takes place , which consumes max 10 minutes of time and then the connection breaks but as you can see in graph the memory usage continues to increase which then reaches its max limit and then the kuberntes pod starts showing trouble

Related

Aurora Postgresql Reader Performace

I am using AWS Aurora postgresql Cluster(One Reader Instance, One Writer Instance).
Until yesterday, my application was operating to use only the writer instance. So far, the CPU metric of the wirter instance has never exceeded 20%.
Today's read-only operation was modified to use a reader instance, but the CPU metric of the reader instance continued to increase around 5% every 30 minutes. (In the first 30 minutes after deployment, it went up to 20% and then decreased again, after 60 minutes it rose to 25% and then decreased again...65%)
Currently, it has been changed to use only the Writer instance again.
Replication was still ongoing, and no other monitoring indicators were found to be unusual.
Why is this happening?
Is it the process of memory warn up of the reader instance?

Flink Incremental CheckPointing Compaction

We have a forever running flink job which reads from kafka , creates sliding time windows with (stream intervals :1hr , 2 hr to 24 hr) and (slide intervals : 1 min , 10 min to 1 hours).
basically its : KafkaSource.keyBy(keyId).SlidingWindow(stream, slide).reduce.sink
I have enabled the check-pointing recently with rocksDB back-end and incremental=true and with hdfs persistent storage.
From last 4/5 days I m monitoring the job and its running fine but I am concerned about the check-point size. As rocksDB does compaction & merging, size is not forever growing but still it grows and till now has reached 100 gb.
So, what is the best way to check-point forever running jobs ?
It will have millions of unique keyId. so, will there be one state per key for each operator while check-pointing ?
If the total amount of your keys is under control, you don't need to worry about the growing of the size of checkpoints, which means it'll be convergent eventually.
If you still want to cut the size of checkpoint, you can set TTL for you state if your state can be regarded as expired that not being operated for a period of time.
Flink state is associated with key-group, which means a group of keys. Key-group is the unit of flink state. Each key's state will be included in a completed checkpoint. However with the incremental mode, some checkpoints will share .sst files, so you can see the checkpointed size is not that large as the total checkpoint size. If some keys are not updated between the last checkpoint interval, these keys' state won't be uploaded this time.

Kafka streams changelog consumption rate drops during state rebuilding

I recently started working with Kafka, and I'm having hard time debugging the changelog consumption rate drop during the state rebuild.
TL;DR: The shape of the graph from Grafana showing the changelog lag after deleting the PVC and the pod and waiting for the pod to start running again looks like this, and this shape doesn't look to me like what I'd expect:
The graph indicates that the lag in the changelog topic is being consumed pretty fast from the beginning, but it slows down over time.
The process is stretched over 30 minutes for a changelog of 14GB size.
More information about the most recent config:
Provider: AWS
storageClass: io1
storageSize: 3TB
podMemory: 25GB
JVM memory: 16GB
UPD: 24 partitions, no data skew
RocksDB params:
writeBuffer: 2MB
blockSize: 32KB
max Write Buffer Number: 4
min Write Buffer Number To Merge: 2
The process I follow is just deleting PVCs and the pods and measure the time it takes for a pod to start running and the changelog topic's lag go back to 0.
Results of my tuning sessions:
increased the storage size from 750GB to 3TB, result: rebuilding state for 14GB topic changed from 68 mins to 50 mins, no change in the graph shape;
changed the storage class from gp2 to io1, result: rebuilding state for 14GB topic changed from 50 mins to 30 mins, no change in the graph shape;
changed RocksDB max Write Buffer Number from 2 to 4 and min Write Buffer Number To Merge from 1 to 2; result: no change in speed neither in the graph shape;
changed pod memory from 14GB to 25 GB and JVM memory from 9GB to 16GB, no change in speed neither in the graph shape.
Where else should I look? The situation looks to me like memory saturation, but garbage collection time stays under 5%, and increasing the memory didn't help even a bit. So where else should I look? Thank you!

MongoDB TTL doesn't delete documents if under load

Use case
I am using MongoDB to persist messages from a message queue system (e. g. RabbitMQ / Kafka). Each message has a timestamp and based on that timestamp I want to expire the documents 1 hour afterwards. Therefore I've got a deleteAt field which is indexed and has set expireAfterSeconds: 0. Everything works fine, except if MongoDB is under heavy load.
We are inserting roughly 5-7k messages / second into a single replica set. The TTL Thread seems to be way slower than the rate of message coming in and thus the storage is quickly growing (which we want to avoid with TTLs).
To describe the behaviour more exactly, when I sort the messages by deleteAt ascending (oldest date first) I can see that it sometimes does not delete any of those messages for hours. Because of this observation I believe that the TTL thread sometimes is stuck or not active at all.
My question
What could I do to ensure that the TTL thread is not negatively impacted by the rate of messages coming in? According to our metrics our only bottleneck seems to be CPU, even though the SSD disk I/O is pretty high too.
Do I need to tune something (e. g. give MongoDB more threads for document deletion) so that the TTL thread can keep up with the write rate?
I believe I am facing a known bug as described in MongoDB's Jira Dashboard: https://jira.mongodb.org/browse/SERVER-19334
From https://docs.mongodb.com/manual/core/index-ttl/:
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a collection during the period between the expiration of the document and the running of the background task.
Because the duration of the removal operation depends on the workload of your mongod instance, expired data may exist for some time beyond the 60 second period between runs of the background task.
I'm not aware of any way to tune that TTL thread, and I suspect you'll need to run your own cron to do batched deletes.
The other thing to look at might be what's taking up CPU and IO and see if there's any way of reducing that load.
You can create the index with "sparse", this should perform the clean up on a separate thread in the background.

PostgreSQL autovacuum causing significant performance degradation

Our Postgres DB (hosted on Google Cloud SQL with 1 CPU, 3.7 GB of RAM, see below) consists mostly of one big ~90GB table with about ~60 million rows. The usage pattern consists almost exclusively of appends and a few indexed reads near the end of the table. From time to time a few users get deleted, deleting a small percentage of rows scattered across the table.
This all works fine, but every few months an autovacuum gets triggered on that table, which significantly impacts our service's performance for ~8 hours:
Storage usage increases by ~1GB for the duration of the autovacuum (several hours), then slowly returns to the previous value (might eventually drop below it, due to the autovacuum freeing pages)
Database CPU utilization jumps from <10% to ~20%
Disk Read/Write Ops increases from near zero to ~50/second
Database Memory increases slightly, but stays below 2GB
Transaction/sec and ingress/egress bytes are also fairly unaffected, as would be expected
This has the effect of increasing our service's 95th latency percentile from ~100ms to ~0.5-1s during the autovacuum, which in turn triggers our monitoring. The service serves around ten requests per second, with each request consisting of a few simple DB reads/writes that normally have a latency of 2-3ms each.
Here are some monitoring screenshots illustrating the issue:
The DB configuration is fairly vanilla:
The log entry documenting this autovacuum process reads as follows:
system usage: CPU 470.10s/358.74u sec elapsed 38004.58 sec
avg read rate: 2.491 MB/s, avg write rate: 2.247 MB/s
buffer usage: 8480213 hits, 12117505 misses, 10930449 dirtied
tuples: 5959839 removed, 57732135 remain, 4574 are dead but not yet removable
pages: 0 removed, 6482261 remain, 0 skipped due to pins, 0 skipped frozen
automatic vacuum of table "XXX": index scans: 1
Any suggestions what we could tune to reduce the impact of future autovacuums on our service? Or are we doing something wrong?
If you can increase autovacuum_vacuum_cost_delay, your autovacuum would run slower and be less invasive.
However, it is usually the best solution to make it faster by setting autovacuum_vacuum_cost_limit to 2000 or so. Then it finishes faster.
You could also try to schedule VACUUMs of the table yourself at times when it hurts least.
But frankly, if a single innocuous autovacuum is enough to disturb your operation, you need more I/O bandwidth.