Couchbase stores the data in the disk and also retains it in the RAM. Once high watermark is reached , I guess it starts the eviction process. I assume the data would also be in the disk at this point. So does eviction really mean deletion of message from RAM? Or does mean deletion of data from RAM and writing it to disk? If it also includes writing it to disk , why should data already present in the disk be overwritten?
Couchbase only evicts documents that have been persisted to disk. As you say, eviction means clearing the document data from RAM. When using the value eviction strategy, which is the default, Couchbase keeps the key and metadata in RAM and only evicts the document value. With the full eviction strategy, it deletes both the key, metadata and value from RAM.
Couchbase first writes to RAM and then asynchronously writes to disk as well. Depending on the configuration you can have document along with metadata of the document in the RAM. And, yes when the threshold (High Water Mark) reaches Couchbase will start evicting the data (value and/or metadata as per configuration) from RAM until it reaches Low Water Mark.
Related
As i know MonogoDB cache working set in RAM.
Then if i increase wiredTigerCacheSizeGB as much as all of data in disk, does it work as fast as in-memory db?
if no, what is difference?
See In-Memory Storage Engine and WiredTiger Storage Engine
(In-memory) By avoiding disk I/O, the in-memory storage engine allows for more predictable latency of database operations.
Keep in mind that you are limited a 10000 GB when setting wiredTigerCacheSizeGB. You should also disable journaling and set storage.syncPeriodSecs to 0 in order to increase performance of WiredTiger. But, still WiredTiger has to create WiredTiger.wt and WiredTiger.turtle at least...
PS. I think this link might answer your question
I cannot answer all your questions.
A cache reads data from disk and keeps it in the RAM. When you access such data again then you read it from RAM instead of reading it again from disk - which would be much slower.
So, a cache is useless if you have to read the data only once. Some applications anticipate the data you may read in future and put it into the cache in advance.
The MongoDB in-memory DB puts all data into RAM only, it does not read or write anything from disk, apart from some logging data. When you stop an in-memory MongoDB process then all data is lost.
The wiredTiger storage engine is a data format used by MongoDB to store data persistently on disk.
If you set wiredTigerCacheSizeGB high enough to hold all of your data, then all of your reads will be satisfied from the cache. Writes will update the cache and also be written to storage.
If you use the in-memory configuration then all of your reads will be satisfied from memory. Writes will only go to memory and will not be stored on disk.
So if your workload is mostly reads, then the large cache will behave similarly to an in-memory DB. If your workload has a lot of writes, then the large cache configuration may be slower because it needs to write to disk.
Also, the in-memory DB will not preserve your data in the event of a crash, since it only holds data in memory.
I understand that the working set is cached every time a query is run in the mongoDB.
What happens if one working set exceeds the caching memory when I know that the data on the previous page is removed and cached?
Ex)
cacheSizeGB: 0.5g,
total document size about query: 1g
And is it helpful to reduce the size of the document being cached by using the project command?
The cache is managed by WiredTiger, the mongod will request documents it needs, and if WT doesn't find it in the cache, it will read it in from disk. If this makes the cache exceed the threshold (default 80% of maximum size), the background eviction workers will start removing the least recently used items from the cache.
The result set is constructed/sorted in heap memory, not cache. If your result set is too large, mongod may be able to use disk for temporary space while sorting.
Large results may also cause the operating system to use swap, and in extreme cases it might run out of memory and get killed by the OOM killer.
Last time I get alert from MongoDB Atlas:
Disk I/O % utilization on Data Partition has gone above 70 on nvme2n1
But I have no any ideas how can I localize / query / index / part of code / problematic collection.
In what way can I perform any analyze to find out problem root-cause?
Not answer, but just seen that many people faced with similar problem.
In My case root cause was: we had collection with huge documents that contain array of data (in fact - list of coordinates with some metadata), and update it as many times, as coordinates we have (when adding new coordinates). + some additional operations.
As I know MongoDB cannot fetch just part of document, it fetch full document, and when we fetch many different and big documents, they are not fit into MongoDB in-memory cache, and each time access into hard disc, that lead to this issue.
So, we just split up this document on several, and this fixed issue. While we need frequent access to update/add this data, we keep it into different documents, and finally, after process done, we gather back all this documents into one big document, for "history check" purpose.
Recently, we met this alert on MongoDB Atlas Disk I/O % utilization on Data Partition has gone above 90 after the instance reboots maintenance. After a discussion with Atlas support guys, we clearly understand this metric.
Understanding Disk I/O % Utilization
The definition of Disk I/O % Utilization and Disk I/O % utilization on Data Partition per doc
Disk I/O % Utilization alerts indicate that the percentage of time during which requests are being issued reaches a specified threshold.
Disk I/O % utilization on Data Partition occurs if the percentage of time during which requests are being issued to any partition that contains the MongoDB collection data meets or exceeds the threshold.
Two traps in iostat: %util and svctm
Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits.
This means if there was even just one I/O operation in progress for a given time period, the operating system would report 100% Disk Util, as the disk was in use 100% of that time.
Thus, the disk utilization percentage by itself is NOT an indicator of stress on the disk relative to its maximum IOPS capacity.
Having disk utilization at 100% does not in itself imply there is an issue. Disk utilization is the percentage of time requests are issued to any partition containing the MongoDB collection data. This includes requests from any process, not just MongoDB processes. Modern disk storage can sustain multiple I/O operations simultaneously, so having a ~100% utilization is not unusual, because it just means that the disk is constantly processing at least one operation during the 100% interval.
Conclusion
We should look at a combination of all the available disk-related metrics, as well as IOWait in the System CPU when diagnosing potential disk performance-related issues.
Possible actions to help resolve Disk Utilization % alerts
Optimize your queries
Create an Index to Support Read Operations
Pay attention to Query Selectivity and Covered Query
Use the Atlas Performance Advisor to view slow queries and suggested indexes.
Review Indexing Strategies for possible further indexing improvements.
Analyze Query Performance to review how your queries are using your indexes.
Analyze Profile to optimize the long execution time query
Increase hardware resources, such as instance size and IOPS on Atlas
Source: Mongo Doc
As the alert says, it is due to the high utilization of the disk. The most common cause of it is unoptimized queries with poor Query Targeting Ratio, or simply reading/writing a lot of documents from/to the disk in a relatively shorter time window.
In order to identify these queries, start with the Profiler and look for the operations with a poor Examined:Returned ratio. You can also refer to the Performance Advisor to see if it suggests any indexes on the inefficient operations. Since Profiler's window is limited to the last 24 hours, you can also refer to your logs to identify the Slow Queries.
Ultimately, the effort to solve this is tri-directional:
Optimizing the query execution with efficient indexing and filtering strategies
Keep a check on the volume of data being read/written in one go.
Increase the IOPS of the cluster
For official reference, checkout the documentation here.
How advantageous is Value Eviction over Full Eviction? In case of Value eviction , I assume meta-data is present in the RAM. How does presence of meta-data help in quicker retrieval of content? Is the NRU documents evicted high watermark is reached? Are there any aspects that we need to consider before changing the eviction policy from value eviction to full eviction.
Value eviction keeps all document meta data in memory and full eviction does not.
Let's say you do a get on a key that does not exist. In value eviction mode you instantly know that the key is not there since it is a memory only operation. In full eviction mode if the meta data for that key is not in memory then you have to do a disk fetch to be sure that they key does not exist.
Basically any operation that requires knowledge of some information about a keys meta data may require a background fetch. Some other operations that may be slow if the meta data is not in memory are CAS set (check and set only if the value has not changed), appends, incrs, decrs, and prepends. Also keep in mind that extra disk activity may cause disk contention that affects other parts of Couchbase.
The NRU is the same across both full eviction and value eviction and Couchbase will do its best to keep your working set in memory.
I would recommend trying to get an idea of what your workload looks like before switching modes and test it out with full eviction because you may see performance issues with will vary depending on your workload.
In addition to Mike's answer it's worth mentioning bloom filters here which is a very powerful feature of Couchbase and can decrease the trips to disk significantly. Bloom filters are also enabled in value-only ejection but Couchbase really leverages their functionality in full ejection mode. I was in the situation where the system had reached its limit with value-only ejections buckets and I tested the two eviction modes and eventually the full ejection ended up being far superior - at least for my case.
Say I have a single collection in mongodb with only one index, and I require the index for the entire life cycle of the application using that mongo collection.
I would like to know about the behaviour of mongodb.
In this case once the index is loaded into memory, will mongodb keep it in the ram?
Thanks
The first thing MongoDB will knock out of RAM will be the LRU (least recently used) piece of data. So if you only have one index, chances are it will continue to be used pretty regularly and it should stay in memory.
Source
Unfortunately you cannot currently pin a collection or index in memory. MongoDB uses memory mapped files to load collections and indexes into memory. As your activities touch various pieces of your database thru queries, updates, insertions and deletions, that data will get loaded into memory. This is referred to as the working set. If the total memory required to load the working set is less than available memory, no problem.
If not, MongoDB is going to use an LRU algorithm to pick what to unload from memory. This is why it's so important to understand the concept of the working set and how it relates to your available memory.
This writeup from the documentation should be helpful:
How do I calculate how much RAM I need for my application?
The amount of RAM you need depends on several factors, including but
not limited to:
The relationship between database storage and working set.
The operating system’s cache strategy for LRU (Least Recently Used)
The impact of journaling
The number or rate of page faults and other MMS gauges to detect when you need more RAM
Each database connection thread will need up to 1 MB of RAM. MongoDB
defers to the operating system when loading data into memory from
disk. It simply memory maps all its data files and relies on the
operating system to cache data. The OS typically evicts the
least-recently-used data from RAM when it runs low on memory. For
example if clients access indexes more frequently than documents, then
indexes will more likely stay in RAM, but it depends on your
particular usage.
To calculate how much RAM you need, you must calculate your working
set size, or the portion of your data that clients use most often.
This depends on your access patterns, what indexes you have, and the
size of your documents. Because MongoDB uses a thread per connection
model, each database connection also will need up to 1MB of RAM,
whether active or idle.
If page faults are infrequent, your working set fits in RAM. If fault
rates rise higher than that, you risk performance degradation. This is
less critical with SSD drives than with spinning disks.
http://docs.mongodb.org/manual/faq/diagnostics/
You can use the serverStatus command to get an estimate of your current working set:
db.runCommand( { serverStatus: 1, workingSet: 1 } )