I recently ran a test with membase, incrementing 60 million keys, each key of size 20-30 bytes, the values are less than the value of an integer. This cluster was across 3 16 GB boxes, 15 GB dedicated to a single bucket (replication=1) in membase. The build is membase-server-community_x86_64_1.7.1.1 on 64-bit ubuntu lucid boxes.
Results:
Initially, 10 million keys resided on 3 GB of memory. (3mil keys / GB)
#60 million keys resided on 45 GB of memory. (1.33mil keys / GB)
In comparison, redis handles 9-10 million keys / GB # 60 million keys. This ratio of keys per GB is consistent regardless of the dataset size.
Question:
Membase does not seem to scale well when faced with key heavy datasets. Is there any tuning/configuration that could help Membase in this use case?
Thanks.
P.S I migrated from redis to membase because the latter seemed to offer more reliability against cache failure. However, this degradation of performance with large datasets is a bit too painful.
Related
I am having a Mongodb replica set with one primary, one secondary and one arbiter.
The hardware specification of both primary and secondary is same, i.e. 8 core, 32 GB RAM and 700 GB SSD.
I have recently moved the database to the WiredTiger db engine from MMap.
According to the documents of Mongo, I know that, page eviction will start when:
Wired Tiger cache is 80% used.
When dirty cache % is more than 5%.
My resident memory is 13 GB. I can see our dirty cache % more than 5%, around 7% all the time, also our Wired Tiger cache usage is more than 11 GB, which is around 80% of our WT cache usage.
I can see an increase in CPU usage due to app thread going into cache evictions all the time.
I want to know, if increase the box size to 16 core, 64 GB, is it going to fix the issue?
We are using GCE for a MongoDB replica set with three members. As our data is quite large the initial sync for a new member is taking quite a lot. In our case the initial sync takes 7 hours for copying records and then 30 hours to create indexes.
The database is stored on a separate disk with these properties (copy-paste from the GCE console):
Type: Standard persistent disk
Size: 2000 GB
Zone: us-central1-c
Sustained random IOPS limit - estimated (R/W): 1,500 / 3,000
Sustained throughput limit (MB/s) - estimated (R/W): 180 / 120
To speed up we tried to add an SSD disk:
Type: SSD persistent disk
Size: 1000 GB
Zone: us-central1-c
Sustained random IOPS limit - estimated (R/W): 15,000 / 15,000
Sustained throughput limit (MB/s) - estimated: 240 / 240
One would expect that SSD disk should be quite faster than a Standard disk. But our results are different. During the initial MongoDB sync Standard disk was several time faster than the SSD. While it took 7 hours for the Standard disk to copy all data, the SSD disk after 12 hours had copied just half of data. We used Linux tool iostat and measured, Standard disk is achieving around 80,000 kB_wrtn/s while the SSD disk is around 8,000 kB_wrtn/s. How is possible that SSD disks is 10 times slower than the Standard disk?
Am wondering if there is any size limit to Spark executor memory ?
Considering the case of running a badass job doing collect, unions, count, etc.
Just a bit of context, let's say I have these resources (2 machines)
Cores: 40 cores, Total = 80 cores
Memory: 156G, Total = 312
What's the recommendation, bigger vs smaller executors ?
The suggestion by Spark development team is to not have an executor that is more than 64GB or so (often mentioned in training videos by Databricks). The idea is that a larger JVM will have a larger Heap that can result in really slow garbage collection cycles.
I think is a good practice to have your executors 32GB or even 24GB or 16GB. So instead of having one large one you have 2-4 smaller ones.
It will perhaps have some more coordination overhead, but I think these should be ok for the vast majority of applications.
If you have not read this post, please do.
I would like to know what is the max size of collection in mongodb.
In mongodb limitations documentation it is mentioned single MMAPv1 database has a maximum size of 32TB.
This means max size of collection is 32TB?
If I want to store more than 32TB in one collection what is the solution?
There are theoretical limits, as I will show below, but even the lower bound is pretty high. It is not easy to calculate the limits correctly, but the order of magnitude should be sufficient.
mmapv1
The actual limit depends on a few things like length of shard names and alike (that sums up if you have a couple of hundred thousands of them), but here is a rough calculation with real life data.
Each shard needs some space in the config db, which is limited as any other database to 32TB on a single machine or in a replica set. On the servers I administrate, the average size of an entry in config.shards is 112 bytes. Furthermore, each chunk needs about 250 bytes of metadata information. Let us assume optimal chunk sizes of close to 64MB.
We can have at maximum 500,000 chunks per server. 500,000 * 250byte equals 125MB for the chunk information per shard. So, per shard, we have 125.000112 MB per shard if we max everything out. Dividing 32TB by that value shows us that we can have a maximum of slightly under 256,000 shards in a cluster.
Each shard in turn can hold 32TB worth of data. 256,000 * 32TB is 8.19200 exabytes or 8,192,000 terabytes. That would be the limit for our example.
Let's say its 8 exabytes. As of now, this can easily translated to "Enough for all practical purposes". To give you an impression: All data held by the Library of Congress (arguably one of the biggest library in the world in terms of collection size) holds an estimated size of data of around 20TB in size including audio, video, and digital materials. You could fit that into our theoretical MongoDB cluster some 400,000 times. Note that this is the lower bound of the maximum size, using conservative values.
WiredTiger
Now for the good part: The WiredTiger storage engine does not have this limitation: The database size is not limited (since there is no limit on how many datafiles can be used), so we can have an unlimited number of shards. Even when we have those shards running on mmapv1 and only our config servers on WT, the size of a becomes nearly unlimited – the limitation to 16.8M TB of RAM on a 64 bit system might cause problems somewhere and cause the indices of the config.shard collection to be swapped to disk, stalling the system. I can only guess, since my calculator refuses to work with numbers in that area (and I am too lazy to do it by hand), but I estimate the limit here in the two digit yottabyte area (and the space needed to host that somewhere in the size of Texas).
Conclusion
Do not worry about the maximum data size in a sharded environment. No matter what, it is by far enough, even with the most conservative approach. Use sharding, and you are done. Btw: even 32TB is a hell lot of data: Most clusters I know hold less data and shard because the IOPS and RAM utilization exceeded a single nodes capacity.
We are going to use mongodb for an automated alert notification system. This will also notify different server statistics and business statistics. We would like to have a separate server for this and need to assess the hard ware(both RAM, hard disc and other configurations if any)
Shall some one shed some light on these plases....
What are the things to consider...?
How to prceeed once we collect that information(Is there any standard)...?
Currently I have only the below information.
Writes per second: 400
Average record size in the write: 5KB
Data retendancy policy: 30days
Mongodb buffers writes in memory and flushes them to disk once a while (60sec by default, can be configured with --syncdelay), so writing 400 5KB docs per sec is not going to be a problem if mongo can quickly update all indices (it would be helpful if you could give some info on the type and number of indices you're going to have).
You're going to have 1'036'800'000 documents / 5TB of raw data each month. Mongo will need more than 5TB to store that (for each doc it will repeat all key names, plus indices). To estimate index size:
2 * [ n * ( 18 bytes overhead + avg size of indexed field + 5 or so bytes of conversion fudge factor ) ]
Where n is the number of documents you have.
And then you can estimate the amount of RAM (you need to fit your indices there if you care about query performance).