The memory use of the Swift Storage Node is too high - openstack-swift

I am testing the performance of Swift. The environment is composed of 1 Swift Proxy Node and 3 Swift Storage Nodes. Each of the storage nodes has the RAM of 2GB and is mounted on a distribution of 100GB.
At first, the result of throughputs is acceptable. After several day's test, the performance is dropt a lot. And I find that the memory use of the Storage Nodes is very high (more than 95%).
Is there any configuration in Swift to control the memory use of the Node? Or is the only solution increasing the RAM (etc 8GB RAM)? Will the Node with 8GB RAM be out of memory either?

I think the high memory usage is OK. Linux borrows much memory for disk caching. You can find more information here.
I also experienced the performance drop after I had uploaded a lot of files. I believe that it was because many daemons (replicators, updaters) were busy working.

Related

mongodb limit server memory consumption

I'm testing running monbodb on the kubernetes platform where I can limit the resources used by the running container.
Say I set the memory limit to 256Mb. The problem is that for example while making backup memory consumption increases to the limit and container gets restarted by kubernetes.
So the question is is there a way to limit mongodb memory consumption for my case so that it would not cause the crush by exeeding memory limit set by platform.
I could of course increase the limit but I'm interested in a principal solution and would like to understand this process better because I don't really now how memory consumed by mongodb and container os. Is it possible to tune mongodb/underlying linux os to work inside existing limits.
The limits that you have set are good enough for a monogodb pod, these are the limits used by the community as well.
The only way I think you can get around this for backups is to increase the memory limits, but still it might fail, because in other places on stackoverflow people have experienced OOM killing on VMs with memory of giga bytes. MongoDB basically tries to eat any and every memory that is made available to it.
Also there are other ways to backup mongodb: https://dba.stackexchange.com/questions/76130/how-to-backup-large-mongodb-database
I am not sure how this aligns in the k8s world.

Is this disk read speed to be expected (Amazon EBS)?

our Amazon EBS backed instance has slowed down considerably (maybe shifted physical host?).
I've checked the instance using top and the CPU use is very low when the process is activated (like 1%). Using iotop I have monitored the disk read speed of postgresql. When there is only one postgresql thread running it's reporting about a 5M/s read speed. Is this rather slow or is this in the parameters of usual disk read speeds?
Thanks
5MB/s its more or less the typical for one single hard drive. I mean sequential accesses of course. If you have only one hard disk then your CPU its fine since one hard disk its not enough to stress out your CPU probably, if you are not reaching any more speed than that even with constant queries then your hard disk its the bottleneck.

Cassandra storage vs memory sizing

I am considering developing an application with a Cassandra backend. I am hoping that I will be able to run each cassandra node on commodity hardware with the following specs:
Quad Core 2GHz i7 CPU
2x 750GB disk drives
16 GB installed RAM
Now, I have been reading online that the available disk-space for Cassandra should be double the amount that is stored on the disks, which would mean that each node (set up in a RAID-1 configuration) would be able to store 375 GB of data, which is acceptable.
My question is this if 16GB RAM is enough to efficiently serve 375 GB of data per node. The data in the application developed will also be fairly time-dependant, such that recent data will be the data most read from the database. In fact, most of the data will be deleted after about 6 months.
Also, would I assign Cassandra a Heap (-Xmx) close to 16 GB, or does Cassandra utilize off-heap memory ?
You should not set the Cassandra heap to more than 8GB; bigger than that, and garbage collection will kill you with large pauses. Cassandra will use the buffer cache (like other applications) so the remaining memory isn't wasted.
16GB of RAM will be enough to serve the data if your hot set will all fit in RAM, or if serving rate can be served off disk. Disks can do about 100 random IO/s, so with your setup if you need more than 200 reads / second you will need to make sure the data is in cache. Cassandra exports good cache statistics (cassandra-cli show keyspaces) so you should easily be able to tell how effective your cache is being.
Do bear in mind, with only two disks in RAID-1, you will not have a dedicated commit log. This could hamper write performance quite badly. You may want to consider turning off the commit log if it does affect performance, and forgo durable writes.
Although it is probably wise not to use a really huge heap with Cassandra, at my company we have used 10GB to 12GB heaps without any issues so far. Our servers typically have at least 48 GB of memory (RAM is cheap -- so why not :-)) and so we may try expanding the heap a bit more and see what happens.

virtual memory on compute nodes in a grid

When I run a job on a node, using PBS, and I get finally in the job report:
resources_used.mem=1616024kb
resources_used.vmem=2350176kb
resources_used.walltime=00:06:32
What does the virtual memory really means? I don't think there is a hard drive connected to each node.
Which kind of memory should I take into account when I try to increase the size of the problem, such that I don't hit the 16GB capacity of the node memory, the normal memory (mem) or the virtual memory (vmem) ?
Thanks
The vmem indicates how much memory your job was using in total. It used all of the available physical memory (see the mem value), and more. An operating system allows programs to allocate more memory than there is physical memory available.
If you're actively using more memory than there is physical memory available, you'll start seeing swap activity (data that was swapped out to disk being brought back into memory, and other stuff being put to disk). That's bad, it will basically kill your performance if it happens a lot.
So, as long as you're not actively using more than 16GB, you're fine. But the mem or vmem values won't tell you this, it depends on what the application is actually doing.

Reduce Membase quota per bucket to 5 MB

In Heroku, I notice that they limit my free Memcached Bucket (actually Membase) to 5MB. However, I tried it on my own server and cannot set Bucket quota to less than 64MB (per node, and for Memcached bucket type). For Membase bucket type, it's even more: 100MB.
Hmm, my server have a humble amount of RAM. And I need to allocate a very small amount of Memcached. Please advice.
Heroku is running a slightly modified version of our memcached software that lets them keep the bucket overhead very low. Unfortunately the "productized" version has some limits imposed to prevent the software from getting itself into trouble.
Especially for Membase buckets, we need at least 100mb in order to safely run.
You may be able to reduce/eliminate these limits if you recompile the source, but that wouldn't be a supported configuration.
Perry
Sorry for the delay in getting back to this...
As with any piece of software, there are internal data structures that need RAM to run...that's what gets allocated immediately with Membase.
If you install memcached, it will use as much RAM as you configure it to use...no more, no less.