Memcached and virtual memory - memcached

According to this thread (not very reliable, I know) memcached does not use the disk, not even virtual memory.
My questions are:
Is this true?
If so, how does memcached ensures that the memory he gets assigned never overflows to disk?

memcached avoids going to swap through two mechanisms:
Informing the system administrators that the machines should never go to swap. This allows the admins to maybe not configure swap space for the machine (seems like a bad idea to me) or configure the memory limits of the running applications to ensure that nothing ever goes into swap. (Not just memcached, but all applications.)
The mlockall(2) system call can be used (-k) to ensure that all the process's memory is always locked in memory. This is mediated via the setrlimit(2) RLIMIT_MEMLOCK control, so admins would need to modify e.g. /etc/security/limits.conf to allow the memcached user account to lock a lot more memory than is normal. (Locked memory is mediated to prevent untrusted user accounts from starving the rest of the system of free memory.)
Both these steps are fair assuming the point of the machine is to run memcached and perhaps very little else. This is often a fair assumption, as larger deployments will dedicate several (or many) machines to memcached.

You configure memcached to use a fixed amount of memory. When that memory is full memcached just deletes old data to stay under the limit. It is that simple.

Related

Can an application running on a system with a paging memory management system directly access the physical memory?

In the context of operating systems, does an application have direct access to primary memory?
Speaking from a Linux perspective, assuming we're talking about user-level (non-root) processes, then no they can't directly access physical memory. Nor should they for security and functionality reasons. The whole point of paging is to abstract away physical memory from applications so that they only think they have the entire physical memory, but behind the scenes, their memory may or not be resident in physical memory (see: page faults, non-contiguous allocation, page replacement policies).
For root processes though, there is at least one way I know of: through /dev/mem. This discussion mentions how to mmap into /dev/mem to get access to specific physical addresses. Use at your own risk though.

Can we ever have 0 page fault rate with an infinite, or an absurd amount of it?

I have an assignment for my Operating System course. One of the questions aks me to provide an explanation as to why it is possible/not possible to have 0 page-fault rate. Can a real system have enough RAM so that it will have no page faults at all.
I was thinking that maybe if we had an infinite amount of RAM, there will be no need for virtual memory, thus there will be no page-faults. I came to this cinclusion because page-faults happen when a process requests a memory page that is in virtual memory, and not in physical memory. Maybe with an infinite amount of RAM, all the memory the process will need will be on the physical memory, there will be no need for paging.
Yes, you can. There are times when we do not tolerate page faults, when any page fault is doomed. For starters, interrupt handlers may not page fault because they may not wait.
Besides that, sometimes the specification reads "must respond in 1/60th of a second" where the consequence of not responding is bad things happen. Depending on the severity of the consequences, we may go way out of our way to ensure page faults do not happen once initialized.
Yes, this means having enough RAM, but that alone will not suffice. There are system calls for locking pages into RAM so that they cannot ever be evicted because otherwise the OS would reclaim idle RAM in favor of disk cache. When we can't tolerate that behavior ...
Some embedded operating systems can't even page.

Optimized environment for mongo

I have my RHEL linux server(VM) running a 4core processor and 8GB ram running the below applications
- an Apache Karaf container
- an Apache tomcat server
- an ActiveMQ server
- and the mongod server(either primary of secondary).
Often I see that mongo consumes nearly 80% of cpu. Now I see that my cpu and memory is overshooting most of the time and this has caused me to doubt whether my hardware config is too low for running these many components.
Please let me know if it is ok to run mongo like this on a shared server..
The question is to broad and the answer depends on too many variables, but I'll try to give you overall sense of it.
Can you use all these services together on the same machine at a minimum load? - for sure. It's not clear where other shards reside though, but it will work either way. You didn't provide your HDD specs which is quite important for a DB server, but again it will work at a minimum load.
Can you use this setup under heavy load - not the best idea. Perhaps it's better to have separate servers handling these services.
Monitor overall server load like: CPU, memory, IO. Check mongo logs for slow queries. If your queries supposed to run fast and they don't, you'll need more hardware.
Nobody would be really able to tell you how much load a specific server configuration can handle. You need at least 512Mb RAM and 1 CPU to get going these days but very soon you hit the limits. It all depends on how many users you have, what kinds of queries they run and how much data they cover.
Can you run MongoDB along other applications on a single server? Well it would appear that if you are having memory issues or CPU issues in your current configuration then you will likely need to address something. But "Can You?", well if it is not going to affect you then of course you can.
Should you, do this? Most people would firmly agree that you should not, and that would also stand for most of the other applications you are running on the one machine.
There are various reasons, process isolation, resource allocation, security, and far too many for a short topic response to go into why you should not have this kind of configuration. And certainly where it becomes a problem you should be addressing the issue by seeking a new configuration.
For Mongo alone, most people would not think twice about running their SQL database on dedicated hardware. The choice for Mongo should likely be no different.
Have also suggested this be moved to ServerFault, as it is not a programming question suited to stack overflow.

Need of Virtual memory? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I was recently asked a question that in a computer system, if the primary memory(RAM) is comparable to the secondary memory (HDD) then is there a need for virtual memory to be implemented in such a computer system ?
Since paging and segmentation require context switching, which is purely processing overhead, would the benefits of virtual memory overshoot the processing overhead it requires ?
Can someone help me with this question ?
Thanku
It is true that with virtual memory, you are able to have your programs commit (i.e. allocate) a total of more memory that physically available. However, this is only one of many benefits if having virtual memory and it's not even the most important one. Personally, when I use a PC, I periodically check task manager to see how close I come to using my actual RAM. If I constantly go over, I go and I buy more RAM.
The key attribute of all OSes that use virtual memory is that every process has its own isolated address space. That means you can have a machine with 1GB of RAM and have 50 processes running but each one will still have 4GB of addressable memory space (32-bit OS assumed). Why is it important? It's not that you can "fake things out" and use RAM that isn't there. As soon as you go over and swapping starts, your virtual memory manager will begin thrashing and performance will come a halt. A much more important implication of this is that if each program has it's own address space, there's no way it can write to any random memory location and affect another program.
That's the main advantage: stability/reliability. In Windows 95, you could write an application that would crash entire operating system. In W2K+, it is simply impossible to write a program that paves all over its own address space and crashes anything other than self.
There are few other advantages as well. When executables and DLLs are loaded into RAM, virtual memory manager can detect when the same binary is loaded more than once and it will make multiple processes share the same physical RAM. At virtual memory level, it appears as if each process has its own copy, but at a lower level, it all gets mapped to one spot. This speeds up program startup and also optimizes memory usage since each DLL is only loaded once.
Virtual memory managers also allow you to perform file I/O by simply mapping files to pages in the virtual address space. In addition to introducing interesting alternative to working with files, this also allows for implementations of shared memory segments which is when physical RAM with read/write pages is intentionally shared between processes for extremely efficient inter-process communications (IPC).
With all these benefits, if we consider that most of the time you still want to shoot for having more physical RAM than total commit size and consider that modern CPUs have support for virtual address mapping built directly into the hardware, the overhead of having virtual memory manager is actually very minimal. On the other hand, in environments where many applications from many different vendors run concurrently, process address space is priceless.
I'm going to dump my understanding of this matter, with absolutely no background credentials to back it up. Gonna get downvoted? :)
First up, by saying primary memory is comparable to secondary memory, I assume you mean in terms of space. (Afterall, accessing RAM is faster than accessing storage).
Now, as I understand it,
Random Access Memory is limited by Address Space, which is the addresses which the operating system can store stuff in. A 32bit operating system is limited to roughly 4gb of RAM, while 64bit operating systems are (theoretically) limited to 2.3EXABYTES of RAM, although Windows 7 limits it to 200gb for Ultimate edition, and 2tb for Server 2008.
Of course, there are still multiple factors, such as
cost to manufacture RAM. (8gb on a single ram thingie(?) still in the hundreds)
dimm slots on motherboards (I've seen boards with 4 slots)
But for the purpose of this discussion let us ignore these limitations, and talk just about space.
Let us talk about how applications nowadays deal with memory. Applications do not know how much memory exists - for the most part, it simply requisitions it from the operating system. The operating system is the one responsible for managing which address spaces have been allocated to each application that is running. If it does not have enough, well, bad things happen.
But, surely with theoretical 2EXABYTES of RAM, you'd never run out?
Well, a famous person long ago once said we'd never need more than 64kBs of RAM.
Because most Applications nowadays are greedy (they take as much as the operating system is willing to give), if you ran enough applications, on a powerful enough computer, you could theoretically exceed the storage limits of the physical memory. In that case, Virtual Memory would be required to make up the extra required memory.
So to answer your question: (in my humble opinion formed from limited knowledge on the matter,) yes you'd still need to implement virtual memory.
Obviously take all this and do your own research. I'm turning this into a community wiki so others can edit it or just delete it if it is plain wrong :)
Virtual memory working
It may not ans your whole question. But it seems the ans to me

Is Virtual memory really useful all the time?

Virtual memory is a good concept currently used by modern operating systems. But I was stuck answering a question and was not sure enough about it. Here is the question:
Suppose there are only a few applications running on a machine, such that the
physical memory of system is more than the memory required by all the
applications. To support virtual memory, the OS needs to do a lot work. So if
the running applications all fit in the physical memory, is virtual memory
really needed?
(Furthermore, the applications running together will always fit in RAM.)
Even when the memory usage of all applications fits in physical memory, virtual memory is still useful. VM can provide these features:
Privileged memory isolation (every app can't touch the kernel or memory-mapped hardware devices)
Interprocess memory isolation (one app can't see another app's memory)
Static memory addresses (e.g. every app has main() at address 0x0800 0000)
Lazy memory (e.g. pages in the stack are allocated and set to zero when first accessed)
Redirected memory (e.g. memory-mapped files)
Shared program code (if more than one instance of a program or library is running, its code only needs to be stored in memory once)
While not strictly needed in this scenario, virtual memory is about more than just providing "more" memory than is physically available (swapping). For example, it helps avoiding memory fragmentation (from an application point of view) and depending on how dynamic/shared libraries are implemented, it can help to avoid relocation (relocation is when the dynamic linker needs to adapt pointers in a library or executable that was just loaded).
A few more points to consider:
Buggy apps that don't handle failures in the memory allocation code
Buggy apps that leak allocated memory
Virtual memory reduces severity of these bugs.
The other replies list valid reasons why virtual memory is useful but
I would like to answer the question more directly : No, virtual memory
is not needed in the situation you describe and not using virtual
memory can be the right trade-off in such situations.
Seymour Cray took the position that "Virtual memory leads to virtual
performance." and most (all?) Cray vector machines lacked virtual
memory. This usually leads to higher performance on the process level
(no translations needed, processes are contiguous in RAM) but can lead
to poorer resource usage on the system level (the OS cannot utilize
RAM fully since it gets fragmented on the process level).
So if a system is targeting maximum performance (as opposed to maximum
resource utilization) skipping virtual memory can make sense.
When you experience the severe performance (and stability) problems
often seen on modern Unix-based HPC cluster nodes when users
oversubscribe RAM and the system starts to page to disk, there is a
certain sympathy with the Cray model where the process either starts
and runs at max performance, or it doesn't start at all.