Total memory consumption of the system - windows-xp

Is it correct to assume that the total memory consumption (virtual + physical) of a system is sum of "Memory Usage" and "VM Size" columns shown by the task manager in windows?

Read these posts by Mark Russinovich:
http://blogs.technet.com/markrussinovich/archive/2008/07/21/3092070.aspx
http://blogs.technet.com/markrussinovich/archive/2008/11/17/3155406.aspx

In modern Windows there really is no single truth about "Total Memory Consumption". It depends of course on the definition, but the real question is what you want to do with the answer.
Some processes like SQL-Server tend to use every byte of memory they can get their hands on, if you let them. The .NET CLR garbage collector monitors memory use and acts accordingly, trying to free more memory when it gets scarce.
So for instance you can have a system with 8 GB of physical memory, of which 90% is "used". How much of that memory is actually needed, is very hard to say. The same system may run on a 4 GB machine with no noticeable performance loss or any other issues.
If you want to explore some of the complexities of memory management under Windows, download "VMMap v2.0" from the former sysinternals site. It shows very detailed memory usage per process and may aid you in your quest.
To quote from VMMaps Help:
VMMap categorizes memory into one of several types:
Image
The memory represents an executable file such as a .exe or .dll. The Details column shows the file's path.
Private
Private memory cannot be shared with other processes, is charged against the system commit limit, and typically contains application data.
Shareable
Shareable memory can be shared with other processes, is charged against the system commit limit and typically contains data shared between DLLs in different processes or inter-process communication messages. The Windows APIs refer to this type of memory as pagefile-backed sections.
Mapped File
The memory represents a file on disk and the Details column shows the file's path. Mapped files typically contain application data.
Heap
Heaps represent memory managed by the user-mode heap manager and, like Private memory, is charged against the system commit limit and contains application data.
Managed Heap
Managed heap represents memory that's allocated and used by the .NET garbage collector.
Stack
Stacks are memory used to store function parameters, local function variables and function invocation records for individual threads. Stacks are charged agains the commit limit and typically grow on demand.
System
System memory is kernel-mode physical memory associated with the process. The vast majority of System memory consists of the process page tables.
Free
Free memory regions are spaces in the process address space that are not allocated.
Now you just need to define what types of memory you consider as "used", add these up for all processes, remove multiple duplicates and look at the number... There is a reason why in task manager or other tools, there is no single number labeled "Total Memory Consumption" :-)

No, physical memory and virtual memory may overlap. If a page of memory is in virtual memory and then paged in to physical memory the virtual memory is not necessarily freed, it may be reserved for when the page gets paged out again.

Related

Memory Address Translation in OS

Is Memory address translation only useful when the total size of virtual memory
(summed over all processes) needs to be larger than physical memory?
Basically, the size of virtual memory depends on what you call "virtual memory". If you call virtual memory the virtual memory of one process then virtual memory has the same size than physical memory. If you call virtual memory the whole virtual memory of all processes than virtual memory can (technically) have an infinite size. This is because every process can have a whole address space. The virtual address space of one process cannot be bigger than physical memory because the processor has limited bits to address this memory. In modern long mode the processor has only 48 bits to address RAM at the byte level. This gives a very big amount of RAM but most systems will have 8GB to 32GB.
Technically on a 8GB RAM computer, every process could have 8GB allocated. I say technically because eventually, the computer will constantly be removing page frames from RAM and that will put too much overhead on the OS and on the computer which will make your system freeze. In the end, the size of the sum of the virtual memory of every process is limited by the capacity of your system (and OS) to have an efficient page swapping algorithm (and on your willingness to have a slow system).
Now to answer your question, paging (virtual memory) is used also to avoid fragmentation and for securing the system. With the old segmentation model, fragmentation was an issue because you had to run a complex algorithm to determine which part of memory a process gets. With paging, the smallest granularity of memory is 4KB. This makes everything much easier because a small process just gets a 4KB page and the process can work in that page the way it wants. While a bigger process will get several pages and can allocate more pages by doing a system call. There is still the issue of external fragmentation but it is mostly due to latency of accessing high memory vs low memory. Basically, paging solves the issue of external fragmentation because a process can get a page anywhere (where it's available) and it will not make a difference (except for high vs low memory). There is still the issue of internal fragmentation with paging.
Paging also secures the system. With segmentation you had several levels of ring protection. With paging you have only user or supervisor. With segmentation, the memory is not well protected because one process can access the memory of another process in the same segment. With paging, there are 2 different protections. The first protection is the ring itself (user vs supervisor) the second are the page tables. The page tables isolate one process from another because the memory accesses are translated to other positions in RAM. It is the job of the OS to fill the page tables properly so that one process doesn't have access to the physical memory allocated to another process. The user vs supervisor bit in the page tables, prevent one process from accessing the kernel except via a system call interface (the instruction syscall in assembly for x86).

Are pages only secondary memory like hard drives, or are they used for RAM too?

I'm confused by this.
Are pages only memory units that exist in secondary memory or do they also exist in RAM too?
A memory page is the smallest unit of memory used by a virtual memory manager. A page can be backed by physical RAM, or by swap space or a page file on a hard drive. Pages backed by RAM have much faster IO, but as RAM gets full the OS may have to swap out pages to the hard drive.
Pages do not exist [physically] at all. A page is simply a redirection mechanism.
The operating system sets up of linear, logical address space for each process. The logical address space is organized into pages that in turn may map to:
A physical page frame of memory
No where
Somewhere on disk and managed by the operating system.
Paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. Pages are used in RAM too, as a solution of external fragmentation.External fragmentation is a situation when total free space is enough to hold another process but space available is not contiguous. Compaction is one of the solution but for processes which are run-time loaded only. So, Paging is the true solution for external fragmentation where we implement page table which gives illusion that process has been given contiguous memory. Every address from CPU is broken down to page number and offset.

Where are multiple stacks and heaps put in virtual memory?

I'm writing a kernel and need (and want) to put multiple stacks and heaps into virtual memory, but I can't figure out how to place them efficiently. How do normal programs do it?
How (or where) are stacks and heaps placed into the limited virtual memory provided by a 32-bit system, such that they have as much growing space as possible?
For example, when a trivial program is loaded into memory, the layout of its address space might look like this:
[ Code Data BSS Heap-> ... <-Stack ]
In this case the heap can grow as big as virtual memory allows (e.g. up to the stack), and I believe this is how the heap works for most programs. There is no predefined upper bound.
Many programs have shared libraries that are put somewhere in the virtual address space.
Then there are multi-threaded programs that have multiple stacks, one for each thread. And .NET programs have multiple heaps, all of which have to be able to grow one way or another.
I just don't see how this is done reasonably efficient without putting a predefined limit on the size of all heaps and stacks.
I'll assume you have the basics in your kernel done, a trap handler for page faults that can map a virtual memory page to RAM. Next level up, you need a virtual memory address space manager from which usermode code can request address space. Pick a segment granularity that prevents excessive fragmentation, 64KB (16 pages) is a good number. Allow usermode code to both reserve space and commit space. A simple bitmap of 4GB/64KB = 64K x 2 bits to keep track of segment state gets the job done. The page fault trap handler also needs to consult this bitmap to know whether the page request is valid or not.
A stack is a fixed size VM allocation, typically 1 megabyte. A thread usually only needs a handful of pages of it, depending on function nesting level, so reserve the 1MB and commit only the top few pages. When the thread nests deeper, it will trip a page fault and the kernel can simply map the extra page to RAM to allow the thread to continue. You'll want to mark the bottom few pages as special, when the thread page faults on those, you declare this website's name.
The most important job of the heap manager is to prevent fragmentation. The best way to do that is to create a lookaside list that partitions heap requests by size. Everything less than 8 bytes comes from the first list of segments. 8 to 16 from the second, 16 to 32 from the third, etcetera. Increasing the size bucket as you go up. You'll have to play with the bucket sizes to get the best balance. Very large allocations come directly from the VM address manager.
The first time an entry in the lookaside list is hit, you allocate a new VM segment. You subdivide the segment into smaller blocks with a linked list. When such an allocation is released, you add the block to the list of free blocks. All blocks have the same size regardless of the program request so there won't be any fragmentation. When the segment is fully used and no free blocks are available you allocate a new segment. When a segment contains nothing but free blocks you can return it to the VM manager.
This scheme allows you to create any number of stacks and heaps.
Simply put, as your system resources are always finite, you can't go limitless.
Memory management always consists of several layers each having its well defined responsibility. From the perspective of the program, the application-level manager is visible that is usually concerned only with its own single allocated heap. A level above could deal with creating the multiple heaps if needed out of (its) one global heap and assigning them to subprograms (each with its own memory manager). Above that could be the standard malloc()/free() that it uses and above those the operating system dealing with pages and actual memory allocation per process (it is basically not concerned not only about multiple heaps, but even user-level heaps in general).
Memory management is costly and so is trapping into the kernel. Combining the two could impose severe performance hit, so what seems to be the actual heap management from the application's point of view is actually implemented in user space (the C runtime library) for the sake of performance (and other reason out of scope for now).
When loading a shared (DLL) library, if it is loaded at program startup, it will of course be most probably loaded to CODE/DATA/etc so no heap fragmentation occurs. On the other hand, if it is loaded at runtime, there's pretty much no other chance than using up heap space.
Static libraries are, of course, simply linked into the CODE/DATA/BSS/etc sections.
At the end of the day, you'll need to impose limits to heaps and stacks so that they're not likely to overflow, but you can allocate others.
If one needs to grow beyond that limit, you can either
Terminate the application with error
Have the memory manager allocate/resize/move the memory block for that stack/heap and most probably defragment the heap (its own level) afterwards; that's why free() usually performs poorly.
Considering a pretty large, 1KB stack frame on every call as an average (might happen if the application developer is unexperienced) a 10MB stack would be sufficient for 10240 nested call -s. BTW, besides that, there's pretty much no need for more than one stack and heap per thread.

How to keep 32 bit mongodb memory usage down on changing dataset

I'm using MongoDB on a 32 bit production system, which sucks but it's out of my control right now. The challenge is to keep the memory usage under ~2.5GB since going over this will cause 32 bit systems to crash.
According to the mongoDB team, the best way to track the memory usage is to use your operating system's process tracking system (i.e. ps or htop on Unix systems; Process Explorer on Windows.) for virtual memory size.
The DB mainly consists of one table which is continually cycling data, i.e. receiving data at regular intervals from sensors, and every day a cron job wipes all data from before the last 3 days. Over a period of time, the memory usage slowly increases. I took some notes over time using db.serverStats(), db.lectura.totalSize() and ps, shown in the chart below. Note that the size of the table in question has reduced in the last month but the memory usage increased nonetheless.
Now, there is some scope for adjustment in how many days of data I store. Today I deleted basically half of the data, and then restarted mongodb, and yet the mem virtual / mem mapped and most importantly memory usage according to ps have hardly changed! Why do these not reduce when I wipe data (and restart)? I read some other questions where people said that mongo isn't really using all the memory that it might appear to be using, and that you can't clear the cache or limit memory use. But then how can I ensure I stay under the 2.5GB limit?
Unless there is a way to stem this dataset-size-irrespective gradual increase in memory usage, it seems to me that the 32-bit version of Mongo is unuseable. Note: I don't mind losing a bit of performance if it solves the problem.
To answer regarding why the mapped and virtual memory usage does not decrease with the deletes, the mapped number is actually what you get when you mmap() the entire set of data files. This does not shrink when you delete records, because although the space is freed up inside the data files, they are not themselves reduced in size - the files are just more empty afterwards.
Virtual will include journal files, and connections, and other non-data related memory usage also, but the same principle applies there. This, and more, is described here:
http://www.mongodb.org/display/DOCS/Checking+Server+Memory+Usage
So, the 2GB storage size limitation on 32-bit will actually apply to the data files whether or not there is data in them. To reclaim deleted space, you will have to run a repair. This is a blocking operation and will require the database to be offline/unavailable while it was run. It will also need up to 2x the original size in terms of free disk space to be able to run the repair, since it essentially represents writing out the files again from scratch.
This limitation, and the problems it causes, is why the 32-bit version should not be run in production, it is just not suitable. I would recommend getting onto a 64-bit version as soon as possible.
By the way, neither of these figures (mapped or virtual) actually represents your resident memory usage, which is what you really want to look at. The best way to do this over time is via MMS, which is the free monitoring service provided by 10gen - it will graph virtual, mapped and resident memory for you over time as well as plenty of other stats.
If you want an immediate view, run mongostat and check out the corresponding memory columns (res, mapped, virtual).
In general, when using 64-bit builds with essentially unlimited storage, the data will usually greatly exceed the available memory. Therefore, mongod will use all of the available memory it can in terms of resident memory (which is why you should always have swap configured to the OOM Killer does not come into play).
Once that is used, the OS does not stop allocating memory, it will just have the oldest items paged out to make room for the new data (LRU). In other words, the recycling of memory will be done for you, and the resident memory level will remain fairly constant.
Your options for stretching 32-bit are limited, but you can try some things. The thing that you run out of is address space, and the increases in the sizes of additional database files mean that you would like to avoid crossing over the boundary from "n" files to "n+1". It may be worth structuring your data into more or fewer databases so that you can get the maximum amount of actual data into memory and as little as possible "dead space".
For example, if your database named "mydatabase" consists of the files mydatabase.ns (the namespace file) at 16 MB, mydatabase.0 at 64 MB, mydatabase.1 at 128 MB and mydatabase.2 at 256 MB, then the next file created for this database will be mydatabase.3 at 512 MB. If instead of adding to mydatabase you instead created an additional database "mynewdatabase" it would start life with mynewdatabase.ns at 16 MB and mynewdatabase.0 at 64 MB ... quite a bit smaller than the 512 MB that adding to the original database would be. In fact, you could create 4 new databases for less space than would be consumed by adding a new file to the original database, and because the files are smaller they would be easier to fit into contiguous blocks of memory.
It is a well-known message that 32-bit should not be used for production.
Use 64-bit systems.
Point.

What's the difference between "virtual memory" and "swap space"?

Can any one please make me clear what is the difference between virtual memory and swap space?
And why do we say that for a 32-bit machine the maximum virtual memory accessible is 4 GB only?
There's an excellent explantation of virtual memory over on superuser.
Simply put, virtual memory is a combination of RAM and disk space that running processes can use.
Swap space is the portion of virtual memory that is on the hard disk, used when RAM is full.
As for why 32bit CPU is limited to 4gb virtual memory, it's addressed well here:
By definition, a 32-bit processor uses
32 bits to refer to the location of
each byte of memory. 2^32 = 4.2
billion, which means a memory address
that's 32 bits long can only refer to
4.2 billion unique locations (i.e. 4 GB).
There is some confusion regarding the term Virtual Memory, and it actually refers to the following two very different concepts
Using disk pages to extend the conceptual amount of physical memory a computer has - The correct term for this is actually Paging
An abstraction used by various OS/CPUs to create the illusion of each process running in a separate contiguous address space.
Swap space, OTOH, is the name of the portion of disk used to store additional RAM pages when not in use.
An important realization to make is that the former is transparently possible due to the hardware and OS support of the latter.
In order to make better sense of all this, you should consider how the "Virtual Memory" (as in definition 2) is supported by the CPU and OS.
Suppose you have a 32 bit pointer (64 bit points are similar, but use slightly different mechanisms). Once "Virtual Memory" has been enabled, the processor considers this pointer to be made as three parts.
The highest 10 bits are a Page Directory Entry
The following 10 bits are a Page Table Entry
The last 12 bits make up the Page Offset
Now, when the CPU tries to access the contents of a pointer, it first consults the Page Directory table - a table consisting of 1024 entries (in the X86 architecture the location of which is pointed to by the CR3 register). The 10 bits Page Directory Entry is an index in this table, which points to the physical location of the Page Table. This, in turn, is another table of 1024 entries each of which is a pointer in physical memory, and several important control bits. (We'll get back to these later). Once a page has been found, the last 12 bits are used to find an address within that page.
There are many more details (TLBs, Large Pages, PAE, Selectors, Page Protection) but the short explanation above captures the gist of things.
Using this translation mechanism, an OS can use a different set of physical pages for each process, thus giving each process the illusion of having all the memory for itself (as each process gets its own Page Directory)
On top of this Virtual Memory the OS may also add the concept of Paging. One of the control bits discussed earlier allows to specify whether an entry is "Present". If it isn't present, an attempt to access that entry would result in a Page Fault exception. The OS can capture this exception and act accordingly. OSs supporting swapping/paging can thus decide to load a page from the Swap Space, fix the translation tables, and then issue the memory access again.
This is where the two terms combine, an OS supporting Virtual Memory and Paging can give processes the illusion of having more memory than actually present by paging (swapping) pages in and out of the swap area.
As to your last question (Why is it said 32 bit CPU is limited to 4GB Virtual Memory). This refers to the "Virtual Memory" of definition 2, and is an immediate result of the pointer size. If the CPU can only use 32 bit pointers, you have only 32 bit to express different addresses, this gives you 2^32 = 4GB of addressable memory.
Hope this makes things a bit clearer.
IMHO it is terribly misleading to use the concept of swap space as equivalent to virtual memory. VM is a concept much more general than swap space. Among other things, VM allows processes to reference virtual addresses during execution, which are translated into physical addresses with the support of hardware and page tables. Thus processes do not concern about how much physical memory the system has, or where the instruction or data is actually resident in the physical memory hierarchy. VM allows this mapping. The referenced item (instruction or data) may be resident in L1, or L2, or RAM, or finally on disk, in which case it is loaded into main memory.
Swap space it is just a place on secondary memory where pages are stored when they are inactive. If there is no sufficient RAM, the OS may decide to swap-out pages of a process, to make room for other process pages. The processor never ever executes instruction or read/write data directly from swap space.
Notice that it would be possible to have swap space in a system with no VM. That is, processes that directly access physical addresses, still could have portions of it on
disk.
Though the thread is quite old and has already been answered. Still would like to share this link as this is the simplest explanation I have found so far. Below link has got diagrams for better visualization.
Key Difference: Virtual memory is an abstraction of the main memory. It extends the available memory of the computer by storing the inactive parts of the content RAM on a disk. Whenever the content is required, it fetches it back to the RAM. Swap memory or swap space is a part of the hard disk drive that is used for virtual memory. Thus, both are also used interchangeably.
Virtual memory is quiet different from the physical memory. Programmers get direct access to the virtual memory rather than physical memory. Virtual memory is an abstraction of the main memory. It is used to hide the information of the real physical memory of the system. It extends the available memory of the computer by storing the inactive parts of the RAM's content on a disk. When the content is required, it fetches it back to the RAM. Virtual memory creates an illusion of a whole address space with addresses beginning with zero. It is mainly preferred for its optimization feature by which it reduces the space requirements. It is composed of the available RAM and disk space.
Swap memory is generally called as swap space. Swap space refers to the portion of the virtual memory which is reserved as a temporary storage location. Swap space is utilized when available RAM is not able to meet the requirement of the system’s memory. For example, in Linux memory system, the kernel locates each page in the physical memory or in the swap space. The kernel also maintains a table in which the information regarding the swapped out pages and pages in physical memory is kept.
The pages that have not been accessed since a long time are sent to the swap space area. The process is referred to as swapping out. In case the same page is required, it is swapped in physical memory by swapping out a different page. Thus, one can conclude that swap memory and virtual memory are interconnected as swap memory is used for the technique of virtual memory.
difference-between-virtual-memory-and-swap-memory
"Virtual memory" is a generic term. In Windows, it is called as Paging or pagination. In Linux, it is called as Swap.