Why was I able to load 4.19 GB of memory when `perl -V:ptrsize` returned 4? - perl

I have this output:
root#hostname:/home/admin# perl -V:ptrsize
ptrsize='4';
According to this answer, ptrsize='4' means that perl is able to address 4GB of memory.
However, while loading huge data into the memory, I was consistently able to load exactly 4190924 (4.19) before hitting Out of memory error.
Why did it not fail at 4000000 (4GB) as expected?
For the sake of completeness, I checked the amount of memory used by running qx{ grep VmSize /proc/$$/status };

The limit for a 32-bit pointer is 2^32 = 4,294,967,296 bytes, properly expressed as 4 GiB, but commonly called 4GB. This is 4,194,304 kiB (the unit that VmSize reports in). You came within 4kiB (one page, on most systems) of that.

Related

Page table entries in Virtual Memory

I have a simple question regarding page table entries. Suppose that we are given a 32 - bit Virtual address with 4 KiB page and a physical memory size of 2^28 bits.
Since the page offset is 12 bits, we would have 2^20 page table entries which would be mapped to 2^16 physical frames. But how is it possible for 2^20 entries to map to 2^16 entries. There would run out of physical frame addresses. Suppose that the process uses the full 2^20 pages, then assuming that the whole RAM consists of memory from only this process, all the 2^16 frames in the RAM would contain this processes memory. Am i right to say that 2^4 of the page table entries would show that it maps to disk ?
Also, if the process uses only one page table, then the remaining 2^20 - 1 page table entries would be invalid ?
There are a lot of assumptions in your question. To being with, you assume the page table entries are 32-bits. They could be 64 or even 128 bits.
You also assume that there are 20 bits available for indicating into page frames. Any real system is going to need some of those bits for control and protection purposes.
But how is it possible for 2^20 entries to map to 2^16 entries. There would run out of physical frame addresses.
That is the whole point of virtual memory system. Assuming you have 2^20 pages mapped to a process but only 2^16 physical pages, then not all of the process's pages are going to be mapped to page frames at the same time.
Am i right to say that 2^4 of the page table entries would show that it maps to disk ?
A rationally designed virtual memory system maintains a copy of all process pages on disk somewhere. The pages get copied from disk into memory ad mapped to the address space as needed.
But how is it possible for 2^20 entries to map to 2^16 entries. There would run out of physical frame addresses.
The important thing to understand about virtual memory is that it's "virtual" - it's an illusion that doesn't actually (physically) exist. This allows the OS to do various tricks, like:
marking some/lots of virtual pages as "unused/not present" so that RAM isn't wasted when it's not needed (and so that programs get an error when they try to access something that doesn't exist - e.g. SIGSEGV signal).
move pages between RAM and swap space to pretend that there's more RAM than there actually is. Note that this isn't limited to "swap space on disk" - e.g. it could be memory built into some kind of device (e.g. unused memory in a video card), memory that isn't in the current machine (e.g. using a network to store data in the RAM of a different computer) and it could be RAM in the same computer (e.g. if half the data can be compressed to half it's size, then "compression as swap space" would let you store 4 MiB of data in 3 MiB of RAM).
pretend RAM was allocated by mapping the same page full of zeros everywhere, and then allocate it later (if and only if the page is written to); so that you can have a large area of zeros (e.g. a program's ".bss" section) that costs almost nothing (until/unless it's written to).
pretend that a file is mapped/loaded into memory without allocating memory and without loading the file; and then allocating page/s and loading data into them if/when the data is accessed later, and also (to get more free RAM if the OS needs it for other things) freeing the memory if it wasn't modified (knowing that you can just get it from disk again later if it's accessed again later)
mapping the same pages of RAM into multiple processes. For example, if you have 10 processes that are all executing the same executable file; then the executable file might be stored in RAM once and then mapped into the virtual file system's caches plus mapped into 10 different processes.
"copy on write" tricks; where the same page of RAM is mapped into lots of processes (so any process can read from the page), and then if a process writes to the page the OS can allocate a new page and make a copy of the old page and replace the original (shared, read only) page with the new (not shared, writeable) copy.
If each virtual address space is 1 MiB, then you could have 100 processes (with 100 virtual address spaces and 1 GiB of total space) where the processes only use (on average) 512 KiB of virtual address space each, so that it looks like a total of 51200 KiB of virtual memory is being used; but the computer might only have 64 KiB of RAM where the remaining 51136 KiB of virtual memory is just trickery.

Will all binary codes stay in memory during runtime?

Let's say if I execute a binary:
./a.out
Then will all the code in a.out be loaded into memory every time ? What if this binary is too large (Say, several GB) while process address space only have 4GB?
The reason I'm asking this is because I found a saying that code bloating will degrade performance in a virtual memory based system. However, I never see such a case where program need to page the code from disk. When can this situation happen?

Mongo Server Status - "Resident" memory

After starting Mongo via mongod, I ran a Mongo query that took 300 seconds. Calling db.serverStatus() on my "admin" db showed Mongo having resident memory of 1 GB. The docs explain that "resident" memory is the amount of physical disk/RAM that Mongo uses.
Then, I re-ran the same query, but it took 8 seconds. Looking at the resident memory this time, I saw 5 GB.
The large increase in RAM, I believe, helps to explain why the query time shrank from 300 to 8 seconds, but why did the resident memory jump so quickly?
Is there some type of "warming" step recommended to prepare Mongo so as to avoid 300 second queries?
There reason behind that MongoDB uses mmap functionality of the operating system. This means, at least on Linux systems That the memory handling of the mongodb is based on some functionality of the operating system called memory mapped files.
The memory in Linux systems is addressed in several levels basically any program will see an address space on 32-bit systems of 2GB over all, on 64-bit systems 128TB. This is a virtual address space which means on 32/64bit that amount of memory can be addressed with 4kb memory pages(page is the individually handled part of the memory). That is why if you start mongoDB on a 32 bit system it will rise a warning that the database on such system can handle only 2GB of data. Obviously this virtual address space is bigger than the amount of physical memory so there is a mapping between these virtual addresses and the physical ones. Some of the virtual addresses reside in really physical memory so they are in the real memory,but the algorithm which ensures this is on the side of the kernel. Programs running on Linux systems can deal only with virtual addresses, if one tries to access a virtual memory address which is not in physical memory, a pagefault occurs (you can track this on the serverStatus commands extra info field). (You can find short explanation of this here)
Accessing memory in case if the virtual address reside in physical memory is as fast as the memory, accessing a virtual address which has no physical currently means a paging from disk to memory and read the memory so as fast as the disks random read. (This makes the different in your case)
There is a command in mongoDB which with you can enforce the caching of a collection or an index this command is the touch
If you use this command to load the data into memory before the first query you will get the results in 8sec at first try. Unfortunately you cannot really force the OS to keep always this in memory, so if you have others things using up the memory OS will page out this data in some time.
IF you have enough physical memory mongoDB will keep everything the data and indexes in memory. This not always needed. There is a portion of data which need to be in memory to avoid extensive amount of pagefaults this is the workingset. You can check the size of the working set with the db.runCommand( { serverStatus: 1, workingSet: 1 } ) command.
You cannot handle the paging while it is OS level, but if you have enough memory usually the kernel likes to keep as much stuff cached as it can. If the workingset fits in memory you are more or less ok. If some documents really rarely accessed and there is not enough memory to keep everything there they will be paged out anyway.
When you run a query several things can happen. An index can cover which means no documents will be touched at all, if your query is selective in some notion only a part of the index will be touched. unfortunately it is really hard to define memory is sufficient and the only thing what you can do is to monitor (the workingset metric is an estimation). The symptom of running out of memory can be identified check this presentation. And use MMS.

How to keep 32 bit mongodb memory usage down on changing dataset

I'm using MongoDB on a 32 bit production system, which sucks but it's out of my control right now. The challenge is to keep the memory usage under ~2.5GB since going over this will cause 32 bit systems to crash.
According to the mongoDB team, the best way to track the memory usage is to use your operating system's process tracking system (i.e. ps or htop on Unix systems; Process Explorer on Windows.) for virtual memory size.
The DB mainly consists of one table which is continually cycling data, i.e. receiving data at regular intervals from sensors, and every day a cron job wipes all data from before the last 3 days. Over a period of time, the memory usage slowly increases. I took some notes over time using db.serverStats(), db.lectura.totalSize() and ps, shown in the chart below. Note that the size of the table in question has reduced in the last month but the memory usage increased nonetheless.
Now, there is some scope for adjustment in how many days of data I store. Today I deleted basically half of the data, and then restarted mongodb, and yet the mem virtual / mem mapped and most importantly memory usage according to ps have hardly changed! Why do these not reduce when I wipe data (and restart)? I read some other questions where people said that mongo isn't really using all the memory that it might appear to be using, and that you can't clear the cache or limit memory use. But then how can I ensure I stay under the 2.5GB limit?
Unless there is a way to stem this dataset-size-irrespective gradual increase in memory usage, it seems to me that the 32-bit version of Mongo is unuseable. Note: I don't mind losing a bit of performance if it solves the problem.
To answer regarding why the mapped and virtual memory usage does not decrease with the deletes, the mapped number is actually what you get when you mmap() the entire set of data files. This does not shrink when you delete records, because although the space is freed up inside the data files, they are not themselves reduced in size - the files are just more empty afterwards.
Virtual will include journal files, and connections, and other non-data related memory usage also, but the same principle applies there. This, and more, is described here:
http://www.mongodb.org/display/DOCS/Checking+Server+Memory+Usage
So, the 2GB storage size limitation on 32-bit will actually apply to the data files whether or not there is data in them. To reclaim deleted space, you will have to run a repair. This is a blocking operation and will require the database to be offline/unavailable while it was run. It will also need up to 2x the original size in terms of free disk space to be able to run the repair, since it essentially represents writing out the files again from scratch.
This limitation, and the problems it causes, is why the 32-bit version should not be run in production, it is just not suitable. I would recommend getting onto a 64-bit version as soon as possible.
By the way, neither of these figures (mapped or virtual) actually represents your resident memory usage, which is what you really want to look at. The best way to do this over time is via MMS, which is the free monitoring service provided by 10gen - it will graph virtual, mapped and resident memory for you over time as well as plenty of other stats.
If you want an immediate view, run mongostat and check out the corresponding memory columns (res, mapped, virtual).
In general, when using 64-bit builds with essentially unlimited storage, the data will usually greatly exceed the available memory. Therefore, mongod will use all of the available memory it can in terms of resident memory (which is why you should always have swap configured to the OOM Killer does not come into play).
Once that is used, the OS does not stop allocating memory, it will just have the oldest items paged out to make room for the new data (LRU). In other words, the recycling of memory will be done for you, and the resident memory level will remain fairly constant.
Your options for stretching 32-bit are limited, but you can try some things. The thing that you run out of is address space, and the increases in the sizes of additional database files mean that you would like to avoid crossing over the boundary from "n" files to "n+1". It may be worth structuring your data into more or fewer databases so that you can get the maximum amount of actual data into memory and as little as possible "dead space".
For example, if your database named "mydatabase" consists of the files mydatabase.ns (the namespace file) at 16 MB, mydatabase.0 at 64 MB, mydatabase.1 at 128 MB and mydatabase.2 at 256 MB, then the next file created for this database will be mydatabase.3 at 512 MB. If instead of adding to mydatabase you instead created an additional database "mynewdatabase" it would start life with mynewdatabase.ns at 16 MB and mynewdatabase.0 at 64 MB ... quite a bit smaller than the 512 MB that adding to the original database would be. In fact, you could create 4 new databases for less space than would be consumed by adding a new file to the original database, and because the files are smaller they would be easier to fit into contiguous blocks of memory.
It is a well-known message that 32-bit should not be used for production.
Use 64-bit systems.
Point.

what is the suggested number of bytes each time for files too large to be memory mapped at one time?

I am opening files using memory map. The files are apparently too big (6GB on a 32-bit PC) to be mapped in one ago. So I am thinking of mapping part of it each time and adjusting the offsets in the next mapping.
Is there an optimal number of bytes for each mapping or is there a way to determine such a figure?
Thanks.
There is no optimal size. With a 32-bit process, there is only 4 GB of address space total, and usually only 2 GB is available for user mode processes. This 2 GB is then fragmented by code and data from the exe and DLL's, heap allocations, thread stacks, and so on. Given this, you will probably not find more than 1 GB of contigous space to map a file into memory.
The optimal number depends on your app, but I would be concerned mapping more than 512 MB into a 32-bit process. Even with limiting yourself to 512 MB, you might run into some issues depending on your application. Alternatively, if you can go 64-bit there should be no issues mapping multiple gigabytes of a file into memory - you address space is so large this shouldn't cause any issues.
You could use an API like VirtualQuery to find the largest contigous space - but then your actually forcing out of memory errors to occur as you are removing large amounts of address space.
EDIT: I just realized my answer is Windows specific, but you didn't which platform you are discussing. I presume other platforms have similar limiting factors for memory-mapped files.
Does the file need to be memory mapped?
I've edited 8gb video files on a 733Mhz PIII (not pleasant, but doable).