Meaning of SIZE and RSS values in prstat output - solaris

Can somebody give some clear explanation of the meaning of the SIZE and RSS values we get from prstat in Solaris?
I wrote a testing C++ application that allocates memory with new[], fills it and frees it with delete[].
As I understood, the SIZE value should be related to how much virtual memory has been "reserved" by the process, that is memory "malloced" or "newed".
That memory doesn't sum up in the RSS value unless I really use it (filling with some values). But then even if I free the memory, the RSS doesn't drop.
I don't understand what semantic I can correctly assign to those 2 values.

RSS is (AFAIK reliably) representing how much physical memory a process is using. Using Solaris default memory allocator, freeing memory doesn't do anything about RSS as it just changes some pointers and values to tell that memory is free to be reused.
If you don't use again that memory by allocating it again, it will eventually be paginated out and the RSS will drop.
If you want freed memory to be returned immediately after a free, you can use the Solaris mmap allocator like this:
export LD_PRELOAD=libumem.so
export UMEM_OPTIONS=backend=mmap

Size is the total virtual memory size of the process, including all mapped files and devices, and RSS should be the resident set size, but is completely unreliable, you should try to get that information from pmap.

As a general rule once memory is allocated to a process it will never be given back to the operating system. On Unix systems the sbrk() call is used to extend the processes address space, and there is not analogous call to go in the other direction.

Related

How does mmap() help read information at a specific offset versus regular Posix I/O

I'm trying to understanding something a bit better about mmap. I recently read this portion of this accepted answer in the related stackoverflow question (quoted below):mmap and memory usage
Let's say you read a 100MB chunk of data, and according to the initial
1MB of header data, the information that you want is located at offset
75MB, so you don't need anything between 1~74.9MB! You have read it
for nothing but to make your code simpler. With mmap, you will only
read the data you have actually accessed (rounded 4kb, or the OS page
size, which is mostly 4kb), so it would only read the first and the
75th MB.
I understand most of the benefits of mmap (no need for context-switches, no need to swap contents out, etc), but I don't quite understand this offset. If we don't mmap and we need information at the 75th MB offset, can't we do that with standard POSIX file I/O calls without having to use mmap? Why does mmap exactly help here?
Of course you could. You can always open a file and read just the portions you need.
mmap() can be convenient when you don't want to write said code or you need sparse access to the contents and don't want to have to write a bunch of caching logic.
With mmap(), you're "mapping" the entire contest of the file to offsets in memory. Most implementation of mmap() do this lazily, so each ~4K block of the file is read on-demand, as you access those memory locations.
All you have to do is access the data in your file like it was a huge array of chars (i.e. int* someInt = &map[750000000]; return *someInt;), and let the OS worry about what portions of the file have been read, when to read the file, how much, writing the dirty data blocks back to the file, and purging the memory to free up RAM.

iPhone/Instruments: what's about the "malloc" entries in object summary?

I'm performance tuning my iPhone/iPad app, it seems like not all the memory gets freed which should be. In instruments, after I simulate a memory warning in my simulator, there are lots of "Malloc" entries left; what's about them? Can I get rid of them, what do they mean/what do they stand for?
Thanks a lot,
Stefan
At any time, your app will have a (huge) number of living objects, even after getting a memory warning (and the subsequent memory recovery by the operating system). So, it is pretty common that you will also see many of those mallocs you are seeing.
They are not in themselves a sign that something is wrong with memory allocation, but possibly only of the fact that your program is running.
Also have a look at this S.O. topic to learn more about the object allocation tool.
Furthermore, there are many advanced techniques you can use to detect memory allocation problems.
Here you can find a great tutorial that will allow you to go way beyond what the Leaks tool allows you to.
EDIT:
About the exact meaning of those mallocs, you have to think that you can allocate two broad classes of objects (to put it roughly): Objective-C objects that are created through the Obj-C runtime system, and "normal" C objects, that are allocated through malloc.
Many object of the second class are allocated (without you directly calling malloc) by system libraries and by the compiler C library (think about, e.g., sockets or file handles, whatever). Those (C) objects do not have type information associated to them, so Instruments simply shows you the size of the allocated memory block, without having more information available.
Many times malloc objects are created by higher-level classes, so that when you recover memory associated to their instances, also memory allocated through malloc is freed.
You should not worry specifically about them, unless you see that their overall size "grows indefinitely" along program execution. In such case you need first to investigate the way you alloc/release your higher level objects and understand where in your code things get stuck.

Is it true that CPU never fetches anything from memory directly?

I hear that cpu just fetches instruction from the EIP register,never fetches from memory directly.
But AFAIK,EIP just stores the address of the next instruction,the instruction itself is still in the memory.If CPU never fetches memory,how can it know what the next instruction actually is?
UPDATE
BTW,I know there're x86,x64,x87 architectures,but which does x86-64 belong to,x86 or x64??
The simple answer to your question is "no, it's not true".
The picture isn't very simple due to caching, instruction pipeline, branch prediction etc. However, the instruction pointer is just that, a pointer. It doesn't store opcodes.
EIP (Extended Instruction Pointer) should hold the address of the instruction. It's just a way to keep a tab of which instruction is being processed currently (or sometimes, which instruction to process next).
The instructions themselves are stored in the Memory (HDD, RAM, Cache) and need to be fetched by the CPU.
Maybe what you heard meant that since so many levels of caches are used generally it's quite rare that the fetch needs to access the RAM..
Well I don't know the point to your question.
Yes the CPU (in a broad sense of the word) does fetch from memory. It has a number of memory management devices (for cache line handling and pipelining). In fact, the 'pipeline' puts the instructions in L1 cache. Indeed, the instruction processor itself only fetches from there. The processor in reality probably never even looks at EIP (unless an instruction uses it directly as an operand).
So the real answer would be, find yourself a wikipedia articale on i86 processor design, and have a ball. You'll be able to know exactly what happens where.
Cheers
Not true in that way. CPU accesses memory thru the cache, so you can kinda say that it does not do it directly. (Also DMA cahnnel can transfer data between memory and IO without ever touching CPU).
Yes, CS:EIP points to the memory, to the next instruction to execute, but you can use direct addresses too for example (load the content of the address 0x0800 to the AX register, by default this is relative to DS segment):
MOV AX,[0x0800]

Memory leak tool tells me zero leaks but memory footprint keeps rising

I'm running through some memory profiling for my application in SDK 3.2 and I used the 'Leak' profiler to find all my memory leaks and I plugged all of them up. This is a scrollView navigationController application where there are tiles and you click a tile which goes to a new view of tiles and so on, I can go many levels deep and come all the way back to the top and the 'Leak' profiler says everything is cool.
However, if I watch the memory footprint in the 'ObjectAlloc' profiler the memory footprint goes up and up as I go deeper (which seems reasonable) but as I back out of the views the memory footprint doesn't go down as I'd expect.
I know this is a vague description of the app but I can't exactly post a gillion lines of code :) Also it should be noted I'm using coreData to store image data as I go so the database is growing in size as more nodes are chosen, dunno if/when that is released from memory.
What gives?
This sounds like it could be one of a few things:
Memory not given back to OS after deallocation. This is a common design for C runtimes. When you do an allocation the C runtime allocates more memory for its use and returns a chunk of it for you to use. When you do a free the C runtime simply marks it as deallocated but doesn't return it back to the OS. Thus if the Leak Tool is reading OS level statistics rather than C runtime statistics the Leak tool will fail to report a corresponding decrease in memory usage.
Misleading values reported by Leak Tool Memory. The Leak Tool could be looking at different values than the C runtime and is reporting values that will cause you concern even though nothing is wrong (just as people try to use Task Manager in Windows for detecting leaks and get very confused with the results because it is a very poor tool indeed for that job).
Fragmentation. It is possible that your application is suffering from memory fragmentation. That is when you allocate, then deallocate then allocate, subsequent attempted allocations are larger than the "holes" left by deallocations. When this happens you fragment the memory space, leaving unusable holes, preventing large contiguous memory blocks and forcing the usage of more and more memory space until you run out of memory. This is a pathological condition and the fix is typically application specific.
I think the first of these three suggestions is most likely what is happening.
Depending on how you have your object graph constructed in Core Data, it's memory use can grow unexpectedly large.
A common mistake is to store objects inside a complex and often faulted (loaded into memory) entity. This cause the big blob to be loaded/remain in memory whenever any other part of the entity is referenced. As you object graph grows, it eats more and more memory unless you actively delete objects and then save the graph.
For example: You have an person entity with lots of text info e.g. name, address, etc as well as a large photo. If you make the photo an attribute of the person entity it will be in memory anytime the person entity is faulted. If you get the attribute name, then the photo attribute is in memory as well.
To avoid this, blobs should be in their own entity and then linked to other entities in relationships. Since relationship objects are not faulted until they are called directly they can remain out of memory until needed.
Just because there are no refcount-based leaks, doesn't mean that you're not stuffing something off in a Dictionary "cache" and forgetting about it; those won't show up as leaks because there are valid references to it (the dict is still valid, and so are the refs to all its children). You also need to look for valid, yet unnecessary refs to objects.
The easiest way is to just let it run for too long, then sort object counts by type and see who has a gigantic number - then, track down the reference graph (might be hard in Obj-C?). If Instruments doesn't do this directly, you can definitely write a DTrace script to do so.
To reiterate:
char *str1 = malloc(1000);
char *str2 = malloc(1000);
.
.
.
char *str1000 = malloc(1000);
is not a memory leak, but
char *str1 = malloc(1000);
char *str1 = malloc(1000); //Note! No free(str1) in between!
is a memory leak
The information on core data memory management is good info and technically the answer by Arthur Kalliokoski is a good answer re: the difference between a leek and object allocation. My particular problem here is related to an apparently known bug with setBackgroundImage on a button in the simulator, it creates a memory 'leak' in that it doesn't release ever release the memory for the UIImage.
You can have a continuously growing program without necessarily leaking memory. Suppose you read words from the input and store them in dynamically allocated blocks of memory in a linked list. As you read more words, the list keeps growing, but all the memory is still reachable via the list, so there is no memory leak.

mmap() internals

It's widely known that the most significant mmap() feature is that file mapping is shared between many processes. But it's not less widely known that every process has its own address space.
The question is where are memmapped files (more specifically, its data) truly kept, and how processes can get access to this memory?
I mean not *(pa+i) and other high-level stuff, but I mean the internals of the process.
This happens at the virtual memory management layer in the operating system. When you memory map a file, the memory manager basically treats the file as if it were swap space for the process. As you access pages in your virtual memory address space, the memory mapper has to interpret them and map them to physical memory. When you cross a page boundary, this may cause a page fault, at which time the OS must map a chunk of disk space to a chunk of physical memory and resolve the memory mapping. With mmap, it simply does so from your file instead of its own swap space.
If you want lots of details of how this happens, you'll have to tell us which operating system you're using, as implementation details vary.
This is very implementation-dependent, but the following is one possible implementation:
When a file is a first memory-mapped, the data isn't stored anywhere at first, it's still on disk. The virtual memory manager (VMM) allocates a range of virtual memory addresses to the process for the file, but those addresses aren't immediately added to the page table.
When the program first tries to read or write to one of those addresses, a page fault occurs. The OS catches the page fault, figures out that that address corresponds to a memory-mapped file, and reads the appropriate disk sector into an internal kernel buffer. Then, it maps the kernel buffer into the process's address space, and restarts the user instruction that caused the page fault. If the faulting instruction was a read, we're all done for now. If it was a write, the data is written to memory, and the page is marked as dirty. Subsequent reads or writes to data within the same page do not require reading/writing to/from disk, since the data is in memory.
When the file is flushed or closed, any pages which have been marked dirty are written back to disk.
Using memory-mapped files is advantageous for programs which read or write disk sectors in a very haphazard manner. You only read disk sectors which are actually used, instead of reading the entire file.
I'm not really sure what you are asking, but mmap() sets aside a chunk of virtual memory to hold the given amount of data (usually. It can be file-backed sometimes).
A process is an OS entity, and it gains access to memory mapped areas through the OS-proscribed method: calling mmap().
The kernel has internal buffers representing chunks of memory. Any given process is assigned a memory mapping in its own address space which refers to that buffer. A number of proccesses may have their own mappings, but they all end up resolving to the same chunk (via the kernel buffer).
This is a simple enough concept, but it can get a little tricky when processes write. To keep things simple in the read-only case there's usually a copy-on-write functionality that's only used as needed.
Any data will be in some form of memory or others, some cases in HDD, in embedded systems may be some flash memory or even the ram (initramfs), barring the last one, the data in the memory are frequently cached in the RAM, RAM is logical divided into pages and the kernel maintains a list of descriptors which uniquely identify an page.
So at best accessing data would be accessing the physical pages. Process gets there own process address space which consists of many vm_are_struct which identifies a mapped section in the address space. In a call to mmap, new vm_area_struct may be created or may be merged with an existing one if the addresses are adjacent.
A new virtual address is returned to the call to mmap. Also new page tables are created which consists the mapping of the newly created virtual addresses to the physical address where the real data resides. mapping can be done on a file, or anonymously like malloc. The process address space structure mm_struct uses the pointer of pgd_t (Page global directory) to reach the physical page and access the data.