Does OS save memory values in context switch? - operating-system

I have a question about which datas will be saved in context switching.
I have searched but all links just talked about registers.
My question is does os also save the memory values of a process ?
Assume a process has defined an array starting from adress 0x80000 of Ram.
When context switch occurs, what will happen to this array?
Maybe the new process override 0x80000 adress in memory and when the old process continue it's working it losts the array!
Anyone can explain?

In general, an operating system does not save memory in a context switch. It just changes register values. The old process's memory just stays there until the system needs it. If that happens, the memory will be paged out.
In the olde days of swapping, yes, the memory was frequently saved when a new process came in.


"Out of memory" error for standalone matlab applications - memory fragmentation

I have to deliver an application as a standalone Matlab executable to a client. The code include a series of calls to a function that internally creates several cell arrays.
My problem is that an out-of-memory error happens when the number of calls to this function increases in response to the increase in the user load. I guess this is low-level memory fragmentation as the workspace variables are independent from the number of loops.
As mentioned here, quitting and restarting Matlab is the only solution for this type of out-of-memory errors at the moment.
My question is that how I can implement such a mechanism in a standalone application to save data, quit and restart itself in the case of out-of-memory error (or when high likelihood of such an error is predicted somehow).
Is there any best practice available?
This is a bit of a tough one. Instead of looking to restart to clear things out, could you change the code to break the work in to chunks to make it more efficient? Fragmentation is mostly proportional to the peak cell-related memory usage and how much the size of data items varies, and less to the total usage over time. If you can break a large piece of work in to smaller pieces done in sequence, this can lower the "high water mark" of your fragmented memory usage. You can also save on memory usage by using "flyweight" data structures that share their backing data values, or sometimes converting to cell-based structures to reference objects or numeric codes. Can you share an example of your code and data structure with us?
In theory, you could get a clean slate by saving your workspace and relevant state out to a mat file and having the executable launch another instance of itself with an option to reload that state and proceed, and then having the original executable exit. But that's going to be pretty ugly in terms of user experience and your ability to debug it.
Another option would be to offload the high-fragmentation code in to another worker process which could be killed and restarted, while the main executable process survives. If you have the Parallel Computation Toolbox, which can now be compiled in to standalone Matlab executables, this would be pretty straightforward: open a worker pool of one or two workers, and run the fraggy code inside them using synchronous calls, periodically killing the workers and bringing up new ones. The workers are independent processes which start out with non-fragmented memory spaces. If you don't have PCT, you could roll your own by compiling your application as two separate apps - the driver app and worker app - and have the main app spin up a worker and control it via IPC, passing your data back and forth as MAT files or bytestreams. That's not going to be a lot of fun to code, though.
Perhaps you could also push some of the fraggy code down in to the Java layer, which handles cell-like data structures more gracefully.
Changing the code to be less fraggy in the first place is probably the simpler and easier approach, and results in a less complicated application design. In my experience it's often possible. If you share some code and data structure details, maybe we can help.
Another option is to periodically check for memory fragmentation with a function like chkmem.
You could integrate this function to be called silently from you code each couple of iterations, or use a timer object to have it called every X minutes...
The idea is to use thse undocumented functions feature memstats and feature dumpmem to get the largest free memory blocks available in addition to the largest variables currently allocated. Using that you could make a guess if there is a sign of memory fragmentation.
When detected, you would warn the user and instruct them you how to save their current session (export to MAT-file), restart the app, and restore the session upon restart.

Is it true that CPU never fetches anything from memory directly?

I hear that cpu just fetches instruction from the EIP register,never fetches from memory directly.
But AFAIK,EIP just stores the address of the next instruction,the instruction itself is still in the memory.If CPU never fetches memory,how can it know what the next instruction actually is?
BTW,I know there're x86,x64,x87 architectures,but which does x86-64 belong to,x86 or x64??
The simple answer to your question is "no, it's not true".
The picture isn't very simple due to caching, instruction pipeline, branch prediction etc. However, the instruction pointer is just that, a pointer. It doesn't store opcodes.
EIP (Extended Instruction Pointer) should hold the address of the instruction. It's just a way to keep a tab of which instruction is being processed currently (or sometimes, which instruction to process next).
The instructions themselves are stored in the Memory (HDD, RAM, Cache) and need to be fetched by the CPU.
Maybe what you heard meant that since so many levels of caches are used generally it's quite rare that the fetch needs to access the RAM..
Well I don't know the point to your question.
Yes the CPU (in a broad sense of the word) does fetch from memory. It has a number of memory management devices (for cache line handling and pipelining). In fact, the 'pipeline' puts the instructions in L1 cache. Indeed, the instruction processor itself only fetches from there. The processor in reality probably never even looks at EIP (unless an instruction uses it directly as an operand).
So the real answer would be, find yourself a wikipedia articale on i86 processor design, and have a ball. You'll be able to know exactly what happens where.
Not true in that way. CPU accesses memory thru the cache, so you can kinda say that it does not do it directly. (Also DMA cahnnel can transfer data between memory and IO without ever touching CPU).
Yes, CS:EIP points to the memory, to the next instruction to execute, but you can use direct addresses too for example (load the content of the address 0x0800 to the AX register, by default this is relative to DS segment):
MOV AX,[0x0800]

What do those question marks mean in Windbg?

I'm getting an access violation in a program. Windbg shows that the program is trying to read at 0x09015000. It shows question marks (??) next to the address. My question is, what do these question marks indicate. Do they mean the memory location was never allocated, i.e. it's not backed by any physical memory (or page file)? Or is it something else?
It means that the virtual address is bad. Possibly a bogus pointer (i.e. uninitialized garbage), freed memory, etc.
Do they mean the memory location was never allocated
That's one possibility. Other options:
it was allocated before, but has been freed (VirtualFree())
it's not included in the crash dump you analyze. This may depend on the MINIDUMP_TYPE. Also, Procdump has an option ( -mp) to exclude memory regions larger than 512 MB.

mmap() internals

It's widely known that the most significant mmap() feature is that file mapping is shared between many processes. But it's not less widely known that every process has its own address space.
The question is where are memmapped files (more specifically, its data) truly kept, and how processes can get access to this memory?
I mean not *(pa+i) and other high-level stuff, but I mean the internals of the process.
This happens at the virtual memory management layer in the operating system. When you memory map a file, the memory manager basically treats the file as if it were swap space for the process. As you access pages in your virtual memory address space, the memory mapper has to interpret them and map them to physical memory. When you cross a page boundary, this may cause a page fault, at which time the OS must map a chunk of disk space to a chunk of physical memory and resolve the memory mapping. With mmap, it simply does so from your file instead of its own swap space.
If you want lots of details of how this happens, you'll have to tell us which operating system you're using, as implementation details vary.
This is very implementation-dependent, but the following is one possible implementation:
When a file is a first memory-mapped, the data isn't stored anywhere at first, it's still on disk. The virtual memory manager (VMM) allocates a range of virtual memory addresses to the process for the file, but those addresses aren't immediately added to the page table.
When the program first tries to read or write to one of those addresses, a page fault occurs. The OS catches the page fault, figures out that that address corresponds to a memory-mapped file, and reads the appropriate disk sector into an internal kernel buffer. Then, it maps the kernel buffer into the process's address space, and restarts the user instruction that caused the page fault. If the faulting instruction was a read, we're all done for now. If it was a write, the data is written to memory, and the page is marked as dirty. Subsequent reads or writes to data within the same page do not require reading/writing to/from disk, since the data is in memory.
When the file is flushed or closed, any pages which have been marked dirty are written back to disk.
Using memory-mapped files is advantageous for programs which read or write disk sectors in a very haphazard manner. You only read disk sectors which are actually used, instead of reading the entire file.
I'm not really sure what you are asking, but mmap() sets aside a chunk of virtual memory to hold the given amount of data (usually. It can be file-backed sometimes).
A process is an OS entity, and it gains access to memory mapped areas through the OS-proscribed method: calling mmap().
The kernel has internal buffers representing chunks of memory. Any given process is assigned a memory mapping in its own address space which refers to that buffer. A number of proccesses may have their own mappings, but they all end up resolving to the same chunk (via the kernel buffer).
This is a simple enough concept, but it can get a little tricky when processes write. To keep things simple in the read-only case there's usually a copy-on-write functionality that's only used as needed.
Any data will be in some form of memory or others, some cases in HDD, in embedded systems may be some flash memory or even the ram (initramfs), barring the last one, the data in the memory are frequently cached in the RAM, RAM is logical divided into pages and the kernel maintains a list of descriptors which uniquely identify an page.
So at best accessing data would be accessing the physical pages. Process gets there own process address space which consists of many vm_are_struct which identifies a mapped section in the address space. In a call to mmap, new vm_area_struct may be created or may be merged with an existing one if the addresses are adjacent.
A new virtual address is returned to the call to mmap. Also new page tables are created which consists the mapping of the newly created virtual addresses to the physical address where the real data resides. mapping can be done on a file, or anonymously like malloc. The process address space structure mm_struct uses the pointer of pgd_t (Page global directory) to reach the physical page and access the data.

Meaning of SIZE and RSS values in prstat output

Can somebody give some clear explanation of the meaning of the SIZE and RSS values we get from prstat in Solaris?
I wrote a testing C++ application that allocates memory with new[], fills it and frees it with delete[].
As I understood, the SIZE value should be related to how much virtual memory has been "reserved" by the process, that is memory "malloced" or "newed".
That memory doesn't sum up in the RSS value unless I really use it (filling with some values). But then even if I free the memory, the RSS doesn't drop.
I don't understand what semantic I can correctly assign to those 2 values.
RSS is (AFAIK reliably) representing how much physical memory a process is using. Using Solaris default memory allocator, freeing memory doesn't do anything about RSS as it just changes some pointers and values to tell that memory is free to be reused.
If you don't use again that memory by allocating it again, it will eventually be paginated out and the RSS will drop.
If you want freed memory to be returned immediately after a free, you can use the Solaris mmap allocator like this:
export UMEM_OPTIONS=backend=mmap
Size is the total virtual memory size of the process, including all mapped files and devices, and RSS should be the resident set size, but is completely unreliable, you should try to get that information from pmap.
As a general rule once memory is allocated to a process it will never be given back to the operating system. On Unix systems the sbrk() call is used to extend the processes address space, and there is not analogous call to go in the other direction.