What and when does the stack and heap actually created? - operating-system

I know basic stuffs like when you run a process, it will load environment variables and the parameters to the stack; or how rsp, rbp registers be doing when we run a function. But, what thing and when does the stack/heap actually created when we run a process?

The stack is allocated at launch time of the process with a default size. It also has a maximum size. If the stack grows past its default size, the kernel checks whether the stack is bigger than its maximum size. It allocates more pages to the stack if it is smaller or kills the process if it is bigger.
The heap is allocated a region of virtual memory but not of physical memory. The heap will have a reserved region of virtual memory so that it can freely grow within it. When you ask for dynamic memory using the new operator in C++, you make a system call in the kernel to ask that it allocates a part of that virtual address space reserved for the heap in the physical address space. Once a portion of the physical address space has been allocated to your process, you can use it in your application freely.
Most C++ standard library implementations will have code that will follow how the heap grows and attempt to avoid internal fragmentation. The smallest unit of allocation in most OSes is 4KB. If you ask for, let's say, 1024 bytes of data, one page is allocated to your process. The process thus have one full page allocated. There is some memory lost here to internal fragmentation. The C++ standard library implementation thus keeps track of the allocated pages for you so that when you ask for more memory it will use the same page for your memory needs instead of requesting several pages for nothing.

There are four components on the ram: the code itself, global variables, stack, and heap.
In c++, the first three gets created during compilation, heap gets created when you create it(eg. int *ptr = new int[10];)
In python, everything is pointers (stack memory only contains the pointers, actual values in heap), meaning all four are created during compilation(interpreting)

The stack is created in every scope. Stack
The heap is created when allocated. Heap

Related

How the heap and stack size is decided in process image

I am reading about virtual memory in which a process's image has text, data, stack and heap.
The heap and stack size grow dynamically. My doubt is how the size is decided or do all the processes has fixed size.
thanks
You have not specified an operating system so I will speak in terms of Linux, as you can actually go check the source code.
For the heap, people typically think of the heap growing when you use malloc but malloc itself does not affect the virtual memory mapping. What really happens is malloc uses the system calls sbrk or mmap to increase the virtual memory region that we generally refer to as the heap and then malloc managers that memory that sbrk and mmap have set up. So the key thing to remember is that the virtual memory for the is increases by system calls such as sbrk and mmap.
For the stack as you make function calls your stack grows down. Eventually, you will hit a page that is is not mapped and you get a page fault. If the OS determines this is because the stack needs more memory and you have not exceeded your stack limit, or just memory limit, it will map more pages to the stack otherwise it is an exception. An alternative to this would just be a fixed sized stack and when you reach your stack limit the program causes an exception.
I think your division of memory into text, data, stack and heap is confusing. The user memory is usually divided into:
Readonly, Executable memory (Code--you call text
Readonly, non-executable memory (Static data--you call text)
Read/Write, non-executable memory
Stack and heap are regions of read/write, non-executable memory. There is nothing special about them. It is the same of memory used for static, rewrite/write variables in your application.
The stack is a bit special because the operating system generally allocates on for each thread. The initial size of a stack can be set up either in the executable (some systems have this as a linker option) or through the system setup. The size is generally not fixed. If a stack goes beyond its limit, most systems will try to map the next pages into the process address space and expand the stack.
The heap is usually not automatically managed. A heap is just some area(s) of memory that your library manages. When you call a library function, like malloc, for memory, it will try to allocate the memory from its available heap. If there is no memory available that fits the request, it will call an operating system service to map additional pages into the process address space that can be used.
There could be multiple heaps created by multiple memory managers. Heaps generally start at zero.

how are global variables stored differently in memory than local variables

I'm reading a book on operating systems and it says "we must set things up so that the portion of the added space allocated to global variables tensions allocated for the life time of the program, but that portion allocated for a local variable remains allocated only while the thread is in the variables scope"
What I don't understand is when a program is loaded into memory isn't the addresses reserved for it static? It's not like when a variable goes out of scope the operating system sees the address space where the variable was (is?) at can be used by another program now, or is it?
The book says that global variables are stored in memory and local variables are stored in the run time stack. First of all the run time stack is in memory so I don't see the distinction. How does the operating system know how much space to allocate for the run time stack since the number of items to be pushed on can't be determined before runtime?
When the book says "global variables are stored in memory", what it is referring to is the heap, which is different from the stack. Both are memory but different types of memory.
See here for more details: http://www-ee.eng.hawaii.edu/~tep/EE160/Book/chap14/subsection2.1.1.8.html
Disregard for the moment, virtual memory and memory management solutions required by modern operating systems.
Embedded in the executable file is the amount of memory needed for fixed/static data of which your global variables are included. The data segment (DS) (hardware) register keeps track of these fixed data elements. The DS amount does not change. The DS points to a contiguous chunk of memory. If there is not enough contiguous memory, you get an insufficient memory error and the program does not load. As you said, this memory does not "go out of scope" during program execution.
The local variables are allocated from another part of memory managed by the stack segment (SS) register. The amount of memory managed by the SS grows and shrinks during the program's execution. Max and minimum sizes are determined by the operating system (OS). (These sizes are not based on physical memory size in today's complex OSs)
If you exceed the max amount during execution you get a stack-overflow error.
The program does not execute if the min size is not available. So the pgm starts with at least the min amount. If more SS memory is needed during execution, the OS will grow the memory block - if possible else stack-overflow error.
When a function is invoked, its local variables are allocated from the SS (sort of like how the global variables were allocated in the DS). There are other housekeeping items in the SS such as the where to go when the function exits (i.e. the address of the code after that invoked the function). As functions invoke other functions in a nested fashion, the stack grows and grows. As each function completes execution, control goes back to the invoking function, the completed function's local variables are released and the stack shrinks.
The "heap" memory is another block of memory that grows as needed to hold items that are "created" or "allocated memory space" during program execution (using "malloc" or "new" or some verb like that depending on the language). Like the SS, the OS determines min and max sizes and each pgm starts with at least the min amount.
There are layers of protection built into today's modern cpus, OSs, languages to ensure that the contents of physical memory is treated properly i.e. can not execute data blocks nor write/read code blocks.
Virtual memory, the need to move memory blocks to allow multiple programs to run (memory management) and getting rid of out-of-scope heap items add another layer but the rudiments detailed above still hold true.

Where are multiple stacks and heaps put in virtual memory?

I'm writing a kernel and need (and want) to put multiple stacks and heaps into virtual memory, but I can't figure out how to place them efficiently. How do normal programs do it?
How (or where) are stacks and heaps placed into the limited virtual memory provided by a 32-bit system, such that they have as much growing space as possible?
For example, when a trivial program is loaded into memory, the layout of its address space might look like this:
[ Code Data BSS Heap-> ... <-Stack ]
In this case the heap can grow as big as virtual memory allows (e.g. up to the stack), and I believe this is how the heap works for most programs. There is no predefined upper bound.
Many programs have shared libraries that are put somewhere in the virtual address space.
Then there are multi-threaded programs that have multiple stacks, one for each thread. And .NET programs have multiple heaps, all of which have to be able to grow one way or another.
I just don't see how this is done reasonably efficient without putting a predefined limit on the size of all heaps and stacks.
I'll assume you have the basics in your kernel done, a trap handler for page faults that can map a virtual memory page to RAM. Next level up, you need a virtual memory address space manager from which usermode code can request address space. Pick a segment granularity that prevents excessive fragmentation, 64KB (16 pages) is a good number. Allow usermode code to both reserve space and commit space. A simple bitmap of 4GB/64KB = 64K x 2 bits to keep track of segment state gets the job done. The page fault trap handler also needs to consult this bitmap to know whether the page request is valid or not.
A stack is a fixed size VM allocation, typically 1 megabyte. A thread usually only needs a handful of pages of it, depending on function nesting level, so reserve the 1MB and commit only the top few pages. When the thread nests deeper, it will trip a page fault and the kernel can simply map the extra page to RAM to allow the thread to continue. You'll want to mark the bottom few pages as special, when the thread page faults on those, you declare this website's name.
The most important job of the heap manager is to prevent fragmentation. The best way to do that is to create a lookaside list that partitions heap requests by size. Everything less than 8 bytes comes from the first list of segments. 8 to 16 from the second, 16 to 32 from the third, etcetera. Increasing the size bucket as you go up. You'll have to play with the bucket sizes to get the best balance. Very large allocations come directly from the VM address manager.
The first time an entry in the lookaside list is hit, you allocate a new VM segment. You subdivide the segment into smaller blocks with a linked list. When such an allocation is released, you add the block to the list of free blocks. All blocks have the same size regardless of the program request so there won't be any fragmentation. When the segment is fully used and no free blocks are available you allocate a new segment. When a segment contains nothing but free blocks you can return it to the VM manager.
This scheme allows you to create any number of stacks and heaps.
Simply put, as your system resources are always finite, you can't go limitless.
Memory management always consists of several layers each having its well defined responsibility. From the perspective of the program, the application-level manager is visible that is usually concerned only with its own single allocated heap. A level above could deal with creating the multiple heaps if needed out of (its) one global heap and assigning them to subprograms (each with its own memory manager). Above that could be the standard malloc()/free() that it uses and above those the operating system dealing with pages and actual memory allocation per process (it is basically not concerned not only about multiple heaps, but even user-level heaps in general).
Memory management is costly and so is trapping into the kernel. Combining the two could impose severe performance hit, so what seems to be the actual heap management from the application's point of view is actually implemented in user space (the C runtime library) for the sake of performance (and other reason out of scope for now).
When loading a shared (DLL) library, if it is loaded at program startup, it will of course be most probably loaded to CODE/DATA/etc so no heap fragmentation occurs. On the other hand, if it is loaded at runtime, there's pretty much no other chance than using up heap space.
Static libraries are, of course, simply linked into the CODE/DATA/BSS/etc sections.
At the end of the day, you'll need to impose limits to heaps and stacks so that they're not likely to overflow, but you can allocate others.
If one needs to grow beyond that limit, you can either
Terminate the application with error
Have the memory manager allocate/resize/move the memory block for that stack/heap and most probably defragment the heap (its own level) afterwards; that's why free() usually performs poorly.
Considering a pretty large, 1KB stack frame on every call as an average (might happen if the application developer is unexperienced) a 10MB stack would be sufficient for 10240 nested call -s. BTW, besides that, there's pretty much no need for more than one stack and heap per thread.

What's the difference between "virtual memory" and "swap space"?

Can any one please make me clear what is the difference between virtual memory and swap space?
And why do we say that for a 32-bit machine the maximum virtual memory accessible is 4 GB only?
There's an excellent explantation of virtual memory over on superuser.
Simply put, virtual memory is a combination of RAM and disk space that running processes can use.
Swap space is the portion of virtual memory that is on the hard disk, used when RAM is full.
As for why 32bit CPU is limited to 4gb virtual memory, it's addressed well here:
By definition, a 32-bit processor uses
32 bits to refer to the location of
each byte of memory. 2^32 = 4.2
billion, which means a memory address
that's 32 bits long can only refer to
4.2 billion unique locations (i.e. 4 GB).
There is some confusion regarding the term Virtual Memory, and it actually refers to the following two very different concepts
Using disk pages to extend the conceptual amount of physical memory a computer has - The correct term for this is actually Paging
An abstraction used by various OS/CPUs to create the illusion of each process running in a separate contiguous address space.
Swap space, OTOH, is the name of the portion of disk used to store additional RAM pages when not in use.
An important realization to make is that the former is transparently possible due to the hardware and OS support of the latter.
In order to make better sense of all this, you should consider how the "Virtual Memory" (as in definition 2) is supported by the CPU and OS.
Suppose you have a 32 bit pointer (64 bit points are similar, but use slightly different mechanisms). Once "Virtual Memory" has been enabled, the processor considers this pointer to be made as three parts.
The highest 10 bits are a Page Directory Entry
The following 10 bits are a Page Table Entry
The last 12 bits make up the Page Offset
Now, when the CPU tries to access the contents of a pointer, it first consults the Page Directory table - a table consisting of 1024 entries (in the X86 architecture the location of which is pointed to by the CR3 register). The 10 bits Page Directory Entry is an index in this table, which points to the physical location of the Page Table. This, in turn, is another table of 1024 entries each of which is a pointer in physical memory, and several important control bits. (We'll get back to these later). Once a page has been found, the last 12 bits are used to find an address within that page.
There are many more details (TLBs, Large Pages, PAE, Selectors, Page Protection) but the short explanation above captures the gist of things.
Using this translation mechanism, an OS can use a different set of physical pages for each process, thus giving each process the illusion of having all the memory for itself (as each process gets its own Page Directory)
On top of this Virtual Memory the OS may also add the concept of Paging. One of the control bits discussed earlier allows to specify whether an entry is "Present". If it isn't present, an attempt to access that entry would result in a Page Fault exception. The OS can capture this exception and act accordingly. OSs supporting swapping/paging can thus decide to load a page from the Swap Space, fix the translation tables, and then issue the memory access again.
This is where the two terms combine, an OS supporting Virtual Memory and Paging can give processes the illusion of having more memory than actually present by paging (swapping) pages in and out of the swap area.
As to your last question (Why is it said 32 bit CPU is limited to 4GB Virtual Memory). This refers to the "Virtual Memory" of definition 2, and is an immediate result of the pointer size. If the CPU can only use 32 bit pointers, you have only 32 bit to express different addresses, this gives you 2^32 = 4GB of addressable memory.
Hope this makes things a bit clearer.
IMHO it is terribly misleading to use the concept of swap space as equivalent to virtual memory. VM is a concept much more general than swap space. Among other things, VM allows processes to reference virtual addresses during execution, which are translated into physical addresses with the support of hardware and page tables. Thus processes do not concern about how much physical memory the system has, or where the instruction or data is actually resident in the physical memory hierarchy. VM allows this mapping. The referenced item (instruction or data) may be resident in L1, or L2, or RAM, or finally on disk, in which case it is loaded into main memory.
Swap space it is just a place on secondary memory where pages are stored when they are inactive. If there is no sufficient RAM, the OS may decide to swap-out pages of a process, to make room for other process pages. The processor never ever executes instruction or read/write data directly from swap space.
Notice that it would be possible to have swap space in a system with no VM. That is, processes that directly access physical addresses, still could have portions of it on
disk.
Though the thread is quite old and has already been answered. Still would like to share this link as this is the simplest explanation I have found so far. Below link has got diagrams for better visualization.
Key Difference: Virtual memory is an abstraction of the main memory. It extends the available memory of the computer by storing the inactive parts of the content RAM on a disk. Whenever the content is required, it fetches it back to the RAM. Swap memory or swap space is a part of the hard disk drive that is used for virtual memory. Thus, both are also used interchangeably.
Virtual memory is quiet different from the physical memory. Programmers get direct access to the virtual memory rather than physical memory. Virtual memory is an abstraction of the main memory. It is used to hide the information of the real physical memory of the system. It extends the available memory of the computer by storing the inactive parts of the RAM's content on a disk. When the content is required, it fetches it back to the RAM. Virtual memory creates an illusion of a whole address space with addresses beginning with zero. It is mainly preferred for its optimization feature by which it reduces the space requirements. It is composed of the available RAM and disk space.
Swap memory is generally called as swap space. Swap space refers to the portion of the virtual memory which is reserved as a temporary storage location. Swap space is utilized when available RAM is not able to meet the requirement of the system’s memory. For example, in Linux memory system, the kernel locates each page in the physical memory or in the swap space. The kernel also maintains a table in which the information regarding the swapped out pages and pages in physical memory is kept.
The pages that have not been accessed since a long time are sent to the swap space area. The process is referred to as swapping out. In case the same page is required, it is swapped in physical memory by swapping out a different page. Thus, one can conclude that swap memory and virtual memory are interconnected as swap memory is used for the technique of virtual memory.
difference-between-virtual-memory-and-swap-memory
"Virtual memory" is a generic term. In Windows, it is called as Paging or pagination. In Linux, it is called as Swap.

Where does memory dynamically allocated reside?

We know that malloc() and new operation allocate memory from heap dynamically, but where does heap reside? Does each process have its own private heap in the namespace for dynamic allocation or the OS have a global one shared by all the processes. What's more, I read from a textbook that once memory leak occurs, the missing memory cannot be reused until next time we restart our computer. Is this thesis right? If the answer is yes, how can we explain it?
Thanks for your reply.
Regards.
The memory is allocated from the user address space of your process virtual memory. And all the memory is reclaimed by the OS when the process terminates there is no need to restart the computer.
Typically the C runtime will use the various OS APIs to allocate memory which is part of its process address space. The, within that allocated memory, it will create a heap and allocate memory itself from that heap via calls to malloc or new.
The reason for this is that often OS APIs are course-grained and require you to allocate memory in large chunks (such as a page size) whereas your application typically wants to allocate small amounts of memory at any one time.
You do not mention OS of your interest.
That's exactly mean no direct answer.
Try to look into some book about OSes
e.g. Tanenbaum's