Are the CPU cycles used up in doing a context switch accounted for in the process context? - context-switch

If a process causes a lot of context switches, will the CPU cycles used in the context switch be shown in the process CPU utilization?
In other words, if I run a process that essentially repeatedly executes a system call, then should the output of top show an increase in CPU utilization for the process because of the increase in context switching from user to kernel space and vice versa?

Yes, I think it should.
Look at the man pages for top and time in linux and possibly other *nix systems.

Related

Can Multiprocessor CPUs avoid context-switching?

Today's computer architecture are trying to maximize the number of registers. It is faster to access a register (which is an integrated memory circuit near the cpu) than to access first-level cache. The problem is, that each context switch has to save all registers into cache, because the next thread needs other register values. What a modern CPU is doing is to cycle in one second through 100 tasks and everytime it saves the registers, and fetches the old one until the task can be started.
IMHO it would be nice to use one CPU for one task, and no context switching is happening. That means we get 100 CPUs, each 1000 registers which has to be never saved. Is that possible or have I a ignored an important detail?
The only way to completely avoid context switching is by having at least as many cores as there are tasks. Generally, there is no guarantee regarding the maximum number of tasks that may run. Current GPUs and manycore processors and co-processors contain hundreds of small cores. If you put multiple of these things in the same system or in a cluster of systems, you can have thousands or more cores. Still, even if you could avoid context switching with such design, these cores are much slower than the traditional high-end CPU cores, so the net effect might be negative.
But let's take a step back here. The number of context switches is not primarily determined by the number of tasks and cores. Tasks don't just perform computations, they also need to interact with I/O devices and wait for things to happen such as results from other tasks or user input. So some tasks would be in a wait state. The overhead of context switching depends on not only the number of tasks but also the behavior of these tasks.
Both processors architects and OS developers are aware of context switching overhead and employ a variety of techniques to alleviate it. For example, x86 provides a number of instructions that are tuned to saving the context (partially) of the current task. The OS thread scheduler uses techniques such as priorities, preemption (with possibly large time slices on servers), and priority boosting. All of these help reducing the number of context switches and therefore their overall overhead. In addition, reducing the overhead of context switching is not the only thing that matters. In particular, the responsiveness of the system is very important as well, which is at odds with that overhead.

How can kernel run all the time?

How can kernel run all the time, when CPU can execute only one process at a time ?
That is, if kernel is occupying CPU all the time , then how come other processes run.
Please explain
Thank You
In the same way that you can run multiple userspace processes at the same time: Only one of them is actually using the CPU at any given time. You have some interrupts that force them to give it up.
Code that is part of the operating system is no different here (except that it is in control of setting up this scheduling in the first place).
You also have to distinguish between processes run by the OS in the background (I suppose that is what you are talking about here), and system calls (which are being run as part of "normal" processes that temporarily switch into supervisor mode).

Context Switch questions: What part of the OS is involved in managing the Context Switch?

I was asked to anwer these questions about the OS context switch, the question is pretty tricky and I cannot find any answer in my textbook:
How many PCBs exist in a system at a particular time?
What are two situations that could cause a Context Switch to occur? (I think they are interrupt and termination of a process,but I am not sure )
Hardware support can make a difference in the amount of time it takes to do the switch. What are two different approaches?
What part of the OS is involved in managing the Context Switch?
There can be any number of PCBs in the system at a given moment in time. Each PCB is linked to a process.
Timer interrupts in preemptive kernels or process renouncing control of processor in cooperative kernels. And, of course, process termination and blocking at I/O operations.
I don't know the answer here, but see Marko's answer
One of the schedulers from the kernel.
3: A whole number of possible hardware optimisations
Small register sets (therefore less to save and restore on context switch)
'Dirty' flags for floating point/vector processor register set - allows the kernel to avoid saving the context if nothing has happened to it since it was switched in. FP/VP contexts are usually very large and a great many threads never use them. Some RTOSs provide an API to tell the kernel that a thread never uses FP/VP at all eliminating even more context restores and some saves - particularly when a thread handling an ISR pre-empts another, and then quickly completes, with the kernel immediately rescheduling the original thread.
Shadow register banks: Seen on small embedded CPUs with on-board singe-cycle SRAM. CPU registers are memory backed. As a result, switching bank is merely a case of switching base-address of the registers. This is usually achieved in a few instructions and is very cheap. Usually the number of context is severely limited in these systems.
Shadow interrupt registers: Shadow register banks for use in ISRs. An example is all ARM CPUs that have a shadow bank of about 6 or 7 registers for its fast interrupt handler and a slightly fewer shadowed for the regular one.
Whilst not strictly a performance increase for context switching, this can help ith the cost of context switching on the back of an ISR.
Physically rather than virtually mapped caches. A virtually mapped cache has to be flushed on context switch if the MMU is changed - which it will be in any multi-process environment with memory protection. However, a physically mapped cache means that virtual-physical address translation is a critical-path activity on load and store operations, and a lot of gates are expended on caching to improve performance. Virtually mapped caches were therefore a design choice on some CPUs designed for embedded systems.
The scheduler is the part of the operating systems that manage context switching, it perform context switching in one of the following conditions:
1.Multitasking
2.Interrupt handling
3.User and kernel mode switching
and each process have its own PCB

Medium term scheduler

I have read in Galvin book of operating system about the Medium term scheduler.
It was written that:
Sometimes, it is advantageous to swap out the process when it is not executing[waiting for I/O or waiting for CPU] in order to decrease the degree of multiprogramming.
Also, we get more amount of physical memory which makes the execution of other process faster by decreasing the number of page faults[as we have more memory].
So, its the work of medium term scheduler to swap out & swap in partially executed process.
But My question is: Does the work of medium term scheduler is really important in scenarios where we have plenty of available physical/main memory?
The use of medium term scheduler is to improve multiprogramming by allowing multiple processes to reside in main memory by swapping out processes that are waiting (need I/O) or low priority processes and swapping in other processes that were in ready queue.
So you can see that we requied medium term scheduler when we have limited memory. This swapping in and out operation does not take place when we are running a single small program and have large memory.
Similary if we are running multiple programs and we have very large memory(larger than the size of all processes plus addition space for other requirements) then medium term scheduler is not needed. Modern operating systems use paging so instead of swapping processes they swap pages in and out of memory.It is same as a system with very large memory(infinite) would not suffer from page faults.
Medium term scheduling is part of the swapping. It removes the processes from the memory. It reduces the degree of multiprogramming. The medium term scheduler is in-charge of handling the swapped out-processes.
TUTORIALS POINT
Simply Easy Learning Page 28
Running process may become suspended if it makes an I/O request. Suspended processes cannot make any progress towards completion. In this condition, to remove the process from memory and make space for other process, the suspended process is moved to the secondary storage. This process is called swapping, and the process is said to be swapped out or rolled out. Swapping may be necessary to improve the process mix.

How do we reduce Context Switch time

All we know that context switch time is pure overhead and is of no use.But i would like to know how can one reduced context switch time .Is using more register help us doing in so?
Are you writing an operating system? The context switch time is dependent on the registers you have to save / restore. One way you can possibly save time is via the AVX extensions on new processors, which allow you to save/restore all of the registers to one block of memory.
Minimize the context size and/or avoid context switches. How exactly yo do that depends on the context (not the context that you're switching but the context of the problem, the CPU, the OS, etc).
On the x86 CPU you can avoid unnecessary saving and restoring of the state of the floating point unit if it doesn't change. This is done by setting the task switched bit in CR0 to 1 during a context switch and then waiting for a special CPU exception originating from the first FPU instruction of the new thread. When it occurs, you save the old thread's FPU state, load the current thread's FPU state, reset CR0.TS and resume execution at that FPU instruction. If threads come and go but the exception doesn't occur, that means the threads aren't doing FPU work and you don't do full context switches.
It would be up to the programmer to implement a threading policy, synchronization mechanisms, and data structures that minimize lock contention. When a thread tries to acquire a lock that is already acquired by another thread, it has little choice but to poll several times, hoping they will release it within a very short time, then give up and do a context switch.
If this question was from the perspective of a linux administrator, you can reduce time spent in context switches by increasing the minimum timeslice (see sched_latency_ns and sched_min_granularity_ns), or by ensuring that the demand for processors is less than or equal to the number of available processors. Context switch rate is significantly lower when you have spare processors - it won't need to "switch" any existing processors, it can use an idle one.