Consider and operating system with a non-preemptive SJF schedule. If it is given a workload of say 10 processes, and each process performs a CPU burst which ranges from 10ms to 20ms followed by a 500ms I/O burst, will any of the processes experience starvation?
Working through this I know that the shortest process is scheduled first and whichever process is running will run to completion but I don't understand how to determine if any processes will be postponed due to a resource never being allocated given this information, I would like to know before continuing with it and I was wondering how I can tell given the workload and type of scheduler?
Consider and operating system with a non-preemptive SJF schedule. If it is given a workload of say 10 processes, and each process performs a CPU burst which ranges from 10ms to 20ms followed by a 500ms I/O burst, will any of the processes experience starvation?
If you define "starvation" as "perpetually not getting any CPU time"; then, with a "shortest job first" algorithm:
a) longer jobs will get starvation when shorter jobs are created faster than they complete (regardless of how many CPUs there are they literally can't keep up because new shorter jobs are being created too often).
b1) if the number of tasks that take an infinite amount of time exceeds the number of CPUs and none of the tasks block (e.g. wait for IO), or more processes will be starved of CPU time (unless you augment SJF with some some form of time sharing to avoid starvation among "always equal length" jobs).
b2) if the number of tasks that take an infinite amount of time exceeds the number of CPUs and some of the tasks do block (e.g. wait for IO), then whether starvation happens or not depends on "sum of time each process is not blocked".
If a SJF scheduler is given a workload of 10 processes and none of them are "infinite length", and no additional new processes are ever created; then all 10 tasks must complete sooner or later and none of the tasks will be perpetually waiting for a CPU.
Of course this doesn't mean some tasks won't have to wait (temporarily, briefly) for a CPU.
Note: for real systems, typically there are lots of infinite length tasks that do block (e.g. for both Windows and Linux, often there's over 100 processes running as services/daemons and for GUI); nobody knows how long any task will take (and not just because the speed of each CPU keeps changing due to power management - e.g. how long will the web browser you're using run for?); often nobody can know if a process will take an infinite amount of time or not (halting problem); and sometimes a process will accidentally loop forever due to a bug. In other words; "shortest job first" is almost always impossible to implement.
Related
I've come across articles on "through-put vs latency" in contexts like networking e.g. https://homepage.cs.uri.edu/~thenry/resources/unix_art/ch12s04.html But in the context of computer architecture / operating systems, I'm not able to understand why would there be a trade-off between latency (response time of a program) and through-put (how many programs we're able to complete in a unit of time, say per hour). Is this solely due to the fact that we can choose to parallelize processing of multiple programs / requests leading to overheads like context switches & sharing of caches which make the start-to-end response time per process to be worse? Or am I missing something here?
In terms of single instructions in a superscalar pipelined out-of-order exec CPU, throughput vs. latency is very important because the CPU is trying to extract parallelism from an instruction stream that has to be executed as if in serial program order. See Assembly - How to score a CPU instruction by latency and throughput and the bottom of my answer on latency vs throughput in intel intrinsics for example.
In terms of OS decisions that affect throughput vs. latency on a much longer timescale than a few clock cycles, that's a totally separate question.
One of the major factors there is choosing how to use the available physical RAM, and whether to page out (to a swap file) infrequently used code / data to make more room to cache disk files. (e.g. Linux's vm.swappiness is widely considered a key tunable in terms of setting it differently between servers and desktops. https://unix.stackexchange.com/questions/88693/why-is-swappiness-set-to-60-by-default).
If you alt-tab to a window when many pages of that process have been paged out, it will take some time before the process can redraw its window. (Multiple hard page faults, can be quite slow especially if paging on a rotational disk, not SSD.) So to optimize for latency, you want the kernel to not aggressively swap out pages from running processes, even if they've been idle for a few hours. Those pages, if they'd been free, could have improved throughput for other processes by acting as buffers / cache.
A related factor is I/O scheduling: trying to group IO requests together to minimize HD seek times (for higher throughput and lower average latency), but sometimes at the expense of delaying a few requests for a longer time (higher worst-case latency). Linux for example has many to choose from, including deadline, Completely Fair Queuing (CFQ), and the original elevator (just grouping requests by locality without consideration of fairness or latency). https://wiki.archlinux.org/title/improving_performance#Input/output_schedulers
CPU scheduling is also a factor: a context-switch hurts throughput, as it takes time itself and caches will likely be cold for the new task on this CPU. You also have to run the kernel's schedule() function to decide which task to run next, so that takes away some time from real work.
To minimize latency (for example between a socket message being sent to a process and it waking up when its poll or select system call returns), you want a short timeslice, like Linux HZ=1000. (Timer interrupts every 1 ms to run the scheduler). And you want to be able to pre-empt even the kernel itself, instead of waiting until the kernel is ready to return to the old user-space to consider the possibility of running a different user-space task.
But neither of these helps throughput, and in fact hurt (assuming the workload has enough parallelism to not bottleneck on latency). So HZ=100 was the default for "server" Linux builds, vs. 1000 on "desktop" builds tuned for interactive use. (Modern Linux can be "tickless", not using a fixed timer interrupt on every core at all, instead deciding when to schedule the next interrupt on a case by case basis.)
Real-time kernels take this even further, spending more time on finer-grained locking and stuff like that to enable pausing work and coming back to it later to minimize interrupt latency and other latencies between it being time to do something and actually starting to do that thing. (There are real-time patches for Linux, and there are also totally separate kernels built from the ground up for real-time operation.)
If you have an embedded system controlling a motor or something, you absolutely need hard real-time latency guarantees that it will never take longer than say 1 millisecond from an interrupt pin being asserted to the interrupt handler starting to run.
(Designing the system to make these guarantees possible often comes at the cost of throughput. e.g. obviously you have to pin some memory to make it not swappable, if we're talking about user-space, making it unavailable for cache even if it goes untouched for days.)
can anyone have an idea about my question.
I need to know about how task manager assign priorities in windows.
It is all about the scheduler, in an OS you have a lot of things running "at the same time", actually there is a scheduler giving access to the CPU to a certain process, while a process is waiting for the CPU it gains "points", then the scheduler gives the CPU to the highest amount of points (that is what I was teached).
The priority will make the process gain more or less points while it is waiting.
How can you ensure that interrupt latency will not exceed a certain value when there may be other variables and factors involved, like the hardware ?
Hardware latency is predictable. It doesn't have to be constant, but it definitely is bounded - for example interrupt entry is usually 12 cycles, but sometimes it may take 15 cycles.
RTOS latency is predictable. It also is not constant, but for example you can be certain, that the RTOS does not block the interrupts for longer than 1000 cycles at any time. Usually it will block them for much shorter periods of time, but never longer than stated.
If only your application doesn't do something strange (like a while (1); in the thread with highest possible priority), then the latency of the whole system will be a sum of hardware latency and RTOS latency.
The important fact here is that using real-time operating system to write your application is not the only requirement for the application to also be real-time. In your application you have to ensure that the real-time constraints are not violated. The main job of RTOS is to NOT get in your way of doing that, so it may not introduce random/unpredictable delays.
Generally the most important of the "predictable" things in RTOS is that the highest priority thread that is not blocked is executing. Period. In a GPOS (like the one on your desktop computer, in tablets or in smartphones), this is not true, because the scheduler actively prevents low priority threads from starvation, by allowing them to run for some time, even if there are more important things to do right now. This makes the behaviour of the application unpredictable, because one day it may react within 10us, while on the other day it may react within 10s, because the scheduler decided it's a great moment to save the logs to hard drive or maybe do some garbage collection.
Alternatively you can think that for RTOS the latency is in the range of microseconds, maybe single milliseconds. For a GPOS the max latency would probably be something like dozens of seconds.
I'm unsure how Round Robin scheduling works with I/O Operations. I've learned that CPU bound processes are favoured by Round Robin scheduling, but what happens if a process finishes its time slice early?
Say we neglect the dispatching process itself and a process finishes its time slice early, will the scheduler schedule another process if its CPU bound, or will the current process start its IO operation, and since that isn't CPU bound, will immediately switch to another (CPU bound) process after? And if CPU bound processes are favoured, will the scheduler schedule ALL CPU bound process until they are finished and only afterwards schedule the I/O processes?
Please help me understand.
There are two distinct schedulers: the CPU (process/thread ...) scheduler, and the I/O scheduler(s).
CPU schedulers typically employ some hybrid algorithms, because they certainly do regularly encounter both pre-emption and processes which voluntarily give up part of their time-slice. They must service higher-priority work quickly, while not "starving" anyone. (A study of the current Linux scheduler is most interesting. There have been several.)
CPU schedulers identify processes as being either "primarily 'I/O-bound'" or "primarily 'CPU-bound'" at this particular time, knowing that their characteristics can and do change. If your process repeatedly consumes full time slices, it is seen as CPU-bound.
I/O schedulers seek to order and re-order the I/O request queues for maximum efficiency. For instance, to keep the read/write head of a physical disk-drive moving efficiently in a single direction. (The two components of disk-drive delay are "seek time" and "rotational latency," with "seek time" being by-far the worst of the two. Per contra, solid-state drives have very different timing.) I/O-schedulers also have to be aware of the channels (disk interface cards, cabling, etc.) that provide access to each device: they can't simply watch what any one drive is doing. As with the CPU-scheduler, requests must be efficiently handled but never "starved." Linux's I/O-schedulers are also readily available for your study.
"Pure round-robin," as a scheduling discipline, simply means that all requests have equal priority and will be serviced sequentially in the order that they were originally submitted. Very pretty birds though they are, you rarely encounter Pure Robins in real life.
I have read in Galvin book of operating system about the Medium term scheduler.
It was written that:
Sometimes, it is advantageous to swap out the process when it is not executing[waiting for I/O or waiting for CPU] in order to decrease the degree of multiprogramming.
Also, we get more amount of physical memory which makes the execution of other process faster by decreasing the number of page faults[as we have more memory].
So, its the work of medium term scheduler to swap out & swap in partially executed process.
But My question is: Does the work of medium term scheduler is really important in scenarios where we have plenty of available physical/main memory?
The use of medium term scheduler is to improve multiprogramming by allowing multiple processes to reside in main memory by swapping out processes that are waiting (need I/O) or low priority processes and swapping in other processes that were in ready queue.
So you can see that we requied medium term scheduler when we have limited memory. This swapping in and out operation does not take place when we are running a single small program and have large memory.
Similary if we are running multiple programs and we have very large memory(larger than the size of all processes plus addition space for other requirements) then medium term scheduler is not needed. Modern operating systems use paging so instead of swapping processes they swap pages in and out of memory.It is same as a system with very large memory(infinite) would not suffer from page faults.
Medium term scheduling is part of the swapping. It removes the processes from the memory. It reduces the degree of multiprogramming. The medium term scheduler is in-charge of handling the swapped out-processes.
TUTORIALS POINT
Simply Easy Learning Page 28
Running process may become suspended if it makes an I/O request. Suspended processes cannot make any progress towards completion. In this condition, to remove the process from memory and make space for other process, the suspended process is moved to the secondary storage. This process is called swapping, and the process is said to be swapped out or rolled out. Swapping may be necessary to improve the process mix.