Degree of multiprogramming definition - operating-system

What is the degree of multiprogramming in OS?
Is it the number of processes in the ready queue or the number of processes in the memory?

In a multiprogramming-capable system, jobs to be executed are loaded into a pool. Some number of those jobs are loaded into main memory, and one is selected from the pool for execution by the CPU. If at some point the program in progress terminates or requires the services of a peripheral device, the control of the CPU is given to the next job in the pool.
An important concept in multiprogramming is the degree of multiprogramming. The degree of multiprogramming describes the maximum number of processes that a single-processor system can accommodate efficiently.
These are some of the factors affecting the degree of multiprogramming:
The primary factor is the amount of memory available to be allocated
to executing processes. If the amount of memory is too limited, the
degree of multiprogramming will be limited because fewer processes
will fit in memory.
Operating system - The means by which resources are allocated to processes. If the operating system
can not allocate resources to executing processes in a fair and
orderly fashion, the system will waste time in reallocation, or
process execution could enter into a deadlock state as programs wait
for allocated resources to be freed by other blocked processes.
Other factors affecting the degree of multiprogramming are program
I/O needs, program CPU needs, and memory and disk access speed.
Hope this answers you. :)
If not, You can get it in more detail here: http://www.tcnj.edu/~coburn/os

For a system with a single CPU core, there will never be more than one
process running at a time, whereas a multicore system can run multiple
processes at one time. If there are more processes than cores, excess
processes will have to wait until a core is free and can be
rescheduled. The number of processes currently in memory is known as
the degree of multiprogramming.
Excerpt from: Operating System Concepts, 10th Edition, Abraham Silberschatz

Related

Is there a way to set worker weight?

I have two machine to do the load test. One machine has worse CPU performance. And the machine will reach high CPU usage when number of users keep increasing, while the other machine still has low CPU usage. Locust complains:
[2022-07-28 11:22:15,529] PF1YW96X-MUO/WARNING/root: CPU usage above 90%! This may constrain your throughput and may even give inconsistent response time measurements! See https://docs.locust.io/en/stable/running-locust-distributed.html for how to distribute the load over multiple CPU cores or machines
[2022-07-28 11:25:06,766] PF1YW96X-MUO/WARNING/locust.runners: CPU usage was too high at some point during the test! See https://docs.locust.io/en/stable/running-distributed.html for how to distribute the load over multiple CPU cores or machines
I want to set lower weight for the machine who has worse CPU perfomance. Is there a way to do that?
You can run fewer worker processes on the weak machine. If necessary you could run more than one process per core on the strong machine, just to make it take more Users.

How are CPU resource units (millicore/millicpu) calculated under the hood?

Let's take this processor as an example: a CPU with 2 cores and 4 threads (2 threads per core).
From what I've read, such a CPU has 2 physical cores but can process 4 threads simultaneously through hyper threading. But, in reality, one physical core can only truly run one thread at a time, but using hyper threading, the CPU exploits the idle stages in the pipeline to process another thread.
Now, here is Kubernetes with Prometheus and Grafana and their CPU resource units measurement - millicore/millicpu. So, they virtually slice a core to 1000 millicores.
Taking into account the hyper threading, I can't understand how they calculate those millicores under the hood.
How can a process, for example, use 100millicore (10th part of the core)? How is this technically possible?
PS: accidentally, found a really descriptive explanation here: Multi threading with Millicores in Kubernetes
This gets very complicated. So k8s doesn't actually manage this it just provides a layer on top of the underlying container runtime (docker, containerd etc). When you configure a container to use 100 millicore k8's hands that down to the underlying container runtime and the runtime deals with it. Now once you start going to this level you have to start looking at the Linux kernel and how it does cpu scheduling / rate with cgroups. Which becomes incredibly interesting and complicated. In a nutshell though: The linux CFS Bandwidth Control is the thing that manages how much cpu a process (container) can use. By setting the quota and period params to the schedular you can control how much CPU is used by controlling how long a process can run before being paused and how often it runs. as you correctly identify you cant only use a 10th of a core. But you can use a 10th of the time and by doing that you can only use a 10th of the core over time.
For example
if I set quota to 250ms and period to 250ms. That tells the kernel that this cgroup can use 250ms of CPU cycle time every 250ms. Which means it can use 100% of the CPU.
if I set quota to 500ms and keep the period to 250ms. That tells the kernel that this cgroup can use 500ms of CPU cycle time every 250ms. Which means it can use 200% of the CPU. (2 cores)
if I set quota to 125ms and keep the period to 250ms. That tells the kernel that this cgroup can use 125ms of CPU cycle time every 250ms. Which means it can use 50% of the CPU.
This is a very brief explanation. Here is some further reading:
https://blog.krybot.com/a?ID=00750-cfae57ed-c7dd-45a2-9dfa-09d42b7bd2d7
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html

About CPU operation and I/O processing

My question is why do we want to have CPU's operation overlap with that of the I/O processing. I have been thinking about optimization and such but yet to arrive at a conclusion.
If anyone is able to answer this question, it will be great. :D
I/O is generally very slow compared to the operating frequency of the CPU.
Suppose you have a 1GHz CPU that's capable of executing one instruction every clock cycle. That means the CPU is able to execute one instruction every nanosecond.
Now let's assume you want to fetch some data from your hard drive. Disk operations often take place in the milisecond scale, and we'll assume your drives are fast enough to fetch the data in only 1ms.
If the CPU just sit around and wait for the disk to fetch the data, the CPU will waste 1 million nanoseconds doing nothing, whereas it could be executing 1 million instructions for another task. When a program has a lot of IO access, those wasted cycles stacks up and become noticeable if you let the CPU wait and do nothing. This is why it's a good idea to overlap computation with IO so CPU cycles aren't wasted.
This is also why your computer becomes super unresponsive when your main memory is full, and the CPU has to page frequently to the disk. Your CPU cannot perform any useful task unless the data it needs has been retrieved from the disk into the main memory, so it must sit around and wait for the IOs to complete.

FIFO (FCFS) in multiprogramming

I have a question:
In which scenario FIFO (First In First Out) scheduling is possible in multiprogramming when we have only 1 processor?
Multiprogramming concept is the ability of the operating system to have multiple programs in memory.
Multiprogramming actually mean switching between the processes or interleaving the I/O time and CPU time of processes.
So it is independent of number of processors.(i.e. Even if there is only one processor it may be working on multiple programs.)

Can multiple cores simultaneously read the same RAM location?

Can multiple cores simultaneously read the same RAM location? I am interested in x86 architecture CPU's in particular. Also can the internal caches of two different cores on the same CPU get filled at the same time from the same RAM locations?
In short, they can read independently and caches will be filled independently, though the location may be preloaded in shared L3 cache. Synchronisation is not guaranteed to the precise tick, but memory state is coherent and transparent to the application. There is an excellent article on memory by Ulrich Drepper, which is a must read: http://lwn.net/Articles/250967/