Why do my Perl threads execute randomly on the first run but in order on subsequent runs? - perl

In the course of testing the code for the question How can I store per-thread state between calls in Perl? I noticed that the first I time execute the script the threads execution are fairly well interleaved with each other. But on all subsequent executions of the script all the threads run almost perfectly in the order of their creation with very little interleaving. This is Perl ithreads on Ubunutu 9.04.
Maybe someone could enlighten me about what's going on?

Your threads are running in creation order largely due to implementation details in Perl and the operating system (mainly because their individual execution time is below the operating system's shortest execution-time slice). To interleave the threads, you can use sleep rather than yield (or make the threads do some real work).
Keep in mind that yield in Perl threads is just a suggestion, which is unlike the way yield works in some other languages. Since Perl threads are concurrent and are largely managed by the operating system's scheduler, unless you use some sort of mutex to block execution, its not really possible to predict their execution order.

Thread scheduling is a complex implementation detail of the OS. Trying to figure out how it works by observing how threads are scheduled is next to impossible. And you shouldn't really make any assumptions based on these observations. On different hardware the OS may schedule the threads differently.

Related

Does each system call create a process?

Does each system call create a process?
Are all functions (e.g. interrupts) of programs and operating systems executed in the form of processes?
I feel that such a large number of process control blocks, a large number of process scheduling waste a lot of resources.
Or, the kernel instruction of the system call is regarded as part of the current
process.
The short answer is - not exactly. But we have to agree on what we are going to call a "process". A process is more of an abstract idea, which encapsulates multiple instructions, each sequentially executed.
So let's start from the first question.
Does each system call create a process?
No. Each system call is the product of the currently running process, that tells the OS - "Hey OS, I need you to open this file for me, or read these here bits". In this case, the process is a bag of sequentially executed instructions, some are system calls, some are not.
Then we have.
Are all functions (e.g. interrupts) of programs and operating systems executed in the form of processes?
Well this kind of goes back to the first question. We are not considering that a system call (an operation that tell the OS to do something and works under very strict conditions) is a separate process. We will NOT see that system call execution to have its OWN process id (pid).
Then we have.
I feel that such a large number of process control blocks, a large number of process scheduling waste a lot of resources.
Well, I would say, do not underestimate your OS and the capabilities of your hardware. A modern processor with a modern OS on it, is VERY, VERY fast and more than capable of computing billions of instructions in seconds. We can't really imagine how fast that is. I wouldn't worry for optimizations on such a micro-level.
Okay, but let's dig deeper into this. What is a process exactly?
Informally, a process is a program in execution. The status of the current activity of a process is represented by a value, called the program counter, and the contents of the processor’s registers. The memory layout of a process is typically divided into multiple sections.
These sections include:
Text section.
Data section.
Heap section.
Stack section.
As a process executes, it changes state. The state of a process is defined in part by the current activity of that process. Each process is represented in the OS by a process control block (PCB), as you already mentioned.
So we can see that we treat a process as a very complicated structure that is MORE that just occupying CPU time. It has a state, storage, timing, and so on.
But because you are interested in system calls, then what are they?
For us, system calls provide an interface to the services made available by an OS. They are the way we tell the OS to do things FOR US. We know that systems execute thousands of system calls per second.
No, they don't.
The operating system uses software interrupt to execute the system call operation within the same process.
You can imagine them as a function call but they are executed with kernel privileges.

UVM shared variables

I have a doubt regarding UVM. Let's think I have a DUT with two interfaces, each one with its agent, generating transactions with the same clock. These transactions are handled with analysis imports (and write functions) on the scoreboard. My problem is that both these transactions read/modify shared variables of the scoreboard.
My questions are:
1) Have I to guarantee mutual exclusion explicitly though a semaphore? (i suppose yes)
2) Is this, in general, a correct way to proceed?
3) and the main problem, can in some way the order of execution be fixed?
Depending on that order the values of shared variables can change, generating inconsistency. Moreover, that order is fixed by specifications.
Thanks in advance.
While SystemVerilog tasks and functions do run concurrently, they do not run in parallel. It is important to understand the difference between parallelism and concurrency and it has been explained well here.
So while a SystemVerilog task or function could be executing concurrently with another task or function, in reality it does not actually run at the same time (run time context). The SystemVerilog scheduler keeps a list of all the tasks and functions that need to run on the same simulation time and at that time it executes them one-by-one (sequentially) on the same processor (concurrency) and not together on multiple processors (parallelism). As a result mutual exclusion is implicit and you do not need to use semaphores on that account.
The sequence in which two such concurrent functions would be executed is not deterministic but it is repeatable. So when you execute a testbench multiple times on the same simulator, the sequence of execution would be same. But two different simulators (or different versions of the same simulator) could execute these functions in a different order.
If the specifications require a certain order of execution, you need to ensure that order by making one of these tasks/functions wait on the other. In your scoreboard example, since you are using analysis port, you will have two "write" functions (perhaps using uvm_analysis_imp_decl macro) executing concurrently. To ensure an order, (since functions can not wait) you can fork out join_none threads and make one of the threads wait on the other by introducing an event that gets triggered at the conclusion of the first thread and the other thread waits for this event at the start.
This is a pretty difficult problem to address. If you get 2 transactions in the same time step, you have to be able to process them regardless of the order in which they get sent to your scoreboard. You can't know for sure which monitor will get triggered first. The only thing you can do is collect the transactions and at the end of the time step do your modeling/checking/etc.
Semaphores only help you if you have concurrent threads that take (simulation) time that are trying to access a shared resource. If you get things from an analysis port, then you get them in 0 time, so semaphores won't help you here.
So to my understanding, the answer is: compiler/vendor/uvm cannot ensure the order of execution. If you need to ensure the order which actually happen in same time step, you need to use semaphore correctly to make it work the way you want.
Another thing is, only you yourself know which one must execute after the other if they are in same simulation time.
this is a classical race condition where the result depends upon the actual thread order...
first of all you have to decide if the write race is problematic for you and/or if there is a priority order in this case. if you dont care the last access would win.
if the access isnt atomic you might need a semaphore to ensure only one access is handled at a time and the next waits till the first has finished.
you can also try to control order by changing the structure or introducing thread ordering (wait_order) or if possible you remove timing at all (here instead of directly operating with the data you get you simply store the data for some time and then later you operate on it.

Multicores and mulithreads

How is process-based multitasking achieved by using multi-threading in each process?
For example, consider when an operating system is running with two background process. Each process supports internally multi-threading features. Now, how does time slicing happen between and inside these processes, and how does time slicing happen between threads?
The scheduler typically works at the thread level. In simplest terms the scheduler gives each runnable thread its timeslice in turn.
So a process with two threads will get twice as much CPU time as a process with one thread.
From:
http://msdn.microsoft.com/en-us/library/ms684259(VS.85).aspx
"A multitasking operating system divides the available processor time among the processes or threads that need it. The system is designed for preemptive multitasking; it allocates a processor time slice to each thread it executes. The currently executing thread is suspended when its time slice elapses, allowing another thread to run. When the system switches from one thread to another, it saves the context of the preempted thread and restores the saved context of the next thread in the queue.
The length of the time slice depends on the operating system and the processor. Because each time slice is small (approximately 20 milliseconds), multiple threads appear to be executing at the same time. This is actually the case on multiprocessor systems, where the executable threads are distributed among the available processors. However, you must use caution when using multiple threads in an application, because system performance can decrease if there are too many threads."
Also check out This link for when to use multi-tasking
The operating system decides when and for how long each thread exectues. For Microsoft operating systems, there is no way to determine or predict which thread in which process will execute next. Each thread also has a priority that it runs at. Higher priority threads tend to get more time than lower This priority can be changed by the user or by a program. See this link for more info.
"Now, how does time slicing happen between and inside these processes, and how does time slicing happen between threads?"
That's entirely up to the operating system to decide, really. A really basic OS might not do time-slicing at all, and just let each process run through to completion on a first-come, first-serve basis.
However, most modern operating systems will use some flavor of scheduling algorithm to decide which thread gets to execute on which core and for how long, and perform the context-switching necessary to save and restore per-thread state when swapping out one thread for another.

job, task and process, what's the difference

what is the difference between these concepts?
“Process” is well-defined; “job” and “task” are ambiguous.
Fundamentally a job/task is what work is done, while a process is how it is done, usually anthropomorphised as who does it. A job is an overall unit of work, and is composed of tasks. In practice usage is very inconsistent, and often “task” == “process”, though formally a process performs a task.
Process is a well-defined operating systems concept, as is thread: a process is an instance of a program that is being executed, and is the basic unit of resources: a process consists of or “owns” its image, execution context, memory, files, etc.; etymologically a process is the steps done by a processor. A process consists of one or more threads, which are the unit of scheduling, and consist of some subset of a process (possibly shared with other threads): execution context and perhaps more. Traditionally a thread is the unit of execution on a processor (a thread is “what is executing”), but with multi-core processors and hardware threads, some scheduling is done even at the level of a single core. There are various kinds of processes and threads, and the exact definition varies between platforms.
Job and task are today vague, ambiguous terms, especially task. A “job” often means a set of processes, while a “task” may mean a process, a thread, a process or thread, or, distinctly, a unit of work done by a process or thread.
To give an idea how confused the naming is,
Windows Task Manager manages (running) processes, while
Windows Task Scheduler schedules programs to execute in future, what is traditionally known as a job scheduler, and uses the .job extension!
The term “job” traditionally means a “piece of work” (as opposed to “occupation”), and is used as such in manufacturing, in the phrase “job production”, meaning “custom production”, where it is contrasted with batch production (many items at once, one step at a time) and flow production (many items at once, all steps at the same time, by item). Note that these distinctions have become blurred in computing, notably in the oxymoronic term “batch job”.
In computing, “job” originates in non-interactive processing on mainframes, notably in IBM’s Job Control Language for the DOS/360 and OS/360 of the mid-1960s, and formally means a “unit of work for an operating system”, which consists of steps, each of which is a request to execute a specific program. Early computers primarily did batch processing (running the same program over many input data), like census or billing, and a standard type of one-off job was compiling a program from source, which could then process batches of data. Later batch came to be applied to all non-interactive computing, whether one-off or multiple items.
In Unix shells, a “job” is the shell’s representation for a process group – a set of processes that can all be sent a signal – concretely a pipeline and its descendent processes; note that running a script starts a job, exactly as in mainframes. The job is not done until the processes complete, and a job can be stopped, resumed, or terminated, which corresponds to suspending, resuming, or terminating the processes. Thus while formally a job is distinct from the process group, this is a subtle distinction and thus people often use “job” to mean “set of processes”.
Traditional jobs (and batches) have finite input data and should complete processing, successfully or not. By contrast, when running a server, such as a web server, the input, such as a stream of requests, is unlimited (formally codata). This is analogous to flow production, and the process (or “job”) never completes, though it can be terminated or “canceled”. In a quip, “a server’s job is never done” (formally, exit status will be CANCELED, not COMPLETED/SUCCESS).
The term “step” makes sense for sequential computing – one step follows another – but once you have concurrent computing, you have a set of tasks, which do not necessarily run in a particular order, rather than a sequence of steps. The term “task” was popularized by OS/360, which featured “Multiprogramming with a Fixed number of Tasks (MFT)” and “Multiprogramming with a Variable number of Tasks (MVT)”, though in this case “task” was used synonymously with “process” or “thread”, as the basic task is “execute this program” (so the resulting process/thread performs the task), which is probably the source of the ambiguity.
Formally “multitasking” means “working on multiple tasks concurrently”, but in practice means an operating system (or virtual machine, or runtime, or individual process) “running multiple processes/threads concurrently”.
A clear distinction between tasks as work and process/threads as how the work is done is given in a
task queue, as in this diagram of a thread pool: there is a (big, potentially unlimited) queue of incoming tasks (pending), which are performed by a (small, often fixed) set of threads, each task being performed by a single thread, and each thread performing a single task at a time: the active tasks correspond to the active threads. Concretely, consider a multithreaded web server, where the tasks are “service this web page request”, and each thread fetches (from disk or memory) or renders the web page (say by a template or PHP), then returns the result.
As you can see from this last example, it is often useful to distinguish tasks from threads or processes, and in particular contexts “job” and “task” have specific meanings, though in general they are ambiguous.
Clearest is thus to avoid using “job” or “task” and instead refer to a “set of processes”, “process”, or “thread”, and for servers to refer to requests (or queries) rather than tasks.
They can be all considered the same thing, really depends on the context. A process though is usually an isolated entity that's managed by the operating system. A job is often more of an application level term or just some script that's executed to do a specific set of task(s). A task is often a part of a job - sometimes the only part.
A job is a unit of work that has been submitted by user. It is usually associated with batch systems. A batch job might be a request to run multiple programs in succession [pg 144]. However, it can be assumed that a job is a request to run a single program. Hence, depending on the context, a job can be a program (we usually assume this), or a set of programs (e.g. batch systems) [pg 8].
A process is an active entity, which requires a set of resources, including a processor and special registers to perform its function. It is a single instance of an executable program. So from here, you can see the connection between a process and a program, hence, a job.
The Linux kernel internally represents processes as tasks [pg 742].
Source: Modern Operating Systems (3rd edition) by Tanenbaum, published by Pearson Education, Inc, 2009
A task represents the execution of a single process or multiple processes on a compute node. A collection of tasks that is used to perform a computation is known as a job. Jobs are used to reserve the resources required by tasks.
source: jobs and tasks http://msdn.microsoft.com/en-us/library/bb525214%28v=vs.85%29.aspx
Well...
This might not be as clear as described here. It may very well depend of the operating system some one is dealing with.
For example when compiling a DIGITAL Equipment OSF1 kernel (also known as TruUnix64) -- when that Unix was still existant, end of the nineties, beginning of the century -- the term TASK was dedicated to the number of parallel tasks the kernel was able to handle.
It was a fixed array of tasks the kernel could perform at a given moment.
Thus it was the sum of the processes it could spawn as well as internal tasks it has to do even if not seen as processes by ps. Then it was a very low level count of actions allowed to the kernel on each NUMA node, not something accessible outside the kernel.
On the other hand a previous operating system like DEC VMS was known for having its base OS unit as a job (you interactively logged under a job) executing possibly (depending on system and account parameters and privileges) many processes at a time. An image (an executable) then occupying a process and (most of the time) multiple threads (the OS took care of multithreading by itself) at a time.
So, then, the job was not application related but really OS related.
Somewhat similarly Windows, which does not natively support fork() as a lightweight process creator, tends to create processes (using a spawn - CreateProcess - primitive that looks very much alike the one that existed onto VMS / OpenVMS 40 years ago) that are heavier that the Unix ones. Here, we have the same word (process) to describe (in term of OS) two realities that are quite different: a Windows process tends to be closer to a VMS job than a true Unix process.
As I did not configure/build any Unix Kernel since TrueUnix64, I am not able to discuss the TASK kernel parameter of a Debian or Linux OS if any. It might be interesting that someone with inner knowledge of the tasks limit of those kind of OS could explain us further on this concept in these systems.
To conclude: task, process, job, spawn, fork, thread... the more you dig into different OS, the more varieties you get and possible contradictory definitions you face.
Gilles
[non native English speaker, pardon my English].

Can I make Perl ithreads in Windows run concurrently?

I have a Perl script that I'm attempting to set up using Perl Threads (use threads). When I run simple tests everything works, but when I do my actual script (which has the threads running multiple SQLPlus sessions), each SQLPlus session runs in order (i.e., thread 1's sqlplus runs steps 1-5, then thread 2's sqlplus runs steps 6-11, etc.).
I thought I understood that threads would do concurrent processing, but something's amiss. Any ideas, or should I be doing some other Perl magic?
A few possible explanations:
Are you running this script on a multi-core processor or multi-processor machine? If you only have one CPU only one thread can use it at any time.
Are there transactions or locks involved with steps 1-6 that would prevent it from being done concurrently?
Are you certain you are using multiple connections to the database and not sharing a single one between threads?
Actually, you have no way of guaranteeing in which order threads will execute. So the behavior (if not what you expect) is not really wrong.
I suspect you have some kind of synchronization going on here. Possibly SQL*Plus only let's itself be called once? Some programs do that...
Other possiblilties:
thread creation and process creation (you are creating subprocesses for SQL*Plus, aren't you?) take longer than running the thread, so thread 1 is finished before thread 2 even starts
You are using transactions in your SQL scripts that force synchronization of database updates.
Check your database settings. You may find that it is set up in a conservative manner. That would cause even minor reads to block all access to that information.
You may also need to call threads::yield.