Windows message pump - operating-system

This is just a technical question to improve my understanding of OS architecture.
I understand when the Application.Run() method is executed, a new form with its message pump is created. From MSDN and other online articles, I understand its thread safe nature and even understand that the Windows OS components like HAL layer, core OS services and applications on the top of the hierarchy all communicate between one another using messaging too.
Is this custom only to Windows or does this happen in the Linux environment too?
Can this be thought of as a semaphore? Or does the definition and context of a semaphore only make sense in a multi-threaded environment?
Please advice.
Thanks,
Subbu

There are many ways how processes can communicate, together called IPC - inter-process communication. From historical reasons, in UNIX-like systems use other mechanisms for communicating between processes than the message loop. UNIX processes are usually communicating through pipes (one can think about them as temporary files which can be only written in one process and read in another one), signals (code preempting the actual execution of some process) or process return values (similar to function returning). There are many other ways how to communicate (sockets, shared memory, files) but these are the most usual.
As for the semaphores: I am not sure how should these be related to message passing, semaphores objects designed for allowing programmers to create critical sections of code. Because in UNIX can be semaphore shared even between different processes (not only different threads in one process), they make sense in any multi-process OS (which is almost every today's OS), even with no threading support.
Well, semaphores can be used even with fibrils - userspace threads which are not preempted by exhausting their time quantum, as threads do, but which yield control to another fibril manually (for example when the fibril is about to begin a long blocking operation such as reading data from harddisk, it may request the data and instead of blocking switch to another fibril which wants CPU).

Unix systems have the message queues:
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
int msgsnd(int msqid, const void *msgp, size_t msgsz, int msgflg);
ssize_t msgrcv(int msqid, void *msgp, size_t msgsz, long msgtyp, int msgflg);
which are much less used than Windows messages but operate in a very similar fashion. Also a very similar concept, the Go language nicely implements the CSV (communicating sequential processes), which is an excellent multitasking paradigm, because does not suffer from exponential complexity growth. I would recommend Unix system programmers to use message queues more.
Windows messages are also somewhat similar to Unix signals, but Unix signals (usually) don't have arguments, are very limited in number (often only 32, compared to thousands of Windows messages) and the signal handlers have to execute in a weird suspended environment, which makes them much less practical. Nonetheless, signals are much more popular in Unix programming than message queues.
Regarding semaphores
Rather than using semaphores (which have an attached counter), you should first try to use mutexes, which are more lightweight and usable for synchronizing threads inside the same process.

Related

The reason(s)/benefit(s) to use realtime operating system instead of while-loop on MCU

I'm working on a wheeled-robot platform. My team is implementing some algorithms on MCU to
keep getting sensors reading (sonar array, IR array, motor encoders, IMU)
receive user command (via a serial port connected to a tablet)
control actuators (motors) to execute user commands.
keep sending sensor readings to the tablet for more complicated algorithms.
We currently implement everything inside a global while-loop, while I know most of the other use-cases do the very same things with a real-time operating system.
Please tell me the benefits and reasons to use a real-time os instead of a simple while-loop.
Thanks.
The RTOS will provide priority based preemption. If your code is off parsing a serial command and executing it, it can't respond to anything else until it returns to your beastly loop. An RTOS will provide the abstractions needed for an instant context switch based on an interrupt event. Otherwise the worst-case latency of an event response is going to be that of the longest possible excursion out of the main loop, and sometimes you really do need long-running processes. Such as, for example, to update an LCD panel or respond to USB device enumeration. Preemption permits you to go off and do these things safe in the knowledge that a 16-bit timer running at the CPU clock isn't going to roll over several times before it's done. While for simple control jobs a loop is sufficient, the problem starting with it is when you get into something like USB device enumeration it's no longer practical and will need a full rewrite. By starting with a preemptive framework like an RTOS provides, you have a lot more future flexibility. But there's definitely a bit more up-front work, and definitely a learning curve.
"Real Time" OS ensures your task periodicity. If you want to read sensors data precisely at every 100msec, simple while loop will not guarantee that. On other hand, RTOS can easily take care of that.
RTOS gives you predictibility. An operation will be executed at given time and it will not be missed.
RTOS gives you Semaphores/Mutex so that your memory will not be corrupted or multiple sources will not access buffers.
RTOS provides message queues which can be useful for communication between tasks.
Yes, you can implement all these features in While loop, but then that's the advantage! You get everything ready and tested.
If your while loop works (i.e. it fulfills the real-time requirements of your system), and it's robust, maintainable, and somewhat extensible, then there probably is no benefit to using a real-time operating system.
But if your while-loop can't fulfill the real-time requirements or is overly complex or over-extended (i.e., any change requires further tuning and tweaking to restore real-time performance), then you should consider another architecture.
An RTOS architecture is one popular next step beyond the super-loop. An RTOS is basically a tool for managing software complexity. It allows you to divide a complex set of software requirements into multiple threads of execution. When done properly, each individual thread has a relatively simple set of requirements and becomes easier to implement. And thread prioritization can make it easier to fulfill the real-time requirements of the application. These are basically the benefits of employing an RTOS.
An RTOS is not a panacea, however. An RTOS increases the overall system complexity and opens you up to new types of bugs (such as deadlocks). It requires knowledge and experience to design and implement an RTOS based program effectively. Consider alternatives such as Multi-Rate Main Loop Tasking or an event-based state machine architecture (such as QP). These alternatives might be simpler, easier to understand, or more compatible with your way of designing software.
There are a couple huge advantage that a RTOS multitasker has:
Very good I/O performance: an RTOS can set a waiting thread ready as soon as an I/O action that it requested has completed, so latency in handling reads/writes/whatever can be low. A looped, polled design cannot respond to I/O completions until it gets round to checking the I/O status, (either directly or my polling some volatile flag set by an interrupt-handler).
Indpendent functionality: The ease of implementing isolated subsystems for serial comms, actuators etc. may well be one for you, as suggested in the other answers. It's very reassuring to know that any extra delay in, say some serial exchange, will not adversely affect timing elsewhere. You need to wait a second? No problem: sleep(1000) and you're done - no effect on the other subsystems:) It's the difference between 'no, I cannot add a network server, it would change all the timing and I would have to retest everything' and 'sure, there's plenty of CPU free, I already have the code from another job and I just need another thread to run it'.
There ae other advanatges that help offset the added annoyance of having to program a preemptive multitasker with its critical sections, semaphores and condvars.
Obviously, if your hardware has multiple cores, the RTOS helps - it is designed to share out available CPU execution cycles just like any other resource, and adding cores means more cycles.
In the end, though, it's the I/O performance and functional isolation that's the big win.
Some of the suggestions in other answers may help, either instead of, or together with, an RTOS. When controlling multiple I/O hardware, eg. sensors and actuators, an event-driven state-machine is a very good idea indeed. I often implement that by queueing all events into one thread, (producer-consumer queue), that has sole access to the state data and implements the state-machine, so serializing actions.
Whether the advantages are worth the candle is between you and your requirements:)
RTOS is not instead of while loop - it is while loop + tools which organize your tasks. How do they organize your tasks? Assign priorities to them, decide how much time each one have for a job and/or at what time it should start/end. RTOS also layers your software, i.e harwdare related stuff, application tasks, etc. Aside of that gives you data structures, containers, ready to use interfaces to handle common tasks so you don't have to implement your own i.e allocate some memory for you, lock access for some resources and so on.

How did the apples fell for threads to be conceived

I was going through the following lecture notes on OS :
http://williamstallings.com/Extras/OS-Notes/h2.html
What I could draw is that "A process is a stream of execution ,i.e.basically a sequence of statements and so is a thread .However , the register states of one process are independent of the register states of another process but the register states of another thread can be accessed inside a thread. For every process at least one thread is allotted or dedicated ,when a process is started the OS activities for that process are taken over by the thread ( or a thread)"
What was the rationale behind conceiving the idea of threads ? When the OS is running a particular process why do we need some intermediate like a thread between them ?
"However , the register states of one process are independent of the register states of another process but the register states of another thread can be accessed inside a thread".
Can the above statement be taken as in the code for a process we cannot access the register states of a another process but in a code for a thread we can access the register states of another thread ?
(The above question did have the substitution of process and thread by their definition as codes or sequences of streams )
P.S : The title of the question is a metaphoric one .Please forgive if it misleads in any way . :P Could I take the liberty to broaden up and ask that if
the processor generates a thread for every process what does it write in the code for a thread ?(How does the code for a thread look like ? )
Terminology - for a system with virtual memory, threads share the same virtual memory address space, while each process has it's own address space. Processes can share physical memory by having a portion of that memory shared into their virtual address spaces (but the virtual address for each process may be different even though it is the same physical memory block).
Early (1960's) instances of multi-processing were mainframes that ran multiple processes that usually did not communicate with each other. Most of this activity was for batch oriented jobs, with a stream of jobs to be run, often from a punched card reader, or in more advanced situations, from remote job entry sites, which were other computers with a few peripherals (card readers, tape drives, line printers, ... ) that communicated with the mainframe to run jobs. There were also time sharing applications, similar to servers, except in many cases, relatively dumb terminals were used to communicate with the main frame. By the 1970's, APL/SV (A Programmming Language / Share Variables) was a time sharing application / programming language that could share variables between users.
For multi-process / multi-threaded operating systems, the device drivers operate from a queue of requests (such as a file read or write). Each request to be added to a device driver queue is done similar to a context switch so there won't be conflicts between process or thread requests for I/O. Some peripherals, such as mainframe, SCSI, or ... disk drives also operated from an internal queue, and could process I/O requests out of order to reduce random access overhead.
The basic problem that drove thread was how can an application handle multiple tasks at the same time and do it in a system-independent manner?
In classical eunuchs, a process could only do one thing at a time. If you needed to handle multiple things you kicked off multiple processes.
In the olde RSX and VMS systems (and Windoze under the covers), programmers relied on software interrupts. A process could queue I/O requests to multiple devices and receive a software interrupt when the request completed, thus allowing the application to do multiple things at once.
Another approach to the multiple things at once problem was to use event queues (Windoze, X Windows).
The ADA programming language was the first (and still really the only) mainstream programming language to support threads (tasks) as a system independent way to handle these kinds of problems. DOD compliance mandates drove the creation of threads.
Originally, threads were implemented through libraries ("use threads", "many to one model"). With the rise of multiprocessor systems, there became an increased demand to be able to have threads execute in parallel on different processors. This drove the creates of kernel threads in operating systems. (Many operating systems still do not support kernel threads).

Which kind of inter process communication (ipc) mechanism should I use at which moment?

I know that there are several methods of inter-process communication (ipc), like:
File
Signal
Socket
Message Queue
Pipe
Named pipe
Semaphore
Shared memory
Message passing
Memory-mapped file
However I was unable to find a list or a paper comparing these mechanism to each other and pointing out the benefits of them in different environemnts.
E.g I know that if I use a file which gets written by process A and process B reads it out it will work on any OS and is pretty robust, on the other hand - why shouldn't I use TCP Socket ? Has anyone a kind of overview in which cases which methods are the most suitable ?
Long story short:
Use lock files, mutexes, semaphores and barriers when processes compete for a scarce resource. They operate in a similar manner: several process try to acquire a synchronisation primitive, some of them acquire it, others are put in sleeping state until the primitive is available again. Use semaphores to limit the amount of processes working with a resource. Use a mutex to limit the amount to 1.
You can partially avoid using synchronisation primitives by using non-blocking thread-safe data structures.
Use signals, queues, pipes, events, messages, unix sockets when processes need to exchange data. Signals and events are usually used for notifying a process of something (for instance, ctrl+c in unix terminal sends a SIGINT signal to a process). Pipes, shared memory and unix sockets are for transmitting data.
Use sockets for networking (or, speaking formally, for exchanging data between processes located on different machines).
Long story long: take a look at Modern Operating Systems book by Tanenbaum & Bos, namely IPC chapter. The topic is vast and can't be completely covered within a list or a paper.

Event or polled based embedded MCU system architecture?

I have prior experience in writing both event and poll based embedded systems (for tiny MCU's with no preemptive OS).
In an event based system, tasks usually receives events (messages) on a queue and handles them in turn.
In a polled based system, tasks polls status with a certain interval and responds to change.
Which architecture do you prefer? Can both co-exist?
UPDATE: POINTS MADE
POLL BASED
- Tight coupling related to timing aspects (#Lundin)
* Can co-exist alongside event system using queues (#embedded.kyle)
* Fine for smaller programs (#Lundin)
EVENT BASED
+ More flexible system in the long run (#embedded.kyle)
- RTOS edition adds complexity (#Lundin)
* Small programs = state-machine controlled (#Lundin)
* Can be implemented using queues and a "super-loop" (inside controller/main) (#embedded.kyle)
* Only true "events" are hw interrupts ones (#Lundin)
RELATED QUESTIONS
* Looking for a comparison of different scheduling algorithms for a Finite State Machine (#embedded.kyle)
RELATED INFO
* "Prefer Using Active Objects Instead of Naked Threads" (#Miro)
http://www.drdobbs.com/parallel/prefer-using-active-objects-instead-of-n/225700095
* "Use Threads Correctly = Isolation + Asynchronous Messages" (#Miro)
http://www.drdobbs.com/parallel/use-threads-correctly-isolation-asynch/215900465
There is really no such thing as "event-driven" on a bare bone MCU platform, despite what the buzzword-spitters are trying to tell you. The only kind of true events you can receive are hardware interrupts.
Depending on the nature of the application and its real time requirements, interrupts may or may not be suitable. Generally, it is far easier to achieve deterministic real time with a polling system. However, systems relying solely on polling are very hard to maintain, because you get tight coupling between all timing aspects.
Suppose you try to start up a LCD, which is slow. Instead of polling some timer repeatedly while burning CPU cycles in an empty loop, you would perhaps decide to receive some data over a bus in the meantime. And then you want to print the data received on the LCD. Such a design has created a tight coupling between the LCD startup time and the serial bus, and another tight coupling between the serial bus and the printing of data. From an object-oriented point-of-view these things are not related to each other at all. If you were to speed up the serial bus at some point in the future, then suddenly you could encounter LCD printing bugs, because it has not finished starting up when you try to print on it.
In a small program, it is perfectly fine to use polling like in the above example. But if the program has potential of growing, polling will make it very complex and the tight coupling will ultimately lead to many strange and fatal bugs.
On the other hand, multi-threading and RTOS adds quite a lot of extra complexity which in turn can lead to bugs as well. Where to draw the line isn't simple to determine.
Out of personal experience I'd say that any program smaller than 20-30k LOC will not benefit from scheduling and multitasking, beyond simple state machines. If the program gets larger than that, I'd consider a multitasking RTOS.
Also, low-end MCUs (8- and 16-bitters) are far from suitable to run an OS. If you find that you need an OS to handle complexity on a 8- or 16-bit platform, you probably picked the wrong MCU to begin with. I'd be sceptical against any attempts to introduce an OS on anything smaller than a 32-bitter.
Actually, event-driven programming and threads can be combined and the resulting pattern is widely known as "active objects" or "actors".
Active objects (actors) are encapsulated, event-driven state machines, which communicate with one another asynchronously by posting events to each other. Active objects process all events in their own thread of execution (at least conceptually, if a cooperative scheduler is used), so they avoid by design most concurrency hazards.
Actors and active objects are all the rage (again) in the general-purpose computing (you can search for Erlang, Scala, Akka). Herb Sutter has written a couple of good articles that explain the "active object" pattern: "Prefer Using Active Objects Instead of Naked Threads" (http://www.drdobbs.com/parallel/prefer-using-active-objects-instead-of-n/225700095) and "Use Threads Correctly = Isolation + Asynchronous Messages" (http://www.drdobbs.com/parallel/use-threads-correctly-isolation-asynch/215900465)
Here is what Herb says in the first of these articles:
"Using raw threads directly is trouble for a number of reasons ...
Active objects dramatically improve our ability to reason about our thread's code and operation by giving us higher-level abstractions and idioms that raise the semantic level of our program and let us express our intent more directly. As with all good patterns, we also get better vocabulary to talk about our design. Note that active objects aren't a novelty: UML and various libraries have provided support for active classes"
So, all this is really not new. But what's perhaps less known, especially in the embedded systems community, is that active objects are not only fully applicable to the embedded systems, but they are actually a perfect match for embedded and they are lighter than a traditional RTOS.
I've been using the event-driven active objects for over a decade now and have created the QP family of active object frameworks for embedded systems (see http://www.state-machine.com/). I would never go back to the polling "superloop" or the raw RTOS.
I prefer whichever architecture is best suited to the application at hand.
Both can co-exist in a multilevel queue architecture. One queue works on a poll basis running in the main loop. While another, most likely tasked with higher priority events, works by using interrupt based preemption.
See my answer to this SO question for a more detailed explanation and comparison of the different scheduling algorithms.

Are nonblocking I/O operations in Perl limited to one thread? Good design?

I am attempting to develop a service that contains numerous client and server sockets (a server service as well as clients that connect out to managed components and persist) that are synchronously polled through IO::Select. The idea was to handle the I/O and/or request processing needs that arise through pools of worker threads.
The shared keyword that makes data shareable across threads in Perl (threads::shared) has its limits--handle references are not among the primitives that can be made shared.
Before I figured out that handles and/or handle references cannot be shared, the plan was to have a select() thread that takes care of the polling, and then puts the relevant handles in certain ThreadQueues spread across a thread pool to actually do the reading and writing. (I was, of course, designing this so that modification to the actual descriptor sets used by select would be thread-safe and take place in one thread only--the same one that runs select(), and therefore never while it's running, obviously.)
That doesn't seem like it's going to happen now because the handles themselves can't be shared, so the polling as well as the reading and writing is all going to need to happen from one thread. Is there any workaround for this? I am referring to the decomposition of the actual system calls across threads; clearly, there are ways to use queues and buffers to have data produced in other threads and actually sent in others.
One problem that arises from this situation is that I have to give select() a timeout, and expect that it'll be high enough to not cause any issues with polling a rather large set of descriptors while low enough not to introduce too much latency into my timing event loop - although, I do understand that if there is actual I/O set membership detected in the polling process, select() will return early, which partly mitigates the problem. I'd rather have some way of waking select() up from another thread, but since handles can't be shared, I cannot easily think of a way of doing that nor see the value in doing so; what is the other thread going to know about when it's appropriate to wake select() anyway?
If no workaround, what is a good design pattern for this type of service in Perl? I have a requirement for a rather high amount of scalability and concurrent I/O, and for that reason went the nonblocking route rather than just spawning threads for each listening socket and/or client and/or server process, as many folks using higher-level languages these days are wont to do when dealing with sockets - it seems to be kind of a standard practice in Java land, and nobody seems to care about java.nio.* outside the narrow realm of systems-oriented programming. Maybe that's just my impression. Anyway, I don't want to do it that way.
So, from the point of view of an experienced Perl systems programmer, how should this stuff be organised? Monolithic I/O thread + pure worker (non-I/O) threads + lots of queues? Some sort of clever hack? Any thread safety gotchas to look out for beyond what I have already enumerated? Is there a Better Way? I have extensive experience architecting this sort of program in C, but not with Perl idioms or runtime characteristics.
EDIT: P.S. It has definitely occurred to me that perhaps a program with these performance requirements and this design should simply not be written in Perl. But I see an awful lot of very sophisticated services produced in Perl, so I am not sure about that.
Bracketing out your several, larger design questions, I can offer a few approaches to sharing filehandles across perl threads.
One may pass $client to a thread start routine or simply reference it in a new thread:
$client = $server_socket->accept();
threads->new(\&handle_client, $client);
async { handle_client($client) };
# $client will be closed only when all threads' references
# to it pass out of scope.
For a Thread::Queue design, one may enqueue() the underlying fd:
$q->enqueue( POSIX::dup(fileno $client) );
# we dup(2) so that $client may safely go out of scope,
# closing its underlying fd but not the duplicate thereof
async {
my $client = IO::Handle->new_from_fd( $q->dequeue, "r+" );
handle_client($client);
};
Or one may just use fds exclusively, and the bit vector form of Perl's select.