Is there a syscall for test_and_set() instruction? - system-calls

I'm reading the Silberschatz's book "Operating system". The author talks about the test_and_set() instruction and he says that it's an hardware instruction.
I'm pretty curious to understand if exist a system call (in Linux, for example) that allow to use test_and_set().

Related

Compiling an OS and defining the system calls

I'm trying to better understand operating systems, not the theory behind them but how real people write real OS code.
I know most OS's are written in C. I know the source code for these OS's include calls to functions like malloc, calloc, etc, to allocate memory for a process, etc.
Under normal conditions, i.e, when compiling code destined to run on an OS, I know that the C compiler will use the underlying OS's system calls to execute these functions. But when compiling the source code for these OS's, how does the compiler know what to do. The system calls don't exist cause they're defined by the OS. Does the compiler just call some assembly routine, which will eventually become a system call?
It is complex because you need to understand several things about OS development to get the big picture. Overall, the OS isn't like a process which executes in-order like average user mode code. When the computer boots, the OS will execute in-order to set up its environment. After that, the OS is basically system calls and interrupts.
Each CPU works differently but most CPUs will have an interrupt table mechanism and a syscall mechanism. The interrupt table specifies where to jump for a certain interrupt number and the syscall mechanism is probably one register containing the address of the entry point for a syscall. It works like this on x86-64 (most desktop/laptop computers). x86-64 has the IDT (Interrupt Descriptor Table) and the syscall register is IA32_LSTAR.
The OS isn't written in C because you can call malloc() or anything. The OS is written in C because you can make C static and freestanding (all code needed is in the executable and it doesn't rely on any external software to the executable). Actually, when writing an OS, you cannot call malloc(). You need to avoid any standard library function implementations and use static and freestanding code (the base of C like structs, pointers, arithmetic, variables, etc). C is also used because you can modify arbitrary memory locations with pointers. For example,
unsigned int* ptr = (unsigned int*)0x1234;
*ptr = 0x87654321;
makes sure that address 0x1234 contains 0x87654321. You can also use binary operators (and, or, xor, shift, etc) to modify memory at the bit level.
To answer your question, if you want to define the system calls in an OS that you write yourself, you simply consider your syscalls to be that way. When you write your syscall handler, you consider that someone using your OS knows that a certain syscall number is requesting a certain operation so you oblige by doing that. For example, Linux uses SysV as a convention for syscall numbers and registers used to pass arguments to them (including the syscall number). On x86-64 Linux, from user mode, you put your syscall number in RAX and use the instruction syscall. The processor then looks in IA32_LSTAR for the address of the syscall handler and jumps to it. The processor (the core) is now in kernel mode (in the kernel). The kernel now looks at RAX for the syscall number and answers the request by the doing the associated operation (after several checkups and a bunch of other things).

What happens to a program that contains certain instructions that are not supported by the cpu in question?

Say for example in the case of x86-64 you have a program that makes use of avx-512 instructions or any other special purpose instruction, and the cpu in question does not support the instruction at the low level such as a haswell based cpu, how does this situation get resolved?
Is the program simply just not able to run (my first thought) or is there a way for the cpu to still run the program using the instructions it does support, however it will simply not receive the performance advantage offered by the instruction in question?
If the program checks CPU features and avoids actually run unsupported instructions, you just miss out on performance. Libraries like x264 and x265 do this, setting function pointers when they start to versions of functions that are the best supported version for this CPU.
If you do run an unsupported instruction (e.g. because you compiled with gcc -O3 -march=skylake-avx512 to tell the compiler it could assume AVX-512 support) there are 2 possibilities:
It happens to decode as something else (which won't be the case for AVX-512, but does happen for example with lzcnt decoding as bsr).
The CPU raises a #UD exception (UnDefined instruction). The OS handles it and delivers a SIGILL (Illegal Instruction) to the process, or equivalent on non-POSIX OSes like Windows.

Exactly to what extent is Kernel structure and design dependent on the file system being used?

For instance, lets say hypothetically that we have access to the Windows source code.
Now can we modify the source code to operate entirely on ext4 file system, instead of NTFS, just by changing the code modules that depend on the exact file system being used? Or will major changes in the way the kernel works be needed?
To what extent does the file system being used affect the kernel design?
(Note: You can switch the above example with the case of ReactOS, which is an open source clone of Windows 2000 and supports only FAT file system, and ext4.
Moreover, I know that Windows source code is not available to public so a definitive opinion cannot be given. I'm asking based on whatever is known about Windows internals, for my given example, and as per general principles of Kernel design.)
Generally, the OS kernel is not dependent upon file systems. Most operating systems support multiple file systems.

Is there runtime flow chart for Perl?

I am trying to better understand logic and flow of exceptions. So i got to state that i really feeled lack of understanding how Perl interpretes and runs programs, which phases are involved and what happens on every phase.
For example, I'd like to understand, when are binded STD* IO and when released, what is happening with $SIG{*} things, how they are depended with execepions, how program dies, etc. I'd like to have better insight of internals mechanics.
I am looking for links or books. I prefer some material which has also visual charts involved but this is not mandatory. I'd like to see some "big picture" of whole process, then i have already possibilities to dig further if i find it necessary.
I found Chapter 18th in Programming Perl gives overview of compiling phase and i try to work it trough, but i appreciate other good sources too.
Some alternative sources (there are not very many):
Mannning's Extending and Embedding Perl, which is the go-to reference on Perl's internals outside of the source
The chapter on the Perl internals in Advanced Perl Programming, which may be exactly what you want
Simon Cozens's Perl internals FAQ
Those may be more focused to what you're looking for. I'm not sure any of them explicitly spells out the interpreter's runtime execution order, though. The first one is a better "I want to work with this stuff" book; the second two are probably good introductory references.
Some of the questions you ask are not, as far as I know, explicitly documented - the I/O question being one I can't think of a good source for in particular. Exception handling is documented very well in Try::Tiny's documentation, and it's what we use for exceptions. Signal handling is messy, but perlipc documents it pretty well. With threads, you may be stuck with unsafe signals - I generally avoid threads in favor of multiple processes unless I must have shared memory.
You might start with these topics accessible via the perldoc program:
Internals and C Language Interface
perlembed Perl ways to embed perl in your C or C++ application
perldebguts Perl debugging guts and tips
perlxstut Perl XS tutorial
perlxs Perl XS application programming interface
perlxstypemap Perl XS C/Perl type conversion tools
perlclib Internal replacements for standard C library functions
perlguts Perl internal functions for those doing extensions
perlcall Perl calling conventions from C
perlmroapi Perl method resolution plugin interface
perlreapi Perl regular expression plugin interface
perlreguts Perl regular expression engine internals
perlapi Perl API listing (autogenerated)
perlintern Perl internal functions (autogenerated)
perliol C API for Perl's implementation of IO in Layers
perlapio Perl internal IO abstraction interface
perlhack Perl hackers guide
perlsource Guide to the Perl source tree
perlinterp Overview of the Perl interpreter source and how it works
perlhacktut Walk through the creation of a simple C code patch
perlhacktips Tips for Perl core C code hacking
perlpolicy Perl development policies
perlgit Using git with the Perl repository

What is INT 21h?

Inspired by this question
How can I force GDB to disassemble?
I wondered about the INT 21h as a concept. Now, I have some very rusty knowledge of the internals, but not so many details. I remember that in C64 you had regular Interrupts and Non Maskable Interrupts, but my knowledge stops here. Could you please give me some clue ? Is it a DOS related strategy ?
From here:
A multipurpose DOS interrupt used for various functions including reading the keyboard and writing to the console and printer. It was also used to read and write disks using the earlier File Control Block (FCB) method.
DOS can be thought of as a library used to provide a files/directories abstraction for the PC (-and a bit more). int 21h is a simple hardware "trick" that makes it easy to call code from this library without knowing in advance where it will be located in memory. Alternatively, you can think of this as the way to utilise the DOS API.
Now, the topic of software interrupts is a complex one, partly because the concepts evolved over time as Intel added features to the x86 family, while trying to remain compatible with old software. A proper explanation would take a few pages, but I'll try to be brief.
The main question is whether you are in real mode or protected mode.
Real mode is the simple, "original" mode of operation for the x86 processor. This is the mode that DOS runs in (when you run DOS programs under Windows, a real mode processor is virtualised, so within it the same rules apply). The currently running program has full control over the processor.
In real mode, there is a vector table that tells the processor which address to jump to for every interrupt from 0 to 255. This table is populated by the BIOS and DOS, as well as device drivers, and sometimes programs with special needs. Some of these interrupts can be generated by hardware (e.g. by a keypress). Others are generated by certain software conditions (e.g. divide by 0). Any of them can be generated by executing the int n instruction.
Programs can set/clear the "enable interrupts" flag; this flag affects hardware interrupts only and does not affect int instructions.
The DOS designers chose to use interrupt number 21h to handle DOS requests - the number is of no real significance: it was just an unused entry at the time. There are many others (number 10h is a BIOS-installed interrupt routine that deals with graphics, for instance). Also note that all this is for IBM PC compatibles only. x86 processors in say embedded systems may have their software and interrupt tables arranged quite differently!
Protected mode is the complex, "security-aware" mode that was introduced in the 286 processor and much extended on the 386. It provides multiple privilege levels. The OS must configure all of this (and if the OS gets it wrong, you have a potential security exploit). User programs are generally confined to a "minimal privilege" mode of operation, where trying to access hardware ports, or changing the interrupt flag, or accessing certain memory regions, halts the program and allows the OS to decide what to do (be it terminate the program or give the program what it seems to want).
Interrupt handling is made more complex. Suffice to say that generally, if a user program does a software interrupt, the interrupt number is not used as a vector into the interrupt table. Rather a general protection exception is generated and the OS handler for said exception may (if the OS is design this way) work out what the process wants and service the request. I'm pretty sure Linux and Windows have in the past (if not currently) used this sort of mechanism for their system calls. But there are other ways to achieve this, such as the SYSENTER instruction.
Ralph Brown's interrupt list contains a lot of information on which interrupt does what. int 21, like all others, supports a wide range of functionality depending on register values.
A non-HTML version of Ralph Brown's list is also available.
The INT instruction is a software interrupt. It causes a jump to a routine pointed to by an interrupt vector, which is a fixed location in memory. The advantage of the INT instruction is that is only 2 bytes long, as oposed to maybe 6 for a JMP, and that it can easily be re-directed by modifying the contents of the interrupt vector.
Int 0x21 is an x86 software interrupt - basically that means there is an interrupt table at a fixed point in memory listing the addresses of software interrupt functions. When an x86 CPU receives the interrupt opcode (or otherwise decides that a particular software interrupt should be executed), it references that table to execute a call to that point (the function at that point must use iret instead of ret to return).
It is possible to remap Int 0x21 and other software interrupts (even inside DOS though this can have negative side effects). One interesting software interrupt to map or chain is Int 0x1C (or 0x08 if you are careful), which is the system tick interrupt, called 18.2 times every second. This can be used to create "background" processes, even in single threaded real mode (the real mode process will be interrupted 18.2 times a second to call your interrupt function).
On the DOS operating system (or a system that is providing some DOS emulation, such as Windows console) Int 0x21 is mapped to what is effectively the DOS operating systems main "API". By providing different values to the AH register, different DOS functions can be executed such as opening a file (AH=0x3D) or printing to the screen (AH=0x09).
This is from the great The Art of Assembly Language Programming about interrupts:
On the 80x86, there are three types of events commonly known as
interrupts: traps, exceptions, and interrupts (hardware interrupts).
This chapter will describe each of these forms and discuss their
support on the 80x86 CPUs and PC compatible machines.
Although the terms trap and exception are often used synonymously, we
will use the term trap to denote a programmer initiated and expected
transfer of control to a special handler routine. In many respects, a
trap is nothing more than a specialized subroutine call. Many texts
refer to traps as software interrupts. The 80x86 int instruction is
the main vehicle for executing a trap. Note that traps are usually
unconditional; that is, when you execute an int instruction, control
always transfers to the procedure associated with the trap. Since
traps execute via an explicit instruction, it is easy to determine
exactly which instructions in a program will invoke a trap handling
routine.
Chapter 17 - Interrupt Structure and Interrupt Service Routines
(Almost) the whole DOS interface was made available as INT21h commands, with parameters in the various registers. It's a little trick, using a built-in-hardware table to jump to the right code. Also INT 33h was for the mouse.
It's a "software interrupt"; so not a hardware interrupt at all.
When an application invokes a software interrupt, that's essentially the same as its making a subroutine call, except that (unlike a subroutine call) the doesn't need to know the exact memory address of the code it's invoking.
System software (e.g. DOS and the BIOS) expose their APIs to the application as software interrupts.
The software interrupt is therefore a kind of dynamic-linking.
Actually, there are a lot of concepts here. Let's start with the basics.
An interrupt is a mean to request attention from the CPU, to interrupt the current program flow, jump to an interrupt handler (ISR - Interrupt Service Routine), do some work (usually by the OS kernel or a device driver) and then return.
What are some typical uses for interrupts?
Hardware interrupts: A device requests attention from the CPU by issuing an interrupt request.
CPU Exceptions: If some abnormal CPU condition happens, such as a division by zero, a page fault, ... the CPU jumps to the corresponding interrupt handler so the OS can do whatever it has to do (send a signal to a process, load a page from swap and update the TLB/page table, ...).
Software interrupts: Since an interrupt ends up calling the OS kernel, a simple way to implement system calls is to use interrupts. But you don't need to, in x86 you could use a call instruction to some structure (some kind of TSS IIRC), and on newer x86 there are SYSCALL / SYSENTER intructions.
CPUs decide where to jump to looking at a table (exception vectors, interrupt vectors, IVT in x86 real mode, IDT in x86 protected mode, ...). Some CPUs have a single vector for hardware interrupts, another one for exceptions and so on, and the ISR has to do some work to identify the originator of the interrupt. Others have lots of vectors, and jump directly to very specific ISRs.
x86 has 256 interrupt vectors. On original PCs, these were divided into several groups:
00-04 CPU exceptions, including NMI. With later CPUs (80186, 286, ...), this range expanded, overlapping with the following ranges.
08-0F These are hardware interrupts, usually referred as IRQ0-7. The PC-AT added IRQ8-15
10-1F BIOS calls. Conceptually, these can be considered system calls, since the BIOS is the part of DOS that depends on the concrete machine (that's how it was defined in CP/M).
20-2F DOS calls. Some of these are multiplexed, and offer multitude of functions. The main one is INT 21h, which offers most of DOS services.
30-FF The rest, for use by external drivers and user programs.