What are the main differences between eBPF and LTTng? - ebpf

What are the main differences between eBPF and LTTng?
I read LTTng uses instrumentation: “Linux Trace Toolkit Next Generation (LTTng) is a tracer able to extract information from the 2 Linux kernel, user space libraries and from programs. It is based on instrumentation of the executables”
https://lttng.org/files/papers/desnoyers-codebreakers.pdf
Does this mean you have to rebuild the kernel or is this also about instrumentation when working with kprobes?

I am more familiar with eBPF than I am with LTTng, based on skimming the LTTng docs I can see the following differences:
LTTng requires the loading of a kernel module whereas eBPF is a native part of the linux kernel. The kernel gives a number of guarantees with regards to eBPF programs, mostly to protect the user from panicking their kernel or getting it stuck in infinite loops ect. You don't have that protection with kernel modules.
LTTng seems to me specifically targeted towards tracing. eBPF on the other hand has a much broader scope which includes tracing, networking, security, infrared drivers, and potentially even CPU scheduling.

Related

Build an RT-application using PREEMPT_RT

I would like to write real-time Linux programs while using the real-time PREEMPT_RT. I found the official Wiki (https://rt.wiki.kernel.org/index.php/HOWTO:_Build_an_RT-application). There are some code examples but I would like to get the explanation of the possible RT-functions.
Thank you,
It is important to underline that PREEMPT_RT is a patch that changes the internal code of the Linux kernel to reduce the maximum latency experienced by a user-level process. This is done by changing e.g. spinlocks to real-time preemptible mutexes, using threaded interrupts (i.e., hardware interrupts handlers run in seperate kernel threads) and so on. Therefore, it doesn't provide any API for user-level programming and you still rely on the standard libc and system call primitives. Just patch, configure and re-install the kernel (or, alternatively, install a pre-built PREEMPT_RT kernel).
You can still, of course, follow good practice real-time programming to avoid delays and contentions. The page you are mentioning concerns how to configure the kernel and write your code to get full benefit from the patch.
If you look for specific real-time APIs, you may want to have a look at Xenomai 3.0.1 which provides a specific API for running your user-level process on top of either the standard Linux or the Xenomai hypervisor (a layer below the Linux kernel)

Is it possible or a common technique to sandbox kernel mode drivers?

I see that kernel mode drivers are risky as they run in privileged mode, but are there any monolithic kernel's that do any form of driver/loadable module sandboxing or is this really the domain of microkernels?
Yes, there are platforms with "monolithic" (for some definition of monolithic) kernels that do driver sandboxing for some drivers. Windows does this in more recent versions with the user mode driver framework. There are two reasons for doing this:-
It allows isolation. A failure in a user mode driver need not bring down the whole system. This is important for drivers for hardware which is not considered system critical. An example of this might be your printer, or your soundcard. In those cases if the driver fails, it can often simply be restarted and the user often won't even notice this happened.
It makes writing and debugging drivers much easier. Driver writers can use regular user mode libraries and regular user mode debuggers, without having to worry about things like IRQL and DPCs.
The other poster said there is no sense to this. Hopefully the above explains why you might want to do this. Additionally, the other poster said their is a performance concern. Again, this depends on the type of the driver. In Windows this is typically used for USB drivers. In the case of USB drivers, the driver is not talking directly to the hardware directly anyway regardless of the mode that the driver operates in - they are talking to another driver which is talking to the USB host controller, so there is much less overhead of user mode communication than there would be if you were writing a driver that had to bit bang IO ports from user mode. Also, you would avoid writing user mode drivers for hardware which was performance critical - in the case of printers and audio hardware the user mode transitions are so much faster than the hardware itself, that the performance cost of the one or two additional mode context switches is probably irrelevant.
So sometimes it is worth doing simply because the additional robustness and ease of development make the small and often unnoticeable performance reduction worthwhile.
There are no sense in this sandboxing, OS fully trust to drivers code. Basically this drivers become part of kernel. You can't failover after FS crash or any major subsystem of kernel. Basically it`s bad (failover after crash, imagine that you can do after storage driver of boot disk crash?), because can lead to data loss for example.
And second - sandboxing lead to perfomance hit to all kernel code.

BSP vs Device-Drivers

While understanding each by itself (or maybe not), it looks like I'm far from understanding the practical differences between the two.
Per my understanding, a BSP is a package of drivers and configuration settings that allows a kernel image to boot up a board (and is part of it).
The individual device driver, operates on a specific component (hardware), interfacing on one side with the core kernel and on the other side with the device itself.
Looking at the Linux kernel, it is unclear to me where the BSP role starts and the device driver role ends. Specifically, I am used to see one BSP per board per image, however, the generic Linux kernel may be loaded on any architecture family with the same image (it is clear that for different families there are different images: x86, amd64, arm, etc.), where the specific board and peripherals drivers are loaded per need from the initrd.
Is there a BSP for the common Linux kernel distributions? Or is BSP relevant just for special cases boards?
Is this behavior similar on other kernels? VxWorks?
And the last one, is it common to merge different BSP/s in order to generate a single image that will fit different boards?
I see the relationship between BSPs and devices drivers as "has-a". Board support packages include device drivers.
The differences between BSPs & kernels isn't easy to distinguish. A kernel translates instructions to the hardware. Kernels are often written to particular families of hardware, so they're not as portable or generic as they seem. It amounts to different permutations of the code for each architecture family.
The BSP acts as sort of the inverse: it provides the tools & instructions to work with that board's specific set of hardware. In specific, controlled situations, the kernel could do this work. But the BSP enables any compatible kernel/OS/application stack to use that board, by following its configuration instructions.
If you just need to access CPU cycles & memory, maybe a few protocols (USB, Ethernet, and a couple of video types), a kernel with wide architecture support is fantastic, and there was a time when the breadth of that hardware abstraction was penultimately valued. But now, consider that the board may have a suite of sensors (accelerometer, magnetometer, gyroscope, light, proximity, atmospheric pressure, etc), telephony, there may be multiple CPUs, multiple GPUs, and so on. A kernel can be written to provide VGA, DVI, HDMI, DisplayPort, and several permutations of CPU/GPU combinations, if/when someone uses those particular hardware packages, but it's not practical to write support for all theoretical contexts, compared to utilizing a BSP that's built for a specific board. And even then, that would be for one kernel; the board is capable of supporting Linux, Windows, Android, Symbian, whatever.
That's why efforts like Yocto exist, to further decouple kernel and hardware. BSPs make hardware sets extensible beyond a kernel, OS, and application stack or two, while kernels make a particular OS or application stack portable over multiple hardware architectures.
Based on my experience, BSP is a much larger scope. It includes bootloader, rootfs, kernel, drivers, etc., which means having a BSP makes your board capable of booting itself up. Drivers make devices working and are just a part of BSP.
Drivers is not equal to BSP.
Today things are modular to increase reusability, and software development for embedded systems normally breaks down into three layers.
Kernel (which contain task handling, scheduling, and memory management)
Stack (which is the upper layer on the device drivers and provides protocol implementations for I²C, SPI, Ethernet, SDIO, serial, file system, networking, etc.)
BSP = Device drivers (which provide access to any controller's registers on hardware like registers of I²C, SDIO, SPI, Ethernet MAC address, UART (serial) and interrupt handling (ISR).
Board support package (device driver) is a software layer which changes with every board, keeping the other two software layers unchanged.
There is a conceptual link between board support packages and a HAL (Hardware Abstraction Layer) in the sense that the device drivers / kernel modules perform the hardware abstraction and the board support package provides an interface to the device drivers or is the hardware abstraction layer itself.
So basically a BSP has a functionality similar to the BIOS in the DOS era:
Additionally the BSP is supposed to perform the following operations
Initialize the processor
Initialize the bus
Initialize the interrupt controller
Initialize the clock
Initialize the RAM settings
Configure the segments
Load and run bootloader from flash
From: Board support package (Wikipedia)
The BIOS in modern PCs initializes and tests the system hardware
components, and loads a boot loader from a mass memory device which
then initializes an operating system. In the era of DOS, the BIOS
provided a hardware abstraction layer for the keyboard, display, and
other input/output (I/O) devices [device drivers] that standardized an interface to
application programs and the operating system. More recent operating
systems do not use the BIOS after loading, instead accessing the
hardware components directly.
Source: BIOS (Wikipedia)
Another aspect is the usage of device trees in BSPs, the device tree is a unifying or standardizing concept to describe the hardware of a machine:
U-boot boot loader and getting ready to ship
Doug Abbott, in Linux for Embedded and Real-Time Applications (Fourth
Edition), 2018
Device Trees
One of the biggest problems with porting an operating system such as
Linux to a new platform is describing the hardware. That is because
the hardware description is scattered over perhaps several dozen or so
device drivers, the kernel, and the boot loader, just to name a few.
The ultimate result is that these various pieces of software become
unique to each platform, the number of configuration options grows,
and every board requires a unique kernel image.
There have been a number of approaches to addressing this problem. The
notion of a “board support package” or BSP attempts to gather all of
the hardware-dependent code in a few files in one place. It could be
argued that the entire arch/ subtree of the Linux kernel source tree
is a gigantic board support package.
Take a look at the arch/arm/ subtree of the kernel. In there you will
find a large number of directories of the form mach-* and plat-*,
presumably short for “machine” and “platform,” respectively. Most of
the files in these directories provide configuration information for a
specific implementation of the ARM architecture. And of course, each
implementation describes its configuration differently.
Would not it be nice to have a single language that could be used to
unambiguously describe the hardware of a computer system? That is the
premise, and promise, of device trees.
The peripheral devices in a system can be characterized along a number
of dimensions. There are, for example, character vs block devices.
There are memory mapped devices, and those that connect through an
external bus such as I2C or USB. Then there are platform devices and
discoverable devices.
Discoverable devices are those that live on external busses, such as
PCI and USB, that can tell the system what they are and how they are
configured. That is, they can be “discovered” by the kernel. Having
identified a device, it is a fairly simple matter to load the
corresponding driver, which then interrogates the device to determine
its precise configuration.
Platform devices, on the other hand, lack any mechanism to identify
themselves. System on Chip (SoC) implementations, such as the Sitara,
are rife with these platform devices—system clocks, interrupt
controllers, GPIO, serial ports, to name a few. The device tree
mechanism is particularly useful for managing platform devices.
The device tree concept evolved in the PowerPC branch of the kernel,
and that is where it seems to be used the most. In fact, it is now a
requirement that all PowerPC platforms pass a device tree to the
kernel at boot time. The text representation of a device tree is a
file with the extension .dts. These .dts files are typically found in
the kernel source tree at arch/$ARCH/boot/dts.
A device tree is a hierarchical data structure that describes the
collection of devices and interconnecting busses of a computer system.
It is organized as nodes that begin at a root represented by “/,” just
like the root file system. Every node has a name and consists of
“properties” that are name-value pairs. It may also contain “child”
nodes.
Listing 15.1 is a sample device tree taken from the devicetree.org
website. It does nothing beyond illustrating the structure. Here we
have two nodes named node1 and node2. node1 has two child nodes, and
node2 has one child. Properties are represented by name=value. Values
can be strings, lists of strings, one or more numeric values enclosed
by square brackets, or one or more “cells” enclosed in angle brackets.
The value can also be empty if the property conveys a Boolean value by
its presence or absence.
Source: Board Support Package (ScienceDirect)
Via device tree overlays kernel modules can be loaded at boot time, i.e., on Raspberry Pi adding dtoverlay=lirc-rpi to /boot/config.txt loads the lirc-pi kernel module/device driver:
A future default config.txt may contain a section like this:
# Uncomment some or all of these to enable the optional hardware interfaces
#dtparam=i2c_arm=on
#dtparam=i2s=on
#dtparam=spi=on
If you have an overlay that defines some parameters, they can be
specified either on subsequent lines like this:
dtoverlay=lirc-rpi
dtparam=gpio_out_pin=16
dtparam=gpio_in_pin=17
dtparam=gpio_in_pull=down
Source: Configuration (Raspberry Pi Documentation)
When building BSPs with Yocto, all the hardware information that is scattered over device drivers, the kernel, the boot loader, etc. is gathered. Here is the developer's guide how this can be done in Yocto: Yocto Project Board Support Package Developer's Guide
[Yocto documentation]... states that the BSP “…is concerned with the
hardware-specific components only. At the end-distribution point, you
can ship the BSP combined with a build system and other tools.
However, it is important to maintain the distinction that these are
separate components that happen to be combined in certain end
products.”
Source: Board Support Package: what is it?
A board support package includes everything that is needed to use the board by an application. These include device drivers for the devices on the board and utility software for application programmers. A windowing environment is also available on multi-media boards. System engineers can further add extensions to the board. Some applications require reimplementing some part of the BSP for enhancements. Here the BSP plays a role of a reference implementation or a starting point for such requirements.
The confusion lies in the business model. The reference or development board is not an end/consumer product like a mobile device. It plays an important role to design and develop a product like iPhone or Samsung Galaxy.
A generic BSP will lack optimization in most cases therefore you can only expect a generic BSP for the newbie model or where optimization is left for you to be done. In case of cheap boards the BSP is quite generic because the producer will put less investment into it.
Don't stress much on the terms of kernel and user-space as there are also microkernels available. Here the drivers are part of user-space! Again think of a low-power board which only has one piece of code without any kernel. So it boils down to software that supports the board to do its job.
The driver is a program which says to the kernel like the behavior of the device... The device may be USB devices or camera or Bluetooth or it can be anything.
Based on the size of operation we classify into three char, block, network. But it only gives access to each device...It configures only the device alone not configure the memory, CPU speed. It does not give the instruction for that processor or controller. It is work on that processor or controller. Who enables that microcontroller who define the functionalities, who gives the starting point of the microcontroller. Who gives instructions. Now comes the answer like BSP.
BSP is a board support package which enables the bootloader. It gives the behavior of the system.
Consider the two scenarios,
One is having pc in pc having a USB connector option. All are okay. This is the first scenario
Second is I am having a pc, board alone in the board having USB. The board should talk to USB. What can I do?
In this case, I have a pc with an OS, so I do not need to think about the behaviour of the system. So I just enable the behavior of the device with system OS
In this, the board means that processor with all peripherals. In this case, we do not have an OS, so we need to make or we need to enable the behavior of that device.

How does the NX flag work?

Could you please explain what the NX flag is and how it works (please be technical)?
It marks a memory page non-executable in the virtual memory system and in the TLB (a structure used by the CPU for resolving virtual memory mappings). If any program code is going to be executed from such page, the CPU will fault and transfer control to the operating system for error handling.
Programs normally have their binary code and static data in a read-only memory section and if they ever try to write there, the CPU will fault and then the operating-system normally kills the application (this is known as segmentation fault or access violation).
For security reasons, the read/write data memory of a program is usually NX-protected by default. This prevents an attacker from supplying some application his malicious code as data, making the application write that to its data area and then having that code executed somehow, usually by a buffer overflow/underflow vulnerability in the application, overwriting the return address of a function in stack with the location of the malicious code in the data area.
Some legitimate applications (most notably high-performance emulators and JIT compilers) also need to execute their data, as they compile the code at runtime, but they specifically allocate memory with no NX flag set for that.
From Wikipedia
The NX bit, which stands for No
eXecute, is a technology used in CPUs
to segregate areas of memory for use
by either storage of processor
instructions (or code) or for storage
of data, a feature normally only found
in Harvard architecture processors.
However, the NX bit is being
increasingly used in conventional von
Neumann architecture processors, for
security reasons.
An operating system with support for
the NX bit may mark certain areas of
memory as non-executable. The
processor will then refuse to execute
any code residing in these areas of
memory. The general technique, known
as executable space protection, is
used to prevent certain types of
malicious software from taking over
computers by inserting their code into
another program's data storage area
and running their own code from within
this section; this is known as a
buffer overflow attack.
Have a look at this 'DEP' found on wikipedia which uses the NX bit. As for supplying the technical answer, sorry, I do not know enough about this but to quote:
Data Execution Prevention (DEP) is a security feature included in modern
Microsoft Windows operating systems that is intended to prevent an
application or service from executing code from a non-executable memory region.
....
DEP was introduced in Windows XP Service Pack 2 and is included in Windows XP
Tablet PC Edition 2005, Windows Server 2003 Service Pack 1 and later, Windows
Vista, and Windows Server 2008, and all newer versions of Windows.
...
Hardware-enforced DEP enables the NX bit on compatible CPUs, through the
automatic use of PAE kernel in 32-bit Windows and the native support on 64-bit
kernels.
Windows Vista DEP works by marking certain parts of memory as being intended to
hold only data, which the NX or XD bit enabled processor then understands as
non-executable.
This helps prevent buffer overflow attacks from succeeding. In Windows Vista,
the DEP status for a process, that is, whether DEP is enabled or disabled for a
particular process can be viewed on the Processes tab in the Windows Task
Manager.
See also here from the MSDN's knowledge base about DEP. There is a very detailed explanation here on how this works.
Hope this helps,
Best regards,
Tom.

Why does syscall need to switch into kernel mode?

I'm studying for my operating systems final and was wondering if someone could tell me why the OS needs to switch into kernel mode for syscalls?
A syscall is used specifically to run an operating in the kernel mode since the usual user code is not allowed to do this for security reasons.
For example, if you wanted to allocate memory, the operating system is privileged to do it (since it knows the page tables and is allowed to access memory of other processes), but you as a user program should not be allowed to peek or ruin the memory of other processes.
It's a way of sandboxing you. So you send a syscall requesting the operating system to allocate memory, and that happens at the kernel level.
Edit: I see now that the Wikipedia article is surprisingly useful on this
Since this is tagged "homework", I won't just give the answer away but will provide a hint:
The kernel is responsible for accessing the hardware of the computer and ensuring that applications don't step on one another. What would happen if any application could access a hardware device (say, the hard drive) without the cooperation of the kernel?