Linux not answered to the point yet - linux-device-driver

ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
int (*open) (struct inode *, struct file *);
Where does this struct file* point to? If we give a pointer variable of our own to struct file* and struct inode*. where is it pointing to? What is going on? I find the declarations in file operations struct.and definition of the same in driver program. But in driver program the pointer for struct file * and struct inode* comes out of nowhere? From where did u get those argument variables? Or can we give variable whatever we like?
If it’s the otherwise , how does that random variable we give serve the purpose.
I think the guy who first raised the question needs an answer for the same.
Rather than making it complex can someone explain in simple if you know?

read() and open() are userspace functions which operate on file descriptors. When a user runs an application which uses these functions, the kernel translates their call and populates the necessary information for the driver which instantiated the file. The kernel sort of "redirects" the userspace call and calls the driver read() and open() with the proper parameters filled in.
I'd recommend reading about driver file operations in LDD3, Chapter 3.

Related

share information between function(BPF/XDP)

Objective: If process id/name = xxx then drop the packet
So, I am bit confused. So far I know you can't extract process information from XDP but bpf trace allows you to trace it.
Here's my probable solution, use bpf hash maps to share information between two function. If process name == xx then XDP_DROP. (This maybe wrong, but something I was trying)
But I am confused how to use BPF_HASHMAPS, I read the documentation on bcc yet..
Example: From this hello function I can trace events
struct data_t {
u32 pid;
u64 ts;
char comm[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(events);
int hello(struct pt_regs *ctx) {
struct data_t data = {};
data.pid = bpf_get_current_pid_tgid();
data.ts = bpf_ktime_get_ns();
bpf_get_current_comm(&data.comm, sizeof(data.comm));
events.perf_submit(ctx, &data, sizeof(data));
return 0;
}
XDP function to drop packer
int udpfilter(struct xdp_md *ctx) {
bpf_trace_printk("got a packet\n");
//u32 cpu = bpf_get_smp_processor_id();
//bpf_trace_printk("%s looking\n",cpu);
//u32 pid = bpf_get_current_pid_tgid();
return XDP_DROP;
}
Now how do I fetch pid value and use it in XDP function, plus does the solution even makes any sense. Thanks for the help, really appreciated.
So, as you know eBPF programs can be loaded into the kernel at different locations. XDP programs are loaded just after the network driver and just before the network stack. At this point the kernel doesn't know for which process a packet might be since it will figure all of that out in the network stack.
The hello program you are showing is an example of a kprobe(kernel probe). It attaches to whatever kernel function you specify, but it is a tracing tool, can't make changes.
Also, some helper functions like bpf_get_current_pid_tgid are program type dependent. bpf_get_current_pid_tgid only works in kprobes, uprobes, tracepoint programs (perf programs), the may actually also work in socket and cGroup programs, the issue is that there is not a very clear list or overview of which work where, these are two good but non-comprehensive links:
https://blogs.oracle.com/linux/post/bpf-in-depth-bpf-helper-functions
https://github.com/iovisor/bcc/blob/master/docs/kernel-versions.md#program-types
In the end it comes down to logic. The kernel can only give you access to data and actions it has access to itself. So if you want to do network related things based on process ID's you might need to use an eBPF program attached at a location where such info is available(keep in mind that this is obviously also slower).
So depending on what exactly you want to do you have a few options:
Attach an eBPF program to a network socket(BPF_PROG_TYPE_SOCKET_FILTER) so you can filter packets on the socket level. This does require the program that creates the socket to attach the program to it.
Use a cGroup and BPF_PROG_TYPE_CGROUP_SKB program to block packets. Since you attach the program to the cGroup, this doesn't require cooperation from the program.
Use an TC program(BPF_PROG_TYPE_SCHED_ACT), on this level a packet is already parsed, but you still need to match it to a process
Use an XDP program(BPF_PROG_TYPE_XDP) can still be used, this does require you to parse all layers of the network packet(Ethernet, VLAN, IP, UDP/TCP), and then manually extract the protocol, Destination IP, and Destination port. Just like in the TC program you then need to match it to an pid using a lookup table.
When going the XDP or TC route you need to create this lookup table. As far as I know you can't access the table of the kernel via helper functions. A few approaches are:
parsing the output of netstat -lpn(protocol, destination ip, destination port and PID) and setting the data in a map to be used by a program
Getting the same data but directly from /sys or /proc(I don't know where the data is stored exactly)
Recording which PIDs have which sockets during creation(using a second program(kprobe/tracepoint)) and setting this data in a map shared by both the XDP/TC program and the trace program. (not quite sure how to share maps between programs in BCC, but it is certainly possible when using libbpf)

Behaviour of passing struct as a parameter to a CUDA kernel

I'm relatively new to CUDA programming, so I want to clarify the behaviour of a struct when I pass it into a kernel. I've defined the following struct to somewhat imitate the behavior of a 3D array that knows its own size:
struct protoarray {
size_t dim1;
size_t dim2;
size_t dim3;
float* data;
};
I create two variables of type protoarray, dynamically allocate space to data via malloc and cudaMalloc on the host and device side, and update dim1, dim2 and dim3 to reflect the size of array I want this struct to represent. I read in this thread that the struct should be passed via copy. So this is what I do in my kernel
__global__ void kernel(curandState_t *state, protoarray arr_device){
const size_t dim1 = arr_device.dim1;
const size_t dim2 = arr_device.dim2;
for(size_t j(0); j < dim2; j++){
for(size_t i(0); i < dim1; i++){
// Do something
}
}
}
The struct is passed by copy, so all its contents are copied into shared memory of each block. This is where I'm getting bizarre behaviour, which I'm hoping you could help me with. Suppose I had set arr_device.dim1 = 2 on the host side. While debugging inside the kernel and setting a breakpoint at one of the for loops, checking the value of arr_device.dim1 yields something like 16776576, nowhere large enough to cause overflow, but this value copies correctly into dim1 as 2, which means that the for loops execute as I intended them to. As a side question, is using size_t which is essential unsigned long long int bad practice, seeing as the GPU's are made of 32bit cores?
Generally, how safe is it to pass struct and class into kernels as arguments, is bad practice that should be avoided at all cost? I imagine that passing pointers to classes to kernels is difficult in case they contain members which point to dynamically allocated memory, and that they should be very lightweight if I want to pass them by value.
This is a partial answer, since without a proper program to look into, it is difficult/impossible to guess why you would see an invalid value in your arr_device.dim1.
The struct is passed by copy, so all its contents are copied into shared memory of each block.
Incorrect. Kernel arguments are stored in constant memory, which is device-global and not block-specific. They are not stored shared memory (which is block-specific).
When a thread runs, it typically reads arguments from constant memory into registers (and again, not shared memory).
Generally, how safe is it to pass struct and class into kernels as arguments
My personal rule of thumb on this matter is: If the struct/class...
is trivially-copyable; and
all its members of the struct/class are defined both for the host and the device side, or at least - designed with GPU use in mind;
then it should be safe to pass to a kernel.
passing struct and class into kernels as arguments [ - ] is [it] bad practice that should be avoided at all cost?
No. But remember that most C++ libraries only provide host-side code; and were not written with a mind of being used on a GPU. So I'd be wary of using non-trivial classes without a lot of scrutiny.
I imagine that passing pointers to classes to kernels is difficult in case they contain members which point to dynamically allocated memory
Yes, this can be problematic. However - if you used cuda::memory::managed::allocate(), cuda::memory::managed::make_unique() or cudaMallocManaged() - then this should "just work", i.e. the relevant memory pages will be fetched to the GPU or the CPU as necessary when accessed. See:
Unified Memory in CUDA for beginners
Beyond GPU Memory Limits with Unified Memory on Pascal
and that they should be very lightweight if I want to pass [objects to kernels] by value.
Yes, because each and every thread has to read each argument from constant memory before it can use that argument. And while constant memory allows this to happen relatively quickly, it's still a bunch of overhead that you want to minimize.
Also remember that you can't pass anything to kernels by (C++) reference; it's all "by-value" - the object itself or a pointer to it.

"nfds" parameter in select() in sockets programming

I never really got the idea behind this parameter, what is it good for? I also noticed this parameter is ignored in WinSock2, why is that?
Do Unix systems use this parameter or do they ignore it as well?
Windows' implementation of select() uses linked lists internally, so it doesn't need to use the nfds parameter for anything.
On other OS's, however, the fd_set struct is implemented to hold an array of bits (one bit per socket). For example, here is how it is declared (in sys/_types/_fd_def.h) under MacOS/X:
typedef struct fd_set {
__int32_t fds_bits[__DARWIN_howmany(__DARWIN_FD_SETSIZE, __DARWIN_NFDBITS)];
} fd_set;
... and in order to do the right thing, the select() call will have to loop over the bits in the array to see what they contain. By supplying select() with the nfds parameter, we tell the select() implementation that it only needs to iterate over the first (nfds) bits of the array, rather than always having to iterate over the entire array on every call. This allows select() to be more efficient than it would otherwise be.

New to socket programming, questions regarding "select()"

Currently in my degree we're starting to work with sockets.
I Have a couple of questions regarding polling for input from sockets,
using the select() function.
int select( int nfds,
fd_set *readfds,
fd_set *writefds,
fd_set *exceptfds,
const struct timespec *timeout);
We give select "nfds" param, which would normally would
be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
What are the file descriptors objects that we use? what is their purpose,
and why can't we just point "select" to the relevant socket structure?
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ?
I mean except for the fact that bind method expects this kind of pointer type ;)
Would appreciate some of the experts answers here :)
Thank you guys and have a great week!
We give select "nfds" param, which would normally would be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
Edit. (sorry, the previous text here was wrong) Just provide your socket descriptor + 1. I'm pretty sure it doesn't mean OS will check all the descriptors in [0, 1... descriptor] range.
What are the file descriptors objects that we use? what is their purpose, and why can't we just point "select" to the relevant socket structure?
File descriptors are usually integer values given to the user by OS. OS uses descriptors to control physical and logical resources - one file descriptor means OS has given you something file-like to control. Since Berkeley Sockets have read and write operations defined, they are file-like and socket objects essentially are plain file descriptors.
Answering why can't we just point "select" to the relevant socket structure? - we actually can. What exactly to pass to select depends on OS and language. In C you place your socket descriptor (plain int value most probably) into a fd_set. fd_set is then passed to select.
Edit.
An tiny example for Linux:
fd_set set;
FD_ZERO(&set);
FD_SET(socket_fd, &set);
// check if socket_fd is ready for reading
result = select(socket_fd + 1, &set, NULL, NULL, NULL);
if (result == -1) report_error(errno);
Docs.
Windows has similar code.
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
A blocking operation makes your thread wait until it is done. It's 99% of functions you use. If there are sockets ready for some IO, blocking select will return something immediately. It there are no such sockets, the thread will wait for them. Non-blocking select, in the latter case, won't wait and will return -1 (error).
As an example, try to implement single threaded server that is capable of working with multiple clients, including long operations like file transfer happening simultaneously. You definitely don't want to use blocking socket operations in this case.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ? I mean except for the fact that bind method expects this kind of pointer type ;)
Probably due to historical reasons, but I'm not sure. And there seems to be a fine answer on SO already.

How to use the select() function in socket programming?

The prototype is:
int select (int nfds,
fd_set *read-fds,
fd_set *write-fds,
fd_set *except-fds,
struct timeval *timeout);
I've been struggling to understand this function for quite some time. My question is, if it checks all the file descriptors from 0 to nfds-1, and will modify the read-fds, write-fds and except-fds when return, why do I need to use FD_SET to add file descriptors to the set at the begining, it will check all the file descriptors anyway, or not?
It won't check from 0 to nfds-1. The first argument just provides an upper bound on how large, numerically, the file descriptors used are. This is because the set itself might be represented as a bitvector, without a way to know how many bits are actually used. Specifying this as a separate argument helps select() avoid checking file descriptors that are not in use.
Also, a descriptor that is not in e.g. the read set when you call select() is not being checked at all, so it cannot appear in the set when the call returns, either.
I once had the same doubt as yours. You can look at following question and answers:
Query on Select System Call