Objective: If process id/name = xxx then drop the packet
So, I am bit confused. So far I know you can't extract process information from XDP but bpf trace allows you to trace it.
Here's my probable solution, use bpf hash maps to share information between two function. If process name == xx then XDP_DROP. (This maybe wrong, but something I was trying)
But I am confused how to use BPF_HASHMAPS, I read the documentation on bcc yet..
Example: From this hello function I can trace events
struct data_t {
u32 pid;
u64 ts;
char comm[TASK_COMM_LEN];
};
BPF_PERF_OUTPUT(events);
int hello(struct pt_regs *ctx) {
struct data_t data = {};
data.pid = bpf_get_current_pid_tgid();
data.ts = bpf_ktime_get_ns();
bpf_get_current_comm(&data.comm, sizeof(data.comm));
events.perf_submit(ctx, &data, sizeof(data));
return 0;
}
XDP function to drop packer
int udpfilter(struct xdp_md *ctx) {
bpf_trace_printk("got a packet\n");
//u32 cpu = bpf_get_smp_processor_id();
//bpf_trace_printk("%s looking\n",cpu);
//u32 pid = bpf_get_current_pid_tgid();
return XDP_DROP;
}
Now how do I fetch pid value and use it in XDP function, plus does the solution even makes any sense. Thanks for the help, really appreciated.
So, as you know eBPF programs can be loaded into the kernel at different locations. XDP programs are loaded just after the network driver and just before the network stack. At this point the kernel doesn't know for which process a packet might be since it will figure all of that out in the network stack.
The hello program you are showing is an example of a kprobe(kernel probe). It attaches to whatever kernel function you specify, but it is a tracing tool, can't make changes.
Also, some helper functions like bpf_get_current_pid_tgid are program type dependent. bpf_get_current_pid_tgid only works in kprobes, uprobes, tracepoint programs (perf programs), the may actually also work in socket and cGroup programs, the issue is that there is not a very clear list or overview of which work where, these are two good but non-comprehensive links:
https://blogs.oracle.com/linux/post/bpf-in-depth-bpf-helper-functions
https://github.com/iovisor/bcc/blob/master/docs/kernel-versions.md#program-types
In the end it comes down to logic. The kernel can only give you access to data and actions it has access to itself. So if you want to do network related things based on process ID's you might need to use an eBPF program attached at a location where such info is available(keep in mind that this is obviously also slower).
So depending on what exactly you want to do you have a few options:
Attach an eBPF program to a network socket(BPF_PROG_TYPE_SOCKET_FILTER) so you can filter packets on the socket level. This does require the program that creates the socket to attach the program to it.
Use a cGroup and BPF_PROG_TYPE_CGROUP_SKB program to block packets. Since you attach the program to the cGroup, this doesn't require cooperation from the program.
Use an TC program(BPF_PROG_TYPE_SCHED_ACT), on this level a packet is already parsed, but you still need to match it to a process
Use an XDP program(BPF_PROG_TYPE_XDP) can still be used, this does require you to parse all layers of the network packet(Ethernet, VLAN, IP, UDP/TCP), and then manually extract the protocol, Destination IP, and Destination port. Just like in the TC program you then need to match it to an pid using a lookup table.
When going the XDP or TC route you need to create this lookup table. As far as I know you can't access the table of the kernel via helper functions. A few approaches are:
parsing the output of netstat -lpn(protocol, destination ip, destination port and PID) and setting the data in a map to be used by a program
Getting the same data but directly from /sys or /proc(I don't know where the data is stored exactly)
Recording which PIDs have which sockets during creation(using a second program(kprobe/tracepoint)) and setting this data in a map shared by both the XDP/TC program and the trace program. (not quite sure how to share maps between programs in BCC, but it is certainly possible when using libbpf)
Related
Is it possible to share an ebpf Map between two network interfaces.
I want to write an XDP program and hook it on two devices namely eth0 and eth1. The implementation requires that they both use the same map.
Is it possible to load the same program, hooking them at eth0 and eth1 and use the same Map.
Thank you all!
Yes, this is entirely possible. An eBPF map is not attached to an interface, it is created in the kernel and then referenced by one or several of:
A file descriptor, from the application that created the map,
An eBPF program that uses the map,
A reference in another map of a dedicated type (array-of-maps, hash-of-maps),
A pinned path in the eBPF virtual file system.
eBPF maps can be shared between consecutive program runs, between kernel and user space, or - as for your use case - between different eBPF programs, no matter what interfaces they are attached to. Note that a given program could even be detached and later re-attached to another interface, anyway.
For reusing a map for several programs, this is done by pointing to the same map when you load your programs: What happens when you load one is that you get a handle to the map (its id or its pinned path, for example), get a file descriptor to the map from that handle, and place this file descriptor in your eBPF bytecode before loading it. Then the kernel translates the file descriptor into the relevant memory address.
What happens most of the time in practice is that this “relocation” step (placing the file descriptor in the bytecode) is handled for you by the framework you are using to load your programs, such as libbpf or bcc tools. For example, libbpf has a bpf_map__reuse_fd(struct bpf_map *map, int fd) function to explicitly reuse a given file descriptor for a specific map used by a program, after parsing an object file to extract the bytecode but before loading it.
I want to write to the holding registers an actuator of mine has, but I've only recently started using pymodbus and I'm a little uncertain about some of its commands. As far as I've understand the primary write command looks like this:
write_register(address, value, **kwargs)
But I'm uncertain about how I "define" which kind of register I write too. Unlike other modbus libraries I've used, in pymodbus I can't define which kind of register I write too. As far as I understand of modbus there's coils, input registers and holding registers. I've read that you can't write to input registers, but how can I be certain I write to holding registers? If I write 1 in address in the write_register will that always be a holding register with that address?
Thank you in advance.
You need to know a little bit more about your actuator. Do you have a datasheet about it? If yes, you need to find a map registers where all the addresses are given for registers where the actuator's information are stored. Here an example where you can find this kind of registers map : SIMEAS Pxxx Com-Modbus Datasheet.
Be careful, sometimes you need to add or subtrace 1 to your register address because the guys who implement this did in this way. Read your datasheet carefully.
So if you need to write something in a particular register, you need the correct address of the register, the value you want to write in the correct format, and the client of the library ModbusClient.
You should have something like this :
client = ModbusClient(host='YOUR_IP_ADRESS', port=xxx)
client.connect()
client.write_register(REGISTER_ADRESS, PAYLOAD, UNIT)
Hope this will help a little. You can also go further by reading the Pymodbus documentation.
So BPF programs are the kernel entities, as they run within the kernel space. On the other hand, Linux namespaces aka containers, provide application-level isolation, in which case they all share the host's kernel, kernel modules etc.
So I guess it doesn't make sense to load a bpf program per container, as it will become visible on the host as well?
Therefore I would guess that bpf programs would load on the host and monitor/mangle/etc. packets to/from namespaces. Given that struct sock has the information about namespace id, I think only certain types of bpf programs would be able to do that?
So I guess it doesn't make sense to load a bpf program per container, as it will become visible on the host as well?
If you mean loading a BPF program from a container, then yes it would be visible on the host as well (and you would need a privileged container in order to do that).
Given that struct sock has the information about namespace id, I think only certain types of bpf programs would be able to do that?
I couldn't find any BPF program type that has direct access to struct sock. BPF programs of type sockops have access to struct bpf_sock, but it contains few actual information.
You could use BPF programs of type cgroup/skb though. Those are attached to cgroups and can act on both ingress and egress packets. They receive a struct __sk_buff object as argument, a mirror of the sk_buff for the packet received/sent. They can only use a few helpers (in addition to the common base), and don't seem to have write access to packets.
kprobe BPF programs have access to any kernel function kprobes can be attached to. So you could retrieve the namespace information by probing the appropriate function and then send it to your monitor/mangle/etc. program through a bpf map. Not the simplest option though.
Almost every definition of socket that I've seen, relates it very closely to the term endpoint:
wikipedia:
A network socket is an internal endpoint for sending or receiving data
at a single node in a computer network. Concretely, it is a
representation of this endpoint in networking software
This answer:
a socket is an endpoint in a (bidirectional) communication
Oracle's definition:
A socket is one endpoint of a two-way communication link between two
programs running on the network
Even stackoverflow's definition of the tag 'sockets' is:
An endpoint of a bidirectional inter-process communication flow
This other answer goes a bit further:
A TCP socket is an endpoint instance
Although I don't understand what "instance" means in this case. If an endpoint is, according to this answer, a URL, I don't see how that can be instantiated.
"Endpoint" is a general term, including pipes, interfaces, nodes and such, while "socket" is a specific term in networking.
IMHO - logically (emphasis added) "socket" and "endpoint" are same, because they both are concatenation of an Internet Address with a TCP port. Strictly technically speaking in core-networking, there is nothing like "endpoint", there is only "socket". Go on, read more below...
As #Zac67 highligted, "socket" is a very specific term in networking - if you read TCP RFC (https://www.rfc-editor.org/rfc/rfc793) then you won't find even a single reference of "endpoint", it only talks about "socket". But when you come out of RFC world, you will hear a lot about "endpoint".
Now, they both talk about combination of IP address and a TCP port, but you can't say someone that "please give me socket of your application", you will say "please give me endpoint of your application". So, IMHO the way someone can understand difference between Socket and Endpoint is - even though both refer to combination of IP address and TCP port, but you use term "socket" when you are talking in context of computer processes or in context of OS, otherwise when talking with someone in general you will use "endpoint".
I am a guy coming from embedded systems world and low level things,
Endpoint is a hardware buffer constructed at the far end of your machine, what does that mean?
YourMachine <---------------> Device
[Socket] ----------------> [Endpoint]
[Endpoint] <---------------- [Socket]
Both sockets and endpoints are endpoints but socket is an endpoint that resides on the sender which here your machine[Socket is a word used to distinguish between sender and receiver]
OK, now that we know it is a buffer, what is the relation between buffers and networking?
Windows
When you create a socket on Windows, the OS returns a handle to that socket, in fact socket is actually a kernel object, so in Windows when you create a kernel object the returned value is a handle which is used to access that object, usually handles are void* which is then casted into numerical value that Windows can understand, now that you have access to socket kernel object, all IO operations are handled in the OS kernel and since you want to communicate with external device then you have to reach kernel first and the socket is exactly doing this, in other words, socket creates a socket object in the kernel = creates an endpoint in the kernel = creates a buffer in the kernel, that buffer is used to stream data through wires later on using OS HAL(Hardware abstraction layer) and you can talk to other devices and you are happy
Now, if the other device doesn't have communication buffer = endpoint, then you can't communicate with it, even if you open a socket on your end, it has to be two way data communication = Send and Receive
Another example of accessing IO peripheral is accessing RAM (Main memory), two ways of accessing RAM, either you access process stack or access process heap, the stack is not a kernel object in fact you can access stack directly without reaching OS kernel, simply by subtracting a value from RSP(Stack pointer register), example:
; This example demonstrates how to allocate 32 contiguous bytes from stack on Windows OS
; Intel syntax
StackAllocate proc
sub rsp, 20h
ret
StackAllocate endp
Accessing heap is different, the heap is a kernel object, so when you call malloc()/new operator in your code a long call stack is called through windows code, the point is reaching RAM requires kernel help, the stack allocation above is actually not reaching RAM, all I did is subtracting a number of an existing value in RSP which is inside CPU so I did not go outside, the heap object in kernel returns a handle that Windows use to manage fragmented memory and in the end returns a void* to that memory
Hope that helped
Currently in my degree we're starting to work with sockets.
I Have a couple of questions regarding polling for input from sockets,
using the select() function.
int select( int nfds,
fd_set *readfds,
fd_set *writefds,
fd_set *exceptfds,
const struct timespec *timeout);
We give select "nfds" param, which would normally would
be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
What are the file descriptors objects that we use? what is their purpose,
and why can't we just point "select" to the relevant socket structure?
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ?
I mean except for the fact that bind method expects this kind of pointer type ;)
Would appreciate some of the experts answers here :)
Thank you guys and have a great week!
We give select "nfds" param, which would normally would be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
Edit. (sorry, the previous text here was wrong) Just provide your socket descriptor + 1. I'm pretty sure it doesn't mean OS will check all the descriptors in [0, 1... descriptor] range.
What are the file descriptors objects that we use? what is their purpose, and why can't we just point "select" to the relevant socket structure?
File descriptors are usually integer values given to the user by OS. OS uses descriptors to control physical and logical resources - one file descriptor means OS has given you something file-like to control. Since Berkeley Sockets have read and write operations defined, they are file-like and socket objects essentially are plain file descriptors.
Answering why can't we just point "select" to the relevant socket structure? - we actually can. What exactly to pass to select depends on OS and language. In C you place your socket descriptor (plain int value most probably) into a fd_set. fd_set is then passed to select.
Edit.
An tiny example for Linux:
fd_set set;
FD_ZERO(&set);
FD_SET(socket_fd, &set);
// check if socket_fd is ready for reading
result = select(socket_fd + 1, &set, NULL, NULL, NULL);
if (result == -1) report_error(errno);
Docs.
Windows has similar code.
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
A blocking operation makes your thread wait until it is done. It's 99% of functions you use. If there are sockets ready for some IO, blocking select will return something immediately. It there are no such sockets, the thread will wait for them. Non-blocking select, in the latter case, won't wait and will return -1 (error).
As an example, try to implement single threaded server that is capable of working with multiple clients, including long operations like file transfer happening simultaneously. You definitely don't want to use blocking socket operations in this case.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ? I mean except for the fact that bind method expects this kind of pointer type ;)
Probably due to historical reasons, but I'm not sure. And there seems to be a fine answer on SO already.