How can I read a UDP segment in Kernel Space? - sockets

I create a module in kernel space that send a UPD segment using socket RAW, but my problem is read the UDP segment from kernel space.
I can read the UDP segment from user space, but when I prove to use "sock_recvmsg" from kernel space, I obtain as result -512
Please, help me!

I don't know why you feel the need to use a raw socket to send/recieve UDP - just use a UDP socket instead.
It may be that the structure you're supplying to sock_recvmsg for the address isn't right.
In general using networking from inside the kernel is a bad idea and should be avoided (not least, it ties your code to a specific kernel version). If you tell us what you're trying to do (ideally in the form of another question) maybe someone can suggest a better way.

Related

What's the difference between endpoint and socket?

Almost every definition of socket that I've seen, relates it very closely to the term endpoint:
wikipedia:
A network socket is an internal endpoint for sending or receiving data
at a single node in a computer network. Concretely, it is a
representation of this endpoint in networking software
This answer:
a socket is an endpoint in a (bidirectional) communication
Oracle's definition:
A socket is one endpoint of a two-way communication link between two
programs running on the network
Even stackoverflow's definition of the tag 'sockets' is:
An endpoint of a bidirectional inter-process communication flow
This other answer goes a bit further:
A TCP socket is an endpoint instance
Although I don't understand what "instance" means in this case. If an endpoint is, according to this answer, a URL, I don't see how that can be instantiated.
"Endpoint" is a general term, including pipes, interfaces, nodes and such, while "socket" is a specific term in networking.
IMHO - logically (emphasis added) "socket" and "endpoint" are same, because they both are concatenation of an Internet Address with a TCP port. Strictly technically speaking in core-networking, there is nothing like "endpoint", there is only "socket". Go on, read more below...
As #Zac67 highligted, "socket" is a very specific term in networking - if you read TCP RFC (https://www.rfc-editor.org/rfc/rfc793) then you won't find even a single reference of "endpoint", it only talks about "socket". But when you come out of RFC world, you will hear a lot about "endpoint".
Now, they both talk about combination of IP address and a TCP port, but you can't say someone that "please give me socket of your application", you will say "please give me endpoint of your application". So, IMHO the way someone can understand difference between Socket and Endpoint is - even though both refer to combination of IP address and TCP port, but you use term "socket" when you are talking in context of computer processes or in context of OS, otherwise when talking with someone in general you will use "endpoint".
I am a guy coming from embedded systems world and low level things,
Endpoint is a hardware buffer constructed at the far end of your machine, what does that mean?
YourMachine <---------------> Device
[Socket] ----------------> [Endpoint]
[Endpoint] <---------------- [Socket]
Both sockets and endpoints are endpoints but socket is an endpoint that resides on the sender which here your machine[Socket is a word used to distinguish between sender and receiver]
OK, now that we know it is a buffer, what is the relation between buffers and networking?
Windows
When you create a socket on Windows, the OS returns a handle to that socket, in fact socket is actually a kernel object, so in Windows when you create a kernel object the returned value is a handle which is used to access that object, usually handles are void* which is then casted into numerical value that Windows can understand, now that you have access to socket kernel object, all IO operations are handled in the OS kernel and since you want to communicate with external device then you have to reach kernel first and the socket is exactly doing this, in other words, socket creates a socket object in the kernel = creates an endpoint in the kernel = creates a buffer in the kernel, that buffer is used to stream data through wires later on using OS HAL(Hardware abstraction layer) and you can talk to other devices and you are happy
Now, if the other device doesn't have communication buffer = endpoint, then you can't communicate with it, even if you open a socket on your end, it has to be two way data communication = Send and Receive
Another example of accessing IO peripheral is accessing RAM (Main memory), two ways of accessing RAM, either you access process stack or access process heap, the stack is not a kernel object in fact you can access stack directly without reaching OS kernel, simply by subtracting a value from RSP(Stack pointer register), example:
; This example demonstrates how to allocate 32 contiguous bytes from stack on Windows OS
; Intel syntax
StackAllocate proc
sub rsp, 20h
ret
StackAllocate endp
Accessing heap is different, the heap is a kernel object, so when you call malloc()/new operator in your code a long call stack is called through windows code, the point is reaching RAM requires kernel help, the stack allocation above is actually not reaching RAM, all I did is subtracting a number of an existing value in RSP which is inside CPU so I did not go outside, the heap object in kernel returns a handle that Windows use to manage fragmented memory and in the end returns a void* to that memory
Hope that helped

How does socketcan handle arbitration?

I pretty much understand how the CAN protocol works -- when two nodes attempt to use the network at the same time, the lower id can frame gets priority and the other node detects this and halts.
This seems to get abstracted away when using socketcan - we simply write and read like we would any file descriptor. I may be misunderstanding something but I've gone through most of the docs (http://lxr.free-electrons.com/source/Documentation/networking/can.txt) and I don't think it's described unambiguously.
Does write() block until our frame is the lowest id frame, or does socketcan buffer the frame until the network is ready? If so, is the user notified when this occurs or do we use the loopback for this?
write does not block for channel contention. It could block because of the same reasons a TCP socket write would (very unlikely).
The CAN peripheral will receive a frame to be transmitted from the kernel and perform the Medium Access Control Protocol (MAC protocol) to send it over the wire. SocketCAN knows nothing about this layer of the protocol.
Where the frame is buffered is peripheral/driver dependent: the chain kernel-driver-peripheral behaves as 3 chained FIFOs with their own control flow mechanisms, but usually, it is the driver that buffers (if it is needed) the most since the peripheral has less memory available.
It is possible to subscribe for errors in the CAN stack protocol (signaled by the so called "error frames") by providing certain flags using the SocketCAN interface (see 4.1.2 in your link): this is the way to get error information at application layer.
Of course you can check for a correctly transmitted frame by checking the loopback interface, but it is overkill, the error reporting mechanism described above should be used instead and it is easier to use.

netlink connector sockets

I have worked with network programming before. But this is my first foray into netlink sockets.
I have chosen to study the 'connector' type of netlink sockets. As with any kernel component, it has a user counterpart as well. The linux kernel has a sample program called ucon.c which can be used to build userspace programs based on the aforementioned connector netlink sockets.
So here I wish to pin-point parts of the program that I want to confirm my understanding of and of parts of the program that I do not follow the logic of. Enough talking. Here we go. Please correct me wherever I go astray.
As far as I have understood, netlink sockets are a IPC method used to connect processes on the same machine and hence process ID is used as an identifier. And since netlink messages can be ideally multicast, another identifier that is needed by the netlink socket is the message group. All components that are connected to the same message group are in fact related. So while in case of IPv4, we use a sockaddr_in in place of the sockaddr, here we use a sockaddr_nl which contains the above mentioned identifiers.
Now, since we are not going to use the TCP/IP stack of the kernel, in case of netlink messages, netlink packets can be considered to be raw (please correct me here if I am wrong). Hence the only encapsulation that the netlink packet goes through is the netlink message header defined as nlmsghdr.
Now coming on to our program ucon, main() first creates a NETLINK family socket with the connector protocol. Then it fills up the aforementioned netlink socketaddress structure with the relevant information. In order to be a little experimental here, I have added an entry in the connector.h file. Now here comes my first question.
A connector message has a certain type defined in connector.h. Now this connector message structure is something that is completely internal to netlink right? As in, as far as netlink is concerned, this is all but payload. Right?
Moving on, what exactly does the nl-group field mean within the netlink message header structure? The definition does not really contain an element of this name. So are we using overlay techniques to fill certain fields of the netlink message header? And if so, what exactly is the correspondence? I cannot seem to find it anywhere.
So after binding the socket address to the socket, it is sending 10,000 unique pieces of connector based data, which as far as netlink is concerned, is pure payload. But what is strange as far as these messages are concerned is, that all of them seem to have the same sequence number.
Moving on, we find ourselves in the netlink_send subroutine to send these packets via the socket that we are bound to above. This subroutine uses a variety of netlink helper macros to manipulate the data to send. As we say above, the main() function sends 10,000 pieces of data, each of whom is zero-length and requires no acknowledgement, since the ack field is 0 (please correct me if I am wrong here). So each 'packet' is nothing but a connector message header without anything in it. Right?
Now what is surprising is that the netlink_Send function uses the same sequence number as the main() since it is a global variable. However, after the post increment in main(), it is now '1'. So basically our netlink talk is starting with a sequence number of '1'. Is that fine?
Looking into some of the helper macros defined in linux/netlink.h, I will try to summarize my understanding of the ones that are directly or indirectly being used in this program.
#define NLMSG_LENGTH(len) ((len)+NLMSG_ALIGN(NLMSG_HDRLEN))
So this macro will first align the netlink message header length and then add the payload length to it. For our case the netlink payload is a connector header without any payload of its own.
In our case, this micro is used like so
nlh->nlmsg_len = NLMSG_LENGTH(size - sizeof(*nlh));
Here, what I do not understand is the actual payload of the netlink message. In the above case, it is the size of the connector message header (since the connector message itself contains no payload of its own) minus the pointer (which is pointing to the first byte of the netlink message and thereby the netlink message header). And this pointer is (like any other pointer variable) equal to the machine word size which in my case is 4 bytes. Why are we substracting this from the connector message header?
After that, we send the message over this netlink socket just like any other IPv4 socket. hope to hear from you fellows out there with regards to the above mentioned questions. Including some sentences before the actual quesion would help as my post is rather long. But I hope it would be useful to people more than just myself.
Regards.

How to set a timeout in connect/send ? ( as400 iseries v5r4, rpg )

From this rpg socket tutorial we created a socket client in rpg that calls a java server socket
The problem is that connect()/send() operations blocks and we have a requirement that if the connect/send couldn't be done in a matter of a second per say, we have to just log it and finish.
If I set the socket to non-blocking mode (I think with fnctl), we are not fully understanding how to proceed, and can't find any useful documentation with examples for it.
I think if I do connect to a non-blocking socket I have to do select(..., timeout) which tells us if the connect succeed and/ we are able to send(bytes). But, if we send(bytes) afterwards, as it is now a non-blocking socket (which will immediately return after the call), how do I know that send() did the actual sending of the bytes to the server before closing the socket ?
I can fall back to have the client socket in AS400 as a Java or C procedure, but I really want to just keep it in a simple RPG program.
Would somebody help me understand how to do that please ?
Thanks !
In my opinion, that RPG tutorial you mention has a slight defect. What I believe is causing your confusion is the following section's code:
...
Consequently, we typically call the
send() API like this:
D miscdata S 25A
D rc S 10I 0
C eval miscdata = 'The data to send goes here'
C eval rc = send(s: %addr(miscdata): 25: 0)
c if rc < 25
C* for some reason we weren't able to send all 25 bytes!
C endif
...
If you read the documentation of send() you will see that the return value does not indicate an error if it is greater than -1 yet in the code above it seems as if an error has occurred. In fact, the sum of the return values must equal the size of the buffer assuming that you keep moving the pointer into the buffer to reflect what has been sent. Look here in Beej's Guide to Network Programming. You might also like to look at Richard Stevens' book UNIX Network Programming, Volume 1 for really detailed explanations.
As to the problem of determining if the last send before close() did the actual send ... well the paragraph above explains how to determine what portion of the data was sent. However, calling close() will attempt to send all unsent data unless SO_LINGER is set.
fnctl() is used to control blocking while setsockopt() is used to set SO_LINGER.
The abstraction of network communications being used is BSD sockets. There are some slight differences in implementations across OS's but it is generally quite homogeneous. This means that one can generally use documentation written for other OS's for the broad overview. Most of the time.

Socket Protocol Fundamentals

Recently, while reading a Socket Programming HOWTO the following section jumped out at me:
But if you plan to reuse your socket for further transfers, you need to realize that there is no "EOT" (End of Transfer) on a socket. I repeat: if a socket send or recv returns after handling 0 bytes, the connection has been broken. If the connection has not been broken, you may wait on a recv forever, because the socket will not tell you that there's nothing more to read (for now). Now if you think about that a bit, you'll come to realize a fundamental truth of sockets: messages must either be fixed length (yuck), or be delimited (shrug), or indicate how long they are (much better), or end by shutting down the connection. The choice is entirely yours, (but some ways are righter than others).
This section highlights 4 possibilities for how a socket "protocol" may be written to pass messages. My question is, what is the preferred method to use for real applications?
Is it generally best to include message size with each message (presumably in a header), as the article more or less asserts? Are there any situations where another method would be preferable?
The common protocols either specify length in the header, or are delimited (like HTTP, for instance).
Keep in mind that this also depends on whether you use TCP or UDP sockets. Since TCP sockets are reliable you can be sure that you get everything you shoved into them. With UDP the story is different and more complex.
These are indeed our choices with TCP. HTTP, for example, uses a mix of second, third, and forth option (double new-line ends request/response headers, which might contain the Content-Length header or indicate chunked encoding, or it might say Connection: close and not give you the content length but expect you to rely on reading EOF.)
I prefer the third option, i.e. self-describing messages, though fixed-length is plain easy when suitable.
If you're designing your own protocol then look at other people's work first; there might already be something similar out there that you could either use 'as is' or repurpose and adjust. For example; ISO-8583 for financial txns, HTTP or POP3 all do things differently but in ways that are proven to work... In fact it's worth looking at these things anyway as you'll learn a lot about how real world protocols are put together.
If you need to write your own protocol then, IMHO, prefer length prefixed messages where possible. They're easy and efficient to parse for the receiver but possibly harder to generate if it is costly to determine the length of the data before you begin sending it.
The decision should depend on the data you want to send (what it is, how is it gathered). If the data is fixed length, then fixed length packets will probably be the best. If data can be easily (no escaping needed) split into delimited entities then delimiting may be good. If you know the data size when you start sending the data piece, then len-prefixing may be even better. If the data sent is always single characters, or even single bits (e.g. "on"/"off") then anything different than fixed size one character messages will be too much.
Also think how the protocol may evolve. EOL-delimited strings are good as long as they do not contain EOL characters themselves. Fixed length may be good until the data may be extended with some optional parts, etc.
I do not know if there is a preferred option. In our real-world situation (client-server application), we use the option of sending the total message length as one of the first pieces of data. It is simple and works for both our TCP and UDP implementations. It makes the logic reasonably "simple" when reading data in both situations. With TCP, the amount of code is fairly small (by comparison). The UDP version is a bit (understatement) more complex but still relies on the size that is passed in the initial packet to know when all data has been sent.