application layer protocol - different size of packets - sockets

Assume I have defined my own application layer protocol on top of TCP for Instant Messaging. I have used a packet structure for the messages. As I am using symmetric (AES) and asymmetric (RSA) encryption, I obtain a different
packet size for different message types. Now to my questions.
How to read from a socket that I receive a single application layer packet?
What size should I specify?
Thanks in advance.
I have two approaches in mind.
Read from the TCP stream a fixed amount of bytes that represents the
actual packet size, and finally re-read from the stream the former gathered size of bytes.
Read the maximal packet size from the stream. Verify the actual size of
obtained bytes and decide so which message type it was.
Now, a more general question. Should I provide metadata like the packet size, encryption method, receiver, sender, etc.? If yes, should I encrypt these meta data as well?

Remember that with TCP, when reading from the network, there is no guarantee about the number of bytes received at that point in time. That is, a client might send a full packet in its write(), but that does not mean that your read() will receive the same number of bytes. Thus your code will always need to read some number of bytes from the network, then verify (based on the accumulated data) that you have received the necessary number of bytes, and then you can verify the packet (type, contents, etc) from there.
Some applications use state machine encoders/decoders and fixed size buffers for reading/writing their network data; other applications dynamically allocate buffers large enough for the "full packet", then continue reading bytes from the network until the "full packet" buffer is full. Which approach you take depends on your application. Thus the size you use for reading is not as important as how your code ensures that it has received a full packet.
As for whether you should encrypt additional metadata, that depends very much on your threat model (i.e. what threats your protocol wants to guard against, what assurances your protocol needs to provide to its clients/users). There's no easy way to answer that question without more context/details.
Hope this helps!

Related

Why does QUIC packets require a sequence number?

This question was closed on the Networking SE because "questions about protocols above OSI layer-4 are off-topic here" so I'm trying here.
This may be a silly question, but if in QUIC we maintain separate sliding windows for each stream, why is there a need for sequencing even below the stream level?
It seems to me that an application will not receive the same data twice because we already sequence streams by themselves, and we also can acknowledge each stream separately without sequencing the packets themselves.
I presume you mean packet number, as there isn't a sequence number as such in QUIC.
If so, then the packet number used in QUIC performs a number of roles, such as being used within the ACK process, sequencing at the packet level, and also is used as part of the nonce input into the AEAD encryption process.
From https://datatracker.ietf.org/doc/html/rfc9001 5.3 AEAD
The nonce, N, is formed by combining the packet protection IV with the packet number.
From https://datatracker.ietf.org/doc/html/rfc9000 13.2.3. Managing ACK Ranges
A receiver SHOULD include an ACK Range containing the largest received packet number in every ACK frame.

Is there a guideline for the maximum buffer size I send over a socket?

Sorry if this question has been asked before (I could not find any questions similar to mine), but is there a "maximum" buffer size that I should be sending over a socket at one time? If I were to for example send over data with a buffer size equal to that of the maximum allowed by sockets, would there be anything bad about that? Thanks in advance for any help.
It depends on the kind of sockets. With TCP a connection is a byte stream and the OS will take care of how best to split the bytes into packets and concatenate these together at the other side.
With UDP instead each send will result in a single UDP message (datagram) and there is an upper limit of 64k for the size of a datagram. But even while UDP supports 64k datagrams in theory it is not a good idea to use that large messages. Since the maximum transfer unit of the underlying layer is much smaller (like around 1500 bytes for Ethernet) the message needs to be fragmented and it is easy to loose a single fragment - in which case the whole message is considered lost.
You can send as much data as you want - that's the point of the abstraction. The underlying layers will do what they want, breaking things into chunks as necessary, but you shouldn't have to care about that as a user of the interface. The read and write interfaces both will return the length actually transmitted in the case of a partial read, so a simple loop should be sufficient to do a large transfer.

TCP File Transfer - Kernel copy mechanisms

I am wondering, if the speed of copy processes between user and kernel space, and in general within the whole tcp send/receive process, is dependent on the type of file (.txt, .mp4).
I mean not the file size, but the "structure" of the bytes or anything. I searched for quite a while but did not find anything related. Are there helpful phrases or terms I could have a look for ?
Thanks in advance!
TCP has no idea of the application level structure of a file and thus cannot make any decisions based on this. All what it cares about is a byte stream.
But it is not uncommon that how the application interacts with the kernel depends on the specific protocol and that this can have noticeable effects on the performance: for example one application might just write 1000 bytes at once to the kernel while another writes a 500 byte HTTP header followed by a 500 byte HTTP body. These might in both cases be the exact same bytes but more context switches are involved in the second case due to more syscalls and depending on the socket options this might also result in two TCP packets instead of one, where each TCP packet has a noticeable overhead in bytes and in processing time.

How bad is ip fragmentation

I understand that when sending ip messages around, each hop in the network path between be and my packet's destination will check if the next hop's MTU is bigger than the size of the packet I sent. If so, the packet will be fragmented and the two packets will be separately sent to the next hop, only to be reassembled at destination (or, in some cases, at the first NAT router encountered).
As far as I understand, this thing can be pretty bad, but I don't really understand why.
I understand that if the connection tends to drop a lot of packets, losing a single fragment means I have to resend the whole packet (this is actually the only thing I figured out myself)
Is there a chance that instead of being fragmented my packet will just be dropped?
How are packet fragments identified? Can I be 100% sure that they will be reassembled correctly? On example, if I send two ip packets of the same length nearly simultaneously to the same destination, how likely it is that fragments of the two will be swaped, like AAA, BBB reassembled into ABA, BAB?
In principle, if packets aren't dropped and fragments are reassembled correctly, actually using packet fragmentation seems like a good idea to save on local bandwidth and avoid having to send more and more headers instead of just one big packet.
Thank you
IP fragmentation can cause several problems:
1) Application layer loss is increased
As you mentioned, if a single fragment is dropped, the entire layer 4 packet will be lost. Thus, for a network with a small random packet loss rate, the application layer loss rate is increased by a factor approximately equal to the number of fragments for each layer 4 packet.
2) Not all networks handle fragmented packets
Some systems, such as Google's Compute Engine, do not reassemble fragmented packets.
3) Fragmentation can cause re-ordering
When routers split traffic down parallel paths, they may try to keep packets from the same flow on a single path. Because only the first fragment has layer 4 information like UDP/TCP port number, subsequent fragments may be routed down a different path, delaying assembly of the layer 4 packet and causing re-ordering.
4) Fragmentation can cause confusing behavior that is hard to debug
For example, if you send two UDP streams, A and B, from one source to a destination running Linux, the destination may discard packets from one of the streams. This is because by default, Linux "times out" fragment queues if more than 64 other fragments have been received from the same source. If stream A has a much higher data rate than stream B, 64 fragments from stream A may arrive in between the fragments from stream B, causing the B fragment to be dropped.
Thus, while IP fragmentation can reduce overhead by minimizing user headers, it may cause more trouble than it is worth.
To my knowledge, the only case where packets will be dropped rather than fragmented (barring cases where it would be dropped anyway), is packets which are marked "don't fragment". These packets are to be discarded rather than being fragmented.
Fragmented packets have identifier, fragment offset, and more fragments fields in their headers that, when combined, allow the destination host to reliably reassemble the packet upon receipt of all the fragments. The first fragment's offset is zero, and the last fragment has the more fragments flag set to zero. It is still possible (although very unlikely) to reassemble an incorrect packet if two packets' headers are mutated so their fragment offsets are exchanged, but their checksums are still valid. The probability of this happening is essentially zero. Bear in mind that IP does not provide any mechanism for ensuring the integrity of the data payload, only the integrity of the control information in the header.
Packet fragmentation necessarily wastes bandwidth because each fragment has a copy of [most of] the original datagram's header. Packets can be fragmented down to only 8 bytes per fragment, so we could have a maximum-sized packet at 60 + 65536 bytes fragmented into 60 * 8192 + 65536 bytes, yielding a payload increase of about 750% in the worst case. The only example I can come up with where you would come out ahead is if you fragmented a packet in order to send its fragments in parallel using some kind of Frequency Division Multiplexing scheme with the knowledge that the other channels are free. At that point, it still seems like it would require more work than would be saved to detect that circumstance and divide the packet rather than just sending it.
All the basic details about the mechanics of packet fragmentation in IP can be found in IETF RFC 791, if you're hungry for more information.

How to split received with boost asio udp sockets united datagrams

I've made my UDP server and client with boost::asio udp sockets. Everything looked good before I started sending more datagrams. They come correctly from client to server. But, they are united in my buffer into one message.
I use
udp::socket::async_receive with std::array<char, 1 << 18 > buffer
for making async request. And receive data through callback
void on_receive(const error_code& code, size_t bytes_transferred)
If I send data too often (every 10 milliseconds) I receive several datagrams simultaneously into my buffer with callback above. The question is - how to separate them? Note: my UDP datagrams have variable length. I don't want to use addition header with size, cause it'll make my code useless for third-party datagrams.
I believe this is a limitation in the way boost::asio handles stateless data streams. I noticed exactly the same behavior when using boost::asio for a serial interface. When I was sending packets with relatively large gaps between them I was receiving each one in a separate callback. As the packet size grew and the gap between the packets therefore decreased, it reached a stage when it would execute the callback only when the buffer was full, not after receipt of a single packet.
If you know exactly the size of the expected datagrams, then your solution of limiting the input buffer size is a perfectly sensible one, as you know a-priori exactly how large the buffer needs to be.
If your congestion is coming from having multiple different packet types being transmitted, so you can't pre-allocate the correct size buffer, then you could potentially create different sockets on different ports for each type of transaction. It's a little more "hacky" but given the virtually unlimited nature of ephemeral port availability, as long as you're not using 20,000 different packet types that would probably help you out as-well.