Why does QUIC packets require a sequence number? - quic

This question was closed on the Networking SE because "questions about protocols above OSI layer-4 are off-topic here" so I'm trying here.
This may be a silly question, but if in QUIC we maintain separate sliding windows for each stream, why is there a need for sequencing even below the stream level?
It seems to me that an application will not receive the same data twice because we already sequence streams by themselves, and we also can acknowledge each stream separately without sequencing the packets themselves.

I presume you mean packet number, as there isn't a sequence number as such in QUIC.
If so, then the packet number used in QUIC performs a number of roles, such as being used within the ACK process, sequencing at the packet level, and also is used as part of the nonce input into the AEAD encryption process.
From https://datatracker.ietf.org/doc/html/rfc9001 5.3 AEAD
The nonce, N, is formed by combining the packet protection IV with the packet number.
From https://datatracker.ietf.org/doc/html/rfc9000 13.2.3. Managing ACK Ranges
A receiver SHOULD include an ACK Range containing the largest received packet number in every ACK frame.

Related

application layer protocol - different size of packets

Assume I have defined my own application layer protocol on top of TCP for Instant Messaging. I have used a packet structure for the messages. As I am using symmetric (AES) and asymmetric (RSA) encryption, I obtain a different
packet size for different message types. Now to my questions.
How to read from a socket that I receive a single application layer packet?
What size should I specify?
Thanks in advance.
I have two approaches in mind.
Read from the TCP stream a fixed amount of bytes that represents the
actual packet size, and finally re-read from the stream the former gathered size of bytes.
Read the maximal packet size from the stream. Verify the actual size of
obtained bytes and decide so which message type it was.
Now, a more general question. Should I provide metadata like the packet size, encryption method, receiver, sender, etc.? If yes, should I encrypt these meta data as well?
Remember that with TCP, when reading from the network, there is no guarantee about the number of bytes received at that point in time. That is, a client might send a full packet in its write(), but that does not mean that your read() will receive the same number of bytes. Thus your code will always need to read some number of bytes from the network, then verify (based on the accumulated data) that you have received the necessary number of bytes, and then you can verify the packet (type, contents, etc) from there.
Some applications use state machine encoders/decoders and fixed size buffers for reading/writing their network data; other applications dynamically allocate buffers large enough for the "full packet", then continue reading bytes from the network until the "full packet" buffer is full. Which approach you take depends on your application. Thus the size you use for reading is not as important as how your code ensures that it has received a full packet.
As for whether you should encrypt additional metadata, that depends very much on your threat model (i.e. what threats your protocol wants to guard against, what assurances your protocol needs to provide to its clients/users). There's no easy way to answer that question without more context/details.
Hope this helps!

Message delimitation in TCP communication

I am a newbie to networks and in particular TCP (I have been fooling a bit with UDP, but that's it).
I am developing a simple protocol based on exchanging messages between two endpoints. Those messages need to be certified, so I implemented a cryptographic layer that takes care of that. However, while UDP has a sound definition of a packet that constitutes the minimum unit that can get transferred at a time, the TCP protocol (as far as my understanding goes) is completely stream oriented.
Now, this puzzles me a bit. When exchanging messages, how can I tell where one starts and the other one ends? In principle, I can obviously communicate fixed length messages or first communicate the size of each message in some header. However, this can be subject to attacks: while of course it is going to be impossible to distort or determine the content of the communication, the above technique would make it easy to completely disrupt my communication just by adding a single byte in the middle.
Say that I need to transfer a message 1234567 bytes long. First of all, I communicate 4 bytes with an integer representing the size of the message. Okay. Then I start sending out the actual message. That message gets split in several packets, which get separately received. Now, an attacker just sends in an additional packet, faking it as if it was part of the conversation. It can just be one byte long: this completely destroys any synchronization mechanism I have implemented! The message has a spurious byte in the middle, and it doesn't successfully get decoded. Not only that, the last byte of the first message disrupts the alignment of the second message and so on: the connection is destroyed, and with a simple, simple attack! How likely and feasible is this attack anyway?
So I am wondering: what is the maximum data unit that can be transferred at once? I understand that to a call to send doesn't correspond a call to receive: the message can be split in different chunks. How can I group the packets together in some way so that I know that they get packed together? Is there a way to define an higher level message that gets reconstructed and aligned all together and triggers a single call to a receive-like function? If not, what other solutions can I find to keep my communication re-alignable even in presence of an attacker?
Basically it is difficult to control the way the OS divides the stream into TCP packets (The RFC defining TCP protocol states that TCP stack should allow the clients to force it to send buffered data by using push function, but it does not define how many packets this should generate. After all the attacker can modify any of them).
And these TCP packets can get divided even more into IP fragments during their way through the network (which can be opted-out by a 'Do not fragment' IP flag -- but this flag can cause that your packets are not delivered at all).
I think that your problem is not about introducing packets into a stream protocol, but about securing it.
IPSec could be very beneficial in your scenario, as it operates on the network layer.
It provides integrity for every packet sent, so any modification on-the-wire gets detected and the invalid packets are dropped. In case of TCP the dropped packets get re-transmitted automatically.
(Almost) everything is done automatically by the OS -- so yo do not need to worry about it (and make mistakes doing so).
The confidentiality can be assured as well (with the same advantage of not re-inventing the wheel).
IPSec should provide you a reliable transport protocol ontop of which you can use whatever framing format you like.
Another alternative is using SSL/TLS on top of TCP session which is less robust (as it does close the whole connection on integrity error).
Now, an attacker just sends in an additional packet, faking it as if it was part of the conversation. It can just be one byte long: this completely destroys any synchronization mechanism I have implemented!
Thwarting such an injection problem is dealt with by securing the stream. Create an encrypted stream and send your packets through that.
Of course the encrypted stream itself then has this problem; its messages can be corrupted. But those messages have secure integrity checks. The problem is detected, and the connection can be torn down and re-established to resynchronize it.
Also, some fixed-length synchronizing/framing bit sequence can be used between messages: some specific bit pattern. It doesn't matter if that pattern occurs inside messages by accident, because we only ever specifically look for that pattern when things go wrong (a corrupt message is received), otherwise we skip that sequence. If a corrupt message is received, we then receive bytes until we see the synchronizing pattern, and assume that whatever follows it is the start of a message (length followed by payload). If that fails, we repeat the process. When we receive a correct message, we reply to the peer, which will re-transmit anything we didn't get.
How likely and feasible is this attack anyway?
TCP connections are identified by four items: the source and destination IP, and source and destination port number. The attacker has to fake a packet which matches your stream in these four identifiers, and sneak that packet past all the routers and firewalls between that attacker and the receiving machine. The attacker also has to be in the right ballpark with regard to the TCP sequence number.
Basically, this is next to impossible for an attacker C to perpetrate against endpoints A and B which are both distant from C on the network. The fake source IP will be rejected long before C is able to reach its destination. It's more plausible as an inside job (which includes malware): C is close to A and B.

Does UDP allow repacketization?

I know that for TCP you can have for example Nagle's Algorithm enabled. However, can you have something similar for UDP?
Practical Question(assume UDP socket):
If I call send() two times in a short period of time with 1 byte of data in each send() call. Is it possible that the transport layer decides to send only 1 UPD packet with the 1 byte + 1 byte = 2 bytes of data?
Thanks in advance!
No. UDP datagrams are delivered intact exactly as sent, or not at all.
Not according to the RFC (RFC 768). Above IP facilities themselves, UDP really only provides, as extras, port-based routing and a little bit of extra detection for corruption or misrouting.
That means there's no facility to combine datagrams. In fact, since it's meant to be transaction oriented, I would say that combining two transactions into one may well be a bad idea in terms of keeping these transaction disparate.
Otherwise, you would need a layer above UDP which could figure out how to extract these transactions from a datagram. At the moment, that's not necessary since the datagram is the transaction.
As added support (though not, of course, definitive) for this contention, see the UDP wikipedia page:
Datagrams – Packets are sent individually and are checked for integrity only if they arrive. Packets have definite boundaries which are honored upon receipt, meaning a read operation at the receiver socket will yield an entire message as it was originally sent.
However, the best support for it comes from one of its clients. UDP was specially engineered for TFTP (among other things) and that protocol breaks down if you cannot distinguish a transaction.
Specifically, one of the TFTP transaction types is the data transaction which consists of an opcode, block number and up to 512 bytes of data. Without a length indication at the start or a sentinel value at the end, there is no way to work out where the next transaction would start unless there is a one-to-one mapping between transaction and datagram.
As an aside, the other four TFTP transaction types have either a fixed length or end-of-string sentinel values but the data transaction is the decider here.

How to split received with boost asio udp sockets united datagrams

I've made my UDP server and client with boost::asio udp sockets. Everything looked good before I started sending more datagrams. They come correctly from client to server. But, they are united in my buffer into one message.
I use
udp::socket::async_receive with std::array<char, 1 << 18 > buffer
for making async request. And receive data through callback
void on_receive(const error_code& code, size_t bytes_transferred)
If I send data too often (every 10 milliseconds) I receive several datagrams simultaneously into my buffer with callback above. The question is - how to separate them? Note: my UDP datagrams have variable length. I don't want to use addition header with size, cause it'll make my code useless for third-party datagrams.
I believe this is a limitation in the way boost::asio handles stateless data streams. I noticed exactly the same behavior when using boost::asio for a serial interface. When I was sending packets with relatively large gaps between them I was receiving each one in a separate callback. As the packet size grew and the gap between the packets therefore decreased, it reached a stage when it would execute the callback only when the buffer was full, not after receipt of a single packet.
If you know exactly the size of the expected datagrams, then your solution of limiting the input buffer size is a perfectly sensible one, as you know a-priori exactly how large the buffer needs to be.
If your congestion is coming from having multiple different packet types being transmitted, so you can't pre-allocate the correct size buffer, then you could potentially create different sockets on different ports for each type of transaction. It's a little more "hacky" but given the virtually unlimited nature of ephemeral port availability, as long as you're not using 20,000 different packet types that would probably help you out as-well.

Sending And Receiving Sockets (TCP/IP)

I know that it is possible that multiple packets would be stacked to the buffer to be read from and that a long packet might require a loop of multiple send attempts to be fully sent. But I have a question about packaging in these cases:
If I call recv (or any alternative (low-level) function) when there are multiple packets awaiting to be read, would it return them all stacked into my buffer or only one of them (or part of the first one if my buffer is insufficient)?
If I send a long packet which requires multiple iterations to be sent fully, does it count as a single packet or multiple packets? It's basically a question whether it marks that the package sent is not full?
These questions came to my mind when I thought about web sockets packaging. Special characters are used to mark the beginning and end of a packet which sorta leads to a conclusion that it's not possible to separate multiple packages.
P.S. All the questions are about TCP/IP but you are welcomed to share information (answers) about UDP as well.
TCP sockets are stream based. The order is guaranteed but the number of bytes you receive with each recv/read could be any chunk of the pending bytes from the sender. You can layer a message based transport on top of TCP by adding framing information to indicate the way that the payload should be chunked into messages. This is what WebSockets does. Each WebSocket message/frame starts with at least 2 bytes of header information which contains the length of the payload to follow. This allows the receiver to wait for and re-assemble complete messages.
For example, libraries/interfaces that implement the standard Websocket API or a similar API (such as a browser), the onmessage event will fire once for each message received and the data attribute of the event will contain the entire message.
Note that in the older Hixie version of WebSockets, each frame was started with '\x00' and terminated with '\xff'. The current standardized IETF 6455 (HyBi) version of the protocol uses the header information that contains the length which allows much easier processing of the frames (but note that both the old and new are still message based and have basically the same API).
TCP connection provides for stream of bytes, so treat it as such. No application message boundaries are preserved - one send can correspond to multiple receives and the other way around. You need loops on both sides.
UDP, on the other hand, is datagram (i.e. message) based. Here one read will always dequeue single datagram (unless you mess with low-level flags on the socket). Event if your application buffer is smaller then the pending datagram and you read only a part of it, the rest of it is lost. The way around it is to limit the size of datagrams you send to something bellow the normal MTU of 1500 (less IP and UDP headers, so actually 1472).