Where to get number of bytes received in TCP payload - sockets

I'm working on a TCP/IP stack for a networks course, and am currently working on implementing a rudimentary TCP system. With IP and UDP you get fields in the header that allow you to know the number of bytes in the payload, but as far as I can see, there is nothing like this in TCP. The client will send a value in the "window size" field during the opening handshake to indicate the maximum number of bytes it will accept, but there is no field that actually says how many bytes of data come after the header in a given packet.
Sequence and acknowledgment numbers are used to give the offset of bytes being sent and the next expected offset, but they cannot tell you how many bytes have come in the current packet.
I'm assuming I can just pass the length from the IP header down to the TCP handler, but I'd just like to verify that that's the proper way to do it.

The length of the TCP payload is given by the length field of the IP packet less the IP header (8 bytes) less the TCP header (given by the TCP header length field times 4).

Related

Demultiplexing udp streams from different sources

My server is using a single udp socket to receive udp streams from different ip addresses. (All senders send to the same port).
When recv returns on the server with a chunk of data, might that chunk contain bytes from different sources?
Assuming not, is there a reliable way to determine which sender sent that entire chunk?
In UDP, each chunk received will be exactly what a sender previously passed to ‘send()’ or ‘sendto()’ — unlike TCP, UDP maintains message boundaries.
You can find out the IP address and port the received packet was sent from by calling ‘recvfrom()’ instead of ‘recv()’. Those values will be written into the ‘struct inaddr_in’ that you provide a pointer to.

socket programming, what happen if I write data more than one TCP/UDP packet can carry?

I have a question about socket programming. When I use socket to send the data, we can use the API such as sendto() to send using TCP or UDP.
For sendto(), we give a array pointer and the byte number we want to send.
In this case, if I gave a large byte number (e.g.: 20000 bytes), based on my understanding, MTU of the network will not be that big so socket actually send mutiple packets instead of one big packet. Since these 20000 bytes are split into several UDP/TCP packets, but will these 20000 bytes be seen as one packet at beginning? Is this process UDP/TCP fragmentation ?
My another question is if I put the data size smaller than MTU into sendto(), then I can gurantee call sendto() once, socket only sends one TCP/UDP packet?
Thanks in advance.
will these 20000 bytes be seen as one packet at beginning? Is this process UDP/TCP fragmentation?
UDP will send it as one datagram if your socket send buffer is large enough to hold it. Otherwise you will get EMSGSIZE. It may subsequently get fragmented at the IP layer, and if a fragment gets lost so does the whole datagram, but if all the fragments arrive the entire datagram will be received intact.
TCP wil send it all, segmenting and fragmenting it however it sees fit. It will all arrive, intact and in order, unless there is a long enough network outage.
My another question is if I put the data size smaller than MTU into sendto(), then I can guarantee call sendto() once, socket only sends one TCP/UDP packet?
UDP: yes.
TCP: no.

Can UDP packets be partially sent like TCP ones?

I've created some type of client/server application that has its own data ACK system. It was originally written in TCP because of some limitations, but the base was written thinking about UDP.
The packets that I sent to the server had their own encapsulation (packet id and packet size headers. I know that UDP has also a checksum so I didn't add a header for that), but how TCP works, I know that the server may not receive the entire packet, so I gathered and buffered the data received until a full valid packet was received.
Now I have the chance to change my client/server program to UDP, and I know that one difference with TCP is that data is not received in the same order as sent (which is why I added a packet id header).
The thing that I want to know is: If I send multiple packets, will they be received with no guaranteed order but with guaranteed encapsulation? I mean, if I send a packet sized 1000 bytes of data and another packet sized 400 bytes of data later, will the server receive 2 packets, one of 1000 bytes and another one of 400 bytes, or is there a chance to receive 200 of that 1000 bytes, then 400 bytes of that 1000 bytes and later the rest of the bytes like TCP does?
UDP is a datagram service. Datagrams may be split for transport, but they will be reassembled before being passed up to the application layer.
With small packet sizes you should have no concern that packets will be broken into multiple packets. That generally is only an issue when the packet gets over an Ethernet network.
You ask" will the server receive 2 packets, one of 1000 bytes and another one of 400 bytes, or there's a chance to receive 200 of that 1000 bytes, then 400 bytes of that 1000 bytes and later the rest of the bytes like TCP can do?
With a packet size of under 1492 bytes there is not going to be any partial packets.
UPDATE:
Apparently I see a need to clarify why I say UDP packet lengths 1492 bytes or less will not affect transport robustness.
The maximum UDP length as implicitly specified in RFC 768 is 65535 including the 8 byte Header. Max Payload Frame Length is 65527 bytes.
While this should not be disputed number the UDP data length is often reported incorrectly. This is exemplified in a previous post:
What is the largest Safe UDP Packet Size on the Internet
A data packet is not constrained by the MTU of the underlying network ToS or communications protocol's Frame length (e.g. IP and Ethernet respectively). Discrepancies between MTU and Protocol Lengths are remedied by Fragmentation and Reassembly
At the Transport Layer each network Type of Service (ToS) has a specific Maximum Transmission Unit (MTU). UDP is encapsulated within IP Packets and IP Packets are encapsulated by the transporting Network's ToS. IP packets are often transmitted through networks of various ToS which include Ethernet, PPP, HDLC, and ADCCP.
When the MTU for a receiving Network ToS is less than the sending ToS then the receiving network must Fragment the received packet. When the Network sends a packet to a network with a higher MTU, the receiving Network must reassemble any Fragmented packets.
Ethernet is the defacto mainstream protocol with the lowest MTU. Non-Mainsteam Arcnet the MTU is 507 bytes. The practical lowest MTU is Ethernet's 1500 bytes, minus the overhead makes maximum payload length 1492 bytes.
If the UDP packet has more than 1492 bytes the data packet will likely be Fragmented and Reassembled. The Fragment and Reassembly adds complexity to the already complex process coupling UDP and IP, and therefore should be avoided.
Because UDP is a non-guaranteed datagram delivery protocol it boosts the transport performance. Robustness is left to the originating and terminating Application. RFC 1166 sets the standards for the communication protocol link layer, IP layer, and transport layer, the UDP Application is responsible for packetization, reassembly, and flow control.
The maximum UDP packet size can also be lowered by a Communication Host's Application Layer. The packet length is a balance between performance and robustness.
The Communications Host's Application Layer may set a maximum UDP packet size. The typical UDP max data length at the Application layer will use the maximum allowed by the IP protocol or the Host Data Link Layer, typically Ethernet.
It is the Application's programer that chooses to use the Host Application Layer or the Host Data Link layer. The Host Application Layer will detect UDP packet errors and discard the packet if necessary. When the application communicates directly with the Host Data Link, the application then has the responsibility of detecting packet errors.
Using maximum UDP data packet length of Ethernet's max payload length of 1492 bytes will eliminate the issues of Fragmentation and Delivery Order of multiple Frames.
That is why I said packet length is not a Fragmentation issue with packet lengths of 1000 and 400 bytes.
###
I do not know what you mean by "guaranteed encapsulation", it makes no sense to me.
With IP there is no guarantee of packet delivery of the order whether UDP or TCP.
As long as you control both sides of the conversation, you can work out your own protocol within the data packet to handle ordering and post packets. Reserve the first x bytes of the packet for a sequential order number and total number of packets. (e.g. 1 of 3, 2 of 2, 3 of 3). If the client side is missing a packet then the client must send a request for retransmission. You need to determine to what level you are going to go for data integrity. like maybe the re-transmission packet is lost.
That may be what you meant by "guaranteed encapsulation", Where there is other information within your datagram packet to ensure some integrity. You should add your own CRC for the total data being sent if broken into multiple datagrams. the checksum is not very robust and is only for the one packet.
UDP is much faster then TCP but TCP has flow control and guaranteed delivery.
UDP is good for streaming content like voice where a lost packet is not going to matter.
Network reliability has improved a lot since the days when these issues were a major concern.

Will TCP connection lose packets?

Say Server S have a successful TCP connection with Client C.
C is keep sending 256-byte-long packets to S.
Is it possible that one of packets only receive part of it, but the connection does not break (Can continue receive new packets correctly)?
I thought TCP protocol itself will guarantee not lose any bytes while connecting. But seems not?
P.S. I'm using Python's socketserver library.
The TCP protocol does guarantee delivery. Thus (assuming there are no bugs in your code and in the TCP stack), the scenario you describe is impossible.
Do bear in mind that TCP is stream- rather than packet-oriented. This means that you may need to call recv() multiple times to read the entire 256-byte packet.
As #NPE said, TCP is a stream oriented protocol, that means that there is no guarantee on how much data bytes are sent in each TCP packet nor how many bytes are available for reading in the receiving socket. What TCP ensures is that the receiving socket will be provided with the data bytes in the same order that they were sent.
Consider a communication through a TCP connection socket between two hosts A and B.
When the application in A requests to send 256 bytes, for example, the A's TCP stack can send them in one, or several individual packets, or even wait before sending them. So, B may receive one or several packets with all or part of the bytes requested to be sent by A, and so, when the application in B is notified of the availability of received bytes, it's not sure that it could read at once the 256 bytes.
The only guaranteed thing is that the bytes B reads are in the same order that A sent them.

Does TCP keepalive refresh the timeout on a NAT?

I've read that NAT routers "assume a connection has been terminated if no data has been sent for a certain time period."
I've also read that TCP keepalive packets usually shouldn't contain any data.
So my questions are:
Are the above statements true?
Do NAT routers consider empty TCP keepalive packets when reordering/cleaning their tables?
I'm asking this because I need a reliable connection between two endpoints where both of them have to be able to detect and react to connection problems. I know that I might just implement a keepalive mechanism myself but I want to know whether the TCP implementation could be used for that.
I do believe the second statement refers to payload (The shortest possible TCP/IP packet is 40 bytes long - 20 bytes for TCP header + 20 bytes for IPv4 header).
Regarding the first, here's a quote from RFC 2663:
End of session for TCP, UDP and others
The end of a TCP session is detected when FIN is acknowledged by
both halves of the session or when either half receives a segment with
the RST bit in TCP flags field. However, because it is impossible for
a NAT device to know whether the packets it sees will actually be
delivered to the destination [...] the NAT device cannot safely assume
that the segments containing FINs or SYNs will be the last packets of
the session [...] Consequently, a session can be assumed to have been
terminated only after a period of 4 minutes subsequent to this
detection. The need for this extended wait period is described in RFC
793 [Ref 7], which suggests a TIME-WAIT duration of 2 * MSL (Maximum
Segment Lifetime) or 4 minutes.
Reference: https://www.rfc-editor.org/rfc/rfc2663
To my understanding, any packets that identifies a session would reset the TTL counter - but that depends heavily on implementation, since 'data' can be understood as 'packet' (minimum 40 bytes) or 'packet payload'. Nonetheless, #CodeCaster is spot-on; never assume that a connection is alive, make sure it is before sending (and, if possible and depending on criticality, acknowledge receipt.)