Is it possible to send variable size of segments with unacknowledged sequence number in TCP? - sockets

Just assume we are sending a packet with TCP and we found that packet has been dropped by network. After expiry of the time we try to resend the packet. In the meanwhile we got a new segment from application layer, and we are now trying to send both segments, the older one and new one with the sequence number which has not been acknowledged. Now the packet size is greater than older size. Just assume the older packet was delivered successfully but its acknowledgement was lost.
I explain this by some steps:-
1. (from sender) packet[SEQ=100,SEG_LEN=3,SEG="ABC"]----(To receiver)--->Receiver got it
2. (from receiver) packet[ACK=121]-----(To sender)---->Packet lost(Sender couldn't receive it)
3. We got new segment from application SEG="XYZ" and time expired for the previous packet
4. (from sender) packet[SEQ=100,SEQ_LEN=6,SEG="ABCXYZ"]----(To receiver)--->Receiver got it
So, now I want to know that what will happen at the receiver side,
Will it drop the packet by just assuming duplicate? or
It will accept the extra("XYZ") segment or total("ABCXYZ") segment.(?)

After expiry of the time we try to resend the packet.
Not necessarily, see below.
In the meanwhile we got a new segment from application layer, and we are now trying to send both segments, the older one and new one with the sequence number which has not been acknowledged.
Not necessarily. It is entirely possible that both segments are now coalesced and that you're only trying to send one segment at this point.
Now the packet size is greater than older size.
The segment size may be older than the old size, if what I said above is true. If it isn't, the sizes of the two packets you originally talked about haven't changed at all.
Just assume the older packet was delivered successfully but its acknowledgement was lost. So, now I want to know that what will happen at the receiver side
Will it drop the packet by just assuming duplicate?
It will drop the original packet if it was resent according to your postulate. If it was coalesced according to my postulate it will accept the new part of the packet, from the currently accepted point onwards.
It will accept the extra segment or total segment.
This question is too confused to answer. You need to make up your mind whether you're talking about two packets or a single coalesced packet. You're talking about both at the same time.
So, now I want to know that what will happen at the receiver side,
Will it drop the packet by just assuming duplicate?
No.
It will accept the extra("XYZ") segment
Yes.
or total("ABCXYZ") segment.(?)
No.
You need to think about this. TCP/IP works. Seriously. For 25 years at least. If any of the alternative scenarios you've posted were true it wouldn't be usable.

Related

Packets lost in Zmq_Push and Zmq_Pull connection

I am working on a project wherein I have multiple push sockets( created dynamically ) connected to a single pull socket. Here, each push socket sends a unique kind of data to the pull socket and the data size for each messages is less than 1000Bytes and each push socket sends upto 20,000 to 30,0000 such messages. During the peak load the number of push sockets can scale upto 100. So here, there are 100 push sockets connecting to one PULL receiver and each push socket sending 20-30K of messages with size almost 1000Bytes.
Now, On the PULL side receiver I process these each unique packets and put it into the respective queue, For Ex: If I have 50 PUSH sockets sending in data than on my receiver side there will be 50 queues processing the data. This is required as each queue holds unique set of data which needs to be processed uniquely. Here, I cannot have multiple PULL receivers as the data would then be routed in a round-robin fashion and one set of unique data might go into another one which I don't need.
Here, the problem I am facing is sometimes I see I don't receive all the data sent by the PUSH sockets. I believe there is some data loss that I happening. In order to control this I have setup HWM on both the sides. I tried the below combinations:
1) SNDHWM set to 100 and RCVHWM set to 100
2) SNDHWM set to 10 and RCVHWM set to 10
3) SNDHWM set to 10 and RCVHWM set to 100
Apart from the above configs I have also set the context threads on sender side to be 3 and on the receiver side to be 1. Here, on the sender side there is only one context created from which multiple push sockets are created dynamically as in when the data receives.
But, in all these combinations I find still some packets are missing.
Can someone guide me as in which parameters I need to configure in ZMQ so that I don't end up in packet loss and all the packets I receive on my receiver side. Also, I somehow believe my receiver side processing might be slow due to which this could have happened.
Please pen your thoughts here and guide me if possible.

Message delimitation in TCP communication

I am a newbie to networks and in particular TCP (I have been fooling a bit with UDP, but that's it).
I am developing a simple protocol based on exchanging messages between two endpoints. Those messages need to be certified, so I implemented a cryptographic layer that takes care of that. However, while UDP has a sound definition of a packet that constitutes the minimum unit that can get transferred at a time, the TCP protocol (as far as my understanding goes) is completely stream oriented.
Now, this puzzles me a bit. When exchanging messages, how can I tell where one starts and the other one ends? In principle, I can obviously communicate fixed length messages or first communicate the size of each message in some header. However, this can be subject to attacks: while of course it is going to be impossible to distort or determine the content of the communication, the above technique would make it easy to completely disrupt my communication just by adding a single byte in the middle.
Say that I need to transfer a message 1234567 bytes long. First of all, I communicate 4 bytes with an integer representing the size of the message. Okay. Then I start sending out the actual message. That message gets split in several packets, which get separately received. Now, an attacker just sends in an additional packet, faking it as if it was part of the conversation. It can just be one byte long: this completely destroys any synchronization mechanism I have implemented! The message has a spurious byte in the middle, and it doesn't successfully get decoded. Not only that, the last byte of the first message disrupts the alignment of the second message and so on: the connection is destroyed, and with a simple, simple attack! How likely and feasible is this attack anyway?
So I am wondering: what is the maximum data unit that can be transferred at once? I understand that to a call to send doesn't correspond a call to receive: the message can be split in different chunks. How can I group the packets together in some way so that I know that they get packed together? Is there a way to define an higher level message that gets reconstructed and aligned all together and triggers a single call to a receive-like function? If not, what other solutions can I find to keep my communication re-alignable even in presence of an attacker?
Basically it is difficult to control the way the OS divides the stream into TCP packets (The RFC defining TCP protocol states that TCP stack should allow the clients to force it to send buffered data by using push function, but it does not define how many packets this should generate. After all the attacker can modify any of them).
And these TCP packets can get divided even more into IP fragments during their way through the network (which can be opted-out by a 'Do not fragment' IP flag -- but this flag can cause that your packets are not delivered at all).
I think that your problem is not about introducing packets into a stream protocol, but about securing it.
IPSec could be very beneficial in your scenario, as it operates on the network layer.
It provides integrity for every packet sent, so any modification on-the-wire gets detected and the invalid packets are dropped. In case of TCP the dropped packets get re-transmitted automatically.
(Almost) everything is done automatically by the OS -- so yo do not need to worry about it (and make mistakes doing so).
The confidentiality can be assured as well (with the same advantage of not re-inventing the wheel).
IPSec should provide you a reliable transport protocol ontop of which you can use whatever framing format you like.
Another alternative is using SSL/TLS on top of TCP session which is less robust (as it does close the whole connection on integrity error).
Now, an attacker just sends in an additional packet, faking it as if it was part of the conversation. It can just be one byte long: this completely destroys any synchronization mechanism I have implemented!
Thwarting such an injection problem is dealt with by securing the stream. Create an encrypted stream and send your packets through that.
Of course the encrypted stream itself then has this problem; its messages can be corrupted. But those messages have secure integrity checks. The problem is detected, and the connection can be torn down and re-established to resynchronize it.
Also, some fixed-length synchronizing/framing bit sequence can be used between messages: some specific bit pattern. It doesn't matter if that pattern occurs inside messages by accident, because we only ever specifically look for that pattern when things go wrong (a corrupt message is received), otherwise we skip that sequence. If a corrupt message is received, we then receive bytes until we see the synchronizing pattern, and assume that whatever follows it is the start of a message (length followed by payload). If that fails, we repeat the process. When we receive a correct message, we reply to the peer, which will re-transmit anything we didn't get.
How likely and feasible is this attack anyway?
TCP connections are identified by four items: the source and destination IP, and source and destination port number. The attacker has to fake a packet which matches your stream in these four identifiers, and sneak that packet past all the routers and firewalls between that attacker and the receiving machine. The attacker also has to be in the right ballpark with regard to the TCP sequence number.
Basically, this is next to impossible for an attacker C to perpetrate against endpoints A and B which are both distant from C on the network. The fake source IP will be rejected long before C is able to reach its destination. It's more plausible as an inside job (which includes malware): C is close to A and B.

TCP Socket Read Variable Length Data w/o Framing or Size Indicators

I am currently writing code to transfer data to a remote vendor. The transfer will take place over a TCP socket. The problem I have is the data is variable length and there are no framing or size markers. Sending the data is no problem, but I am unsure of the best way to handle the returned data.
The data is comprised of distinct "messages" but they do not have a fixed size. Each message has an 8 or 16 byte bitmap that indicates what components are included in this message. Some components are fixed length and some are variable. Each variable length component has a size prefix for that portion of the overall message.
When I first open the socket I will send over messages and each one should receive a response. When I begin reading data I should be at the start of a message. I will need to interpret the bitmap to know what message fields are included. As the data arrives I will have to validate that each field indicated by the bitmap is present and of the correct size.
Once I have read all of the first message, the next one starts. My concern is if the transmission gets cut partway through a message, how can I recover and correctly find the next message start?
I will have to simulate a connection failure and my code needs to automatically retry a set number of times before canceling that message.
I have no control over the code on the remote end and cannot get framing bytes or size prefixes added to the messages.
Best practices, design patterns, or ideas on the best way to handle this are all welcomed.
From a user's point of view, TCP is a stream of data, just like you might receive over a serial port. There are no packets and no markers.
A non-blocking read/recv call will return you what has currently arrived at which point you can parse that. If, while parsing, you run out of data before reaching the end of the message, read/recv more data and continue parsing. Rinse. Repeat. Note that you could get more bytes than needed for a specific message if another has followed on its heels.
A TCP stream will not lose or re-order bytes. A message will not get truncated unless the connection gets broken or the sender has a bug (e.g. was only able to write/send part and then never tried to write/send the rest). You cannot continue a TCP stream that is broken. You can only open a new one and start fresh.
A TCP stream cannot be "cut" mid-message and then resumed.
If there is a short enough break in transmission then the O/S at each end will cope, and packets retransmitted as necessary, but that is invisible to the end user application - as far as it's concerned the stream is contiguous.
If the TCP connection does drop completely, both ends will have to re-open the connection. At that point, the transmitting system ought to start over at a new message boundary.
For something like this you would probably have a lot easier of a time using a networking framework (like netty), or a different IO mechansim entirely, like Iteratee IO with Play 2.0.

How does packet interaction with TCP Selective Acknowledgement work?

Can anybody explain how the packet interaction with TCP Selective Acknowledgment works?
I found the definition on Wikipedia, but I cannot get a clear picture what Selective Acknowledgment really does compared to Positive Acknowledgment and Negative Acknowledgment.
TCP breaks the information it sends into segments... essentially segments are chunks of data no larger than the current value of the TCP MSS (maximum segment size) received from the other end. Those chunks have incrementing sequence numbers (based on total data byte counts sent in the TCP session) that allow TCP to know when something got lost in-flight; the first TCP sequence number is chosen at random, and for security-purposes it should not be a pseudo-random number. Most of the time, the MTU of your local ethernet is smaller than the MSS, so they could send multiple segments to you before you can ACK.
It is helpful to think of these things in the time sequence they got standardized...
First came Positive Acknowledgement, which is the mechanism of telling the sender you got the data, and the sequence number you ACK with is the maximum byte-sequence received per TCP chunk (a.k.a segment) he sent.
I will demonstrate below, but in my examples you will see small TCP segment numbers like 1,2,3,4,5... in reality these byte-sequence numbers will be large, incrementing, and have gaps between them (but that's normal... TCP typically sends data in chunks at least 500 bytes long).
So, let's suppose the sender xmits segment numbers 1,2,3,4,5 before send your first ACK. If all goes well, you send an ACK for 1,2,3,4,5 and life is good. If 2 gets lost, everything is on hold till the sender realizes that 2 has never been ACK'd; he knows because you send duplicate ACKs for 1. Upon the proper timeout, the sender xmits 2,3,4,5 again.
Then Selective Acknowledgement was proposed as a way to make this more efficient. In the same example, you ACK 1, and SACK segments 3 through 5 along with it (if you use a sniffer, you'll see something like "ACK:1, SACK:3-5" for the ACK packets from you). This way, the sender knows it only has to retransmit TCP segment 2... so life is better. Also, note that the SACK defined the edges of the contiguous data you have received; however, multiple non-contiguous data segments can be SACK'd at the same time.
Negative Acknowledgement is the mechanism of telling the sender only about missing data. If you don't tell them something is missing, they keep sending the data 'till you cry uncle.
HTH, \m

Boost Asio UDP retrieve last packet in socket buffer

I have been messing around Boost Asio for some days now but I got stuck with this weird behavior. Please let me explain.
Computer A is sending continuos udp packets every 500 ms to computer B, computer B desires to read A's packets with it own velocity but only wants A's last packet, obviously the most updated one.
It has come to my attention that when I do a:
mSocket.receive_from(boost::asio::buffer(mBuffer), mEndPoint);
I can get OLD packets that were not processed (almost everytime).
Does this make any sense? A friend of mine told me that sockets maintain a buffer of packets and therefore If I read with a lower frequency than the sender this could happen. ยก?
So, the first question is how is it possible to receive the last packet and discard the ones I missed?
Later I tried using the async example of the Boost documentation but found it did not do what I wanted.
http://www.boost.org/doc/libs/1_36_0/doc/html/boost_asio/tutorial/tutdaytime6.html
From what I could tell the async_receive_from should call the method "handle_receive" when a packet arrives, and that works for the first packet after the service was "run".
If I wanted to keep listening the port I should call the async_receive_from again in the handle code. right?
BUT what I found is that I start an infinite loop, it doesn't wait till the next packet, it just enters "handle_receive" again and again.
I'm not doing a server application, a lot of things are going on (its a game), so my second question is, do I have to use threads to use the async receive method properly, is there some example with threads and async receive?
One option is to take advantage of the fact that when the local receive buffer for your UDP socket fills up, subsequently received packets will push older ones out of the buffer. You can set the local receive buffer size to be large enough for one packet, but not two. This will make the newest packet to arrive always cause the previous one to be discarded. When you then ask for the packet using receive_from, you'll get the latest (and only) one.
Here are the API docs for changing the receive buffer size with Boost:
http://www.boost.org/doc/libs/1_37_0/doc/html/boost_asio/reference/basic_datagram_socket/receive_buffer_size.html
The example appears to be wrong, in that it shows a tcp socket rather than a udp socket, but changing that back to udp should be easy (the trivially obvious change should be the right one).
With Windows (certainly XP, Vista, & 7); if you set your recv buffer size to zero you'll only receive datagrams if you have a recv pending when the datagram arrives. This MAY do what you want but you'll have to sit and wait for the next one if you post your recv just after the last datagram arrives ...
Since you're doing a game, it would be far better, IMHO, is to use something built on UDP rather than UDP itself. Take a look at ENet which supports reliable data over UDP and also unreliable 'sequenced' data over UDP. With unreliable sequenced data you only ever get the 'latest' data. Or something like RakNet might be useful to you as it does a lot of games stuff and also includes stuff similar to ENet's sequenced data.
Something else you should bear in mind is that with raw UDP you may get those datagrams out of order and you may get them more than once. So you're likely gonna need your own sequence number in their anyway if you don't use something which sequences the data for you.
P2engine is a flexible and efficient platform for making p2p system development easier. Reliable UDP, Message Transport , Message Dispatcher, Fast and Safe Signal/Slot...
You're going about it the wrong way. The receiving end has a FIFO queue. Once the queue gets filled new arriving packets are discarded.
So what you need to do on the receiver is just to keep reading the packets as fast as possible and process them as they arrive.
Your receiving end should easily be able to handle receiving a packet every 500ms. I'd say you've got a bug in your code and from what you describe yes you do.
It could be this, make sure in handle_receive that you only call async_receive_from if there is no error.
I think that I have your same problem, to solve the problem I read the bytes_available and compare with packet width until I receive the last package:
boost::asio::socket_base::bytes_readable command(true);
socket_server.io_control(command);
std::size_t bytes_readable = command.get();
Here is the documentation.