TCP-Connection Establishment = How to measure time based on Ping RRT? - sockets

I would be greatful for help, understanding how long it takes to establish a TCP connection when I have the Ping RoundTripTip:
According to Wikipedia a TCP Connection will be established in three steps:
1.SYN-SENT (=>CLIENT TO SERVER)
2.SYN/ACK-RECEIVED (=>SERVER TO CLIENT)
3.ACK-SENT (=>CLIENT TO SERVER)
My Questions:
Is it correct, that the third transmission (ACK-SENT) will not yet carry any payload (my data) but is only used for the connection establishement.(This leads to the conclusion, that the fourth packt will be the first packt to hold any payload....)
Is it correct to assume, that when my Ping RoundTripTime is 20 milliseconds, that in the example given above, the TCP Connection establishment would at least require 30 millisecons, before any data can be transmitted between the Client and Server?
Thank you very much
Tom

Those things are basically correct, though #2 assumes that the round-trip time is symmetric.

To measure this, called the "Time to Syn/ACK" (which is NOT the Time to Establish a connection - the connection is only half-open when in that state, you need the 3rd packet acknowledging the establishment to consider it established), you usually need professional tools that include their own TCP stack, enabling that kind of measurement. The most used one is called the Spirent Avalanche, but you also have Ixia's IxLoad or BreakingPoint Systems box (BPS has now been acquired by Ixia btw).
Note that, yes, 3rd packet won't have any data, and that is also true of the first two. They are only Syn and Syn+Ack flagged (those are TCP flags), and contain no application data. This initial exchange, called the Three-way Handshake therefore causes some overhead, which is why TCP is typically not used in real-time applications (voice, live video, etc..).
Also as stated, you can't assume that Latency = RTT/2. It is in fact very complicated to measure one-way latency above layer 3 (IP) - and you are already at layer 4 (TCP) here. This blog post covers in details the challenge of this: http://synsynack.wordpress.com/2012/04/09/realistic-latency-measurement-in-the-application-layers/

Related

how high the percentage of packet delivery rate of MQTT than CoAP?

I am willing to know about the comparison of the Packet delivery rate between MQTT and CoAP transmission. I know that TCP is more secure than UDP, so MQTT should have a higher Packet delivery rate. I just want to know, if 2000 packets are sent using both protocols separately what would be the approximate percentage in the two cases?
Please help with an example if possible.
If you dig a little, you will find, that both, TCP and UDP, are mainly sending IP messages. And some of these messages may be lost. For TCP, the retransmission is handled by the TCP protocol without your influence. That sure works not too bad (at least in many cases). For CoAP, when you use CON messages, CoAP does the retransmission for you, so also not too much to lose.
When it comes to transmissions with more message loss (eg. bad connectivity), the reliability may also depend on the amount of data. If it fits into one IP message, the probability that this reaches the destination is higher, than 4 messages reaching their destination. In that situation the difference starts:
Using TCP requires, that all the messages are reaching the destination without gaps (e.g. 1,2, (drop 3), 4 will not work). CoAP will deliver the messages also with gaps. It depends on your application, if you can benefit from that or not.
I've been testing CoAP over DTLS 1.2 (Connection ID) for mostly a year now using a Android phone and just moving around sending request (about 400 bytes) with wifi and mobile network. It works very reliable. Current statistic: 2000 request, 143 retransmissions, 4 lost. Please note: 4 lost mainly means "no connection", so be sure, using TCP will have results below that, especially when moving around and frequently new TCP/TLS handshakes get required.
So my conclusion:
If you have a stable network connection, both should work.
If you have a less stable network connection and gaps are not acceptable by your application, both have trouble.
If you have a less stable network connection and gaps are acceptable by your application, then CoAP will do a better job.

Low latency two-phase protocol

I'm look recommendations on how to achieve low latency for the following network protocol:
Alice sends out a request for information to many peers selected at random from a very large pool.
Each peer responds with a small packet <20kb.
Alice aggregates the responses and selects a peer accordingly.
Alice and the selected peer then continue to the second phase of the protocol whereby a sequence of 2 requests and responses are performed.
Repeat from 1.
Given that steps 1 and 2 do not need to be reliable (as long as a percentage of responses arrive back we proceed to step 3) and that 1 is essentially a multicast, this part of the protocol seems to suit UDP - setting up a TCP connection to these peers would add an addition round trip.
However step 4 needs to be reliable - we can't tolerate packet loss during the subsequent requests/responses.
The conundrum I'm facing is that UDP suits 1 and 2 and TCP protocol suits 4. Connecting to every peer selected in 1 is slow especially since we aim to transmit just 20kb, however UDP cannot be tolerated for step 4. Handshaking the peer selected in 4. would require an additional round trip, which compared to the 3 round trips still is a considerable increase in total time.
Is there some hybrid scheme whereby you can do a TCP handshake while transmitting a small amount of data? (The handshake could be merged into 1 and 2 and hence doesn't add any additional round trip time.)
Is there a name for such protocols? What should I read to become more acquainted with such problems?
Additional info:
Participants are assumed to be randomly distributed around the globe and connected via the internet.
The pool selected from in step 1. is on the order of 1000 addresses and the the sample from the pool on the order of 10 to 100.
There's not enough detail here to do a well-informed criticism. If you were hiring me for advice, I'd want to know a lot more about the proposal, but since I'm doing this for free, I'll just answer the question as asked, and try to make it practical rather than ideal.
I'd argue that UDP is not suitable for the early part of your protocol. You can't just multicast a single packet to a large number of hosts on the Internet (although you can do it on typical LANs). A 20KB payload is not the sort of thing you can generally transmit in a single datagram in any case, and the moment messages fail to fit in a single datagram, UDP loses most of its attraction, because you start reinventing TCP (badly).
Probably the simplest thing you can do is base your system on HTTP, and work with implementations which incorporate all the various speed-ups that Google (mostly) has been putting into HTTP development. This includes TCP Fast Open, and things like it. Initiate connections out to your chosen servers; some will respond faster than others: use that to your advantage by going with the quickest ones. Don't underestimate the importance of efficient implementation relative to theoretical round-trip time, by the way.
For stage two, continue with HTTP as before. For efficiency, you could hold all the connections open at the end of phase one and then close all the ones except your chosen phase two partner. It's not clear from your description that the phase two exchange lends itself to the HTTP model, though, so I have to hand-wave this a bit.
It's also possible that you can simply hold TCP connections open to all available peers more or less permanently, thus dodging the cost of connection establishment nearly all the time. A thousand simultaneous open connections is large, but not outrageous in most contexts (although you may need to tweak OS settings to allow it). If you do that, you can just talk whatever protocol you like over TCP. If it's a truly peer-to-peer protocol, you only need one TCP connection per pair. Implementing this kind of thing is tricky, though: an average programmer will do a terrible job of it, in my experience.

tcp or udp for a game server?

I know, I know. This question has been asked many times before. But I've spent an hour googling now without finding what I am looking for so I will ask it again and mention my context along with what makes the decision hard for me:
I am writing the server for a game where the response time is very important and a packet loss every now and then isn't a problem.
Judging by this and the fact that I as a server mostly have to send the same data to many different clients, the obvious answer would be UDP.
I had already started writing the code when I came across this:
In some applications TCP is faster (better throughput) than UDP.
This is the case when doing lots of small writes relative to the MTU size. For example, I read an experiment in which a stream of 300 byte packets was being sent over Ethernet (1500 byte MTU) and TCP was 50% faster than UDP.
In my case the information units I'm sending are <100 bytes, which means each one fits into a single UDP packet (which is quite pleasant for me because I don't have to deal with the fragmentation) and UDP seems much easier to implement for my purpose because I don't have to deal with a huge amount of single connections, but my top priority is to minimize the time between
client sends something to server
and
client receives response from server
So I am willing to pick TCP if that's the faster way.
Unfortunately I couldn't find more information about the above quoted case, which is why I am asking: Which protocol will be faster in my case?
UDP is still going to be better for your use case.
The main problem with TCP and games is what happens when a packet is dropped. In UDP, that's the end of the story; the packet is dropped and life continues exactly as before with the next packet. With TCP, data transfer across the TCP stream will stop until the dropped packet is successfully retransmitted, which means that not only will the receiver not receive the dropped packet on time, but subsequent packets will be delayed also -- most likely they will all be received in a burst immediately after the resend of the dropped packet is completed.
Another feature of TCP that might work against you is its automatic bandwidth control -- i.e. TCP will interpret dropped packets as an indication of network congestion, and will dial back its transmission rate in response; potentially to the point of dialing it down to near zero, in cases where lots of packets are being lost. That might be useful if the cause really was network congestion, but dropped packets can also happen due to transient network errors (e.g. user pulled out his Ethernet cable for a couple of seconds), and you might not want to handle those problems that way; but with TCP you have no choice.
One downside of UDP is that it often takes special handling to get incoming UDP packets through the user's firewall, as firewalls are often configured to block incoming UDP packets by default. For an action game it's probably worth dealing with that issue, though.
Note that it's not a strict either/or option; you can always write your game to work over both TCP and UDP, and either use them simultaneously, or let the program and/or the user decide which one to use. That way if one method isn't working well, you can simply use the other one, and it only takes twice as much effort to implement. :)
In some applications TCP is faster (better throughput) than UDP. This
is the case when doing lots of small writes relative to the MTU size.
For example, I read an experiment in which a stream of 300 byte
packets was being sent over Ethernet (1500 byte MTU) and TCP was 50%
faster than UDP.
If this turns out to be an issue for you, you can obtain the same efficiency gain in your UDP protocol by placing multiple messages together into a single larger UDP packet. i.e. instead of sending 3 100-byte packets, you'd place those 3 100-byte messages together in 1 300-byte packet. (You'd need to make sure the receiving program is able to correctly intepret this larger packet, of course). That's really all that the TCP layer is doing here, anyway; placing as much data into the outgoing packets as it has available and can fit, before sending them out.

General overhead of creating a TCP connection

I'd like to know the general cost of creating a new connection, compared to UDP. I know TCP requires an initial exchange of packets (the 3 way handshake). What would be other costs? For instance is there some sort of magic in the kernel needed for setting up buffers etc?
The reason I'm asking is I can keep an existing connection open and reuse it as needed. However if there is little overhead reconnecting it would reduce complexity.
Once a UDP packet's been dumped onto the wire, the UDP protocol stack is free to completely forget about it. With TCP, there's at bare minimum the connection details (source/dest port and source/dest IP), the sequence number, the window size for the connection etc... It's not a huge amount of data, but adds up quickly on a busy server with many connections.
And then there's the 3-way handshake as well. Some braindead (and/or malicious systems) can abuse the process (look up 'syn flood'), or just drop the connection on their end, leaving your system waiting for a response or close notice that'll never come. The plus side is that with TCP the system will do its best to make sure the packet gets where it has to. With UDP, there's no guarantees at all.
Compared to the latency of the packet exchange, all other costs such as kernel setup times are insignificant.
OPTION 1: The general cost of creating a TCP connection are:
Create socket connection
Send data
Tear down socket connection
Step 1: Requires an exchange of packets, so it's delayed by to & from network latency plus the destination server's service time. No significant CPU usage on either box is involved.
Step 2: Depends on the size of the message.
Step 3: IIRC, just sends a 'closing now' packet, w/ no wait for destination ack, so no latency involved.
OPTION 2: Costs of UDP:*
Create UDP object
Send data
Close UDP object
Step 1: Requires minimal setup, no latency worries, very fast.
Step 2: BE CAREFUL OF SIZE, there is no retransmit in UDP since it doesn't care if the packet was received by anyone or not. I've heard that the larger the message, the greater probability of data being received corrupted, and that a rule of thumb is that you'll lose a certain percentage of messages over 20 MB.
Step 3: Minimal work, minimal time.
OPTION 3: Use ZeroMQ Instead
You're comparing TCP to UDP with a goal of reducing reconnection time. THERE IS A NICE COMPROMISE: ZeroMQ sockets.
ZMQ allows you to set up a publishing socket where you don't care if anyone is listening (like UDP), and have multiple listeners on that socket. This is NOT a UDP socket - it's an alternative to both of these protocols.
See: ZeroMQ.org for details.
It's very high speed and fault tolerant, and is in increasing use in the financial industry for those reasons.

What's the difference between streams and datagrams in network programming?

What's the difference between sockets (stream) vs sockets (datagrams)? Why use one over the other?
A long time ago I read a great analogy for explaining the difference between the two. I don't remember where I read it so unfortunately I can't credit the author for the idea, but I've also added a lot of my own knowledge to the core analogy anyway. So here goes:
A stream socket is like a phone call -- one side places the call, the other answers, you say hello to each other (SYN/ACK in TCP), and then you exchange information. Once you are done, you say goodbye (FIN/ACK in TCP). If one side doesn't hear a goodbye, they will usually call the other back since this is an unexpected event; usually the client will reconnect to the server. There is a guarantee that data will not arrive in a different order than you sent it, and there is a reasonable guarantee that data will not be damaged.
A datagram socket is like passing a note in class. Consider the case where you are not directly next to the person you are passing the note to; the note will travel from person to person. It may not reach its destination, and it may be modified by the time it gets there. If you pass two notes to the same person, they may arrive in an order you didn't intend, since the route the notes take through the classroom may not be the same, one person might not pass a note as fast as another, etc.
So you use a stream socket when having information in order and intact is important. File transfer protocols are a good example here. You don't want to download some file with its contents randomly shuffled around and damaged!
You'd use a datagram socket when order is less important than timely delivery (think VoIP or game protocols), when you don't want the higher overhead of a stream (this is why DNS is primarily a datagram protocol, so that servers can respond to many, many requests at once very quickly), or when you don't care too much if the data ever reaches its destination.
To expand on the VoIP/game case, such protocols include their own data-ordering mechanism. But if one packet is damaged or lost, you don't want to wait on the stream protocol (usually TCP) to issue a re-send request -- you need to recover quickly. TCP can take up to some number of minutes to recover, and for realtime protocols like gaming or VoIP even three seconds may be unacceptable! Using a datagram protocol like UDP allows the software to recover from such an event extremely quickly, by simply ignoring the lost data or re-requesting it sooner than TCP would.
VoIP is a good candidate for simply ignoring the lost data -- one party would just hear a short gap, similar to what happens when talking to someone on a cell phone when they have poor reception. Gaming protocols are often a little more complex, but the actions taken will usually be to either ignore the missing data (if subsequently-received data supercedes the data that was lost), re-request the missing data, or request a complete state update to ensure that the client's state is in sync with the server's.
Stream Socket:
Dedicated & end-to-end channel between server and client.
Use TCP protocol for data transmission.
Reliable and Lossless.
Data sent/received in the similar order.
Long time for recovering lost/mistaken data
Datagram Socket:
Not dedicated & end-to-end channel between server and client.
Use UDP for data transmission.
Not 100% reliable and may lose data.
Data sent/received order might not be the same.
Don't care or rapid recovering lost/mistaken data.
If it is the network programming I think starting from sockets would be a good start.
socket = ip + port
there are three types of sockets
stream (TCP, order and delivery guaranteed,no duplication,no length or char boundaries for data,connection-oriented,reliable, concurrency)
datagram(UDP,packet-based, connectionless, datagram size limit, data can be lost or duplicated, order not guaranteed,not reliable)
raw (direct access to lower layer protocols IP,ICMP)
I do not see any strict rule for transport protocol type as to what socket has to use what transport protocol and reliability should not be mistaken because UDP is realiable in case both ends are active.
Reliability refers to more like reliability of delivery since there are sequence number checks by using TCP as transport protocol which do not exist in UDP.It is better using network protocol analyzer like wireshark tcpdump etc to see what your software is exactly doing; kind of verification or merging theory on the paper with your work in action.