What are the required mechanisms for a reliable layer over UDP? - sockets

I've been working on writing my own networking engine for my own game development side projects. This requires the options of having unreliable, reliable, and ordered reliable messages. I have not, however, been able to identify all of the mechanisms necessary for reliable and ordered reliable protocols.
What are the required mechanisms for a reliable layer over UDP? Additional details are appreciated.
So far, I gather that these are requirements:
Acknowledge received messages with a sequence number.
Resend unacknowledged messages after a retransmission time expires.
Track round trip times for each destination in order to calculate an appropriate retransmission time.
Identify and remove duplicate packets.
Handle overflowing sequence numbers looping around.
This has influenced my architecture to have reliable message headers with sequences and timestamps, acknowledge messages that echo a received sequence and timestamp, a system for tracking appropriate retransmission times based on address, and a thread that a) receives messages and queues them for user receipt, b) acknowledges reliable messages, and c) retransmits unacknowledged messages with expired retransmission timers.
NOTE:
Reliable UDP is not the same as TCP. Even ordered reliable UDP is not the same as TCP. I am not secretly unaware that I really want TCP. Also, before someone plays the semantics games, yes... reliable UDP is an "oxymoron". This is a layer over UDP that makes for reliable delivery.

You might like to take a look at the answers to this question: What do you use when you need reliable UDP?
I'd add 'flow control' to your list. You want to be able to control the amount of data you're sending on a particular link depending on the round trip time's you're getting or you'll flood the link and just be throwing datagrams away.

Note that depending on the overall protocol, it might be possible to dispense with retransmission timers. See, for example, the Quake 3 network protocol.
In Q3 reliable packets are simply sent until an ack is seen.

Why are you trying to re-invent TCP? It provides all of the features you originally stated, and has been show to work well.
EDIT - Since your comments show that you have additional requirements not originally stated, you should consider whether a hybrid model using multiple sockets would be better than trying to fulfill all of those criteria in a single application-layer protocol.
Actually it seems that what you really need is SCTP.
SCTP supports:
message based (rather than byte stream) transmissions
multiple streams over a single netsock socket
ordered or unordered receipt of packets
... message ordering is optional in SCTP; a receiving application may choose to process messages in the order they are received instead of the order they were sent

Related

Does it make sense to use RTP protocol for multiple streamers and single receiver?

I am in a process of learning and trying to use the RTP/RTCP protocol. My situation is that there is 1 to n streamers and 1 (or potentially 1 to m if needed) receiver(s), but in a way that the streamers themselves do not know about each other (they cannot directly due to technical reasons, such as different network, limited bandwidth, etc...). So it is more like multiple unicast sessions, but the receiver actually knows about them all, collects data from all of them, it is just the senders do not know about each other .
Now reading about the protocol, it seems to me that huge portion of it is related to sending some feedback, collision detections, and so on. So I have doubts, is RTP is really applicable in this case? Is is already used in this way somewhere?
Seems to me it is still beneficial to collect statistic about data transfer that RTP provides (data sent, loss, times, etc...), it just feels the most of the protocol is sort of left out...
Also I have one additional question, going through the various RTP libraries, they all assume that sender will also open ports for receiving RTP/RTCP data, does RTP forbid use of one way communication? I mean application that would only stream the data, not expecting to receive anything back. The libraries (e.g. ccRTP) seem to assume both way communication only...
RTCP is the protocol that provides statistics. The stream receiver (client) will send stats to the sender (server) via RTCP. I don't believe the client will get any statistic reports from the server.
There's nothing wrong with a single client receiving multiple unicast sessions from various servers.
RTP requires two way communication during the setup process. Once setup is complete and the play cmd is sent, it is mostly one way. The exception are the "keep alive" packets that must be sent to the server periodically (usually every 60 seconds or so) to keep the stream going. The exact timeout value is sent to the client during the setup process.
But if you implement your own RTP, there's nothing stopping you from having the server send the stream continuously without any feedback from the client. Basically it would be implementing an infinite timeout value.
You can read about all the details in the spec: RTP: A Transport Protocol for Real-Time Applications

Is UDP always unreliable?

I'm about to re-architect a real-time system that has been prototyped on a single node and specify how it should be scaled up to multiple nodes (probably never more than 20 of them in any one LAN). Some of the functionality will multiply on a per-node basis, and some of it will remain centralised on a one-per-system basis. There is going to be a need for communication between each node and that central unit (possibly a master node), but not between individual nodes.
Due to the real-time demands of the system, UDP is something that should be considered for that communication. But... it is almost always described as unreliable. Is this always the case? Does it not depend on the scale of the network, the data load on the network and the way the protocol is used?
For example, suppose I have a central unit which regularly polls through each node by addressing a UDP message to it, and each node immediately responds with its data via UDP. There is no other communication on the (isolated) network. Suppose there is also some mechanism to ensure there are never any collisions (e.g. all nodes have a maximum transmission length for their responses to a poll message, and the latencies are nailed down to known levels). Is there any (hidden) reason in a simple and structured network like this that you would ever fail to transmit/receive every last UDP packet and have near 100% reliability?
EDIT: the detail of this question suffers from confusion around what "unreliable" means, and whether it is intended to apply only to UDP, or to the system in which UDP is employed. I have chosen to leave this confusion in the question, because looking back over a lot of material on UDP, I can see that this confusion might be very common, and that answers which highlight that confusion and overcome it might be valuable.
The key is, UDP does not make any guarantees. There are many reasons why datagrams might go undelivered:
Sender host buffers fill up
Cosmic rays flip bits somewhere along the way, causing a checksum mismatch and the datagram to be discarded
Electromagnetic interference corrupts the signal momentarily
A network cable gets unplugged for a moment
A hub or switch loses power for a moment
A switch's buffers fill up
Receiving host buffers fill up
If any of these things (or many others) occurs, a datagram may go undelivered. UDP will make no attempt to detect this or to re-deliver it.
Yes. Every layer is potentially unreliable, starting with the electrical signalling across your Ethernet cable. (Ever jostled one of those plugs? You can see it in Wireshark logs.) Collisions are virtually impossible to avoid. And in case of congestion, your protocol stack may decide to drop UDP packets.
But all that's rather beside the point. UDP is unreliable, but that doesn't mean it can't be relied on. Plenty of mission-critical applications run over UDP. You just need to understand the unreliability and account for it.
Unreliable does not mean it will definitely fail. It only means that it does not care about transport problems and thus will not make any guarantees that transmission will be successful. Let's compare some aspects of UDP against TCP.
UDP is packet based, TCP stream based. This has not much to do with reliability.
Packets may arrive in a different order than they were sent. UDP does not care and will deliver the packets in this order to the application. In TCP data have a sequence number so the receivers operating system will detect reordering and forward the data to the application in the correct order. This usually does not matter when you have a direct connection between client and server, but might happen in wide networks like the internet.
Packets may get lost due to router or switch congestion or overload of the senders or receiving system or others. This might also happen in local networks with heavy traffic or if the receiver system is unable to cope with the amount of data, even for a short time. With UDP the data will be lost. TCP instead will detect lost packets and retransmit them and even slow down the traffic to adapt to what speed network and endpoints can handle and thus loose less packets in the future.
Packets might get duplicated. Again TCP will detect this due to the sequence number but UDP will not and thus transmit the duplicate packet to the application.
Packets might get corrupted. Both TCP and UDP have the same kind of checksum to detect small errors, but will not detect larger errors.
Applications using UDP usually does not need the reliability of TCP or don't need all of this. For instance with real time audio and video packet loss is acceptable but duplicates and reordering is not. Thus the RTP protocol contains its own sequence number (timestamp) to detect this case. Also, RTP is often accompanied by the RTCP protocol to send statistics about packet loss back to the peer and thus make adaption of connection speed possible.
If you want reliable UDP, try looking at ENet library.
http://enet.bespin.org/
Unreliability with regard to UDP is different from unreliability in general. Also, UDP and alternatives to it (e.g. TCP) are always only ever components or single layers in a wider system. This can lead to some confusion about what "unreliable" means.
UDP is a transport layer network protocol. The transport layer is responsible for getting data from one point on the network to another specific point on the network. In that context, UDP is described as an "unreliable" protocol because it makes no guarantees about whether the data sent will actually arrive. In contrast, TCP is a "reliable" transport layer protocol because if data goes missing or is corrupted the first time it is sent, the protocol itself has mechanisms to resend the data and ensure it arrives... eventually.
But UDP is not some sloppy "maybe, maybe not - let me think about it and screw you around" protocol. It does what it is specified to do, and is reliable (general sense) at doing it... as well as reliable (general sense) in failing in predictable ways. If you take these failure modes into account elsewhere, UDP can be a component of an overall very reliable system.
For example, by restricting network topology and using UDP to transport higher level protocols, the GigE Vision standard specifies a highly reliable system with high data transfer rates and real-time response whose transport level communications is dominated by UDP traffic.
Historically, the major source of unreliable packet transport was packet collisions due to two sources attempting to transmit simultaneously on a single channel. In modern networks, each node is typically connected on a full duplex link to a network switch, making collisions impossible on that link, and consequently making modern networks much more reliable (in all senses) than was the case when UDP was first designed.
No networking technology currently available can be made 100% reliable... but let's be practical rather than pedantic, because potential unreliability and actual unreliability are a lot like shark attacks - they tend to occur far more in people's minds than in reality.
Some material on UDP makes it sound almost like the people who designed UDP did it just to annoy people - that unreliability was deliberately engineered in. This is not the case, and it is unhelpful to think of it in these terms. It is far better to focus on what UDP does and does not do in comparison to alternatives (e.g. see this comparison between TCP and UDP... which nonetheless lists "unreliability" as a key feature of UDP).
In reality, when there is data to be transmitted, that can be transmitted, it is transmitted; when there is data that can be received, it is received. Likewise, if you transmit packets 1, 2 then 3 directly to an endpoint, they will almost certainly be received as packets 1, 2 and 3 in order (assuming no failures in lower network layers, and that incoming data is buffered in a FIFO as is customary, but not mandatory). You can get a lot of reliability out of this, depending on how you use it.
However, if you transmit packets via multiple routes, all bets are off - "unreliability" of packet order can occur. And if you flood the available buffers, unreliability via dropping packets will occur. And if you allow nodes to transmit at any time (asynchronous), then you will get unreliability through packet collisions. But in the "simple and structured" (and also small and synchronous) LAN described, you may be able to either avoid this, or detect its occurrence (e.g. by sending an incrementing counter value in each packet), which will let you compensate in an application-specific way.
In cases where the power goes off (perhaps momentarily), or cosmic rays strike, or people trip on loose cables causing an unacceptable level of "unreliability"... then don't blame UDP - blame the engineer(s) whose design left the system susceptible to these things.
All things considered, in the LAN described, you might reasonably expect to be able to engineer a system based on UDP so as to never lose more than one packet in every few million, or billion, or even astronomically better than this - but it will depend on specifics, and only you can know if your application can tolerate the quantity and quality of unreliable comms that results in your case.

Game server TCP networking sockets - fairness

I'm writing a game server for a turn-based game. One criteria is that the game needs to be as fair for all players as possible.
So far it works like this:
Each client has a TCP connection. (If relevant, the connection is opened via WebSockets)
While running, continually check for incoming socket messages via epoll.
Iterate through clients with sockets ready to read:
Read all messages from the client.
Update the internal game state for each message.
Queue outgoing messages to affected clients.
At the end of each "window" (turn):
Iterate through clients and write all queued outgoing messages to their sockets
My concern for fairness raises the following questions:
Does it matter in which order I send messages to the clients?
Calling write() on all the sockets takes only a fraction of a second for my program, but somewhere in the underlying OS or networking would it make a difference if I sorted the client list?
Perhaps I should be sending to the highest-latency clients first?
Does it matter how I write the outgoing messages to the sockets?
Currently I'm writing them as one large chunk. The size can exceed a single packet.
Would it be faster for the client to begin its processing if I sent messages in smaller chunks than 1 packet?
Would it be better to write 1 packet worth to each client at a time, and iterate over the clients multiple times?
Are there any linux/networking configurations that would bear impact here?
Thanks in advance for your feedback and tips.
Does it matter in which order I send messages to the clients?
Yes, by fractions of milliseconds. If the network interface is available for sending the OS will immediately start sending. Why would it wait?
Perhaps I should be sending to the highest-latency clients first?
I think you should be sending in random order. Shuffle the list prior to sending. This makes it fair. I think your question is valid and this should be addressed.
Currently I'm writing them as one large chunk. [...]
First, realize that TCP is stream-based and that there are no packets/messages at the protocol level. On a physical level data is indeed packetized.
It is not necessary to manually split off packets because clients will read data as it arrives anyway. If a client issues a read, that read will complete immediately once the first packet has arrived. There is no artificial waiting in the OS.
Are there any linux/networking configurations that would bear impact here?
I don't know. Be sure to disable nagling.

How'd I determine where one packet ends and where another one starts

While sending packets across a network, how can one determine where one packet ends and where another starts?
Is sending/receiving acknowledgment one of the ways of doing so?
TCP is a stream-based protocol. That is, it provides a stream vs. packet or message-based interface to the application. If using TCP, an application must implement its own method of determining packets or messages. For example, (a) all message are a fixed size, or (b) each message is prefixed with its subsequent size, or (c) there is a special "end-of-record" sequence in the data stream to indicate a message boundary. Search google for lots of information on how one can implement message boundaries in TCP.
I assume here that you mean application-level 'packets'.
If you use UDP, you don't need to since it's a message protocol. TCP is a byte streaming protocol, so it cannot send packets, just bytes. If you need to send anything more complex than a byte-stream across TCP, you have to add another protocol on top - HTTP is one such protocol. Text is fairly easy since lines have terminating characters, usually CR/LF/CRLF. Sending non-text messages will require a different protocol.
One approach that is often used with TCP is to connect, stream a protocol-unit, disconnect. This works OK, but slowly because of the huge latency of continually opening and closing TCP connections. HTTP usually works like this in order to serve up web pages to large numbers of users who, if left permanently connected while they viewed pages, would needlessly use up all the server sockets.
Waiting for an application-level ACK from the peer is sometimes necessary if it absolutely essential that peer receipt is known before the next message is sent, but again, this is slow because of the connection latency. TCP was not designed with this approach in mind.
If the commonly available IP protocols cannot directly provide what you need, you will have to resort to implementing your own.
What sort of 'packet' are you sending?
Rgds,
Martin
With TCP sockets, you just see the datastream where you can receive and send bytes. You have no way of knowing where a packet ends and another begins.
This is a feature (and a problem) of TCP. Most people just read data into a buffer until a linefeed (\n) is seen. Then process the data and wait for the next line. If transferring chunks of binary data, one can first inform the receiver of how many bytes of data are coming.
If packet boundaries are important, you could use UDP but then the packet order might change or some packets might be lost on the way without you knowing.
The newer SCTP protocol behaves much like TCP (lost packets are resend, packet ordering is retained) but with SCTP sockets you can send packets so that receiver gets exactly the same packet.

how to make SIP protocol more reliable using UDP

Actually We are doing thesis work where we need to make 10 voip phones which are SIP based connected with each other.So they can call and talk among each other.Also we want to add video calls access.Another question is it possible video calls on SIP.
SIP already has built in reliability measures, most of which are specifically to cope with unreliable transports such as UDP. You should read the section in the SIP RFC on Transactions to gain an understanding of how it works. One aspect missing from the SIP RFC is reliability for provisional responses and the supplementary RFC3262 deals with that.
SIP is agnostic to the type of sessions, such as voice or video, it sets up so yes it can be used to set up video calls. There are heaps of readily available SIP softphones around that already provide video, one example being x-lite.
To make it reliable you need to emulate the following two features:
For Calls
You need to sequence the packets.
One end needs to tell the other end that a sequenced packet is missing if this happens, and you probably want take jitter into account -- i.e., wait a small amount of time before you request a missing packet.
For protocol commands
You need to ackknowledge command packets -- if a command is not acknowledged it has to be sent again.