Situation
Imagine real-time fast-pace online game server. There're two types of messages: urgent (like movement and shooting, 20 per second, small size) and normal (like chat messages, 1 per minute, big size).
I want my server to receive urgent messages separately from normal messages in order to process them faster.
Handlers of urgent and normal messages can be deployed on one physical machine or on separate.
I use UDP protocol.
Question
As I understand, if handlers are on separate physical machines there're no problems because there're two physical network streams which are independent.
But what if handlers are on one physical machine? Should I create a separate socket for each message type or there're no way to escape problem on one physical machine?
As I understand, at hardware level there's one network buffer, where all incoming packets go from one cable so many sockets don't solve the problem because it's impossible to divide streams at hardware level. Or there can be many hardware network buffers on one physical machine?
Multiplexing sockets only works for TCP, since UDP has no concept of connection and packets are treated in a unified form.
If you make multiple TCP sockets, it may make sense depending on the situation. For example, if you create a channel for small data and another for large data, they would be processed differently and large data wouldn't jam the other. (e.g. sending an image doesn't stop other urgent data)
Anyway, when designing this type of communication, you would end up building your own congestion control with UDP. This way you can prioritize the urgent packets with rescheduling. Please also consider researching for a library that handles this for you.
Related
I teach students to develop network applications, both clients and servers. At this moment, we have not yet touched existing protocols such as HTTP, SMTP, etc. The students write very simple programs on top of the plain socket API. Currently I check a students' work manually, but I want to automate this task and create an automated test bench for networking applications. The most interesting topics for testing are:
Breaking TCP segments into small parts and delivering them with a noticeable delay. A reason I need such test is that students usually just issue a read/recv call and process the received data without checking that all necessary data was received. TCP doesn't guarantee the message boundaries, so in certain circumstances it is necessary to make several read/recv calls. The problem is that in most simple network applications (for example, in a chat application) messages are small and fit into the single TCP segment, so the issue doesn't appear. My idea is to artificially break messages into several small TCP segments (i.e. several bytes of data) so the problem will appear.
Pausing the data transfer for some time to simulate multiple slow clients and check that the multithreading/async sockets are implemented properly in the students' servers.
Resetting a connection in random moments of time.
I've found several systems which simulate a bad network (dummynet, clumsy, netem). Hovewer, they all work on the IP level of the stack, so OS and it's TCP implementation will compensate the data loss. Such systems are able to solve the task number 2, but they are not able to solve tasks 1 and 3. So I think that I need to develop my own solution, which will act as a TCP proxy. My questions are:
Maybe the are any libraries or applications which can (at least partially) solve the given tasks, so I'll be able to use them as a base for my own solution?
In case there is none any suitable existing software projects, maybe there are any ideas and approaches about how to do this properly?
From WireShark mailing list - Creating and Modifying Packets:
...There's a "Tools" page on the Wireshark Wiki:
http://wiki.wireshark.org/Tools
which has a "Traffic generators" section:
https://wiki.wireshark.org/Tools#Traffic_generators
which lists some tools that might be useful...
The "Traffic generators" chapter also mentions another collection of traffic generators
If you write your own socket code, you can address all 3 tasks.
enable the socket's TCP_NODELAY option (disable the Nagle Algorithm for Send Coalescing) via setsockopt(), then you can send() small fragments of data as you wish, optionally with a delay in between (see #2).
simply put a delay in between your send() calls.
use setsockopt() to adjust the socket's SO_LINGER and SO_DONTLINGER options to control whether closing the socket performs an abortive or graceful closure, then simply close the socket at some random interval after the connection is established.
I am trying to run a simulation to test packet loss in an environment where packet collision is happening. My current setup includes several discrete machines each with their own network interface to send/receive packets. These machines are connected by wifi through an AP. I'm currently using UDP for its ability to broadcast packets on a single address. All machines are listening on a shared IP address, something like 192.168.1.255.
This answer mentions that UDP packets are unreliable, but will they fail because of a collision? Here, I use collision to refer to interference caused by multiple simultaneous transmission. That is, will the simultaneous broadcast of two UDP nodes in the network induce the unreliability I am looking to test? If it's not, will I have to look into changing my network configuration or even start tinkering with kernel code?
If the question is vague, I will say that my end goal involves writing some distributed algorithm that may or may not be resistant to collisions.
I am trying to run a simulation to test packet loss in an environment
where packet collision is happening.
You might want to include in your question what you mean by the word collision. I'm going to assume in my answer that you mean it in the traditional sense (i.e. two network endpoints transmitting at approximately the same time and thereby "talking over each other" and garbling each other's transmissions such that neither transmission is successful), and not in any broader sense of "a packet got dropped due to network congestion".
This answer mentions that UDP packets are unreliable, but will they
fail because of a collision?
The answer is going to depend entirely on what sort of network hardware you are running your UDP packets over. The UDP protocol itself is hardware-independent, so it's not going to specify anything about whether collisions can occur or not, since there's no way for it to know.
That said, most low-level networking hardware these days has provisions for avoiding collisions (in the sense I mentioned above) -- for example, modern Ethernet switches do a limited amount of active queueing/buffering of packets when necessary (which is much more efficient and reliable than the old 10Mb/sec Ethernet hubs, which basically just electrically connected the Ethernet RX and TX leads of all the endpoints into one big "shared wire", and hoped for the best)
The other commonly used networking-hardware type, Wi-Fi, also has mechanisms to reduce collisions, but that doesn't mean that UDP broadcast over Wi-Fi is a good idea, because it suffers from other issues -- for one thing, the Wi-Fi router has to receive your broadcast packet and rebroadcast it to make sure all other clients can receive it, and worse, it will typically be set to retransmit it at a very slow "legacy" rate, in order to make sure that any ancient Wi-Fi cards out there can still receive the broadcast data. My advice is that if you're going to be using Wi-Fi, keep your broadcast (and multicast) transmissions to an absolute minimum; even sending separate/identical unicast packets to every other client is usually more efficient(!) -- not to avoid collisions, but rather because even a modest amount of broadcast/multicast traffic can bring your Wi-Fi network to a crawl.
UDP is said to be unreliable because it does not guarantee packet delivery, retransmission, flow control, or congestion. So, the sending/receiving of UDP packets can fail for many reasons: collision, unreliable physical medium, interference, dropping of packets due to router queue overflow, etc.
My goal is to drop as few UDP datagrams as possible. Shocker, I know, ;-)
Here is my circumstance which is a bit different from the general network server/client optimization questions for which I see a lot of discussion:
I am writing socket code for a process which has one singular goal: grab UDP packets received by my Gigabit Ethernet NIC and get them into application RAM with as high a bandwidth as possible (i.e. minimize packet drops/loss).
The network is point-to-point without any firewalls, switches, routers, etc - just a single Cat6 cable connecting the UDP datagram generator/server (an embedded system) with my Windows 7 PC, the datagram receiver/client. I can control the transmitted datagrams-per-second via some controls on the datagram generator. The datagrams are sent to the broadcast address (FF.FF.FF.FF).
I've successfully achieved about 250-300Mbits/sec (30% of the theoretical 1G Ethernet bandwidth) without any packets getting dropped or order-scrambled by using lean-and-mean code based on the built-in Winsock2 commands: select() and recvfrom() as outlined in the sample code for those commands on MSDN.
(I've already adjusted the receive buffer to be very large using the setsockopt() command, and this helped considerably.) But I am still wanting to maximize performance and eager to hear thoughts from this community on whether or not I should expect noticeable gains from trying the following:
Asynchronous I/O, such as boost::asio. From what I gather, this library appears to be more for optimizing applications which have to serve a lot of different sockets to different machines. Should I expect much in terms of single-socket UDP receive performance from switching from Winsock to an asynchronous I/O architecture?
Packet size: If I make the effort to change the packet size by modifying the embedded code that is generating the packets, would it be likely to improve performance by having lots of smaller packets or fewer large/jumbo packets?
Broadcast/multicast/unicast: is one destination address type likely to perform better than others?
Or is 300Mbps about the limit that I should be expecting for actual throughput on a 1G physical link?
Any other recommendations on low-hanging fruit to improve performance, or expectations on what type of performance is feasible.
Thanks all!
I'm about to re-architect a real-time system that has been prototyped on a single node and specify how it should be scaled up to multiple nodes (probably never more than 20 of them in any one LAN). Some of the functionality will multiply on a per-node basis, and some of it will remain centralised on a one-per-system basis. There is going to be a need for communication between each node and that central unit (possibly a master node), but not between individual nodes.
Due to the real-time demands of the system, UDP is something that should be considered for that communication. But... it is almost always described as unreliable. Is this always the case? Does it not depend on the scale of the network, the data load on the network and the way the protocol is used?
For example, suppose I have a central unit which regularly polls through each node by addressing a UDP message to it, and each node immediately responds with its data via UDP. There is no other communication on the (isolated) network. Suppose there is also some mechanism to ensure there are never any collisions (e.g. all nodes have a maximum transmission length for their responses to a poll message, and the latencies are nailed down to known levels). Is there any (hidden) reason in a simple and structured network like this that you would ever fail to transmit/receive every last UDP packet and have near 100% reliability?
EDIT: the detail of this question suffers from confusion around what "unreliable" means, and whether it is intended to apply only to UDP, or to the system in which UDP is employed. I have chosen to leave this confusion in the question, because looking back over a lot of material on UDP, I can see that this confusion might be very common, and that answers which highlight that confusion and overcome it might be valuable.
The key is, UDP does not make any guarantees. There are many reasons why datagrams might go undelivered:
Sender host buffers fill up
Cosmic rays flip bits somewhere along the way, causing a checksum mismatch and the datagram to be discarded
Electromagnetic interference corrupts the signal momentarily
A network cable gets unplugged for a moment
A hub or switch loses power for a moment
A switch's buffers fill up
Receiving host buffers fill up
If any of these things (or many others) occurs, a datagram may go undelivered. UDP will make no attempt to detect this or to re-deliver it.
Yes. Every layer is potentially unreliable, starting with the electrical signalling across your Ethernet cable. (Ever jostled one of those plugs? You can see it in Wireshark logs.) Collisions are virtually impossible to avoid. And in case of congestion, your protocol stack may decide to drop UDP packets.
But all that's rather beside the point. UDP is unreliable, but that doesn't mean it can't be relied on. Plenty of mission-critical applications run over UDP. You just need to understand the unreliability and account for it.
Unreliable does not mean it will definitely fail. It only means that it does not care about transport problems and thus will not make any guarantees that transmission will be successful. Let's compare some aspects of UDP against TCP.
UDP is packet based, TCP stream based. This has not much to do with reliability.
Packets may arrive in a different order than they were sent. UDP does not care and will deliver the packets in this order to the application. In TCP data have a sequence number so the receivers operating system will detect reordering and forward the data to the application in the correct order. This usually does not matter when you have a direct connection between client and server, but might happen in wide networks like the internet.
Packets may get lost due to router or switch congestion or overload of the senders or receiving system or others. This might also happen in local networks with heavy traffic or if the receiver system is unable to cope with the amount of data, even for a short time. With UDP the data will be lost. TCP instead will detect lost packets and retransmit them and even slow down the traffic to adapt to what speed network and endpoints can handle and thus loose less packets in the future.
Packets might get duplicated. Again TCP will detect this due to the sequence number but UDP will not and thus transmit the duplicate packet to the application.
Packets might get corrupted. Both TCP and UDP have the same kind of checksum to detect small errors, but will not detect larger errors.
Applications using UDP usually does not need the reliability of TCP or don't need all of this. For instance with real time audio and video packet loss is acceptable but duplicates and reordering is not. Thus the RTP protocol contains its own sequence number (timestamp) to detect this case. Also, RTP is often accompanied by the RTCP protocol to send statistics about packet loss back to the peer and thus make adaption of connection speed possible.
If you want reliable UDP, try looking at ENet library.
http://enet.bespin.org/
Unreliability with regard to UDP is different from unreliability in general. Also, UDP and alternatives to it (e.g. TCP) are always only ever components or single layers in a wider system. This can lead to some confusion about what "unreliable" means.
UDP is a transport layer network protocol. The transport layer is responsible for getting data from one point on the network to another specific point on the network. In that context, UDP is described as an "unreliable" protocol because it makes no guarantees about whether the data sent will actually arrive. In contrast, TCP is a "reliable" transport layer protocol because if data goes missing or is corrupted the first time it is sent, the protocol itself has mechanisms to resend the data and ensure it arrives... eventually.
But UDP is not some sloppy "maybe, maybe not - let me think about it and screw you around" protocol. It does what it is specified to do, and is reliable (general sense) at doing it... as well as reliable (general sense) in failing in predictable ways. If you take these failure modes into account elsewhere, UDP can be a component of an overall very reliable system.
For example, by restricting network topology and using UDP to transport higher level protocols, the GigE Vision standard specifies a highly reliable system with high data transfer rates and real-time response whose transport level communications is dominated by UDP traffic.
Historically, the major source of unreliable packet transport was packet collisions due to two sources attempting to transmit simultaneously on a single channel. In modern networks, each node is typically connected on a full duplex link to a network switch, making collisions impossible on that link, and consequently making modern networks much more reliable (in all senses) than was the case when UDP was first designed.
No networking technology currently available can be made 100% reliable... but let's be practical rather than pedantic, because potential unreliability and actual unreliability are a lot like shark attacks - they tend to occur far more in people's minds than in reality.
Some material on UDP makes it sound almost like the people who designed UDP did it just to annoy people - that unreliability was deliberately engineered in. This is not the case, and it is unhelpful to think of it in these terms. It is far better to focus on what UDP does and does not do in comparison to alternatives (e.g. see this comparison between TCP and UDP... which nonetheless lists "unreliability" as a key feature of UDP).
In reality, when there is data to be transmitted, that can be transmitted, it is transmitted; when there is data that can be received, it is received. Likewise, if you transmit packets 1, 2 then 3 directly to an endpoint, they will almost certainly be received as packets 1, 2 and 3 in order (assuming no failures in lower network layers, and that incoming data is buffered in a FIFO as is customary, but not mandatory). You can get a lot of reliability out of this, depending on how you use it.
However, if you transmit packets via multiple routes, all bets are off - "unreliability" of packet order can occur. And if you flood the available buffers, unreliability via dropping packets will occur. And if you allow nodes to transmit at any time (asynchronous), then you will get unreliability through packet collisions. But in the "simple and structured" (and also small and synchronous) LAN described, you may be able to either avoid this, or detect its occurrence (e.g. by sending an incrementing counter value in each packet), which will let you compensate in an application-specific way.
In cases where the power goes off (perhaps momentarily), or cosmic rays strike, or people trip on loose cables causing an unacceptable level of "unreliability"... then don't blame UDP - blame the engineer(s) whose design left the system susceptible to these things.
All things considered, in the LAN described, you might reasonably expect to be able to engineer a system based on UDP so as to never lose more than one packet in every few million, or billion, or even astronomically better than this - but it will depend on specifics, and only you can know if your application can tolerate the quantity and quality of unreliable comms that results in your case.
I'm writing a game server for a turn-based game. One criteria is that the game needs to be as fair for all players as possible.
So far it works like this:
Each client has a TCP connection. (If relevant, the connection is opened via WebSockets)
While running, continually check for incoming socket messages via epoll.
Iterate through clients with sockets ready to read:
Read all messages from the client.
Update the internal game state for each message.
Queue outgoing messages to affected clients.
At the end of each "window" (turn):
Iterate through clients and write all queued outgoing messages to their sockets
My concern for fairness raises the following questions:
Does it matter in which order I send messages to the clients?
Calling write() on all the sockets takes only a fraction of a second for my program, but somewhere in the underlying OS or networking would it make a difference if I sorted the client list?
Perhaps I should be sending to the highest-latency clients first?
Does it matter how I write the outgoing messages to the sockets?
Currently I'm writing them as one large chunk. The size can exceed a single packet.
Would it be faster for the client to begin its processing if I sent messages in smaller chunks than 1 packet?
Would it be better to write 1 packet worth to each client at a time, and iterate over the clients multiple times?
Are there any linux/networking configurations that would bear impact here?
Thanks in advance for your feedback and tips.
Does it matter in which order I send messages to the clients?
Yes, by fractions of milliseconds. If the network interface is available for sending the OS will immediately start sending. Why would it wait?
Perhaps I should be sending to the highest-latency clients first?
I think you should be sending in random order. Shuffle the list prior to sending. This makes it fair. I think your question is valid and this should be addressed.
Currently I'm writing them as one large chunk. [...]
First, realize that TCP is stream-based and that there are no packets/messages at the protocol level. On a physical level data is indeed packetized.
It is not necessary to manually split off packets because clients will read data as it arrives anyway. If a client issues a read, that read will complete immediately once the first packet has arrived. There is no artificial waiting in the OS.
Are there any linux/networking configurations that would bear impact here?
I don't know. Be sure to disable nagling.