Would we see any speedup using ZeroMQ instead of TCP Sockets if the two processes communicating are on the same machine? - sockets

I understand that 0MQ is supposed to be faster than TCP Sockets in a clustered environment and I can see where that would be the case (I think that's what they're referring to when they say "Faster than TCP, for clustered products and supercomputing" on the 0MQ website). However, will I see any kind of speedup using 0MQ instead of TCP sockets to communicate between two processes running on the same machine?

Well, the short version is give it a try.
The slightly longer version is that writing TCP sockets can be hard, there's a lot of things that are easy to have problems with, but 0MQ guarantees the message will be delivered in its entirety. It is also written by experts in network sockets, which, with the best will in the world, you probably aren't, and they use a few advanced tricks to speed things along.

You are not actually running on one machine because the VM is treated as a separate machine. This means that TCP sockets have to run through the whole network stack and cannot take shortcuts like they do when you communicate between processes on one machine.
However, you could try UDP multicast under ZeroMQ to see if that speeds up your application. UDP is less reliable on a wide area network, but in a closed environment of a VM talking to its host, you can safely skip all the TCP reliability stuff.

I guess IPC should be faster than TCP. If you are willing to move to a single process, INPROC is definitely going to be much faster.

I think (have not tested) that the answer is false as ZMQ likely uses the same standard C lib and adds some message headers.
Same thing applies for UDP.
Same thing applies for IPC pipes.
ZMQ could be just as fast but since it adds headers it is not likely.
Now it could be a different story if you really need some sort of header and ZMQ has implemented it better than you. Like for message size or type, but I digress.

Related

ZMQ performance in comparison to UDP multicast

What is performance (I mean latency while sending all messages, maximum fan-out rate for many messages to many receivers) of ZMQ in comparison to "simple" UDP and its multicast implementation?
Assume, I have one static 'sender', which have to send messages to many,many 'receivers'. PUB/SUB pattern with simple TCP transport seems very comfortable to handle such task - ZMQ does many things without our effort, one ZMQ-socket is enough to handle even numerous connections.
But, what I am afraid is: ZMQ could create many TCP sockets in background, even if we don't "see" that. That could create latency. However, if I create "common" UDP socket and will transmit all my messages with multicast - there would be only one socket (multicast), so I think latency problem would be solved. To be honest, I would like to stay with ZMQ and PUB/SUB on TCP. Are my concerns valid?
I don't think you can really compare them in that way. It depends on what is important to you.
TCP provides reliability and you as the sender can choose if loss is more important than latency by setting your block/retry options on send.
mcast provides network bandwidth saving especially if you network has multiple segments/routers.
Other options in zeromq
Use zmq_proxy's to split/share the load of tcp connections
Use pub/sub with pgm/epgm which is just a layer over multicast (I use this)
Use the new radio dish pattern (with this you have limited subscription options)
http://api.zeromq.org/4-2:zmq-udp
Behind the scenes, a TCP "socket" is identified (simplified) by both the "source" and "destination" - so there will be a socket for each direction you're communicating with each peer (for a more full description of how a socket is set up/identified, see here and here). This has nothing to do with ZMQ, ZMQ will set up exactly as many sockets as is required by TCP. You can optimize this if you choose to use multicast, but ZMQ does not optimize this for you by using multicast behind the scenes, except for PUB/SUB see here.
Per James Harvey's answer, ZMQ has added support for UDP, so you can use that if you do not need or want the overhead of TCP, but based on what you've said, you'd like to stay with TCP, which is likely the better choice for most applications (i.e. we often unconsciously expect the reliability of TCP when we're designing our applications, and should only choose UDP if we know what we're doing).
Given that, you should assume that ZMQ is as efficient as TCP enables it to be with regards to low-level socket management. In your case, with PUB/SUB, you're already using multicast. I think you're good.

Use socket to comunicate between process in the same host, is it ok go with UDP?

I want to make sure, if use UDP within a host, should i care about the package lost issue?
Yes, you should care about reliability when using UDP. Even if you use it on localhost, there is no guaranty that packets are not lost because the Protocol Specifications does not ensure this. It also depends on the implementation of UDP in Operating System. It may behave differently on different operating systems as far as reliability is concerned because there is no rule defined in UDP specifications.
Also the order of delivery in UDP is not ensured so you should also take care of it while using UDP for IPC.
I hope it helps.

UDP for interprocess communications

I have to implement IPC mechanism (Sending short messages) between java/c++/python process running on the same system. One way to implement is using socket using TCP protocol. This requires maintain connection and other associated activities.
Instead I am thinking of using UDP protocol which does not requires connection and I can send messages.
My question is , does UDP on same machine ( for IPC ) still has same disadvantage has it is applicable when communicating across machines ( like un reliable packet delivery, out of order packet.
Yes, is still unrealiable. For local communication try to use named pipes or shared memory
Edit:
Don't know the requirements of your applications, did you considered something like MPI (altough Java is not well supported...) or, Thrift? ( http://thrift.apache.org/ )
Local UDP is still unreliable, but the major advantage is UDP multicast. You can have one data publisher and many data subscribers. The kernel does the job of delivering a copy of the datagram to each subscriber for you.
Unix local datagram sockets, on the other hand, are required to be reliable but they do not support multicast.
Local UDP is more unreliable than on a network, like 50%+ packet drop unreliable. It is a terrible choice, kernel developers have attributed the quality down to lack of demand.
I would recommend investigating message based middleware preferably with a BSD socket compatible interface for easy learning curve. A suggestion would be ZeroMQ which includes C++, Java and Python bindings.
Local UDP is both still unreliable and sometimes blocked by firewalls. We faced this in our MsgConnect product which uses local UDP for interthread communication. BTW MsgConnect can be an option for your task so that you don't need to deal with sockets. Unfortunately there's no Python binding, but "native" C++ and Java implementations exist.

How efficient BSD sockets to writew server client application on iphone?

I am creating server-client application for iPhone. I want to communicate between two application in same network.
For this functionality i am planning to use sockets. How much efficient BSD sockets to use with iphone??
Is there any option available to implement same functionality?
Thanks,
Jim.
See this thread on the iPhone Dev SDK website.
The CF networking stuff is a bit
confusing and hard to wrap your head
around. But, it's just a set of
functions that use BSD sockets and
integrate them with the run loop so
you don't have to create threads. You
can still use BSD sockets yourself
Basically, the thread points out multiple libraries / frameworks which integrate well with the iPhone environment, and using any of them instead of straight BSD sockets probably won't make any significant performance difference. Unless you're really comfortable with low level socket programming you're probably better of with one of the libraries.
Don't do premature optimization - use whatever socket interface you are most comfortable with and which will help you get the job done quickly and produce clear, maintainable code.
EDIT
In response to Jim's question below:
Yes. There are a few factors that determine the system wide and per process socket limits. Take a look at this article for a discussion of these issues. iPhone and Linux are both Unix based OS's so they probably share some of these system admin related socket limitations, but you'll have to look up the system specific admin details.
Second, there are limits imposed by the architecture of UDP and TCP. Basically, UDP and TCP are both limited to 2^16 listening sockets per machine IP address since a listening socket is defined by a fixed 32 bit IP address and a 16 bit Port number. However, since a connected socket is defined by the set of [ [src IP] [src Port] [dst IP] [dst Port] ] then the number of connected sockets you can theoretically have on a single machine IP is significantly higher, something like 2^64 although practically your OS would probably barf way before you hit that limit.

What do you use when you need reliable UDP?

If you have a situation where a TCP connection is potentially too slow and a UDP 'connection' is potentially too unreliable what do you use? There are various standard reliable UDP protocols out there, what experiences do you have with them?
Please discuss one protocol per reply and if someone else has already mentioned the one you use then consider voting them up and using a comment to elaborate if required.
I'm interested in the various options here, of which TCP is at one end of the scale and UDP is at the other. Various reliable UDP options are available and each brings some elements of TCP to UDP.
I know that often TCP is the correct choice but having a list of the alternatives is often useful in helping one come to that conclusion. Things like Enet, RUDP, etc that are built on UDP have various pros and cons, have you used them, what are your experiences?
For the avoidance of doubt there is no more information, this is a hypothetical question and one that I hoped would elicit a list of responses that detailed the various options and alternatives available to someone who needs to make a decision.
What about SCTP. It's a standard protocol by the IETF (RFC 4960)
It has chunking capability which could help for speed.
Update: a comparison between TCP and SCTP shows that the performances are comparable unless two interfaces can be used.
Update: a nice introductory article.
It's difficult to answer this question without some additional information on the domain of the problem.
For example, what volume of data are you using? How often? What is the nature of the data? (eg. is it unique, one off data? Or is it a stream of sample data? etc.)
What platform are you developing for? (eg. desktop/server/embedded)
To determine what you mean by "too slow", what network medium are you using?
But in (very!) general terms I think you're going to have to try really hard to beat tcp for speed, unless you can make some hard assumptions about the data that you're trying to send.
For example, if the data that you're trying to send is such that you can tolerate the loss of a single packet (eg. regularly sampled data where the sampling rate is many times higher than the bandwidth of the signal) then you can probably sacrifice some reliability of transmission by ensuring that you can detect data corruption (eg. through the use of a good crc)
But if you cannot tolerate the loss of a single packet, then you're going to have to start introducing the types of techniques for reliability that tcp already has. And, without putting in a reasonable amount of work, you may find that you're starting to build those elements into a user-space solution with all of the inherent speed issues to go with it.
ENET - http://enet.bespin.org/
I've worked with ENET as a reliable UDP protocol and written an asynchronous sockets friendly version for a client of mine who is using it in their servers. It works quite nicely but I don't like the overhead that the peer to peer ping adds to otherwise idle connections; when you have lots of connections pinging all of them regularly is a lot of busy work.
ENET gives you the option to send multiple 'channels' of data and for the data sent to be unreliable, reliable or sequenced. It also includes the aforementioned peer to peer ping which acts as a keep alive.
We have some defense industry customers that use UDT (UDP-based Data Transfer) (see http://udt.sourceforge.net/) and are very happy with it. I see that is has a friendly BSD license as well.
Anyone who decides that the list above isn't enough and that they want to develop their OWN reliable UDP should definitely take a look at the Google QUIC spec as this covers lots of complicated corner cases and potential denial of service attacks. I haven't played with an implementation of this yet, and you may not want or need everything that it provides, but the document is well worth reading before embarking on a new "reliable" UDP design.
A good jumping off point for QUIC is here, over at the Chromium Blog.
The current QUIC design document can be found here.
RUDP - Reliable User Datagram Protocol
This provides:
Acknowledgment of received packets
Windowing and congestion control
Retransmission of lost packets
Overbuffering (Faster than real-time streaming)
It seems slightly more configurable with regards to keep alives then ENet but it doesn't give you as many options (i.e. all data is reliable and sequenced not just the bits that you decide should be). It looks fairly straight forward to implement.
As others have pointed out, your question is very general, and whether or not something is 'faster' than TCP depends a lot on the type of application.
TCP is generally as fast as it gets for reliable streaming of data from one host to another. However, if your application does a lot of small bursts of traffic and waiting for responses, UDP may be more appropriate to minimize latency.
There is an easy middle ground. Nagle's algorithm is the part of TCP that helps ensure that the sender doesn't overwhelm the receiver of a large stream of data, resulting in congestion and packet loss.
If you need the reliable, in-order delivery of TCP, and also the fast response of UDP, and don't need to worry about congestion from sending large streams of data, you can disable Nagle's algorithm:
int opt = -1;
if (setsockopt(sock_fd, IPPROTO_TCP, TCP_NODELAY, (char *)&opt, sizeof(opt)))
printf("Error disabling Nagle's algorithm.\n");
If you have a situation where a TCP connection is potentially too slow and a UDP 'connection' is potentially too unreliable what do you use? There are various standard reliable UDP protocols out there, what experiences do you have with them?
The key word in your sentence is 'potentially'. I think you really need to prove to yourself that TCP is, in fact, too slow for your needs if you need reliability in your protocol.
If you want to get reliability out of UDP then you're basically going to be re-implementing some of TCP's features on top of UDP which will probably make things slower than just using TCP in the first place.
Protocol DCCP, standardized in RFC 4340, "Datagram Congestion Control Protocol" may be what you are looking for.
It seems implemented in Linux.
May be RFC 5405, "Unicast UDP Usage Guidelines for Application Designers" will be useful for you.
RUDP. Many socket servers for games implement something similar.
Did you consider compressing your data ?
As stated above, we lack information about the exact nature of your problem, but compressing the data to transport them could help.
It is hard to give a universal answer to the question but the best way is probably not to stay on the line "between TCP and UDP" but rather to go sideways :).
A bit more detailed explanation:
If an application needs to get a confirmation response for every piece of data it transmits then TCP is pretty much as fast as it gets (especially if your messages are much smaller than optimal MTU for your connection) and if you need to send periodic data that gets expired the moment you send it out then raw UDP is the best choice for many reasons but not particularly for speed as well.
Reliability is a more complex question, it is somewhat relative in both cases and it always depends on a specific application. For a simple example if you unplug the internet cable from your router then good luck keeping reliably delivering anything with TCP. And what even worse is that if you don't do something about it in your code then your OS will most likely just block your application for a couple of minutes before indicating an error and in many cases this delay is just not acceptable as well.
So the question with conventional network protocols is generally not really about speed or reliability but rather about convenience. It is about getting some features of TCP (automatic congestion control, automatic transmission unit size adjustment, automatic retransmission, basic connection management, ...) while also getting at least some of the important and useful features it misses (message boundaries - the most important one, connection quality monitoring, multiple streams within a connection, etc) and not having to implement it yourself.
From my point of view SCTP now looks like the best universal choice but it is not very popular and the only realistic way to reliably pass it across the Internet of today is still to wrap it inside UDP (probably using sctplib). It is also still a relatively basic and compact solution and for some applications it may still be not sufficient by itself.
As for the more advanced options, in some of the projects we used ZeroMQ and it worked just fine. This is a much more of a complete solution, not just a network protocol (under the hood it supports TCP, UDP, a couple of higher level protocols and some local IPC mechanisms to actually deliver messages). Since a couple of releases its initial developer has switched his attention to his new NanoMSG and currently the newest NNG libraries. It is not as thoroughly developed and tested and it is not very popular but someday it may change. If you don't mind the CPU overhead and some network bandwidth loss then some of the libraries might work for you. There are some other network-oriented message exchange libraries available as well.
You should check MoldUDP, which has been around for decades and it is used by Nasdaq's ITCH market data feed. Our messaging system CoralSequencer uses it to implement a reliable multicast event-stream from a central process.
Disclaimer: I'm one of the developers of CoralSequencer
The best way to achieve reliability using UDP is to build the reliability in the application program itself( for example, by adding acknowledgment and retransmission mechanisms)