What is the fastest (lowest latency) messaging queue solution for sending a message from host A to host B? - real-time

Ok folks, NOT counting ethernet speed (Infinitband), kernel bypass or any other fancy stuff, just plain TCP/IP (TCP/UDP over Ethernet) networking. What is the fastest messaging queue implementation that can deliver a message from host A to host B?
Let's assume 10Gigabits ethernet cards connecting both machines with up-to-date architecture and CPUs. What latency in microseconds are we talking here for a 1472 bytes message (MTU - IP/UDP headers)?
As #Sachin described very well, what I am looking for is the messaging queue and the latency number to send a message from A to B like below:
Host A <-------TCP-------> Messaging queue (process, route, etc) <-------TCP-------> Host B

if you do not require a broker in between, 0MQ gave us the best performance (you will need to test the numbers on your platform/use case). If using a broker in between, both ActiveMQ & RabbitMQ performed in the same range. Using Redis as a messaging server did not hold up for us.
If not using a messaging server, options such as Netty, J-groups etc might be useful (not sure about your programming language).
You could look into reliable UDP as well if going with straight socket connectivity.
hope it helps.

The lower bound would be at least 2 TCP connections and the routing time inside the messaging queue server (meaning the delays associated with these)
Host A <-------TCP-------> Messaging queue (process, route, etc) <-------TCP-------> Host B
Off course, if you build in redundancy, fault tolerance etc, then you are going to be certainly way above this lower bound.

It looks like you are talking about an UDP-based MQ because you mentioned MTU. Well, for UDP-based MQs this time is usually measured as the time required to publish a message and see it back in the message bus. So it is a round-trip time, not a one-way time as you described. This can usually be done in less than 6 microseconds, depending of course on your choice of LAN.

Related

how high the percentage of packet delivery rate of MQTT than CoAP?

I am willing to know about the comparison of the Packet delivery rate between MQTT and CoAP transmission. I know that TCP is more secure than UDP, so MQTT should have a higher Packet delivery rate. I just want to know, if 2000 packets are sent using both protocols separately what would be the approximate percentage in the two cases?
Please help with an example if possible.
If you dig a little, you will find, that both, TCP and UDP, are mainly sending IP messages. And some of these messages may be lost. For TCP, the retransmission is handled by the TCP protocol without your influence. That sure works not too bad (at least in many cases). For CoAP, when you use CON messages, CoAP does the retransmission for you, so also not too much to lose.
When it comes to transmissions with more message loss (eg. bad connectivity), the reliability may also depend on the amount of data. If it fits into one IP message, the probability that this reaches the destination is higher, than 4 messages reaching their destination. In that situation the difference starts:
Using TCP requires, that all the messages are reaching the destination without gaps (e.g. 1,2, (drop 3), 4 will not work). CoAP will deliver the messages also with gaps. It depends on your application, if you can benefit from that or not.
I've been testing CoAP over DTLS 1.2 (Connection ID) for mostly a year now using a Android phone and just moving around sending request (about 400 bytes) with wifi and mobile network. It works very reliable. Current statistic: 2000 request, 143 retransmissions, 4 lost. Please note: 4 lost mainly means "no connection", so be sure, using TCP will have results below that, especially when moving around and frequently new TCP/TLS handshakes get required.
So my conclusion:
If you have a stable network connection, both should work.
If you have a less stable network connection and gaps are not acceptable by your application, both have trouble.
If you have a less stable network connection and gaps are acceptable by your application, then CoAP will do a better job.

Full duplex socket vs. two sockets used, one for read and other for write

I was wondering,
1st question What are the pros and cons of using one socket (full duplex) vs. two socket (simplex) per peer: one for read and other write? Specially in terms of performance and resource utilization.
2nd question In case, if i choose to use more than 1 sockets per peer, on all i do read and write. Then will it helps me scale out in handling no of messages handled?
3rd question: what should help me determine the number of sockets per peer? Network Bandwidth? No. of message in and out?
All questions are different and do not have any inter-relation.
What are the pros and cons of using one socket (full duplex) vs. two socket one for read and other write? Specially in terms of performance and resource utilization.
Pro one socket: resource utilization. Contra one socket: nil. Performance: identical, except that you save on connect and close handshakes if you only use one socket.
In case, I choose to take two socket approach, then will not be useful to use both of them full duplex, that way it helps me scale out in terms of data flowing in and out?
Now you're comparing apples and oranges. You can't compare one full-duplex socket with two full-duplex sockets. I don't know why you think you might need two inbound and two outbound flows, but you don't. Every protocol I can think of except FTP uses only one.
what impact does network bandwidth has on it?
Nil.
or it has on network utilization?
Nil, apart from the connect and close handshakes. But it wastes resources at both ends.
We've added --full-duplex to iperf 2.0.14 which will test a full-duplex socket. One can dompare it to two sockets per the -d or --dualtest option. We've found "your mileage will vary" and there is no universal to answer of having equal performance or not. In theory, it seems they should be equal but, in practice, maybe not.
-d, --dualtest
Do a bidirectional test simultanous test using two unidirectional sockets
--fq-rate n[kmgKMG]
Set a rate to be used with fair-queueing based socket-level pacing, in bytes or bits per second. Only available on platforms supporting the SO_MAX_PACING_RATE socket option.
(Note: Here the suffixes indicate bytes/sec or bits/sec per use of uppercase or lowercase, respectively)
--full-duplex
run a full duplex test, i.e. traffic in both transmit and receive directions using the same socket
Bob

Sending large files between erlang nodes

I have a setup with two Erlang nodes on the same physical machine, and I wanna be able to send large files between the nodes.
From the symptoms I see it looks like there is only one Tcp connection between the nodes, and sending the large binary across stops all other traffic, is this the case?
And even more interesting is there a way of making the vm use several connections between the nodes?
Yeah, you only get 1 connection, according to the manual
The handshake will continue, but A is informed that B has another
ongoing connection attempt that will be shut down (simultaneous
connect where A's name is greater than B's name, compared literally).
Not sure what "big" means in the question, but generally speaking (and imho), it might be good to setup a separate tcp port to handle the payloads, and only use the standard erlang messages as a signaling method (to negotiate ports, setup a listener, etc), like advising there's a new incoming payload and negotiate anything needed.
Btw, there's an interesting thread on the same subject, and you might try tunning the net_* variables to see if they help with the issues.
hope it helps!
It is not recommended to send large messages between erlang nodes,
http://learnyousomeerlang.com/distribunomicon
Refer to "bandwidth is infinite" section, I would recommend use something else like GFS so that you don't lose the distribution feature of erlang.

kernel-based (Linux) data relay between two TCP sockets

I wrote TCP relay server which works like peer-to-peer router (supernode).
The simplest case are two opened sockets and data relay between them:
clientA <---> server <---> clientB
However the server have to serve about 2000 such A-B pairs, ie. 4000 sockets...
There are two well known data stream relay implementations in userland (based on socketA.recv() --> socketB.send() and socketB.recv() --> socketA.send()):
using of select / poll functions (non-blocking method)
using of threads / forks (blocking method)
I used threads so in the worst case the server creates 2*2000 threads! I had to limit stack size and it works but is it right solution?
Core of my question:
Is there a way to avoid active data relaying between two sockets in userland?
It seems there is a passive way. For example I can create file descriptor from each socket, create two pipes and use dup2() - the same method like stdin/out redirecting. Then two threads are useless for data relay and can be finished/closed.
The question is if the server should ever close sockets and pipes and how to know when the pipe is broken to log the fact?
I've also found "socket pairs" but I am not sure about it for my purpose.
What solution would you advice to off-load the userland and limit amount fo threads?
Some extra explanations:
The server has defined static routing table (eg. ID_A with ID_B - paired identifiers). Client A connects to the server and sends ID_A. Then the server waits for client B. When A and B are paired (both sockets opened) the server starts the data relay.
Clients are simple devices behind symmetric NAT therefore N2N protocol or NAT traversal techniques are too complex for them.
Thanks to Gerhard Rieger I have the hint:
I am aware of two kernel space ways to avoid read/write, recv/send in
user space:
sendfile
splice
Both have restrictions regarding type of file descriptor.
dup2 will not help to do something in kernel, AFAIK.
Man pages: splice(2) splice(2) vmsplice(2) sendfile(2) tee(2)
Related links:
Understanding sendfile() and splice()
http://blog.superpat.com/2010/06/01/zero-copy-in-linux-with-sendfile-and-splice/
http://yarchive.net/comp/linux/splice.html (Linus)
C, sendfile() and send() difference?
bridging between two file descriptors
Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)
http://ogris.de/howtos/splice.html
OpenBSD implements SO_SPLICE:
relayd asiabsdcon2013 slides / paper
http://www.manualpages.de/OpenBSD/OpenBSD-5.0/man2/setsockopt.2.html
http://metacpan.org/pod/BSD::Socket::Splice .
Does Linux support something similar or only own kernel-module is the solution?
TCPSP
SP-MOD described here
TCP-Splicer described here
L4/L7 switch
HAProxy
Even for loads as tiny as 2000 concurrent connections, I'd never go with threads. They have the highest stack and switching overhead, simply because it's always more expensive to ensure that you can be interrupted anywhere than when you can only be interrupted at specific places. Just use epoll() and splice (if your sockets are TCP, which seems to be the case) and you'll be fine. You can even make epoll work in event triggered mode, where you only register your fds once.
If you absolutely want to use threads, use one thread per CPU core to spread the load, but if you need to do this, it means you're playing at speeds where affinity, RAM location on each CPU socket etc... plays a significant role, which doesn't seem to be the case in your question. So I'm assuming that a single thread is more than enough in your case.

What's the difference between streams and datagrams in network programming?

What's the difference between sockets (stream) vs sockets (datagrams)? Why use one over the other?
A long time ago I read a great analogy for explaining the difference between the two. I don't remember where I read it so unfortunately I can't credit the author for the idea, but I've also added a lot of my own knowledge to the core analogy anyway. So here goes:
A stream socket is like a phone call -- one side places the call, the other answers, you say hello to each other (SYN/ACK in TCP), and then you exchange information. Once you are done, you say goodbye (FIN/ACK in TCP). If one side doesn't hear a goodbye, they will usually call the other back since this is an unexpected event; usually the client will reconnect to the server. There is a guarantee that data will not arrive in a different order than you sent it, and there is a reasonable guarantee that data will not be damaged.
A datagram socket is like passing a note in class. Consider the case where you are not directly next to the person you are passing the note to; the note will travel from person to person. It may not reach its destination, and it may be modified by the time it gets there. If you pass two notes to the same person, they may arrive in an order you didn't intend, since the route the notes take through the classroom may not be the same, one person might not pass a note as fast as another, etc.
So you use a stream socket when having information in order and intact is important. File transfer protocols are a good example here. You don't want to download some file with its contents randomly shuffled around and damaged!
You'd use a datagram socket when order is less important than timely delivery (think VoIP or game protocols), when you don't want the higher overhead of a stream (this is why DNS is primarily a datagram protocol, so that servers can respond to many, many requests at once very quickly), or when you don't care too much if the data ever reaches its destination.
To expand on the VoIP/game case, such protocols include their own data-ordering mechanism. But if one packet is damaged or lost, you don't want to wait on the stream protocol (usually TCP) to issue a re-send request -- you need to recover quickly. TCP can take up to some number of minutes to recover, and for realtime protocols like gaming or VoIP even three seconds may be unacceptable! Using a datagram protocol like UDP allows the software to recover from such an event extremely quickly, by simply ignoring the lost data or re-requesting it sooner than TCP would.
VoIP is a good candidate for simply ignoring the lost data -- one party would just hear a short gap, similar to what happens when talking to someone on a cell phone when they have poor reception. Gaming protocols are often a little more complex, but the actions taken will usually be to either ignore the missing data (if subsequently-received data supercedes the data that was lost), re-request the missing data, or request a complete state update to ensure that the client's state is in sync with the server's.
Stream Socket:
Dedicated & end-to-end channel between server and client.
Use TCP protocol for data transmission.
Reliable and Lossless.
Data sent/received in the similar order.
Long time for recovering lost/mistaken data
Datagram Socket:
Not dedicated & end-to-end channel between server and client.
Use UDP for data transmission.
Not 100% reliable and may lose data.
Data sent/received order might not be the same.
Don't care or rapid recovering lost/mistaken data.
If it is the network programming I think starting from sockets would be a good start.
socket = ip + port
there are three types of sockets
stream (TCP, order and delivery guaranteed,no duplication,no length or char boundaries for data,connection-oriented,reliable, concurrency)
datagram(UDP,packet-based, connectionless, datagram size limit, data can be lost or duplicated, order not guaranteed,not reliable)
raw (direct access to lower layer protocols IP,ICMP)
I do not see any strict rule for transport protocol type as to what socket has to use what transport protocol and reliability should not be mistaken because UDP is realiable in case both ends are active.
Reliability refers to more like reliability of delivery since there are sequence number checks by using TCP as transport protocol which do not exist in UDP.It is better using network protocol analyzer like wireshark tcpdump etc to see what your software is exactly doing; kind of verification or merging theory on the paper with your work in action.