network realtime audio transmission with latency < 100 ms - web-audio-api

I need a solution to transmit one audio channel (mono) 44.1 Khz, 16 bit resp. 88.2 KB/Sec with less than 100 ms latency to a remote location. The application is for a remote concert. My software is built on Windows 10 with Max (cycling74), Java, Unity and C#. I want send data as well between the applications especially from Max to Java and Unity. I found zeromq and apache kafka as possible frameworks. I would appreciate to get some hints which tools could be suited. As I am not very experienced in network programming minimizing the effort for an implementation is also an important concern.

ZeroMQ is capable of sub millisecond latency on an internal network. However, I would recommend raw UDP sockets. UDP doesn't retransmit lost packets and has very low overhead compared to TCP (used by ZMQ).
You may also need to do traffic prioritization on your network to ensure a reasonable latency, but with the small quantity of data you are using it might not have a significant effect (this all depends on your specific network). I would start with implementing a UDP socket then if you are seeing unacceptable latencies try to optimize the network.

Related

Should I use RTP or WebRTC for local network audio communication

I have a set of Raspberry Pi Zeros that I would like to use as a home intercom. I initially set them up to send audio to each other using golang with gRPC and bidirectional streaming, which works for short calls, but the lag builds up over time, so I think I need to switch to a real-time protocol like RTP or WebRTC. Since I already know the IP address of each device, and the hardware/supported codecs for each is the same, and they are all on the same network, is there any advantage to using WebRTC over using plain RTP? My understanding is that WebRTC mainly provides some additional security and connection orchestration like ICE and SDP, which I wouldn't necessarily need. I am trying to minimize resource usage since these devices are not as powerful as a phone or desktop. If I do use WebRTC, I can do the SDP signaling with gRPC or some other direct delivery method. Since there are more than 2 devices, I'm also curious about multicast functionality, which seems pure-RTP specific, while WebRTC (which uses RTP), doesn't necessarily support multicasting, and would require (n-1)! p2p connections. I'm very unclear/unsure about this point.
Also, does either support mixing audio channels natively, or would that need to be handled in the custom software?
You could use WebRTC, but you'd need to rig a signalling server, and a STUN / TURN server. These can be super simple and low capacity because everything is on a private network, but you still need 'em. The signalling server handles the necessary SDP interchange. Going full WebRTC might be overengineering this. (But of course learning to get WebRTC working can be useful.)
You've built out a golang infrastructure. Seeing as how you're on a private network, you could change up that program to send multicast UDP packets or RTP packets. Then you can rig your listeners to listen to them.
No matter what you do, you'll need to deal with the lag. A good way to do it in the packet world: don't build a queue of buffers ready to play. Instead, always put each received packet as the next-to-play packet, even if you have to overwrite a previously received packet. (That is, skip ahead.) You may get a pop once in a while, but with reasonably short packets, under 50ms, it shouldn't affect the user experience significantly. And the lag won't build up.
The oldtimey phone system ran on a continent-wide 8K synchronous clock. So lag was not an issue. But it's always a problem when audio analog-to-digital and digital-to-analog clocks aren't synchronized. That's true whenever they are on different devices. The slightest drift builds up over time. (RPis don't have fifty-dollar clock parts in them with guaranteed low drift.)
If all your audio sources run at the same sample rate, you can average them to mix them. That should get you started. (If you're using WebRTC in a browser, it will mix multiple sources for you. )
Since you are using Go check out offline-browser-communication. This removes the need for Signaling and STUN/TURN. It uses mDNS and pre-generated certificates. It is also being discussed in the WICG Discourse no idea if/when it will land.
'Lag' is a pretty common problem to have when doing media over TCP. You have lots of queues and congestion control you are dealing with. WebRTC (and RTP in general) is great at solving this. You have the following standardized things to solve it.
RTP packets have the relative timestamp
RTP Sender reports have a mapping of relative to NTP timestamp. Use this for sync/timing.
RTP Receiver reports give you packet loss/jitter. Use this to assert your network health.
Multicast is a fantastic suggestion as well. You reduce the complexity of having to signal all those 1:1 connections, and reduce the amount of bandwidth required. It does make security a little bit more delicate/roll your own though.
With Pion we decoupled all the RTP/RTCP stuff Pion Interceptor. So you don't have to use the full WebRTC stack to get the media transport things mentioned above.

Calculate Bandwidth for Each Application (TCP/UDP) on Windows?

Is it possible to calculate bandwidth of networking applications (TCP/UDP) by using Win32 API on Windows?
AFAIK TCP bandwidth calculation can be done by using GetPerTcpConnectionEStats
function. But I could not find any function for UDP.
PSPing seems to be working for UDP, therefore I assume that there must be a way to measure the bandwidth of UDP connections somewhere. Am I right?
Is there any other coding method to gather instant UDP bandwidth usage per connection on Windows?

Selection of software architecture or lib to optimize UDP client on a point-to-point network

My goal is to drop as few UDP datagrams as possible. Shocker, I know, ;-)
Here is my circumstance which is a bit different from the general network server/client optimization questions for which I see a lot of discussion:
I am writing socket code for a process which has one singular goal: grab UDP packets received by my Gigabit Ethernet NIC and get them into application RAM with as high a bandwidth as possible (i.e. minimize packet drops/loss).
The network is point-to-point without any firewalls, switches, routers, etc - just a single Cat6 cable connecting the UDP datagram generator/server (an embedded system) with my Windows 7 PC, the datagram receiver/client. I can control the transmitted datagrams-per-second via some controls on the datagram generator. The datagrams are sent to the broadcast address (FF.FF.FF.FF).
I've successfully achieved about 250-300Mbits/sec (30% of the theoretical 1G Ethernet bandwidth) without any packets getting dropped or order-scrambled by using lean-and-mean code based on the built-in Winsock2 commands: select() and recvfrom() as outlined in the sample code for those commands on MSDN.
(I've already adjusted the receive buffer to be very large using the setsockopt() command, and this helped considerably.) But I am still wanting to maximize performance and eager to hear thoughts from this community on whether or not I should expect noticeable gains from trying the following:
Asynchronous I/O, such as boost::asio. From what I gather, this library appears to be more for optimizing applications which have to serve a lot of different sockets to different machines. Should I expect much in terms of single-socket UDP receive performance from switching from Winsock to an asynchronous I/O architecture?
Packet size: If I make the effort to change the packet size by modifying the embedded code that is generating the packets, would it be likely to improve performance by having lots of smaller packets or fewer large/jumbo packets?
Broadcast/multicast/unicast: is one destination address type likely to perform better than others?
Or is 300Mbps about the limit that I should be expecting for actual throughput on a 1G physical link?
Any other recommendations on low-hanging fruit to improve performance, or expectations on what type of performance is feasible.
Thanks all!

What is the serial communication speed using TCP sockets?

I am communicating using TCP sockets. One computer is using Windows commands, and the other is running on Linux using Python. The two computers are able to communicate, but I'm not sure what the bit rate is. I never set any bit rate. Is there a default bit rate? Can it be changed?
EDIT: It seems that the programs can accommodate a variety of bit rates. For example, 10 Mbps Ethernet or 100 Mbps Ethernet. I thought (wrongly) that the bit rate had to be set, as it does for serial communication over USB. It does not have to be set.
TCP implements the SLOW START and CONGESTION AVOIDANCE procedures by which it tests the capacity of the underlying network and tries to exploit it, as much as possible. The process is fairly complex but, bottom line, TCP will try to use all the available bandwidth. The reference standard is the Internet Engineering Task Force rfc 5681: https://www.rfc-editor.org/rfc/rfc5681

Sending a huge amount of real time processed data via UDP to iPhone from a server

I'm implementing a remote application. The server will process & render data in real time as animation. (a series of images, to be precise) Each time, an image is rendered, it will be transferred to the receiving iPhone client via UDP.
I have studied some UDP and I am aware of the following:
UDP has max size of about 65k.
However, it seems that iPhone can only receive 41k UDP packet. iPhone seems to not be able to receive packet larger than that.
When sending multiple packets, many packets are being dropped. This is due to oversizing UDP processing.
Reducing packet size increase the amount of packets not being dropped, but this means more packets are required to be sent.
I never write real practical UDP applications before, so I need some guidance for efficient UDP communication. In this case, we are talking about transferring rendered images in real time from the server to be displayed on iPhone.
Compressing data seems mandatory, but in this question, I would like to focus on the UDP part. Normally, when we implement UDP applications, what can we do in terms of best practice for efficient UDP programming if we need to send a lot of data non-stop in real time?
Assuming that you have a very specific and good reason for using UDP and that you need all your data to arrive ( i.e. you can't tolerate any lost data ) then there are a few things you need to do ( this assumes a uni-cast application ):
Add a sequence number to the header for each packet
Ack each packet
Set up a retransmit timer which resends the packet if no ack recv'ed
Track latency RTT ( round trip time ) so you know how long to set your timers for
Potentially deal with out of order data arrival if that's important to your app
Increase receive buffer size on client socket.
Also, you could be sending so fast that you are dropping packets internally on the sending machine without them even getting out the NIC onto the wire. On certain systems calling select for write-ablity on the sending socket can help with this. Also, calling connect on the UDP socket can speed up performance leading to less dropped packets.
Basically, if you need guaranteed in-order delivery of your data than you are going to re-implement TCP on top of UDP. If the only reason you use UDP is latency, then you can probably use TCP and disable the Nagle Algorithm. If you want packetized data with reliable low latency delivery another possibility is SCTP, also with Nagle disabled. It can also provide out-of-order delivery to speed things up even more.
I would recommend Steven's "Unix Network Programming" which has a section on advanced UDP and when it's appropriate to use UDP instead of TCP. As a note, he recommends against using UDP for bulk data transfer, although the reality is that this is becoming much more common these days for streaming multimedia apps.
Small packets is probably better than large packets :-)