Protocol for sending streaming data over multiple sockets

Protocol for sending streaming data over multiple sockets - sockets

I am working on designing an API for consuming messages from an application that will generate a very large amount of data; 10+ of GB/s is likely, even for smaller clients. I am looking for a protocol that allows me to deliver this data in a way that is easy for clients to consume.
The obvious answer for me is: split up the messages so they are consumable over multiple connections. Each connection would consume a fraction of the overall load.
But if I do this, there are a few things I need to account for:
How does the user know they are falling behind and need to launch more connections?
Twitter says consumers should check timestamps, which could work for us
When they launch a new connection to consume more of the data, how do they specify that this is part of the same consumption session?
We could give the session a name, correlate that with a "direct" amqp queue, and let our queue do the hard work
Is there something very important I am missing.
Probably.
For this reason, I'd much rather a protocol that already exists.
The protocol would be considered extra awesome if it:
is websocket or streaming HTTP friendly
supports data compression

The problems you are describing are pretty much the same issues that video streaming has to deal with, which you probably already know. The key HTTP friendly streaming protocols are HLS (Apple), SmoothStreaming (Microsoft), HDS (Adobe) and MPEG-DASH (open protocol, but new).
When considering video streaming, it is also worth understanding whether your streams are more like 'live' streams or 'static' content - the former is generated on the fly and any given part of the live stream may only be available for a set tine, while the latter stored on the server in full and generally any part is available at any time (until the content is removed). How you stream and playback these is subtly different.
It may be that you can simply reuse one of the above video streaming protocols by wrapping your data as if it were video (or maybe it even is video), and implementing your own custom client on the receiving side.
Alternatively, these protocols could provide a good reference point if you wanted to create your own simpler protocol - there are several open source streaming servers you could look to for ideas or even adapt to your needs if that looks like a sensible route:
http://gstreamer.freedesktop.org
http://icecast.org
Video streaming is quite complex as you may already be aware, but if your use cases are simpler you may be able to ignore or remove much of the complexity - for example you may not need seek, multiple format and bit rate streams, accompanying streams (for subtitles etc). Being able to simplify like this might justify the effort to modify one of the above for your needs, if you are not able to use them out of the box.
One final point - video and audio streaming protocols usually have a built in way of dealing with delayed or lost packets. Depending on your application these may not be applicable to you so you should look carefully at this aspect if reusing a video or audio streaming protocol or server. For example, audio clients are typically tolerant of a small amount of packet loss, and will generally discard delayed packets rather than pause the audio (packets received outside the 'jitter buffer' window). If your application cannot tolerate any packet loss, then you will need to look carefully at the underlying solution and protocol to make sure it really meets your needs over all network conditions.

Related

Video streaming application Where to start

So there is some time now that i am thinking of creating some sort of video streaming application(client and server).Doing some little search i always get applications for streaming and not how to code one.
I know that it should be something like... capture data, pack , send to server and then the server will broadcast to anyone connected...right?
So where should i start...should i study about sockets..should i study more about how to implement UDT or TCP protocol...or those two combined??

Part of the problem you're having in your searches is that you haven't really defined what you're trying to solve. "video streaming application" isn't enough... what are the constraints? Some questions that help narrow down appropriate solutions:
Does the player need to be web-based?
Does the source need to be web-based?
What other platforms need support?
What sort of latency requirements do you have? (Video conferencing style, where quality is less important but low latency is very important... or more traditional streaming where you choose quality and don't care much about latency.)
What's the ratio between the source streams and those playing? Lots of watchers per stream, or lots of streams with few watchers?
At what sort of scale does your whole operation need to be at?
I know that it should be something like... capture data, pack , send to server and then the server will broadcast to anyone connected...right?
Close. Let's break this down a bit. All video streaming is going to have some element of capture, codecs, a container or transport, a server to distribute, and clients to connect to the server and reverse the whole process.
Media Capture
As I hinted at above, how you do this depends on the platform you're on. This is actually where things vary the most. If you're on Windows, there's DirectShow. OSX and Linux have their own capture frameworks. Also remember that you need and audio stream as well, which isn't necessarily handled in the video capture. If you're web-based, you need getUserMedia.
Codecs
It would be incredibly inefficient to send raw uncompressed frames. If it weren't for codecs, video streaming would be impossible as we know it. Each codec works a bit differently, but there are a lot of common techniques.
At a basic level, if you can imagine frames on a filmstrip, each frame isn't much different from the next. For a given shot, there may be motion happening but much of the content in the frame stays very similar. We can save a lot of bandwidth by only sending what's changed. (Realistically speaking, each frame is always a bit different due to the analog nature of the world we're capturing, but codecs can spend very little bandwidth on the things that are almost completely the same, vs. things that are totally different.) When we go to a different shot, the codec sees that the whole frame is different and sends a whole frame. Frames that can stand alone are "I-frames". I-frames are also inserted regularly in the stream, every few seconds. Most video players will only seek to I-frames, because anything not an I-frame requires decoding of all the frames before it up until a preceeding I-frame. If you've ever tried to hit an exact spot in a movie but the player put you somewhere within a few seconds nearby, that's why this happens. In addition, if some frames were to become corrupted, the stream will correct itself on the next I-frame. (Ever watched a video and a huge chunk of it went green for a few seconds but was fine later? That's why.)
Video codecs also use the nature of how we see things to their advantage. Our eyes are far more sensitive to changes in brightness than changes in color. Therefore, the codecs spend more bandwidth on brightness differences in the frame than they do in the color differences. There are also some crafty tricks for smoothing and adding visual noise to make things look more normal rather than blocky.
Audio codecs are also required. While a CD-quality stereo uncompressed audio stream may only take up 1.4mbit, that's a lot of bandwidth in internet terms. A lot of streaming video sites use less bandwidth than this for the entire video. Audio codecs, much like video codecs, use some tricks around how we perceive to save bandwidth. (For a more detailed explanation, read my post about how MP3 works here: https://sound.stackexchange.com/a/25946/7209)
Container
The next step is to mux your encoded audio and video streams together in a container format. If you were recording to disk, you might choose something like MKV which supports audio, video, subtitles, and more, all in the same file. WebM is basically a limited version of MKV but is designed to be easily supported by browsers. Or you might choose a format less complicated like MP4 where you are limited in choice of audio and video codecs, but get better player compatibility.
Since you're live streaming, the line between the streaming protocol and the container are often blurred a bit. HLS will require you to make a bunch of video files that stand alone, but your muxer and your codecs need to know how to segment these files in a way that they can be put together again. I think that RTMP takes its cues from FLV, but also has some information about the streams in its exchange with the client. (If you use RTMP, you might read up on it elsewhere... I don't know much about RTMP under the hood.)
Server
Lots of choices here. In the case of WebRTC, the "server" might actually be the web browser doing all the encoding and what not because it can run peer-to-peer. Alternatively, you might have a specialized streaming server running RTMP, or a normal HTTP web server for distributing HLS chunks. Again, what you choose depends on your requirements.
Clients
Clients need to connect to the server, demux the streams, decode the audio and video streams, and play they back. It's the entire process listed above, but in reverse.
So where should i start...
Start by figuring out exactly what you want to do. If you don't know what you want to do, play around with WebRTC. The browsers do all the work, and it requires very little server resources in most cases. This will allow you to stream between a few clients in real time.
To get more advanced, start experimenting with what you already have off-the-shelf. FFmpeg is a great tool that you should absolutely know how to use, and it can be embedded in your solution.
A few things of what you probably shouldn't do (unless you really want to):
Don't invent your own codec. (The codecs we have today are very good. They have taken a ton of investment and decades of academic research to get to where they are.)
Don't invent your own streaming protocol. (You would have to fight to get it adopted in all the players. We already have a ton of streaming protocols to choose from. Use what's already there.)
should i study about sockets..should i study more about how to implement UDT or TCP protocol...or those two combined??
It would always be helpful for you to know the basics of networking. Yes, learning about UDP and TCP will certainly help you, but since you're not inventing your own streaming protocols, you're not going to get to even choose between them anyway.
I hope this helps you get started. In short, understand all the layers here. Once you have done that, you'll know what to do next, and what to Google for.

RESTful interface for ECG/EEG sensor data in haskell

I'm working on a project in which I want to display biosensor EEG/ECG data measured by a portable device (e.g., a micro controller with wireless data transmission via Wifi or Bluetooth). For this purpose, I need to interface with the portable device/microcontroller, for which the many or some of the device seem to use RESTful interfaces, but offer also probably sockets.
One example of microcontroller with wifi is the "spark.io", which is based on a cortex m3 and CC3000 wireless controller for WiFi access on-board. The data to be transferred are around 500 to 1000 float values per second, which should arrive at the REST client with as little delay as possible. Probably an non-REST approach like sockets would fit better, but I would still like to test an approach based on a RESTFul interface (a tiny argument for this would be that transferring data via RESFul interface seems very common and has good library support).
Q: The question is, what is the best approach for a performant (in the sense of near-realtime) implementation that interfaces with this via REST interface?
I am sure this problem has been solved before, but I could not quickly find a paper via google scholar or technical/scientific blog post that explains this. The only link I found is on "rest hooks", but I am not sure if this is a good approach. Searching on SE didn't reveal a past question on this.
Side note: My approach would be to implement the interface in haskell first to test the design and performance of the RESFull interface. Later the working approach should be ported or implemented with Java/Android/spark.io/some other microcontroller.
(Please note this question is entirely about the architecture and not at all about haskell libraries or anything. If using REST is the stupiest thing, I will accept that as an answer if it is argumented. Also then the question is then whether in general microcontroller web-interfaces and specically their APIs, like that of "spark.io", are in general a stupid idea, if they are implemented via REST. Is this the case? If not, what definition of "near real time" justifies that a REST interface is a bad idea and thus other means of communcation are better. Like: one sensor read per minute? Or, one per second, by 1/10 second, by 1/100 second, by 1/1000 second?)

Okay, let's go through this.
REST is not necessarily a bad idea but it has a lot of features which you may not need. For example, there are REST verbs not just for retrieval, but also updating, deleting, and creating resources. If those functions are important (e.g. you need to send certain control data to the EEG controller) then REST will be nice. If you just want fast access to the stream of data, consider raw TCP instead.
Similarly, REST will package messages into "requests" and their "responses" which come with a bunch of "headers" indicating things like whether the request could be fulfilled, whether it's compressed, etc. These can be great features but may be bloat. You'll probably want to emit enough data on each request so that the ~1kB of headers are a small fraction of it. But given 8-byte floats (doubles), that requires transmitting 500-1000 data points, which you've said will take about one second. Is that our fate -- to always have 1s of latency?
REST will allow you to avoid some of that bloat by declaring a Transfer-Encoding: chunked so that the client can operate on individual chunks as they become available. So that's an architectural decision that I think will need to be made.
I would definitely get Keep-Alive working as soon as possible, and it would be my chief feature when looking for what library to use on the server. Keep-Alive is a standard extension to HTTP which avoids tearing down and rebuilding the TCP stack for each HTTP request. If you don't do this then you have some heavy protocol negotiations each time you send a request.
A crucial decision you'll have to make involves whether you want to do HTTP pipelining or not. You can combine HTTP pipelining with longer-lived requests (ones where you don't expect an immediate response) to essentially "send the data when it becomes available" (i.e. send the headers first and let the server push out the data when it's good and ready). This is an alternative to chunked transfers.
If you can work those out, then HTTP is regularly used to send megabytes per second, so your use case fits well within what REST is capable of. In terms of REST/HTTP libraries for Haskell, if you have to somehow program the controller yourself, the big options are wai, yesod, snap, and rest. If you just need an HTTP client there are a few of those too.

Live Streaming Service Development

I am about to develop a service that involves an interactive audio live streaming. Interactive in the sense that a moderator can have his stream paused and upon request, stream audio coming from one of his listeners (during the streaming session).
Its more like a Large Pipe where what flows through but the water can come in from only one of many small pipes connected to it at a time with a moderator assigned to each stream controlling which pipe is opened. I know nothing about media streaming, I dont know if a cloud service provides an interactive programmable solution such as this.
I am a programmer and I will be able to program the logic involved in such interaction. The issue is I am a novice to media streaming, don't have any knowledge if its technologies and various software used on the server for such purpose, are there any books that can introduce on to the technologies employed in media streaming, and I am trying to avoid using Flash,?
Clients could be web or mobile. I dont think I will have any problem with integrating with client system. My issue is implementing the server side

You are effectively programming a switcher. Basically, you need to be able to switch from one audio stream to the other. With uncompressed PCM, this is very simple. As long as the sample rates and bit depth are equal, cut the audio on any frame (which is sample-accurate) and switch to the other. You can resample audio and apply dithering to convert between different sample rates and bit depths.
The complicated part is when lossy codecs get involved. On a simlar project, I have gone down the road of trying to stitch streams together, and I can tell you that it is nearly impossible, even with something as simple as MP3. (The bit reservoir makes things difficult.) Plus, it sounds as if you will be supporting a wide variety of devices, meaning you likely won't be able standardize on a codec anyway. The best thing to do is take multiple streams and decode them at the mix point of your system. Then, you can switch from stream to stream easily with PCM.
At the output of your system, you'll want to re-encode to some lossy codec.
Due to latency, you don't typically want the server doing this switching. The switching should be done at the desk of the person encoding the stream so that way they can cue it accurately. Just write something that does all of the switching and encoding, and use SHOUTcast/Icecast for hosting your streams.

RTP/RTSP start up latency: Would this method help to reduce it, and if yes, why we don't have it

This is probably not the best forum for such a specialized question, but at the moment I don't know of a better one (open to suggestions/recommendations).
I work on a video product which for the last 10+ years has been using proprietary communications protocol (DCOM-based) to send the video across the network. A while ago we recognized the need to standardize and currently are almost at a point of ripping out all that DCOM baggage and replacing it with a fully compliant RTP/RTSP client/server framework.
One thing we noticed during testing over the last few months is that when we switch the client to use RTP/RTSP, there's a noticeable increase in start-up latency. The problem is that it's not us but RTSP.
BEFORE (DCOM): we would send one DCOM command and before that command even returned back to the client, the server would already be sending video. -- total latency 1 RTT
NOW (RTSP): This is the sequence of commands, each one being a separate network request: DESCRIBE, SETUP, SETUP, PLAY (assuming the session has audio and video) -- total of 4 RTTs.
Works as designed - unfortunately it feels like a step backwards because prior user experience was actually better.
Can this be improved? If you stay with the standard, short answer is, NO. However, my team fully controls our entire RTP/RTSP stack and I've been thinking we could introduce a new RTSP command (without touching any of existing commands so we are still fully inter-operable) as a solution: DESCRIBE_SETUP_PLAY.
We could send this one command, pass in types of streams interested in (typically, there's only one video and 0..1 audio). Response would include the full SDP text, as well as all the port information and just like before, server would start streaming instantly without waiting for anything else from the client.
Would this work? any downside that I may not be seeing? I'm curious why this wasn't considered (or was dropped) from official spec, since latency even in local intranet is definitely noticeable.

FYI, it is possible according to the RTSP 1.0 specification:
9.1 Pipelining
A client that supports persistent connections or connectionless mode
MAY "pipeline" its requests (i.e., send multiple requests without
waiting for each response). A server MUST send its responses to those
requests in the same order that the requests were received.
The RTSP 2.0 draft also contains support for pipelining.
However none of the clients/servers I've used implement it AFAIK.

long polling vs streaming for about 1 update/second

is streaming a viable option?
will there be a performance difference on the server end depending on which i choose?
is one better than the other for this case?
I am working on a GWT application with Tomcat running on the server end. To understand my needs, imagine updating the stock prices of several stocks concurrently.

Do you want the process to be client- or server-driven? In other words, do you want to push new data to the clients as soon as it's available, or would you rather that the clients request new data whenever they see fit, even though that might not be once/second? What is the likelyhood that the client will be able to stick around to wait for an answer? Even though you expect the events to occur once/second, how long does it take between a request from a client and the return from the server? If it's longer than a second, I'd expect you to lean towards pushing the events to the clients, though the other way around, I'd expect polling to be okay. If the response takes longer than the interval, then you're essentially streaming anyway, since there's a new event ready by the time the client receives the last one, so the client could essentially poll continually and always receive events - in this case, streaming the data would actually be more lightweight, since you're removing the connection/negotiation overhead from the process.
I would suspect that server load to be higher for a client-based (pull) subscription, instead of a streaming configuration, since the client would have to re-negotiate the connection each time, instead of leaving a connection open, but each open connection in a streaming model would require server resources as well. It depends on what the trade-off is between how aggressive your negotiation process is vs. how much memory/processing is required for each open connection. I'm no expert, though, so there may be other factors.
UPDATE: This guy talks about the trade-offs between long-polling and streaming, and he seems to say that with HTTP/1.1, the connection re-negotiation process is trivial, so that's not as much of an issue.

It doesn't really matter. The connection re-negotiation overhead is so slim with HTTP1.1, you won't notice any significant performance differences one way or another.
The benefits of long-polling are compatibility and reliability - no issues with proxies, ports, detecting disconnects, etc.
The benefits of "true" streaming would potentially be reduced overhead, but as mentioned already, this benefit is much, much less than it's made out to be.
Personally, I find a well-designed comet server to be the best solution for large numbers of updates and/or server-push.

Certainly, if you're looking to push data, streaming would seem to provide better performance, if your server can handle the expected number of continuous connections. But there's another issue that you don't address: Are you internet or intranet? Streaming has been reported to have some problems across proxies, much as you'd expect. So for a general purpose solution, you would probably be better served by long poll - for an intranet, where you understand the network infrastructure, streaming is quite likely a simpler, better performance solution for you.

The StreamHub GWT Comet Adapter was designed exactly for this scenario of streaming stock quotes. Example here: GWT Streaming Stock Quotes. It updates the stock prices of several stocks concurrently. I think the implementation underneath is Comet which is essentially streaming over HTTP.
Edit: It uses a different technique for each browser. To quote the website:
There are several different underlying
techniques used to implement Comet
including Hidden iFrame,
XMLHttpRequest/Script Long Polling,
and embedded plugins such as Flash.
The introduction of HTML 5 WebSockets
in to future browsers will provide an
alternative mechanism for HTTP
Streaming. StreamHub uses a "best-fit"
approach utilizing the most performant
and reliable technique for each
browser.

Streaming will be faster because data only crosses the wire one way. With polling, the latency is at least twice.
Polling is more resilient to network outages since it doesn't rely on a connection being kept open.
I'd go for polling just for the robustness.

For live stock prices I would absolutely keep the connection open, and ensure user alert/reconnection on disconnect.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse