I'm trying to plan a learning curve for a nodeJS module that streams all my output sounds to a server. Say I have a nodeJS module that forwards all outgoing sounds and music as packets to the server's port 8000. How can I connect some client's mp3 player to playback the streaming audio formats from the server? I mean the buffer that is sent is just raw messy bits, how to make my audio player on the client recognize the format, connect to the stream, forward the packets to the player etc.
You (I) need to open up a file, meaning by that a resource through POST request's answer and transmit to that file chunks of data from your original video resource according to the indices[ranges] the request asks for. So the request asks for data at xyz (just in an additional field) and tries to download resource Z and you constantly fill that resource with data so that it is always full.
This is a rather complex topic but there are many examples and documentation already available.
Just doing a quick search these came out:
node (socket) live audio stream / broadcast
https://www.npmjs.com/package/audio-stream
By the way, I'm not an expert but I think if you want to do audio streaming probably mp3 is not the right choice and you may get some benefit if you convert it to an intermediate streaming format.
So there is some time now that i am thinking of creating some sort of video streaming application(client and server).Doing some little search i always get applications for streaming and not how to code one.
I know that it should be something like... capture data, pack , send to server and then the server will broadcast to anyone connected...right?
So where should i start...should i study about sockets..should i study more about how to implement UDT or TCP protocol...or those two combined??
Part of the problem you're having in your searches is that you haven't really defined what you're trying to solve. "video streaming application" isn't enough... what are the constraints? Some questions that help narrow down appropriate solutions:
Does the player need to be web-based?
Does the source need to be web-based?
What other platforms need support?
What sort of latency requirements do you have? (Video conferencing style, where quality is less important but low latency is very important... or more traditional streaming where you choose quality and don't care much about latency.)
What's the ratio between the source streams and those playing? Lots of watchers per stream, or lots of streams with few watchers?
At what sort of scale does your whole operation need to be at?
I know that it should be something like... capture data, pack , send to server and then the server will broadcast to anyone connected...right?
Close. Let's break this down a bit. All video streaming is going to have some element of capture, codecs, a container or transport, a server to distribute, and clients to connect to the server and reverse the whole process.
Media Capture
As I hinted at above, how you do this depends on the platform you're on. This is actually where things vary the most. If you're on Windows, there's DirectShow. OSX and Linux have their own capture frameworks. Also remember that you need and audio stream as well, which isn't necessarily handled in the video capture. If you're web-based, you need getUserMedia.
Codecs
It would be incredibly inefficient to send raw uncompressed frames. If it weren't for codecs, video streaming would be impossible as we know it. Each codec works a bit differently, but there are a lot of common techniques.
At a basic level, if you can imagine frames on a filmstrip, each frame isn't much different from the next. For a given shot, there may be motion happening but much of the content in the frame stays very similar. We can save a lot of bandwidth by only sending what's changed. (Realistically speaking, each frame is always a bit different due to the analog nature of the world we're capturing, but codecs can spend very little bandwidth on the things that are almost completely the same, vs. things that are totally different.) When we go to a different shot, the codec sees that the whole frame is different and sends a whole frame. Frames that can stand alone are "I-frames". I-frames are also inserted regularly in the stream, every few seconds. Most video players will only seek to I-frames, because anything not an I-frame requires decoding of all the frames before it up until a preceeding I-frame. If you've ever tried to hit an exact spot in a movie but the player put you somewhere within a few seconds nearby, that's why this happens. In addition, if some frames were to become corrupted, the stream will correct itself on the next I-frame. (Ever watched a video and a huge chunk of it went green for a few seconds but was fine later? That's why.)
Video codecs also use the nature of how we see things to their advantage. Our eyes are far more sensitive to changes in brightness than changes in color. Therefore, the codecs spend more bandwidth on brightness differences in the frame than they do in the color differences. There are also some crafty tricks for smoothing and adding visual noise to make things look more normal rather than blocky.
Audio codecs are also required. While a CD-quality stereo uncompressed audio stream may only take up 1.4mbit, that's a lot of bandwidth in internet terms. A lot of streaming video sites use less bandwidth than this for the entire video. Audio codecs, much like video codecs, use some tricks around how we perceive to save bandwidth. (For a more detailed explanation, read my post about how MP3 works here: https://sound.stackexchange.com/a/25946/7209)
Container
The next step is to mux your encoded audio and video streams together in a container format. If you were recording to disk, you might choose something like MKV which supports audio, video, subtitles, and more, all in the same file. WebM is basically a limited version of MKV but is designed to be easily supported by browsers. Or you might choose a format less complicated like MP4 where you are limited in choice of audio and video codecs, but get better player compatibility.
Since you're live streaming, the line between the streaming protocol and the container are often blurred a bit. HLS will require you to make a bunch of video files that stand alone, but your muxer and your codecs need to know how to segment these files in a way that they can be put together again. I think that RTMP takes its cues from FLV, but also has some information about the streams in its exchange with the client. (If you use RTMP, you might read up on it elsewhere... I don't know much about RTMP under the hood.)
Server
Lots of choices here. In the case of WebRTC, the "server" might actually be the web browser doing all the encoding and what not because it can run peer-to-peer. Alternatively, you might have a specialized streaming server running RTMP, or a normal HTTP web server for distributing HLS chunks. Again, what you choose depends on your requirements.
Clients
Clients need to connect to the server, demux the streams, decode the audio and video streams, and play they back. It's the entire process listed above, but in reverse.
So where should i start...
Start by figuring out exactly what you want to do. If you don't know what you want to do, play around with WebRTC. The browsers do all the work, and it requires very little server resources in most cases. This will allow you to stream between a few clients in real time.
To get more advanced, start experimenting with what you already have off-the-shelf. FFmpeg is a great tool that you should absolutely know how to use, and it can be embedded in your solution.
A few things of what you probably shouldn't do (unless you really want to):
Don't invent your own codec. (The codecs we have today are very good. They have taken a ton of investment and decades of academic research to get to where they are.)
Don't invent your own streaming protocol. (You would have to fight to get it adopted in all the players. We already have a ton of streaming protocols to choose from. Use what's already there.)
should i study about sockets..should i study more about how to implement UDT or TCP protocol...or those two combined??
It would always be helpful for you to know the basics of networking. Yes, learning about UDP and TCP will certainly help you, but since you're not inventing your own streaming protocols, you're not going to get to even choose between them anyway.
I hope this helps you get started. In short, understand all the layers here. Once you have done that, you'll know what to do next, and what to Google for.
I would like to offer a few Live broadcast to users who don't support any of the available Streaming protocols as auto refreshing JPG's.
I got the idea to use ffmpeg (or mjpg_stramer) to extract two frames per second from the RTMP Live Stream base64 ecndoe it and load them by JavaScript at half second interval, but with 5-50 concurrent streams this is a hard job for the Server.
What would be the best way to get from multiple RTMP Live Streams two (or more) images per second as Base64 encoded JPG?
I am about to develop a service that involves an interactive audio live streaming. Interactive in the sense that a moderator can have his stream paused and upon request, stream audio coming from one of his listeners (during the streaming session).
Its more like a Large Pipe where what flows through but the water can come in from only one of many small pipes connected to it at a time with a moderator assigned to each stream controlling which pipe is opened. I know nothing about media streaming, I dont know if a cloud service provides an interactive programmable solution such as this.
I am a programmer and I will be able to program the logic involved in such interaction. The issue is I am a novice to media streaming, don't have any knowledge if its technologies and various software used on the server for such purpose, are there any books that can introduce on to the technologies employed in media streaming, and I am trying to avoid using Flash,?
Clients could be web or mobile. I dont think I will have any problem with integrating with client system. My issue is implementing the server side
You are effectively programming a switcher. Basically, you need to be able to switch from one audio stream to the other. With uncompressed PCM, this is very simple. As long as the sample rates and bit depth are equal, cut the audio on any frame (which is sample-accurate) and switch to the other. You can resample audio and apply dithering to convert between different sample rates and bit depths.
The complicated part is when lossy codecs get involved. On a simlar project, I have gone down the road of trying to stitch streams together, and I can tell you that it is nearly impossible, even with something as simple as MP3. (The bit reservoir makes things difficult.) Plus, it sounds as if you will be supporting a wide variety of devices, meaning you likely won't be able standardize on a codec anyway. The best thing to do is take multiple streams and decode them at the mix point of your system. Then, you can switch from stream to stream easily with PCM.
At the output of your system, you'll want to re-encode to some lossy codec.
Due to latency, you don't typically want the server doing this switching. The switching should be done at the desk of the person encoding the stream so that way they can cue it accurately. Just write something that does all of the switching and encoding, and use SHOUTcast/Icecast for hosting your streams.
Can anyone provide any direction or links on how to use the adaptive bitrate feature that DSS says it supports? According to the release notes for v6.0.3:
3GPP Release 6 bit rate adaptation support
I assume that this lets you include multiple video streams in the 3gp file with varying bitrates, and DSS will automatically serve the best stream based on the current bandwidth. At least that's what I hope it does.
I guess I'm not sure what format DSS is expecting to receive the file. I tried just adding several streams to a 3gp file which resulted in Quicktime unable to play it, and VLC opening up a different window for each stream in the file.
Any direction would be much appreciated.
Adaptive streaming used in DSS 6.x uses a dropped frame approach to reduce overall bandwidth rather than dynamic on the fly bitrate adjustments. The result of this can be unpredictable. The DSS drops the frames, and does not need the video encoded in any special way for it to work.