So there is some time now that i am thinking of creating some sort of video streaming application(client and server).Doing some little search i always get applications for streaming and not how to code one.
I know that it should be something like... capture data, pack , send to server and then the server will broadcast to anyone connected...right?
So where should i start...should i study about sockets..should i study more about how to implement UDT or TCP protocol...or those two combined??
Part of the problem you're having in your searches is that you haven't really defined what you're trying to solve. "video streaming application" isn't enough... what are the constraints? Some questions that help narrow down appropriate solutions:
Does the player need to be web-based?
Does the source need to be web-based?
What other platforms need support?
What sort of latency requirements do you have? (Video conferencing style, where quality is less important but low latency is very important... or more traditional streaming where you choose quality and don't care much about latency.)
What's the ratio between the source streams and those playing? Lots of watchers per stream, or lots of streams with few watchers?
At what sort of scale does your whole operation need to be at?
I know that it should be something like... capture data, pack , send to server and then the server will broadcast to anyone connected...right?
Close. Let's break this down a bit. All video streaming is going to have some element of capture, codecs, a container or transport, a server to distribute, and clients to connect to the server and reverse the whole process.
Media Capture
As I hinted at above, how you do this depends on the platform you're on. This is actually where things vary the most. If you're on Windows, there's DirectShow. OSX and Linux have their own capture frameworks. Also remember that you need and audio stream as well, which isn't necessarily handled in the video capture. If you're web-based, you need getUserMedia.
Codecs
It would be incredibly inefficient to send raw uncompressed frames. If it weren't for codecs, video streaming would be impossible as we know it. Each codec works a bit differently, but there are a lot of common techniques.
At a basic level, if you can imagine frames on a filmstrip, each frame isn't much different from the next. For a given shot, there may be motion happening but much of the content in the frame stays very similar. We can save a lot of bandwidth by only sending what's changed. (Realistically speaking, each frame is always a bit different due to the analog nature of the world we're capturing, but codecs can spend very little bandwidth on the things that are almost completely the same, vs. things that are totally different.) When we go to a different shot, the codec sees that the whole frame is different and sends a whole frame. Frames that can stand alone are "I-frames". I-frames are also inserted regularly in the stream, every few seconds. Most video players will only seek to I-frames, because anything not an I-frame requires decoding of all the frames before it up until a preceeding I-frame. If you've ever tried to hit an exact spot in a movie but the player put you somewhere within a few seconds nearby, that's why this happens. In addition, if some frames were to become corrupted, the stream will correct itself on the next I-frame. (Ever watched a video and a huge chunk of it went green for a few seconds but was fine later? That's why.)
Video codecs also use the nature of how we see things to their advantage. Our eyes are far more sensitive to changes in brightness than changes in color. Therefore, the codecs spend more bandwidth on brightness differences in the frame than they do in the color differences. There are also some crafty tricks for smoothing and adding visual noise to make things look more normal rather than blocky.
Audio codecs are also required. While a CD-quality stereo uncompressed audio stream may only take up 1.4mbit, that's a lot of bandwidth in internet terms. A lot of streaming video sites use less bandwidth than this for the entire video. Audio codecs, much like video codecs, use some tricks around how we perceive to save bandwidth. (For a more detailed explanation, read my post about how MP3 works here: https://sound.stackexchange.com/a/25946/7209)
Container
The next step is to mux your encoded audio and video streams together in a container format. If you were recording to disk, you might choose something like MKV which supports audio, video, subtitles, and more, all in the same file. WebM is basically a limited version of MKV but is designed to be easily supported by browsers. Or you might choose a format less complicated like MP4 where you are limited in choice of audio and video codecs, but get better player compatibility.
Since you're live streaming, the line between the streaming protocol and the container are often blurred a bit. HLS will require you to make a bunch of video files that stand alone, but your muxer and your codecs need to know how to segment these files in a way that they can be put together again. I think that RTMP takes its cues from FLV, but also has some information about the streams in its exchange with the client. (If you use RTMP, you might read up on it elsewhere... I don't know much about RTMP under the hood.)
Server
Lots of choices here. In the case of WebRTC, the "server" might actually be the web browser doing all the encoding and what not because it can run peer-to-peer. Alternatively, you might have a specialized streaming server running RTMP, or a normal HTTP web server for distributing HLS chunks. Again, what you choose depends on your requirements.
Clients
Clients need to connect to the server, demux the streams, decode the audio and video streams, and play they back. It's the entire process listed above, but in reverse.
So where should i start...
Start by figuring out exactly what you want to do. If you don't know what you want to do, play around with WebRTC. The browsers do all the work, and it requires very little server resources in most cases. This will allow you to stream between a few clients in real time.
To get more advanced, start experimenting with what you already have off-the-shelf. FFmpeg is a great tool that you should absolutely know how to use, and it can be embedded in your solution.
A few things of what you probably shouldn't do (unless you really want to):
Don't invent your own codec. (The codecs we have today are very good. They have taken a ton of investment and decades of academic research to get to where they are.)
Don't invent your own streaming protocol. (You would have to fight to get it adopted in all the players. We already have a ton of streaming protocols to choose from. Use what's already there.)
should i study about sockets..should i study more about how to implement UDT or TCP protocol...or those two combined??
It would always be helpful for you to know the basics of networking. Yes, learning about UDP and TCP will certainly help you, but since you're not inventing your own streaming protocols, you're not going to get to even choose between them anyway.
I hope this helps you get started. In short, understand all the layers here. Once you have done that, you'll know what to do next, and what to Google for.
I would like to offer a few Live broadcast to users who don't support any of the available Streaming protocols as auto refreshing JPG's.
I got the idea to use ffmpeg (or mjpg_stramer) to extract two frames per second from the RTMP Live Stream base64 ecndoe it and load them by JavaScript at half second interval, but with 5-50 concurrent streams this is a hard job for the Server.
What would be the best way to get from multiple RTMP Live Streams two (or more) images per second as Base64 encoded JPG?
I am about to develop a service that involves an interactive audio live streaming. Interactive in the sense that a moderator can have his stream paused and upon request, stream audio coming from one of his listeners (during the streaming session).
Its more like a Large Pipe where what flows through but the water can come in from only one of many small pipes connected to it at a time with a moderator assigned to each stream controlling which pipe is opened. I know nothing about media streaming, I dont know if a cloud service provides an interactive programmable solution such as this.
I am a programmer and I will be able to program the logic involved in such interaction. The issue is I am a novice to media streaming, don't have any knowledge if its technologies and various software used on the server for such purpose, are there any books that can introduce on to the technologies employed in media streaming, and I am trying to avoid using Flash,?
Clients could be web or mobile. I dont think I will have any problem with integrating with client system. My issue is implementing the server side
You are effectively programming a switcher. Basically, you need to be able to switch from one audio stream to the other. With uncompressed PCM, this is very simple. As long as the sample rates and bit depth are equal, cut the audio on any frame (which is sample-accurate) and switch to the other. You can resample audio and apply dithering to convert between different sample rates and bit depths.
The complicated part is when lossy codecs get involved. On a simlar project, I have gone down the road of trying to stitch streams together, and I can tell you that it is nearly impossible, even with something as simple as MP3. (The bit reservoir makes things difficult.) Plus, it sounds as if you will be supporting a wide variety of devices, meaning you likely won't be able standardize on a codec anyway. The best thing to do is take multiple streams and decode them at the mix point of your system. Then, you can switch from stream to stream easily with PCM.
At the output of your system, you'll want to re-encode to some lossy codec.
Due to latency, you don't typically want the server doing this switching. The switching should be done at the desk of the person encoding the stream so that way they can cue it accurately. Just write something that does all of the switching and encoding, and use SHOUTcast/Icecast for hosting your streams.
I need to test the application I developed, that uses HTTP Live Streaming ( audio only ) and I would like to see how it works in comparison with other similar apps. How can I conclude that an app uses HTTP Live Streaming by using a packet sniffer?
Look for a request for a playlist file with an .m3u8 extension and in the response look for #EXTM3U on the first line. The IETF draft should give you all the information you need to interpret the rest of the traffic.
Here are some values you might find useful if you want to look at the stream closer.
I have never tried this tool but Apple released a stream validator, all you need is the playlist URL.
Bear in mind that it's possible to use a number of different streaming technologies and it's a popular practice with alot of services to use the one which makes most sense at the time. If you don't simulate the correct environment you might never get an HLS stream at all.
Is it possible to recreate the media file from the captured wireshark logs. Is there any doc which explains how this needs to be done.
I am doing RTSP based streaming from my darwin test server. So I want to compare the Quality of the original and the streamed file.
I'm not familar with Darwin Streaming Servers but generally RTSP is only for establishing the RTP stream. The direction of RTP packets is normally in one direction (ignoring the ACK-packages for TCP).
For comparing the files I would use a tool suggested by all other users.
But to answer your question for wireshark:
filter you stream for the destination ip by using 'ip.addr eq '
look for your RTP or UDP packages from the RTSP-server
in case you see UDP-packages: right click on the package->'Decode As' and choose 'RTP' in Transport tab
choose from context menu 'follow UDP stream'
now you have the whole RTP-stream without RTP headers.
But keep in mind that in H.264 you have packetization which gives you extra bytes in the displayed stream. You cannot compare this with the original file!!
Look here in chapter 5.4. for further description.
Better use the tools mentioned by the others!
I don't think it is possible the way you hope, as RTSP is a sort of conversation between a client and a server (or servers). To recreate the RTSP session you would have to recreate all of this two-way traffic - it is not really comparable to opening a file in a video player.
I think you will find it easier to use VLC to stream the rtsp:// link and save it to a file. The stream will be transcoded while saving, so if you need a "true" comparison to the original file, you will want to use a lossless video codec for transcoding, and the output file could be very large.
Using Ostinato, You should be able to replay the file and capture using VLC.