I'm working on getting a streaming video solution implemented for a client, with iOS devices targeted (mostly iPad).
I have diced up my video files into TS's and I have my accompanying m3u8 file. They are being hosted by a generic web host, and CDN'd by Amazon CloudFront, so on paper speed should be fine.
I am noticing that pretty much no matter what, the iPad is still having substantial buffering problems at pre-determined points (presumably where one segment ends and another begins).
My lowest bitrate for the TS files is 600kb/s which seems like that would be plenty low for typical WiFi streaming, but it still stops pretty hard.
I'm trying to figure out what is going wrong... I don't think it's the file hosting... as once it STARTS downloading, it goes fast. I feel like perhaps my m3u8 is somehow incomplete or inadequate...
As a side note, these videos are only 30-35 seconds long, and the media segmenter slices them into 3-4 pieces.
Has anyone seen anything like this?
Related
We are developing an open-source streaming server and are running into some trouble with our implementation of HLS.
We've been able to successfully convert to TS and segment into HLS segments any stream we want to, and it plays back beautifully on most every player... except for the Apple players (iPad, iPhone, Safari, Quicktime). On those, the H264 encoding settings need to be picked very carefully, and even when sticking to Baseline/3.0 some visible glitching can be seen.
The AAC audio, no matter how we encode it (both ffmpeg's aac and the libfkd_aac encoders were tried in nearly all possible configurations) sounds choppy as well. (Again, all these versions play back just fine on non-Apple players.) Changing the encoding settings does yield better results sometimes, but we've not been able to find any combination that will work for every video we've been testing with.
This leads us to conclude that perhaps the Apple-based players require something in the TS stream itself that we're not doing correctly. Is there anything that could cause this kind of behavior? For reference, an HLS teststream outputted by our packager/segmenter can be found here: link
We appreciate any feedback!
I'd like to get real-time video from the iPhone to another device (either desktop browser or another iPhone, e.g. point-to-point).
NOTE: It's not one-to-many, just one-to-one at the moment. Audio can be part of stream or via telephone call on iphone.
There are four ways I can think of...
Capture frames on iPhone, send
frames to mediaserver, have
mediaserver publish realtime video
using host webserver.
Capture frames on iPhone, convert to
images, send to httpserver, have
javascript/AJAX in browser reload
images from server as fast as
possible.
Run httpServer on iPhone, Capture 1 second duration movies on
iPhone, create M3U8 files on iPhone, have the other
user connect directly to httpServer on iPhone for
liveStreaming.
Capture 1 second duration movies on
iPhone, create M3U8 files on iPhone,
send to httpServer, have the other
user connected to the httpServer
for liveStreaming. This is a good answer, has anyone gotten it to work?
Is there a better, more efficient option?
What's the fastest way to get data off the iPhone? Is it ASIHTTPRequest?
Thanks, everyone.
Sending raw frames or individual images will never work well enough for you (because of the amount of data and number of frames). Nor can you reasonably serve anything from the phone (WWAN networks have all sorts of firewalls). You'll need to encode the video, and stream it to a server, most likely over a standard streaming format (RTSP, RTMP). There is an H.264 encoder chip on the iPhone >= 3GS. The problem is that it is not stream oriented. That is, it outputs the metadata required to parse the video last. This leaves you with a few options.
Get the raw data and use FFmpeg to encode on the phone (will use a ton of CPU and battery).
Write your own parser for the H.264/AAC output (very hard)
Record and process in chunks (will add latency equal to the length of the chunks, and drop around 1/4 second of video between each chunk as you start and stop the sessions).
"Record and process in chunks (will add latency equal to the length of the chunks, and drop around 1/4 second of video between each chunk as you start and stop the sessions)."
I have just wrote such a code, but it is quite possible to eliminate such a gap by overlapping two AVAssetWriters. Since it uses the hardware encoder, I strongly recommend this approach.
We have similar needs; to be more specific, we want to implement streaming video & audio between an iOS device and a web UI. The goal is to enable high-quality video discussions between participants using these platforms. We did some research on how to implement this:
We decided to use OpenTok and managed to pretty quickly implement a proof-of-concept style video chat between an iPad and a website using the OpenTok getting started guide. There's also a PhoneGap plugin for OpenTok, which is handy for us as we are not doing native iOS.
Liblinphone also seemed to be a potential solution, but we didn't investigate further.
iDoubs also came up, but again, we felt OpenTok was the most promising one for our needs and thus didn't look at iDoubs in more detail.
I can get individual frames from the iPhone's cameras just fine. what I need is a way to package them up with sound for streaming to the server. Sending the files once I have them isn't much of an issue. Its the generation of the files for streaming that I am having problems with. I've been trying to get FFMpeg to work without much luck.
Anyone have any ideas on how I can pull this off? I would like a known working API or instructions on getting FFMpeg to compile properly in an iPhone app.
You could divide your recording to separate files with a length of say, 10sec, then send them separately. If you use AVCaptureSession's beginConfiguration and commitConfiguration methods to batch your output change you shouldn't drop any frames between the files. This has many advantages over frame by frame upload:
The files can be directly used for HTTP live streaming without any server side processing.
The gap between data transfers allow the antennas to sleep in between if the connection is fast enough, saving battery life.
Conversely, if the connection is slow so upload is slower than recording, managing delayed upload of a set of files is much easier than a stream of bytes.
Using FFMPEG, Live555, JSON
Not sure how it works but if you look at the source files at http://github.com/dropcam/dropcam_for_iphone you can see that they are using a combination of open source projects like FFMPEG, Live555, JSON etc. Using Wireshark to sniff the packets sent from one of the public cameras that's available to view with the free "Dropcam For Iphone App" at the App Store, I was able to confirm that the iphone was receiving H264 video via RTP/RTSP/RTCP and even RTMPT which looks like maybe some of the stream is tunneled?
Maybe someone could take a look at the open source files and explain how they got RTSP to work on the iphone.
Thanks for the info TinC0ils. After digging a little deeper I'v read that they have modified the Axis camera with custom firmware to limit the streaming to just a single 320x240 H264 feed, to better provide a consistent quality video over different networks and, as you point out, be less of a draw on the phone's hardware etc. My interest was driven by a desire to use my iphone to view live video and audio from a couple of IP cameras that I own without the jerkiness of MJPEG or the inherent latency that is involved with "http live streaming". I think Dropcam have done an excellent job with their hardware/software combo, I just don't need any new hardware at the moment.
Oh yeah, I almost forgot the reason of this post RTSP PROTOCOL DOES WORK ON THE IPHONE!
They are using open source projects to receive the frames and decoding in software instead of using hardware decoders. This will work, however, this runs counter to Apple's requirement that you use their HTTP Streaming. It will also require greater CPU resources such that it doesn't decode video at the desired fps/resolution on older devices and/or decrease battery life compared to HTTP streaming.
I'm wondering if there are any examples atomic examples out there for streaming audio FROM the iPhone to a server. I'm not interested in telephony or SIP style solutions, just a simple socket stream to send an audio clip, in .wav format, as it is being recorded. I haven't had much luck with the google or other obvious avenues, although there seem to be many examples of doing this the other way around.
i cant figure out how to register the unregistered account i initially posted with.
anyway, I'm not really interested in the audio format at present, just the streaming aspect. i want to take the microphone input, and stream it from the iphone to a server. i dont presently care about the transfer rate as ill initially just test from a wifi connection, not the 3g setup. the reason i cant cache it is because im interested in trying out some open source speech recognition stuffs for my undergraduate thesis. caching and then sending the recording is possible but then it takes considerably longer to get the voice data to the server. if i can start sending the data as soon as i start recording, then the response time is considerably improved because most of the data will have already reached the server by the time i let go of the record button. furthermore, if i can get this streaming functionality to work from the iphone then on the server side of things i can also start the speech recognizer as soon as the first bit of audio comes through. again this should considerably speech up the final amount of time that the transaction takes from the user perspective.
colin barrett mentions the phones and phone networks, but these are actually a pretty suboptimal solution for asr, mainly because they provide no good way to recover from errors - doing so over a voip dialogue is a horrible experience. however, the iphone and in particular the touch screen provide a great way to do that, through use of an ime or nbest lists for the other recognition candidates.
if i can figure out the basic architecture for streaming the audio, then i can start thinking about doing flac encoding or something to reduce the required transfer rate. maybe even feature extraction, although that limits the later ability to retrain the system with the recordings.