Query for optimal pixel format when capturing video on iOS? - iphone

The AVFoundation Programming Guide states that the preferred pixel formats when capturing video are:
for iPhone 4: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange or kCVPixelFormatType_32BGRA
for iPhone 3G: kCVPixelFormatType_422YpCbCr8 or kCVPixelFormatType_32BGRA
(There are no recommendations [yet] for the iPhone 5 or for iPad devices with cameras.)
There is however no help provided as to how I should go about and determine what device the app is currently running on. And what if the preferred pixel format becomes different on a future and therefor to my app unknown device?
What is the correct, and future proof, way to determine the preferred YpCbCr pixel format for any device?

I believe you can just set the video settings to nil and AVFoundation will use the most efficient format. For instance instead of doing
NSDictionary *videoSettings = [NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange] forKey:(id)kCVPixelBufferPixelFormatTypeKey];
videoOutput.videoSettings = videoSettings;
Do this instead
videoOutput.videoSettings = nil;
You may also try not setting it at all. I know in the past I would just set this to nil unless I needed to capture images in a specific format.
edit
To get the format AVFoundation chose to use.
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(source);
CGColorSpaceRef cref = CVImageBufferGetColorSpace(imageBuffer);

Related

What is kAudioSessionProperty_InputSources actually good for?

I've tried to fetch the list of available audio input devices on an iPhone by using this code:
CFArrayRef arrayRef;
UInt32 size = sizeof(arrayRef);
OSStatus status = AudioSessionGetProperty(kAudioSessionProperty_InputSources, &size, &arrayRef);
assert(status == noErr);
NSArray *array = (__bridge NSArray *)arrayRef;
The call works and returns without error, but the results array is always empty, no matter what hardware I have connected to it. I've tried two usual headsets for mobiles, an original one from Apple and one from Samsung and two kinds of USB microphones (an iXY from Rode and an iM2X from Tascam), but the array always stays empty. So I wonder what kinds of input sources would actually be listed by this property? Is it usable at all?
By using a listener callback on the audio routes, I was able to verify that all 4 devices are detected correctly. I was also able to record audio with each of the devices, so they all work properly. I use an iPhone 4S with iOS 6.1.3 (10B329).
The property you are referring to is only for audio input sources in a USB audio accessory attached through the iPad camera connection kit, as mentioned in the AudioSessionServices class reference.
To get an array that is not nil you will need to test with say a USB Audio Workstation that plugs into the iPad camera connection kit.
Here is a link that lists some hardware that uses the iPad camera connection kit.
Connecting USB audio interfaces using the Apple iPad Camera Connection Kit.
Also from the class reference
If there is no audio input source available from the attached accessory, this property’s value is an empty array.
So from the list found in the above link (scroll down to List of some compatible devices sub heading), devices you would be interested in, that yield a !nil result, would be some device that offers audio input such as the Alesis iO4, iO2, or iO2 express.
EDIT: there's merit in the answer provided by Shawn Hershey, with regards to using a non-deprecated objective-c alternative. However you would be most interested in the portType property of the AVAudioSessionPortDescription class. (available from iOS 6.0)
Two constants of interest are - AVAudioSessionPortLineIn and AVAudioSessionPortUSBAudio. The first one mentioned is for audio input through the dock connector, which is the way your test microphones mentioned connect.
In iOS 7.0 and later you can query the availableInputs property of the AVAudioSession class. In iOS 6 you can only query the currentRoute property.
I found this Technical Q&A very helpful -
AVAudioSession - microphone selection
I'm very new to audio programming on iPhones so I don't have an answer to the question of what that particular property is good for, but if you want the list of audio inputs, I think this will work:
NSArray * ais = [[AVAudioSession sharedInstance] availableInputs];
This provides an array of AVAudioSessionPortDescription objects.
for (id object in ais) {
AVAudioSessionPortDescription * pd = (AVAudioSessionPortDescription*)object;
NSLog(#"%#",pd.portName);
}

How to get real time video stream from iphone camera and send it to server?

I am using AVCaptureSession to capture video and get real time frame from iPhone camera but how can I send it to server with multiplexing of frame and sound and how to use ffmpeg to complete this task, if any one have any tutorial about ffmpeg or any example please share here.
The way I'm doing it is to implement an AVCaptureSession, which has a delegate with a callback that's run on every frame. That callback sends each frame over the network to the server, which has a custom setup to receive it.
Here's the flow:
http://developer.apple.com/library/ios/#documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/03_MediaCapture.html#//apple_ref/doc/uid/TP40010188-CH5-SW2
And here's some code:
// make input device
NSError *deviceError;
AVCaptureDevice *cameraDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
AVCaptureDeviceInput *inputDevice = [AVCaptureDeviceInput deviceInputWithDevice:cameraDevice error:&deviceError];
// make output device
AVCaptureVideoDataOutput *outputDevice = [[AVCaptureVideoDataOutput alloc] init];
[outputDevice setSampleBufferDelegate:self queue:dispatch_get_main_queue()];
// initialize capture session
AVCaptureSession *captureSession = [[[AVCaptureSession alloc] init] autorelease];
[captureSession addInput:inputDevice];
[captureSession addOutput:outputDevice];
// make preview layer and add so that camera's view is displayed on screen
AVCaptureVideoPreviewLayer *previewLayer = [AVCaptureVideoPreviewLayer layerWithSession:captureSession];
previewLayer.frame = view.bounds;
[view.layer addSublayer:previewLayer];
// go!
[captureSession startRunning];
Then the output device's delegate (here, self) has to implement the callback:
-(void) captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
{
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer( sampleBuffer );
CGSize imageSize = CVImageBufferGetEncodedSize( imageBuffer );
// also in the 'mediaSpecific' dict of the sampleBuffer
NSLog( #"frame captured at %.fx%.f", imageSize.width, imageSize.height );
}
Sending raw frames or individual images will never work well enough for you (because of the amount of data and number of frames). Nor can you reasonably serve anything from the phone (WWAN networks have all sorts of firewalls). You'll need to encode the video, and stream it to a server, most likely over a standard streaming format (RTSP, RTMP). There is an H.264 encoder chip on the iPhone >= 3GS. The problem is that it is not stream oriented. That is, it outputs the metadata required to parse the video last. This leaves you with a few options.
1) Get the raw data and use FFmpeg to encode on the phone (will use a ton of CPU and battery).
2) Write your own parser for the H.264/AAC output (very hard).
3) Record and process in chunks (will add latency equal to the length of the chunks, and drop around 1/4 second of video between each chunk as you start and stop the sessions).
There is a long and a short story to it.
This is the short one:
go look at https://github.com/OpenWatch/H264-RTSP-Server-iOS
this is a starting point.
you can get it and see how he extracts the frame. This is a small and simple project.
Then you can look at kickflip which has a specific function "encodedFrame" its called back onces and encoded frame arrives from this point u can do what you want with it, send via websocket. There is a bunch of very hard code avalible to read mpeg atoms
Look here , and here
Try capturing video using AV Foundation framework. Upload it to your server with HTTP streaming.
Also check out a stack another stack overflow post below
(The post below was found at this link here)
You most likely already know....
1) How to get compressed frames and audio from iPhone's camera?
You can not do this. The AVFoundation API has prevented this from
every angle. I even tried named pipes, and some other sneaky unix foo.
No such luck. You have no choice but to write it to file. In your
linked post a user suggest setting up the callback to deliver encoded
frames. As far as I am aware this is not possible for H.264 streams.
The capture delegate will deliver images encoded in a specific pixel
format. It is the Movie Writers and AVAssetWriter that do the
encoding.
2) Encoding uncompressed frames with ffmpeg's API is fast enough for
real-time streaming?
Yes it is. However, you will have to use libx264 which gets you into
GPL territory. That is not exactly compatible with the app store.
I would suggest using AVFoundation and AVAssetWriter for efficiency
reasons.

ffmpeg-X264 encode --BGRA to AVFrame(ffmpeg) and viceversa? for IOS

Am working Video Processing in IOS(iphone/ipod/ipad) Using Objective c. i am using AVFoundation Framework to Capture Video . i want to Encode/decode those video frame using ffmpeg-libx264. i have compiled ffmpeg-x264 lib for ios. i got kCVPixelFormatType_32BGRA from AVFoundation.
my problem is
1.How to convert kCVPixelFormatType_32BGRA to AVFrame for enode using avcodec_encode_video?
2.How to convert AVFrame to kCVPixelFormatType_32BGRA # decode side from avcodec_decode_video2?
Please help me to start above process or give path for working tutorial .Thanks in advance.
If you're trying to use FFMpeg you'll need to use kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange rather than kCVPixelFormatType_32BGRA and then you can shove it into an AVFrame. You'll probably also want to convert what you're getting from the iOS camera (YUVNV12) to YUV420P so you can receive it on other devices that aren't iOS. If you are just using iOS devices and that's all you care about, you can skip this side of the color conversion and just pack it into the AVFrame.
Since you're already putting it into a YUV format, you can just use CVPixelBufferGetBaseAddressOfPlane(buf,0) and encode that address.
Once you decode the image, you'll need to change the colors to BGRA from YUV420P. If you didn't swap the colors properly in the first place before you encoded it, you'll just change YUVNV12 to BGRA.
Hope this helps a bit. You can find the proper color conversion algorithms online.

iOS Video: More than 4 simultaneous AVAssetReaders possible?

I would like to render multiple H264 mp4 videos on multiple views at the same time. Target is to read about 8 short videos, each at a size of 100x100 pixels and let them display their content on multiple positions on the screen, simultaneously.
Imagine 24 squares on the screen, each showing one video out of pool of 8 videos.
MoviePlayer doesn't work, for it's only showing one fullscreen video. An AVPlayer with multiple AVPlayerLayers is limited, because only the most-recently-created Layer will show it's content on screen (according to the documentation and my testing).
So, i wrote a short video class and created an instance for every .mp4 file in my package, using AVAssetReader to read it's content. On update, every videoframe is retreived converted to an UIImage and displayed, according to the video's framerate. Furthermore, these images are cached for a fast access on looping.
- (id) initWithAsset:(AVURLAsset*)asset withTrack:(AVAssetTrack*)track
{
self = [super init];
if (self)
{
NSDictionary* settings = [NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA], (NSString*)kCVPixelBufferPixelFormatTypeKey, nil];
mOutput = [[AVAssetReaderTrackOutput alloc] initWithTrack:track outputSettings:settings];
mReader = [[AVAssetReader alloc] initWithAsset:asset error:nil];
[mReader addOutput:mOutput];
BOOL status = [mReader startReading];
}
return self;
}
- (void) update:(double)elapsed
{
CMSampleBufferRef buffer = [mOutput copyNextSampleBuffer];
if (buffer)
{
UIImage* image = [self imageFromSampleBuffer:sampleBuffer];
}
[...]
}
Actually this works pretty well, but only for 4 videos. The fifth one never shows up. First I thought of memory issues, but I tested it on the following devices:
iPhone 3GS
iPhone 4
iPad
iPad 2
I had the same behaviour on each device: 4 videos playing in a smooth loop, no differences.
If it would have been a memory issue, I would have expect at least either the iPad 2 to show 5 or 6 videos (due to it's better hardware) or the 3GS to show only 1 or a crash somewhere.
The simulator shows all videos, though.
Debugging on the device shows, that
BOOL status = [mReader startReading];
returns false for video 5,6,7 and 8.
So, is there some kind of hardware setting (or restriction) that doesn't allow more than 4 simultaneous AVAssetReaders? Because, I can't really explain this behaviour. I don't think that all devices have the exact same amount of video memory.
Yes, iOS has an upper limit on the number of videos that can be decoded at one time. While your approach is good, I don't know of any way to work around this upper limit as far as having that many h.264 decoders active at once. If you are interested, please have a look at my solution to this problem, this is an xcode project called Fireworks. Basically, this demo shows decoding a bunch of alpha channel videos to disk, then each one is played by mapping a portion of the video files into memory. This approach makes it possible to decode more than 4 movies at the same time without using up all the system memory and without running into the hard limit of the number of h.264 decoder objects.
Have you tried creating separate AVPlayerItems based on the same AVAsset for each AVPlayerLayer?
Here's my latest iteration of a perfectly smooth-scrolling collection view with real-time video previews (up to 16 at a time):
https://youtu.be/7QlaO7WxjGg
It even uses a cover flow custom layout and "reflection" view that mirrors the video preview perfectly. The source code is here:
http://www.mediafire.com/download/ivecygnlhqxwynr/VideoWallCollectionView.zip

Audio/Voice Visualization

Hey you Objective-C bods.
Does anyone know how I would go about changing (transforming) an image based on the input from the Microphone on the iPhone?
i.e. When a user speaks into the Mic, the image will pulse or skew.
[edit] Anyone have any ideas, I have (what is basically) a voice recording app. I just wanted something to change as the voice input is provided. I've seen it in a sample project, but that wasn't with an UIImage. [/edit]
Thanking you!!
Apple put together some great frameworks for this! The AVFoundation framework and CoreAudio framework will be the most useful to you.
To get audio level information AVAudioRecorder is useful. Although it is made to be used for recording, it also provides levels data for the microphone. This would be useful for deforming your image base on how loud the user is shouting at his phone ;)
Here is the apple documentation for AVAudioRecorder: AVAudioRecorder Class Reference
A bit more detail:
// You will need an AVAudioRecorder object
AVAudioRecorder *myRecorderObject;
// To be able to get levels data from the microphone you need
// to enable metering for your recorder object
[myRecorderObject prepareToRecord];
myRecorderObject.meteringEnabled=YES;
// Now you can poll the microphone to get some levels data
float peakPower = [myRecorderObject peakPowerForChannel:0];
float averagePower = [myRecorderObject averagePowerForChannel:0];
If you want to see a great example of how an AVAudioRecorder object can be used to get levels data, check out this tutorial.
As far as deforming your image, that would be up to an image library. There are a lot to choose from and some great ones from apple. I am not familiar with anything though so that might be up for someone else to answer.
Best of luck!
You may try using gl-data-visualization-view extensible framework in order to visualize your sound levels.