AVCaptureVideoPreviewLayer is not visible on the screenshot - swift

I have an application that adds some live animations and images to preview view in AV Foundation camera. I can do "hardware screenshot" (holding the Side button and Volume Up button) and it's ok. However, I need a button that makes a screenshot.
All the methods of taking screenshot like UIGraphicsGetImageFromCurrentImageContext (or view.drawHierarchy() ) result in black screen where video preview is. All other elements are on the screenshot and images are visible except AVCaptureVideoPreviewLayer.
Please help me. Can I do "hardware screenshot"? Is exist another solution to that problem?

I was in the same position, and researched two separate solutions to this problem.
Set up the ViewController as an AVCaptureVideoDataOutputSampleBufferDelegate and sample the video output to take the screenshot.
Set up the ViewController as an AVCapturePhotoCaptureDelegate and capture the photo.
The mechanism for setting up the former is described in this question for example: How to take UIImage of AVCaptureVideoPreviewLayer instead of AVCapturePhotoOutput capture
I implemented both to check if there was any difference in the quality of the image (there wasn't).
If all you need is the camera snapshot, then that's it. But it sounds like you need to draw an additional animation on top. For this, I created a container UIView of the same size as the snapshot, added a UIImageView to it with the snapshot and then drew the animation on top. After that you can use UIGraphicsGetImageFromCurrentImageContext on the container.
As for which of solutions (1) and (2) to use, if you don't need to support different camera orientations in the app, it probably doesn't matter. However, if you need to switch between front and back camera and support different camera orientations, then you need to know the snapshot orientation to apply the animation in the right place, and getting that right turned out to be a total bear with method (1).
The solution I used:
UIViewController extends AVCapturePhotoCaptureDelegate
Add photo output to the AVCaptureSession
private let session = AVCaptureSession()
private let photoOutput = AVCapturePhotoOutput()
....
// When configuring the session
if self.session.canAddOutput(self.photoOutput) {
self.session.addOutput(self.photoOutput)
self.photoOutput.isHighResolutionCaptureEnabled = true
}
Capture snapshot
let settings = AVCapturePhotoSettings()
let previewPixelType = settings.availablePreviewPhotoPixelFormatTypes.first!
let previewFormat = [
kCVPixelBufferPixelFormatTypeKey as String: previewPixelType,
kCVPixelBufferWidthKey as String: 160,
kCVPixelBufferHeightKey as String: 160
]
settings.previewPhotoFormat = previewFormat
photoOutput.capturePhoto(with: settings, delegate: self)
Rotate or flip the snapshot before doing the rest
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: Error?) {
guard error == nil else {
// Do something
// return
}
if let dataImage = photo.fileDataRepresentation() {
print(UIImage(data: dataImage)?.size as Any)
let dataProvider = CGDataProvider(data: dataImage as CFData)
let cgImageRef: CGImage! = CGImage(jpegDataProviderSource: dataProvider!, decode: nil, shouldInterpolate: true, intent: .defaultIntent)
//https://developer.apple.com/documentation/uikit/uiimageorientation?language=objc
let orientation = UIApplication.shared.statusBarOrientation
var imageOrientation = UIImage.Orientation.right
switch orientation {
case .portrait:
imageOrientation = self.cameraPosition == .back ? UIImage.Orientation.right : UIImage.Orientation.leftMirrored
case .landscapeRight:
imageOrientation = self.cameraPosition == .back ? UIImage.Orientation.up : UIImage.Orientation.downMirrored
case .portraitUpsideDown:
imageOrientation = self.cameraPosition == .back ? UIImage.Orientation.left : UIImage.Orientation.rightMirrored
case .landscapeLeft:
imageOrientation = self.cameraPosition == .back ? UIImage.Orientation.down : UIImage.Orientation.upMirrored
case .unknown:
imageOrientation = self.cameraPosition == .back ? UIImage.Orientation.right : UIImage.Orientation.leftMirrored
#unknown default:
imageOrientation = self.cameraPosition == .back ? UIImage.Orientation.right : UIImage.Orientation.leftMirrored
}
let image = UIImage.init(cgImage: cgImageRef, scale: 1.0, orientation: imageOrientation)
// Do whatever you need to do with the image
} else {
// Handle error
}
}
If you need to know the size of the image to position the animations you can use the AVCaptureVideoDataOutputSampleBufferDelegate strategy to detect the size of the buffer once.

Related

Cropping/Compositing An Image With Vision/CoreImage

I am working with the Vision framework in iOS 13 and am trying to achieve the following tasks;
Take an image (in this case, a CIImage) and locate all faces in the image using Vision.
Crop each face into its own CIImage (I'll call this a "face image").
Filter each face image using a CoreImage filter, such as a blur or comic book effect.
Composite the face image back over the original image, hereby creating effects that only apply to the face.
A better example of this would be the end goal of taking a live camera feed from an AVCaptureSession and blurring every face in the video frame, compositing the blurred faces back over the original image for saving.
I almost have this working, save for the fact that there seems to be a coordinates/translation issue. For example, when I test this code and move my face, the "blurred" section goes the wrong direction (if I turn my face right, the box goes left, if I look up, the box goes down). While I think this may have something to do with mirroring on the front-facing camera, I can't seem to figure out what I should try next;
func drawFaceBox(bufferImage: CIImage, observations: [VNFaceObservation]) -> CVPixelBuffer? {
// The filter
let blur = CIFilter(name: "CICrystallize")
// The unfiltered image, prepared for filtering
var filteredImage = bufferImage
// Find and crop each face
if !observations.isEmpty {
for face in observations {
let faceRect = VNImageRectForNormalizedRect(face.boundingBox, Int(bufferImage.extent.size.width), Int(bufferImage.extent.size.height))
let croppedFace = bufferImage.cropped(to: faceRect)
blur?.setValue(croppedFace, forKey: kCIInputImageKey)
blur?.setValue(10.0, forKey: kCIInputRadiusKey)
if let blurred = blur?.value(forKey: kCIOutputImageKey) as? CIImage {
compositorCIFilter?.setValue(blurred, forKey: kCIInputImageKey)
compositorCIFilter?.setValue(filteredImage, forKey: kCIInputBackgroundImageKey)
if let output = compositorCIFilter?.value(forKey: kCIOutputImageKey) as? CIImage {
filteredImage = output
}
}
}
}
// Convert image to CVPixelBuffer and return. This part works fine.
}
Any thoughts on how I can composite the blurred face image(s) back to their original position with accuracy? Or any other approach to only filter part of the original CIImage to avoid this issue altogether/save processing? Thanks!
I believe this issue stems from an orientation problem earlier on in the pipeline (specifically, during the output of the sample buffers from the camera, which is where the Vision task was instantiated). I have updated my didOutputSampleBuffer code like so;
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
...
// Setup the current device orientation
let curDeviceOrientation = UIDevice.current.orientation
// Handle the image property orientation
//let orientation = self.exifOrientation(from: curDeviceOrientation)
// Setup the image request handler
//let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation(rawValue: UInt32(1))!)
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, options: [:])
// Setup the completion handler
let completion: VNRequestCompletionHandler = {request, error in
let observations = request.results as! [VNFaceObservation]
// Draw faces
DispatchQueue.main.async {
// HANDLE FACES
self.drawFaceBoxes(for: observations)
}
}
// Setup the image request
let request = VNDetectFaceRectanglesRequest(completionHandler: completion)
// Handle the request
do {
try handler.perform([request])
} catch {
print(error)
}
}
As noted, I have commented out the let orientation = ... and the first let handler = ..., which was using the orientation. By removing the reference to the orientation, I seem to have removed any issue with orientation in the Vision calculations.

Implementing AVVideoCompositing causes video rotation problems

I using Apple's example https://developer.apple.com/library/ios/samplecode/AVCustomEdit/Introduction/Intro.html and have some issues with video transformation.
If source assets have preferredTransform other than identity, output video will have incorrectly rotated frames. This problem can be fixed if AVMutableVideoComposition doesn't have value in property customVideoCompositorClass and when AVMutableVideoCompositionLayerInstruction's transform is setted up with asset.preferredTransform. But in reason of using custom video compositor, which adopting an AVVideoCompositing protocol I can't use standard video compositing instructions.
How can I pre-transform input asset tracks before it's CVPixelBuffer's putted into Metal shaders? Or there are any other way to fix it?
Fragment of original code:
func buildCompositionObjectsForPlayback(_ forPlayback: Bool, overwriteExistingObjects: Bool) {
// Proceed only if the composition objects have not already been created.
if self.composition != nil && !overwriteExistingObjects { return }
if self.videoComposition != nil && !overwriteExistingObjects { return }
guard !clips.isEmpty else { return }
// Use the naturalSize of the first video track.
let videoTracks = clips[0].tracks(withMediaType: AVMediaType.video)
let videoSize = videoTracks[0].naturalSize
let composition = AVMutableComposition()
composition.naturalSize = videoSize
/*
With transitions:
Place clips into alternating video & audio tracks in composition, overlapped by transitionDuration.
Set up the video composition to cycle between "pass through A", "transition from A to B", "pass through B".
*/
let videoComposition = AVMutableVideoComposition()
if self.transitionType == TransitionType.diagonalWipe.rawValue {
videoComposition.customVideoCompositorClass = APLDiagonalWipeCompositor.self
} else {
videoComposition.customVideoCompositorClass = APLCrossDissolveCompositor.self
}
// Every videoComposition needs these properties to be set:
videoComposition.frameDuration = CMTimeMake(1, 30) // 30 fps.
videoComposition.renderSize = videoSize
buildTransitionComposition(composition, andVideoComposition: videoComposition)
self.composition = composition
self.videoComposition = videoComposition
}
UPDATE:
I did workaround for transforming like this:
private func makeTransformedPixelBuffer(fromBuffer buffer: CVPixelBuffer, withTransform transform: CGAffineTransform) -> CVPixelBuffer? {
guard let newBuffer = renderContext?.newPixelBuffer() else {
return nil
}
// Correct transformation example I took from https://stackoverflow.com/questions/29967700/coreimage-coordinate-system
var preferredTransform = transform
preferredTransform.b *= -1
preferredTransform.c *= -1
var transformedImage = CIImage(cvPixelBuffer: buffer).transformed(by: preferredTransform)
preferredTransform = CGAffineTransform(translationX: -transformedImage.extent.origin.x, y: -transformedImage.extent.origin.y)
transformedImage = transformedImage.transformed(by: preferredTransform)
let filterContext = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!)
filterContext.render(transformedImage, to: newBuffer)
return newBuffer
}
But wondering if there are more memory-effective way without creation of new pixel buffers
How can I pre-transform input asset tracks before it's CVPixelBuffer's
putted into Metal shaders?
The best way to achieve maximum performance is to transform your video frame directly in shader. You just need to add rotation matrix in your Vertex shader.

AVCaptureVideoPreviewLayer does not detect objects in two ranges of the screen

I downloaded Apple's project about recognizing Objects in Live Capture.
When I tried the app I saw that if I put the object to recognize on the top or on the bottom of the camera view, the app doesn't recognize the object:
In this first image the banana is in the center of the camera view and the app is able to recognize it.
image object in center
In these two images the banana is near to the camera view's border and it is not able to recognize the object.
image object on top
image object on bottom
This is how session and previewLayer are set:
func setupAVCapture() {
var deviceInput: AVCaptureDeviceInput!
// Select a video device, make an input
let videoDevice = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: .video, position: .back).devices.first
do {
deviceInput = try AVCaptureDeviceInput(device: videoDevice!)
} catch {
print("Could not create video device input: \(error)")
return
}
session.beginConfiguration()
session.sessionPreset = .vga640x480 // Model image size is smaller.
// Add a video input
guard session.canAddInput(deviceInput) else {
print("Could not add video device input to the session")
session.commitConfiguration()
return
}
session.addInput(deviceInput)
if session.canAddOutput(videoDataOutput) {
session.addOutput(videoDataOutput)
// Add a video data output
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange)]
videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue)
} else {
print("Could not add video data output to the session")
session.commitConfiguration()
return
}
let captureConnection = videoDataOutput.connection(with: .video)
// Always process the frames
captureConnection?.isEnabled = true
do {
try videoDevice!.lockForConfiguration()
let dimensions = CMVideoFormatDescriptionGetDimensions((videoDevice?.activeFormat.formatDescription)!)
bufferSize.width = CGFloat(dimensions.width)
bufferSize.height = CGFloat(dimensions.height)
videoDevice!.unlockForConfiguration()
} catch {
print(error)
}
session.commitConfiguration()
previewLayer = AVCaptureVideoPreviewLayer(session: session)
previewLayer.videoGravity = AVLayerVideoGravity.resizeAspectFill
rootLayer = previewView.layer
previewLayer.frame = rootLayer.bounds
rootLayer.addSublayer(previewLayer)
}
You can download the project here,
I am wondering if it is normal or not.
Is there any solutions to fix?
Does it take square photos to elaborate with coreml and the two ranges are not included?
Any hints? Thanks
That's probably because the imageCropAndScaleOption is set to centerCrop.
The Core ML model expects a square image but the video frames are not square. This can be fixed by setting the imageCropAndScaleOption option on the VNCoreMLRequest. However, the results may not be as good as with center crop (it depends on how the model was originally trained).
See also VNImageCropAndScaleOption in the Apple docs.

Why is an iPhone XS getting worse CPU performance when using the camera live than an iPhone 6S Plus?

I'm using live camera output to update a CIImage on a MTKView. My main issue is that I have a large, negative performance difference where an older iPhone gets better CPU performance than a newer one, despite all their settings I've come across are the same.
This is a lengthy post, but I decided to include these details since they could be important to the cause of this problem. Please let me know what else I can include.
Below, I have my captureOutput function with two debug bools that I can turn on and off while running. I used this to try to determine the cause of my issue.
applyLiveFilter - bool whether or not to manipulate the CIImage with a CIFilter.
updateMetalView - bool whether or not to update the MTKView's CIImage.
// live output from camera
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection){
/*
Create CIImage from camera.
Here I save a few percent of CPU by using a function
to convert a sampleBuffer to a Metal texture, but
whether I use this or the commented out code
(without captureOutputMTLOptions) does not have
significant impact.
*/
guard let texture:MTLTexture = convertToMTLTexture(sampleBuffer: sampleBuffer) else{
return
}
var cameraImage:CIImage = CIImage(mtlTexture: texture, options: captureOutputMTLOptions)!
var transform: CGAffineTransform = .identity
transform = transform.scaledBy(x: 1, y: -1)
transform = transform.translatedBy(x: 0, y: -cameraImage.extent.height)
cameraImage = cameraImage.transformed(by: transform)
/*
// old non-Metal way of getting the ciimage from the cvPixelBuffer
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else
{
return
}
var cameraImage:CIImage = CIImage(cvPixelBuffer: pixelBuffer)
*/
var orientation = UIImage.Orientation.right
if(isFrontCamera){
orientation = UIImage.Orientation.leftMirrored
}
// apply filter to camera image
if debug_applyLiveFilter {
cameraImage = self.applyFilterAndReturnImage(ciImage: cameraImage, orientation: orientation, currentCameraRes:currentCameraRes!)
}
DispatchQueue.main.async(){
if debug_updateMetalView {
self.MTLCaptureView!.image = cameraImage
}
}
}
Below is a chart of results between both phones toggling the different combinations of bools discussed above:
Even without the Metal view's CIIMage updating and no filters being applied, the iPhone XS's CPU is 2% greater than iPhone 6S Plus's, which isn't a significant overhead, but makes me suspect that somehow how the camera is capturing is different between the devices.
My AVCaptureSession's preset is set identically between both phones
(AVCaptureSession.Preset.hd1280x720)
The CIImage created from captureOutput is the same size (extent)
between both phones.
Are there any settings I need to set manually between these two phones AVCaptureDevice's settings, including activeFormat properties, to make them the same between devices?
The settings I have now are:
if let captureDevice = AVCaptureDevice.default(for:AVMediaType.video) {
do {
try captureDevice.lockForConfiguration()
captureDevice.isSubjectAreaChangeMonitoringEnabled = true
captureDevice.focusMode = AVCaptureDevice.FocusMode.continuousAutoFocus
captureDevice.exposureMode = AVCaptureDevice.ExposureMode.continuousAutoExposure
captureDevice.unlockForConfiguration()
} catch {
// Handle errors here
print("There was an error focusing the device's camera")
}
}
My MTKView is based off code written by Simon Gladman, with some edits for performance and to scale the render before it is scaled up to the width of the screen using Core Animation suggested by Apple.
class MetalImageView: MTKView
{
let colorSpace = CGColorSpaceCreateDeviceRGB()
var textureCache: CVMetalTextureCache?
var sourceTexture: MTLTexture!
lazy var commandQueue: MTLCommandQueue =
{
[unowned self] in
return self.device!.makeCommandQueue()
}()!
lazy var ciContext: CIContext =
{
[unowned self] in
return CIContext(mtlDevice: self.device!)
}()
override init(frame frameRect: CGRect, device: MTLDevice?)
{
super.init(frame: frameRect,
device: device ?? MTLCreateSystemDefaultDevice())
if super.device == nil
{
fatalError("Device doesn't support Metal")
}
CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, self.device!, nil, &textureCache)
framebufferOnly = false
enableSetNeedsDisplay = true
isPaused = true
preferredFramesPerSecond = 30
}
required init(coder: NSCoder)
{
fatalError("init(coder:) has not been implemented")
}
// The image to display
var image: CIImage?
{
didSet
{
setNeedsDisplay()
}
}
override func draw(_ rect: CGRect)
{
guard var
image = image,
let targetTexture:MTLTexture = currentDrawable?.texture else
{
return
}
let commandBuffer = commandQueue.makeCommandBuffer()
let customDrawableSize:CGSize = drawableSize
let bounds = CGRect(origin: CGPoint.zero, size: customDrawableSize)
let originX = image.extent.origin.x
let originY = image.extent.origin.y
let scaleX = customDrawableSize.width / image.extent.width
let scaleY = customDrawableSize.height / image.extent.height
let scale = min(scaleX*IVScaleFactor, scaleY*IVScaleFactor)
image = image
.transformed(by: CGAffineTransform(translationX: -originX, y: -originY))
.transformed(by: CGAffineTransform(scaleX: scale, y: scale))
ciContext.render(image,
to: targetTexture,
commandBuffer: commandBuffer,
bounds: bounds,
colorSpace: colorSpace)
commandBuffer?.present(currentDrawable!)
commandBuffer?.commit()
}
}
My AVCaptureSession (captureSession) and AVCaptureVideoDataOutput (videoOutput) are setup below:
func setupCameraAndMic(){
let backCamera = AVCaptureDevice.default(for:AVMediaType.video)
var error: NSError?
var videoInput: AVCaptureDeviceInput!
do {
videoInput = try AVCaptureDeviceInput(device: backCamera!)
} catch let error1 as NSError {
error = error1
videoInput = nil
print(error!.localizedDescription)
}
if error == nil &&
captureSession!.canAddInput(videoInput) {
guard CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, MetalDevice, nil, &textureCache) == kCVReturnSuccess else {
print("Error: could not create a texture cache")
return
}
captureSession!.addInput(videoInput)
setDeviceFrameRateForCurrentFilter(device:backCamera)
stillImageOutput = AVCapturePhotoOutput()
if captureSession!.canAddOutput(stillImageOutput!) {
captureSession!.addOutput(stillImageOutput!)
let q = DispatchQueue(label: "sample buffer delegate", qos: .default)
videoOutput.setSampleBufferDelegate(self, queue: q)
videoOutput.videoSettings = [
kCVPixelBufferPixelFormatTypeKey as AnyHashable as! String: NSNumber(value: kCVPixelFormatType_32BGRA),
kCVPixelBufferMetalCompatibilityKey as String: true
]
videoOutput.alwaysDiscardsLateVideoFrames = true
if captureSession!.canAddOutput(videoOutput){
captureSession!.addOutput(videoOutput)
}
captureSession!.startRunning()
}
}
setDefaultFocusAndExposure()
}
The video and mic are recorded on two separate streams. Details on the microphone and recording video have been left out since my focus is performance of live camera output.
UPDATE - I have a simplified test project on GitHub that makes it a lot easier to test the problem I'm having: https://github.com/PunchyBass/Live-Filter-test-project
From the top of my mind, you are not comparing pears with pears, even if you are running with the 2.49 GHz of A12 against 1.85 GHz of A9, the differences between the cameras are also huge, even if you use them with the same parameters there are several features from XS's camera that require more CPU resources (dual camera, stabilization, smart HDR, etc).
Sorry for the sources, I tried to find metrics of the CPU cost of those features, but I couldn't find it, unfortunately for your needs, that information is not relevant for marketing, when they are selling it as the best camera ever for an smartphone.
They are selling it as the best processor as well, we don't know what would happen using the XS camera with an A9 processor, it would probably crash, we will never know...
PS.... Your metrics are for the whole processor or for the used core? For the whole processor, you also need to consider other tasks that the devices can be executing, for the single core, is 21% of 200% against 39% of 600%

Taking a snapshot of UIImageView instead of main View

I have in my app in my View a Button for taking snapshot with below code
open func takeScreenshot(_ shouldSave: Bool = true) -> UIImage? {
var screenshotImage :UIImage?
let layer = UIApplication.shared.keyWindow!.layer
let scale = UIScreen.main.scale
UIGraphicsBeginImageContextWithOptions(layer.frame.size, false, scale);
guard let context = UIGraphicsGetCurrentContext() else {return nil}
layer.render(in:context)
screenshotImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
if let image = screenshotImage, shouldSave {
UIImageWriteToSavedPhotosAlbum(image, nil, nil, nil)
}
return screenshotImage
}
But in the same View is UIImageView and I want to take a snapshot of this UIImageView. How should I change the code? When I take a snapshot with this code, it will save only black screen with button for play/pause video and with button for taking snapshot.
Instead you using
let layer = UIApplication.shared.keyWindow!.layer
Just use your imageview Layer
let layer = YourImageView.layer