Keep last minute video with AVCaptureVideoDataOutput - swift

I am grabbing frames this way with swift 3 and AVCaptureSession.
What i want to do is to store frames in memory. Easy ? I just have to store my pixel buffer in memory, or write it to a document file...
But i want to store about one minute of frames and then delete older frames to replace by new ones. When the user will hit a button, i want to keep in memory the last minute of frames. I have tried to write a double chained-list, but i do not think it is a good thing for performance. Is there a best way to do that, with AVFoundation tools ?
Thanks
let videoOutput = AVCaptureVideoDataOutput()
videoOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as NSString:kCVPixelFormatType_32BGRA]
videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "my queue"))
if session.canAddOutput(videoOutput)
{
session.addOutput(videoOutput)
}
// AVCaptureVideoDataOutputSampleBufferDelegate
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!)
{
NSLog("frame")
let pixelBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
}

Related

Record camera output before Vision recognises event

My app recognises an event using Vision and uses CMSampleBuffer to do so.
After the event I am recording the video already using AVWriter successfully.
Now I want to record the full motion and thus record 1-2 seconds before the event occurred.
I tried pushing the CMSampleBuffer into a ring buffer, but that starves the camera of buffers.
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
// sends that to detectBall
/// Gets called by the camera every time there is a new buffer available
func detectBall(inBuffer buffer: CMSampleBuffer,
ballDetectionRequest: VNCoreMLRequest,
orientation: CGImagePropertyOrientation,
frame: NormalizedPoint,
updatingRingBuffer: PassthroughSubject<AppEnvironment.AVState.RingBufferItem, Never>
) throws {
// I tried to convert it into a CVPixelBuffer but its a shallow copy as well so it also starves the camera
let imageBuffer: CVPixelBuffer = CMSampleBufferGetImageBuffer(buffer)!
/// rotated 90 because of the cameras native landscape orientation
let visionHandler = VNImageRequestHandler(ciImage: croppedImage, options: [:])
try visionHandler.perform([ballDetectionRequest])
if let results = ballDetectionRequest as? [VNClassificationObservation] {
// Filter out classification results with low confidence
let filteredResults = results.filter { $0.confidence > 0.9 }
guard let topResult = results.first,
topResult.confidence > 0.9 else { return }
// print(" its a: \(topResult.identifier)")
// print("copy buffer")
updatingRingBuffer.send(AppEnvironment.AVState.RingBufferItem(
/// HERE IS THE PROBLEM: AS SOON AS I SEND IT SOMEWHERE ELSE THE CAMERA IS STARVED
buffer: imageBuffer,
ball: topResult.identifier == "ball")
How can I achieve to store these 1-2 seconds of video continuously without writing it to disk and then prepending it to the video file?
Thanks!

Recording of metal view is slow due to texture.getbytes function - Swift

I am using this post for recording a custom metal view, but I am experiencing some issues. When I start recording I go from 60fps to ~20fps on a iPhone 12 Pro Max. After Profiling, the function that is slowing everything is texture.getBytes, as it is grabbing buffer from the GPU into the CPU.
Another issue, not sure if consequence of this, is that the video and audio are out of sync. I am not sure if I should go into the semaphores route for solving this or there is any other potential workaround.
In my case, the texture size is as big as the screen size, as I create it from the camera stream and then process it through a couple of CIFilters. I am not sure if the issue is that it is too big so getBytes cannot support this size of textures on a real-time basis.
If I need to define priorities, my #1 priority would be to solve the out-of-sync between the audio and video. Any thoughts would be super helpful.
Here is the code:
import AVFoundation
class MetalVideoRecorder {
var isRecording = false
var recordingStartTime = TimeInterval(0)
private var assetWriter: AVAssetWriter
private var assetWriterVideoInput: AVAssetWriterInput
private var assetWriterPixelBufferInput: AVAssetWriterInputPixelBufferAdaptor
init?(outputURL url: URL, size: CGSize) {
do {
assetWriter = try AVAssetWriter(outputURL: url, fileType: AVFileType.m4v)
} catch {
return nil
}
let outputSettings: [String: Any] = [ AVVideoCodecKey : AVVideoCodecType.h264,
AVVideoWidthKey : size.width,
AVVideoHeightKey : size.height ]
assetWriterVideoInput = AVAssetWriterInput(mediaType: AVMediaType.video, outputSettings: outputSettings)
assetWriterVideoInput.expectsMediaDataInRealTime = true
let sourcePixelBufferAttributes: [String: Any] = [
kCVPixelBufferPixelFormatTypeKey as String : kCVPixelFormatType_32BGRA,
kCVPixelBufferWidthKey as String : size.width,
kCVPixelBufferHeightKey as String : size.height ]
assetWriterPixelBufferInput = AVAssetWriterInputPixelBufferAdaptor(assetWriterInput: assetWriterVideoInput,
sourcePixelBufferAttributes: sourcePixelBufferAttributes)
assetWriter.add(assetWriterVideoInput)
}
func startRecording() {
assetWriter.startWriting()
assetWriter.startSession(atSourceTime: CMTime.zero)
recordingStartTime = CACurrentMediaTime()
isRecording = true
}
func endRecording(_ completionHandler: #escaping () -> ()) {
isRecording = false
assetWriterVideoInput.markAsFinished()
assetWriter.finishWriting(completionHandler: completionHandler)
}
func writeFrame(forTexture texture: MTLTexture) {
if !isRecording {
return
}
while !assetWriterVideoInput.isReadyForMoreMediaData {}
guard let pixelBufferPool = assetWriterPixelBufferInput.pixelBufferPool else {
print("Pixel buffer asset writer input did not have a pixel buffer pool available; cannot retrieve frame")
return
}
var maybePixelBuffer: CVPixelBuffer? = nil
let status = CVPixelBufferPoolCreatePixelBuffer(nil, pixelBufferPool, &maybePixelBuffer)
if status != kCVReturnSuccess {
print("Could not get pixel buffer from asset writer input; dropping frame...")
return
}
guard let pixelBuffer = maybePixelBuffer else { return }
CVPixelBufferLockBaseAddress(pixelBuffer, [])
let pixelBufferBytes = CVPixelBufferGetBaseAddress(pixelBuffer)!
// Use the bytes per row value from the pixel buffer since its stride may be rounded up to be 16-byte aligned
let bytesPerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let region = MTLRegionMake2D(0, 0, texture.width, texture.height)
texture.getBytes(pixelBufferBytes, bytesPerRow: bytesPerRow, from: region, mipmapLevel: 0)
let frameTime = CACurrentMediaTime() - recordingStartTime
let presentationTime = CMTimeMakeWithSeconds(frameTime, preferredTimescale: 240)
assetWriterPixelBufferInput.append(pixelBuffer, withPresentationTime: presentationTime)
CVPixelBufferUnlockBaseAddress(pixelBuffer, [])
}
}
Unlike OpenGL, Metal doesn't have the concept of a default framebuffer. Instead it uses a technique called Swap Chain. A swap chain is a collection of buffers that are used for displaying frames to the user. Each time an application presents a new frame for display, the first buffer in the swap chain takes the place of the displayed buffer.
When a command queue schedules a command buffer for execution, the
drawable tracks all render or write requests on itself in that command
buffer. The operating system doesn't present the drawable onscreen
until the commands have finished executing. By asking the command
buffer to present the drawable, you guarantee that presentation
happens after the command queue has scheduled this command buffer.
Don’t wait for the command buffer to finish executing before
registering the drawable’s presentation.
The layer reuses a drawable only if it isn’t onscreen and there are no strong references to it. They exist within a limited and reusable resource pool, and a drawable may or may not be available when you request one. If none are available, Core Animation blocks your calling thread until a new drawable becomes available — usually at the next display refresh interval.
In your case frame recorder keeps a reference to your drawable for too long which is what causes the frame drops. In order to avoid it you should implement a Triple Buffering Model.
Adding a third dynamic data buffer is the ideal solution when considering processor idle time, memory overhead, and frame latency.
I have encountered the same problem, I'd like to know if you have solved this problem.
Here is what I know now.
Everything is doing on main thread. You can init another serial queue to do the writing & finishWriting asynchronously.
My iPhone Xs Max can record screen size video at 60 FPS.
You can check this repo,it is swift version of Apple's sample which is using AVAssetWriter, and it will tell you how to sync your video and audio.
RosyWriter
getBytes might have performance issue on A14 devices. Same code running on iPhone 12 Pro Max, the output video is laggy and unusable.
You can check this.
Developer Forums
I did not fully understand how to implement #HamidYusifli proposed solution, so I focused on:
Optimize the rest of the Metal code (I am doing some real time image processing)
Fix the out of sync video and audio via AVCaptureSynchronizedData
With this new implementation my code is still consuming quite a lot of CPU (106% on iPhone 12 plus) and at ~20fps but with a feeling of working pretty smooth to the user (there is no out-of-sync)

captureOutput stops being called after switching CIFilters

I use this function to change CIFilter on my camera preview. It works as it should, but somehow after switching several filters, captureOutput stops being called and the preview is stuck on the last image captured. It does not return on my "guard let filter". The app does not crash - when I close the camera, and reopen it, it works again.
How can I prevent that behaviour?
func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!)
{
guard let filter = Filters[FilterNames[currentFilter]] else
{
return
}
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let cameraImage = CIImage(CVPixelBuffer: pixelBuffer!)
filter!.setValue(cameraImage, forKey: kCIInputImageKey)
let filteredImage = UIImage(CIImage: filter!.valueForKey(kCIOutputImageKey) as! CIImage!)
dispatch_async(dispatch_get_main_queue())
{
self.imageView.image = filteredImage
}
}
I guess the system can't keep up with the rendering of the images. UIImageView is not meant to display new images at 30 frames per second while also adding the filtering on top of that.
A much more efficient way would be to render directly into an MTKView. I encourage you to check out the AVCamFilter example project to see how this can be done.

Detect camera condition in AVCaptureSession swift

I am working on a swift application and I want to take a picture during the video when the camera is not moving or when user focuses on something.
i used AVCaptureVideoDataOutputSampleBufferDelegate *captureOutput method which giving me image every time after starting camera. but I want to take only when the camera is not moving or focused.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
print("didOutput")
guard let hasImage = CMSampleBufferGetImageBuffer(sampleBuffer) else {
print("no image")
return
}
let ciimage : CIImage = CIImage(cvPixelBuffer: hasImage)
DispatchQueue.main.async {
self.liveCamImage = self.convert(cmage: ciimage)
}
}
is there any solution for this
You can try to use adjusting focus property of your capture device (AVCaptureDevice), when it is false the focus is stable. See detailed documentation below.
/**
#property adjustingFocus
#abstract
Indicates whether the receiver is currently performing a focus scan to adjust focus.
#discussion
The value of this property is a BOOL indicating whether the receiver's camera focus is being automatically adjusted by means of a focus scan, because its focus mode is AVCaptureFocusModeAutoFocus or AVCaptureFocusModeContinuousAutoFocus. Clients can observe the value of this property to determine whether the camera's focus is stable.
#seealso lensPosition
#seealso AVCaptureAutoFocusSystem
*/
open var isAdjustingFocus: Bool { get }

How do you add an overlay while recording a video in Swift?

I am trying to record, and then save, a video in Swift using AVFoundation. This works. I am also trying to add an overlay, such as a text label containing the date, to the video.
For example: the video saved is not only what the camera sees, but the timestamp as well.
Here is how I am saving the video:
func fileOutput(_ output: AVCaptureFileOutput, didFinishRecordingTo outputFileURL: URL, from connections: [AVCaptureConnection], error: Error?) {
saveVideo(toURL: movieURL!)
}
private func saveVideo(toURL url: URL) {
PHPhotoLibrary.shared().performChanges({
PHAssetChangeRequest.creationRequestForAssetFromVideo(atFileURL: url)
}) { (success, error) in
if(success) {
print("Video saved to Camera Roll.")
} else {
print("Video failed to save.")
}
}
}
I have a movieOuput that is an AVCaptureMovieFileOutput. My preview layer does not contain any sublayers. I tried adding the timestamp label's layer to the previewLayer, but this did not succeed.
I have tried Ray Wenderlich's example as well as this stack overflow question. Lastly, I also tried this tutorial, all of which to no avail.
How can I add an overlay to my video that is in the saved video in the camera roll?
Without more information it sounds like what you are asking for is a WATERMARK.
Not an overlay.
A watermark is a markup on the video that will be saved with the video.
An overlay is generally showed as subviews on the preview layer and will not be saved with the video.
Check this out here: https://stackoverflow.com/a/47742108/8272698
func addWatermark(inputURL: URL, outputURL: URL, handler:#escaping (_ exportSession: AVAssetExportSession?)-> Void) {
let mixComposition = AVMutableComposition()
let asset = AVAsset(url: inputURL)
let videoTrack = asset.tracks(withMediaType: AVMediaType.video)[0]
let timerange = CMTimeRangeMake(kCMTimeZero, asset.duration)
let compositionVideoTrack:AVMutableCompositionTrack = mixComposition.addMutableTrack(withMediaType: AVMediaType.video, preferredTrackID: CMPersistentTrackID(kCMPersistentTrackID_Invalid))!
do {
try compositionVideoTrack.insertTimeRange(timerange, of: videoTrack, at: kCMTimeZero)
compositionVideoTrack.preferredTransform = videoTrack.preferredTransform
} catch {
print(error)
}
let watermarkFilter = CIFilter(name: "CISourceOverCompositing")!
let watermarkImage = CIImage(image: UIImage(named: "waterMark")!)
let videoComposition = AVVideoComposition(asset: asset) { (filteringRequest) in
let source = filteringRequest.sourceImage.clampedToExtent()
watermarkFilter.setValue(source, forKey: "inputBackgroundImage")
let transform = CGAffineTransform(translationX: filteringRequest.sourceImage.extent.width - (watermarkImage?.extent.width)! - 2, y: 0)
watermarkFilter.setValue(watermarkImage?.transformed(by: transform), forKey: "inputImage")
filteringRequest.finish(with: watermarkFilter.outputImage!, context: nil)
}
guard let exportSession = AVAssetExportSession(asset: asset, presetName: AVAssetExportPreset640x480) else {
handler(nil)
return
}
exportSession.outputURL = outputURL
exportSession.outputFileType = AVFileType.mp4
exportSession.shouldOptimizeForNetworkUse = true
exportSession.videoComposition = videoComposition
exportSession.exportAsynchronously { () -> Void in
handler(exportSession)
}
}
And heres how to call the function.
let outputURL = NSURL.fileURL(withPath: "TempPath")
let inputURL = NSURL.fileURL(withPath: "VideoWithWatermarkPath")
addWatermark(inputURL: inputURL, outputURL: outputURL, handler: { (exportSession) in
guard let session = exportSession else {
// Error
return
}
switch session.status {
case .completed:
guard NSData(contentsOf: outputURL) != nil else {
// Error
return
}
// Now you can find the video with the watermark in the location outputURL
default:
// Error
}
})
Let me know if this code works for you.
It is in swift 3 so some changes will be needed.
I currently am using this code on an app of mine. Have not updated it to swift 5 yet
I do not have an actual development environment for Swift that can utilize AVFoundation. Thus, I can't provide you with any example code.
For adding meta data(date, location, timestamp, watermark, frame rate, etc...) as an overlay to the video while recording, you would have to process the video feed, frame by frame, live, while recording. Most likely you would have to store the frames in a buffer and process them before actually record them.
Now when it come to the meta data, there are two type, static and dynamic. For static type such as a watermark, it should be easy enough, as all the frames will get the same thing.
However, for dynamic meta data type such as timestamp or GPS location, there are a few things that needed to be taken into consideration. It takes computational power and time to process the video frames. Thus, depends on the type of dynamic data and how you got them, sometime the processed value may not be a correct value. For example, if you got a frame at 1:00:01, you process it and add a timestamp to it. Just pretend that it took 2 seconds to process the timestamp. The next frame you got is at 1:00:02, but you couldn't process it until 1:00:03 because processing the previous frame took 2 seconds. Thus, depend on how you got that new timestamp for the new frame, that timestamp value may not be the value that you wanted.
For processing dynamic meta data, you should also take into consideration of hardware lag. For example, the software is supposed to add live GPS location data to each frame and there weren't any lags in development or in testing. However, in real life, a user used the software in an area with a bad connection, and his phone lag while obtaining his GPS location. Some of his lags lasted as long as 5 seconds. What do you do in that situation? Do you set a time out for the GPS location and used the last good position? Do you report the error? Do you defer that frame to be process later when the GPS data become available(This may ruin live recording) and using an expensive algorithm to try to predict the user's location for that frame?
Besides those to take into consideration, I have some references here that I think may help you. I thought the one from medium.com looked pretty good.
https://medium.com/ios-os-x-development/ios-camera-frames-extraction-d2c0f80ed05a
Adding watermark to currently recording video and save with watermark
Render dynamic text onto CVPixelBufferRef while recording video
Adding on to #Kevin Ng, you can do an overlay on video frames with an UIViewController and an UIView.
UIViewController will have:
property to work with video stream
private var videoSession: AVCaptureSession?
property to work with overlay(the UIView class)
private var myOverlay: MyUIView{view as! MyUIView}
property to work with video output queue
private let videoOutputQueue = DispatchQueue(label:
"outputQueue", qos: .userInteractive)
method to create video session
method to process and display overlay
UIView will have task-specific helper methods needed to to act as overlay. For example, if you are doing hand detection, this overlay class can have helper methods to draw points on coordinates(ViewController class will detect coordinates of hand features, do necessary coordinate conversions, then pass the coordinates to the UIView class to display coordinates as an overlay)