TokBox: 'consumeFrame' crashes when using a modified pixel buffer - swift

I'm trying to modify the pixel buffer from live video feed from AVFoundation to stream through OpenTok's API. But whenever I try to do so and feed it through OpenTok's consumeFrame, it crashes.
I am doing this so I can apply different live video effects (filters, stickers, etc).. I have tried converting CGImage->CVPixelBuffer with different methods but nothing works.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
if !capturing || videoCaptureConsumer == nil {
return
}
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
else {
print("Error acquiring sample buffer")
return
}
guard let videoInput = videoInput
else {
print("Capturer does not have a valid input")
return
}
let time = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
videoFrame.clearPlanes()
videoFrame.timestamp = time
let height = UInt32(CVPixelBufferGetHeight(imageBuffer))
let width = UInt32(CVPixelBufferGetWidth(imageBuffer))
if width != captureWidth || height != captureHeight {
updateCaptureFormat(width: width, height: height)
}
// This is where I convert CVImageBuffer->CIImage, modify it, turn it into CGImage, then CGImage->CVPixelBuffer
guard let finalImage = makeBigEyes(imageBuffer) else { return }
CVPixelBufferLockBaseAddress(finalImage, CVPixelBufferLockFlags(rawValue: 0))
videoFrame.format?.estimatedCaptureDelay = 10
videoFrame.orientation = .left
videoFrame.clearPlanes()
videoFrame.planes?.addPointer(CVPixelBufferGetBaseAddress(finalImage))
delegate?.finishPreparingFrame(videoFrame)
videoCaptureConsumer!.consumeFrame(videoFrame)
CVPixelBufferUnlockBaseAddress(finalImage, CVPixelBufferLockFlags(rawValue: 0))
}
And here is my CGImage->CVPixelBuffer method:
func buffer(from image: UIImage) -> CVPixelBuffer? {
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer : CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(image.size.width), Int(image.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
guard (status == kCVReturnSuccess) else {
return nil
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: pixelData, width: Int(image.size.width), height: Int(image.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
context?.translateBy(x: 0, y: image.size.height)
context?.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context!)
image.draw(in: CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
I get this error on the first frame:
* Terminating app due to uncaught exception 'NSRangeException', reason: '* -[NSConcretePointerArray pointerAtIndex:]: attempt to
access pointer at index 1 beyond bounds 1'
If you made it this far, thank you for reading. I've been stuck on this issue a while now, so any kind of pointer would be greatly appreciated. Thanks.

Since you are converting the camera(NV12) frame to RGB, you need to set pixelFormat to OTPixelFormatARGB on videoFrame.format

Related

Modifying CVPixelBuffer

I'm using the method below to add drawings to a pixel buffer, then append it to an AVAssetWriterInputPixelBufferAdaptor.
It works on my Mac mini (macOS 12 beta 7), but the drawingHandler draws nothing on my MacBook (macOS 11.5.2).
Is there anything wrong with this code?
#if os(macOS)
import AppKit
#else
import UIKit
#endif
import CoreMedia
extension CMSampleBuffer {
func pixelBuffer(drawingHandler: ((CGRect) -> Void)? = nil) -> CVPixelBuffer? {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(self) else {
return nil
}
guard let drawingHandler = drawingHandler else {
return pixelBuffer
}
guard CVPixelBufferLockBaseAddress(pixelBuffer, .readOnly) == kCVReturnSuccess else {
return pixelBuffer
}
let data = CVPixelBufferGetBaseAddress(pixelBuffer)
let width = CVPixelBufferGetWidth(pixelBuffer)
let height = CVPixelBufferGetHeight(pixelBuffer)
let bitsPerComponent = 8
let bytesPerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let colorSpace = CGColorSpaceCreateDeviceRGB()
let imageByteOrderInfo = CGImageByteOrderInfo.order32Little
let imageAlphaInfo = CGImageAlphaInfo.premultipliedFirst
if let ctx = CGContext(data: data,
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: imageByteOrderInfo.rawValue | imageAlphaInfo.rawValue)
{
// Push
#if os(macOS)
let graphCtx = NSGraphicsContext(cgContext: ctx, flipped: false)
NSGraphicsContext.saveGraphicsState()
NSGraphicsContext.current = graphCtx
#else
UIGraphicsPushContext(ctx)
#endif
let rect = CGRect(x: 0, y: 0, width: width, height: height)
drawingHandler(rect)
// Pop
#if os(macOS)
NSGraphicsContext.restoreGraphicsState()
#else
UIGraphicsPopContext()
#endif
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, .readOnly)
return pixelBuffer
}
}
Change the lock flags.
let lockFlags = CVPixelBufferLockFlags(rawValue: 0)
guard CVPixelBufferLockBaseAddress(pixelBuffer, lockFlags) == kCVReturnSuccess else {
return pixelBuffer
}
// ...
CVPixelBufferUnlockBaseAddress(pixelBuffer, lockFlags)

Use Create ML object detection model in swift

Hello I have created an object detection model in create ML and imported it to my swift project but I can't figure out how to use it. Basically i'm just looking to give the model an input and then receive an output. I have opened the Ml model prediction tab and found the input and output variabels but i don't know how to implement it code wise. I have searched on the internet for an answer and found multiple code snippets for running ml models but I can't get them to work.
This is the ML Model:
ML Model predictions
This is the code I have tried:
let model = TestObjectModel()
guard let modelOutput = try? model.prediction(imagePath: "images_(2)" as! CVPixelBuffer, iouThreshold: 0.5, confidenceThreshold: 0.5) else {
fatalError("Unexpected runtime error.")
}
print(modelOutput)
When running the code i get this error:
error: Execution was interrupted, reason: EXC_BREAKPOINT (code=1, subcode=0x106c345c0).
The process has been left at the point where it was interrupted, use "thread return -x" to return to the state before expression evaluation.
Ok first to all you have to decide which type of Input you have declared.You can see it, when you click on your model in the project navigator.
For example :
let mlArray = try? MLMultiArray(shape: [1024],dataType: MLMultiArrayDataType.float32)
mlArray![index] = x --> giving your array some data
let input = TestObjectModel(input: mlArray!)
do {
let options = MLPredictionOptions.init()
options.usesCPUOnly = true
let prediction = try? self. TestObjectModel.prediction(input: input, options: options)
--> now you can use prediction which is your output
} catch let err {
fatalError(err.localizedDescription) // Error computing NN outputs error
}
Another example for image as input for your model :
do {
if let resizedImage = resize(image: image, newSize: CGSize(width: 416, height: 416)), let pixelBuffer = resizedImage.toCVPixelBuffer() {
let prediction = try model.prediction(image: pixelBuffer)
let value = prediction.output[0].intValue
print(value)
}
} catch {
print("Error while doing predictions: \(error)")
}
func resize(image: UIImage, newSize: CGSize) -> UIImage? {
UIGraphicsBeginImageContextWithOptions(newSize, false, 0.0)
image.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height))
let newImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return newImage
}
extension UIImage {
func toCVPixelBuffer() -> CVPixelBuffer? {
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer : CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(self.size.width), Int(self.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
guard (status == kCVReturnSuccess) else {
return nil
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: pixelData, width: Int(self.size.width), height: Int(self.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
context?.translateBy(x: 0, y: self.size.height)
context?.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context!)
self.draw(in: CGRect(x: 0, y: 0, width: self.size.width, height: self.size.height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
}

Bitmap to Metal Texture and back - How does the pixelFormat work?

I'm having problems understanding how the pixelFormat of a MTLTexture relates to the properties of a NSBitmapImageRep?
In particular, I want to use a metal compute kernel (or the built in MPS method) to subtract an image from another one and KEEP the negative values temporarily.
I have a method that creates a MTLTexture from a bitmap with a specified pixelFormat:
func textureFrom(bitmap: NSBitmapImageRep, pixelFormat: MTLPixelFormat) -> MTLTexture? {
guard !bitmap.isPlanar else {
return nil
}
let region = MTLRegionMake2D(0, 0, bitmap.pixelsWide, bitmap.pixelsHigh)
var textureDescriptor = MTLTextureDescriptor()
textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: pixelFormat, width: bitmap.pixelsWide, height: bitmap.pixelsHigh, mipmapped: false)
guard let texture = device.makeTexture(descriptor: textureDescriptor),
let src = bitmap.bitmapData else { return nil }
texture.replace(region: region, mipmapLevel: 0, withBytes: src, bytesPerRow: bitmap.bytesPerRow)
return texture
}
Then I use the textures to do some computation (like a subtraction) and when I'm done, I want to get a bitmap back. In the case of textures with a .r8Snorm pixelFormat, I thought I could do:
func bitmapFrom(r8SnormTexture: MTLTexture?) -> NSBitmapImageRep? {
guard let texture = r8SnormTexture,
texture.pixelFormat == .r8Snorm else { return nil }
let bytesPerPixel = 1
let imageByteCount = Int(texture.width * texture.height * bytesPerPixel)
let bytesPerRow = texture.width * bytesPerPixel
var src = [Float](repeating: 0, count: imageByteCount)
let region = MTLRegionMake2D(0, 0, texture.width, texture.height)
texture.getBytes(&src, bytesPerRow: bytesPerRow, from: region, mipmapLevel: 0)
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.none.rawValue)
let colorSpace = CGColorSpaceCreateDeviceGray()
let bitsPerComponent = 8
let context = CGContext(data: &src, width: texture.width, height: texture.height, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo.rawValue)
guard let dstImageFilter = context?.makeImage() else {
return nil
}
return NSBitmapImageRep(cgImage: dstImageFilter)
}
But the negative values are not preserved, they are clamped to zero somehow...
Any insight on how swift goes from bitmap to texture and back would be appreciated.

Is there a faster way to create a CVPixelBuffer from a UIImage in Swift?

Task: Record a video in real time with a filter applied to it
Problem Getting a CVPixelBuffer from a modified UIImage is too slow
My camera's output is being filtered and going right to a UIImageView so that the user can see the effect in real time, even when not recording video or taking a photo. I'd like some way to record this changing UIImage to video, so it doesn't need to be like the way I'm doing now. Currently, I'm doing it by appending CVPixelBuffer's to an assetWriter, but since I'm applying a filter to a UIImage, I translate the UIImage back to a buffer. I've tested with and without the UIImage -> buffer, so I've proven that's causing the unacceptable slow down.
Below is the code inside captureOutput, commented to be clear what is going on, and the method for getting the UIImage buffer:
// this function is called to output the device's camera output in realtime
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection){
if(captureOutput){
// create ciImage from buffer
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let cameraImage = CIImage(cvPixelBuffer: pixelBuffer!)
// set UIImage to ciImage
image = UIImage(ciImage: cameraImage)
if let ciImage = image?.ciImage {
// apply filter to CIImage
image = filterCIImage(with:ciImage)
// make CGImage and apply orientation
image = UIImage(cgImage: (image?.cgImage)!, scale: 1.0, orientation: UIImageOrientation.right)
// get format description, dimensions and current sample time
let formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer)!
self.currentVideoDimensions = CMVideoFormatDescriptionGetDimensions(formatDescription)
self.currentSampleTime = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer)
// check if user toggled video recording
// and asset writer is ready
if(videoIsRecording && self.assetWriterPixelBufferInput?.assetWriterInput.isReadyForMoreMediaData == true){
// get pixel buffer from UIImage - SLOW!
let filteredBuffer = buffer(from: image!)
// append the buffer to the asset writer
let success = self.assetWriterPixelBufferInput?.append(filteredBuffer!, withPresentationTime: self.currentSampleTime!)
if success == false {
print("Pixel Buffer failed")
}
}
}
DispatchQueue.main.async(){
// update UIImageView with filtered camera output
imageView!.image = image
}
}
}
// UIImage to buffer method:
func buffer(from image: UIImage) -> CVPixelBuffer? {
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer : CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(image.size.width), Int(image.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
guard (status == kCVReturnSuccess) else {
return nil
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: pixelData, width: Int(image.size.width), height: Int(image.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
context?.translateBy(x: 0, y: image.size.height)
context?.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context!)
image.draw(in: CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}

Is there an easier way to setup a pixel buffer for CoreML? [duplicate]

I am trying to get Apple's sample Core ML Models that were demoed at the 2017 WWDC to function correctly. I am using the GoogLeNet to try and classify images (see the Apple Machine Learning Page). The model takes a CVPixelBuffer as an input. I have an image called imageSample.jpg that I'm using for this demo. My code is below:
var sample = UIImage(named: "imageSample")?.cgImage
let bufferThree = getCVPixelBuffer(sample!)
let model = GoogLeNetPlaces()
guard let output = try? model.prediction(input: GoogLeNetPlacesInput.init(sceneImage: bufferThree!)) else {
fatalError("Unexpected runtime error.")
}
print(output.sceneLabel)
I am always getting the unexpected runtime error in the output rather than an image classification. My code to convert the image is below:
func getCVPixelBuffer(_ image: CGImage) -> CVPixelBuffer? {
let imageWidth = Int(image.width)
let imageHeight = Int(image.height)
let attributes : [NSObject:AnyObject] = [
kCVPixelBufferCGImageCompatibilityKey : true as AnyObject,
kCVPixelBufferCGBitmapContextCompatibilityKey : true as AnyObject
]
var pxbuffer: CVPixelBuffer? = nil
CVPixelBufferCreate(kCFAllocatorDefault,
imageWidth,
imageHeight,
kCVPixelFormatType_32ARGB,
attributes as CFDictionary?,
&pxbuffer)
if let _pxbuffer = pxbuffer {
let flags = CVPixelBufferLockFlags(rawValue: 0)
CVPixelBufferLockBaseAddress(_pxbuffer, flags)
let pxdata = CVPixelBufferGetBaseAddress(_pxbuffer)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB();
let context = CGContext(data: pxdata,
width: imageWidth,
height: imageHeight,
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(_pxbuffer),
space: rgbColorSpace,
bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue)
if let _context = context {
_context.draw(image, in: CGRect.init(x: 0, y: 0, width: imageWidth, height: imageHeight))
}
else {
CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);
return nil
}
CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);
return _pxbuffer;
}
return nil
}
I got this code from a previous StackOverflow post (last answer here). I recognize that the code may not be correct, but I have no idea of how to do this myself. I believe that this is the section that contains the error. The model calls for the following type of input: Image<RGB,224,224>
You don't need to do a bunch of image mangling yourself to use a Core ML model with an image — the new Vision framework can do that for you.
import Vision
import CoreML
let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)
let request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod)
let handler = VNImageRequestHandler(url: myImageURL)
handler.perform([request])
func myResultsMethod(request: VNRequest, error: Error?) {
guard let results = request.results as? [VNClassificationObservation]
else { fatalError("huh") }
for classification in results {
print(classification.identifier, // the scene label
classification.confidence)
}
}
The WWDC17 session on Vision should have a bit more info — it's tomorrow afternoon.
You can use a pure CoreML, but you should resize an image to (224,224)
DispatchQueue.global(qos: .userInitiated).async {
// Resnet50 expects an image 224 x 224, so we should resize and crop the source image
let inputImageSize: CGFloat = 224.0
let minLen = min(image.size.width, image.size.height)
let resizedImage = image.resize(to: CGSize(width: inputImageSize * image.size.width / minLen, height: inputImageSize * image.size.height / minLen))
let cropedToSquareImage = resizedImage.cropToSquare()
guard let pixelBuffer = cropedToSquareImage?.pixelBuffer() else {
fatalError()
}
guard let classifierOutput = try? self.classifier.prediction(image: pixelBuffer) else {
fatalError()
}
DispatchQueue.main.async {
self.title = classifierOutput.classLabel
}
}
// ...
extension UIImage {
func resize(to newSize: CGSize) -> UIImage {
UIGraphicsBeginImageContextWithOptions(CGSize(width: newSize.width, height: newSize.height), true, 1.0)
self.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height))
let resizedImage = UIGraphicsGetImageFromCurrentImageContext()!
UIGraphicsEndImageContext()
return resizedImage
}
func cropToSquare() -> UIImage? {
guard let cgImage = self.cgImage else {
return nil
}
var imageHeight = self.size.height
var imageWidth = self.size.width
if imageHeight > imageWidth {
imageHeight = imageWidth
}
else {
imageWidth = imageHeight
}
let size = CGSize(width: imageWidth, height: imageHeight)
let x = ((CGFloat(cgImage.width) - size.width) / 2).rounded()
let y = ((CGFloat(cgImage.height) - size.height) / 2).rounded()
let cropRect = CGRect(x: x, y: y, width: size.height, height: size.width)
if let croppedCgImage = cgImage.cropping(to: cropRect) {
return UIImage(cgImage: croppedCgImage, scale: 0, orientation: self.imageOrientation)
}
return nil
}
func pixelBuffer() -> CVPixelBuffer? {
let width = self.size.width
let height = self.size.height
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(width),
Int(height),
kCVPixelFormatType_32ARGB,
attrs,
&pixelBuffer)
guard let resultPixelBuffer = pixelBuffer, status == kCVReturnSuccess else {
return nil
}
CVPixelBufferLockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(resultPixelBuffer)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
guard let context = CGContext(data: pixelData,
width: Int(width),
height: Int(height),
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(resultPixelBuffer),
space: rgbColorSpace,
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else {
return nil
}
context.translateBy(x: 0, y: height)
context.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context)
self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
return resultPixelBuffer
}
}
The expected image size for inputs you can find in the mimodel file:
A demo project that uses both pure CoreML and Vision variants you can find here: https://github.com/handsomecode/iOS11-Demos/tree/coreml_vision/CoreML/CoreMLDemo
If the input is UIImage, rather than an URL, and you want to use VNImageRequestHandler, you can use CIImage.
func updateClassifications(for image: UIImage) {
let orientation = CGImagePropertyOrientation(image.imageOrientation)
guard let ciImage = CIImage(image: image) else { return }
let handler = VNImageRequestHandler(ciImage: ciImage, orientation: orientation)
}
From Classifying Images with Vision and Core ML