Use Create ML object detection model in swift - swift

Hello I have created an object detection model in create ML and imported it to my swift project but I can't figure out how to use it. Basically i'm just looking to give the model an input and then receive an output. I have opened the Ml model prediction tab and found the input and output variabels but i don't know how to implement it code wise. I have searched on the internet for an answer and found multiple code snippets for running ml models but I can't get them to work.
This is the ML Model:
ML Model predictions
This is the code I have tried:
let model = TestObjectModel()
guard let modelOutput = try? model.prediction(imagePath: "images_(2)" as! CVPixelBuffer, iouThreshold: 0.5, confidenceThreshold: 0.5) else {
fatalError("Unexpected runtime error.")
}
print(modelOutput)
When running the code i get this error:
error: Execution was interrupted, reason: EXC_BREAKPOINT (code=1, subcode=0x106c345c0).
The process has been left at the point where it was interrupted, use "thread return -x" to return to the state before expression evaluation.

Ok first to all you have to decide which type of Input you have declared.You can see it, when you click on your model in the project navigator.
For example :
let mlArray = try? MLMultiArray(shape: [1024],dataType: MLMultiArrayDataType.float32)
mlArray![index] = x --> giving your array some data
let input = TestObjectModel(input: mlArray!)
do {
let options = MLPredictionOptions.init()
options.usesCPUOnly = true
let prediction = try? self. TestObjectModel.prediction(input: input, options: options)
--> now you can use prediction which is your output
} catch let err {
fatalError(err.localizedDescription) // Error computing NN outputs error
}
Another example for image as input for your model :
do {
if let resizedImage = resize(image: image, newSize: CGSize(width: 416, height: 416)), let pixelBuffer = resizedImage.toCVPixelBuffer() {
let prediction = try model.prediction(image: pixelBuffer)
let value = prediction.output[0].intValue
print(value)
}
} catch {
print("Error while doing predictions: \(error)")
}
func resize(image: UIImage, newSize: CGSize) -> UIImage? {
UIGraphicsBeginImageContextWithOptions(newSize, false, 0.0)
image.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height))
let newImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return newImage
}
extension UIImage {
func toCVPixelBuffer() -> CVPixelBuffer? {
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer : CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(self.size.width), Int(self.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
guard (status == kCVReturnSuccess) else {
return nil
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: pixelData, width: Int(self.size.width), height: Int(self.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
context?.translateBy(x: 0, y: self.size.height)
context?.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context!)
self.draw(in: CGRect(x: 0, y: 0, width: self.size.width, height: self.size.height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
}

Related

Setting image quality using UIGraphicsPDFRenderer

Trying to add a Aztec barcode to a PDF using UIGraphicsPDFRenderer my issue is that the result is blurry thought the fix was setting interpolationQuality, Thanks for your help.
let renderer = UIGraphicsPDFRenderer(bounds: CGRect(x: 0, y: 0, width: 70.8661, height: 70.8661))
return renderer.pdfData{ ctx in
ctx.beginPage()
ctx.cgContext.interpolationQuality = .none //Doesn't do anything
let barcode = generateQRCode(from: UUID().description)
barcode.draw(in: CGRect(x: 0, y: 0, width: 70.8661, height: 70.8661))
}
func generateQRCode(from string: String) -> UIImage {
filter.message = Data(string.utf8)
if let outputImage = filter.outputImage {
if let cgimg = context.createCGImage(outputImage, from: outputImage.extent) {
return UIImage(cgImage: cgimg)
}
}
return UIImage()
}

TokBox: 'consumeFrame' crashes when using a modified pixel buffer

I'm trying to modify the pixel buffer from live video feed from AVFoundation to stream through OpenTok's API. But whenever I try to do so and feed it through OpenTok's consumeFrame, it crashes.
I am doing this so I can apply different live video effects (filters, stickers, etc).. I have tried converting CGImage->CVPixelBuffer with different methods but nothing works.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
if !capturing || videoCaptureConsumer == nil {
return
}
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
else {
print("Error acquiring sample buffer")
return
}
guard let videoInput = videoInput
else {
print("Capturer does not have a valid input")
return
}
let time = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
videoFrame.clearPlanes()
videoFrame.timestamp = time
let height = UInt32(CVPixelBufferGetHeight(imageBuffer))
let width = UInt32(CVPixelBufferGetWidth(imageBuffer))
if width != captureWidth || height != captureHeight {
updateCaptureFormat(width: width, height: height)
}
// This is where I convert CVImageBuffer->CIImage, modify it, turn it into CGImage, then CGImage->CVPixelBuffer
guard let finalImage = makeBigEyes(imageBuffer) else { return }
CVPixelBufferLockBaseAddress(finalImage, CVPixelBufferLockFlags(rawValue: 0))
videoFrame.format?.estimatedCaptureDelay = 10
videoFrame.orientation = .left
videoFrame.clearPlanes()
videoFrame.planes?.addPointer(CVPixelBufferGetBaseAddress(finalImage))
delegate?.finishPreparingFrame(videoFrame)
videoCaptureConsumer!.consumeFrame(videoFrame)
CVPixelBufferUnlockBaseAddress(finalImage, CVPixelBufferLockFlags(rawValue: 0))
}
And here is my CGImage->CVPixelBuffer method:
func buffer(from image: UIImage) -> CVPixelBuffer? {
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer : CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(image.size.width), Int(image.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
guard (status == kCVReturnSuccess) else {
return nil
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: pixelData, width: Int(image.size.width), height: Int(image.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
context?.translateBy(x: 0, y: image.size.height)
context?.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context!)
image.draw(in: CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}
I get this error on the first frame:
* Terminating app due to uncaught exception 'NSRangeException', reason: '* -[NSConcretePointerArray pointerAtIndex:]: attempt to
access pointer at index 1 beyond bounds 1'
If you made it this far, thank you for reading. I've been stuck on this issue a while now, so any kind of pointer would be greatly appreciated. Thanks.
Since you are converting the camera(NV12) frame to RGB, you need to set pixelFormat to OTPixelFormatARGB on videoFrame.format

Convert raw image data to jpeg in swift

I am trying to convert the raw image data to jpeg in swift. But unfortunately, the jpeg image created is skewed. Please find below the code used for the conversion.
let rawPtr = (rawImgData as NSData).bytes
let mutableRawPtr = UnsafeMutableRawPointer.init(mutating: rawPtr)
let context = CGContext.init(data: mutableRawPtr,
width: 750,
height: 1334,
bitsPerComponent: 32,
bytesPerRow: (8 * 750)/8,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue())
let imageRef = CGContext.makeImage(context!)
let imageRep = NSBitmapImageRep(cgImage: imageRef()!)
let finalData = imageRep.representation(using: .jpeg,
properties: [NSBitmapImageRep.PropertyKey.compressionFactor : 0.5])
Here's the converted jpeg image
Any help or pointers would be greatly appreciated. Thanks in advance!
Finally figured out the answer
var width:size_t?
var height:size_t?
let bitsPerComponent:size_t = 8
var bytesPerRow:size_t?
var bitmapInfo: UInt32
if #available(OSX 10.12, *) {
bitmapInfo = UInt32(CGImageAlphaInfo.premultipliedFirst.rawValue) | UInt32(CGImageByteOrderInfo.order32Little.rawValue)
} else {
bitmapInfo = UInt32(CGImageAlphaInfo.premultipliedFirst.rawValue)
}
let colorSpace = CGColorSpaceCreateDeviceRGB()
do {
let streamAttributesFuture = self.bitmapStream?.streamAttributes()
streamAttributesFuture?.onQueue(.main, notifyOfCompletion: { (streamAttributesFut) in
})
try streamAttributesFuture?.await()
let dict = streamAttributesFuture?.result
width = (dict?.attributes["width"] as! Int)
height = (dict?.attributes["height"] as! Int)
bytesPerRow = (dict?.attributes["row_size"] as! Int)
} catch let frameError{
print("frame error = \(frameError)")
return
}
let rawPtr = (data as NSData).bytes
let mutableRawPtr = UnsafeMutableRawPointer.init(mutating: rawPtr)
let context = CGContext.init(data: mutableRawPtr, width: width!, height: height!, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow!, space: colorSpace, bitmapInfo: bitmapInfo)
if context == nil {
return
}
let imageRef = context?.makeImage()
let scaledWidth = Int(Double(width!) / 1.5)
let scaledHeight = Int(Double(height!) / 1.5)
let resizeCtx = CGContext.init(data: nil, width: scaledWidth, height: scaledHeight, bitsPerComponent: bitsPerComponent, bytesPerRow: bytesPerRow!, space: colorSpace, bitmapInfo: bitmapInfo)
resizeCtx?.draw(imageRef!, in: CGRect(x: 0, y: 0, width: scaledWidth, height: scaledHeight))
resizeCtx?.interpolationQuality = .low
let resizedImgRef = resizeCtx?.makeImage()
let imageRep = NSBitmapImageRep(cgImage: resizedImgRef!)
let finalData = imageRep.representation(using: .jpeg, properties: [NSBitmapImageRep.PropertyKey.compressionFactor : compression])
Try this, change imageName to what you want, and save it in any Directory
func convertToJPEG(image: UIImage) {
if let jpegImage = UIImageJPEGRepresentation(image, 1.0){
do {
//Convert
let tempDirectoryURL = URL(fileURLWithPath: NSTemporaryDirectory(), isDirectory: true)
let newFileName = "\(imageName.append(".JPG"))"
let targetURL = tempDirectoryURL.appendingPathComponent(newFileName)
try jpegImage.write(to: targetURL, options: [])
}catch {
print (error.localizedDescription)
print("FAILED")
}
}else{
print("FAILED")
}
}

Is there a faster way to create a CVPixelBuffer from a UIImage in Swift?

Task: Record a video in real time with a filter applied to it
Problem Getting a CVPixelBuffer from a modified UIImage is too slow
My camera's output is being filtered and going right to a UIImageView so that the user can see the effect in real time, even when not recording video or taking a photo. I'd like some way to record this changing UIImage to video, so it doesn't need to be like the way I'm doing now. Currently, I'm doing it by appending CVPixelBuffer's to an assetWriter, but since I'm applying a filter to a UIImage, I translate the UIImage back to a buffer. I've tested with and without the UIImage -> buffer, so I've proven that's causing the unacceptable slow down.
Below is the code inside captureOutput, commented to be clear what is going on, and the method for getting the UIImage buffer:
// this function is called to output the device's camera output in realtime
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection){
if(captureOutput){
// create ciImage from buffer
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)
let cameraImage = CIImage(cvPixelBuffer: pixelBuffer!)
// set UIImage to ciImage
image = UIImage(ciImage: cameraImage)
if let ciImage = image?.ciImage {
// apply filter to CIImage
image = filterCIImage(with:ciImage)
// make CGImage and apply orientation
image = UIImage(cgImage: (image?.cgImage)!, scale: 1.0, orientation: UIImageOrientation.right)
// get format description, dimensions and current sample time
let formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer)!
self.currentVideoDimensions = CMVideoFormatDescriptionGetDimensions(formatDescription)
self.currentSampleTime = CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer)
// check if user toggled video recording
// and asset writer is ready
if(videoIsRecording && self.assetWriterPixelBufferInput?.assetWriterInput.isReadyForMoreMediaData == true){
// get pixel buffer from UIImage - SLOW!
let filteredBuffer = buffer(from: image!)
// append the buffer to the asset writer
let success = self.assetWriterPixelBufferInput?.append(filteredBuffer!, withPresentationTime: self.currentSampleTime!)
if success == false {
print("Pixel Buffer failed")
}
}
}
DispatchQueue.main.async(){
// update UIImageView with filtered camera output
imageView!.image = image
}
}
}
// UIImage to buffer method:
func buffer(from image: UIImage) -> CVPixelBuffer? {
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer : CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(image.size.width), Int(image.size.height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer)
guard (status == kCVReturnSuccess) else {
return nil
}
CVPixelBufferLockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer!)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
let context = CGContext(data: pixelData, width: Int(image.size.width), height: Int(image.size.height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer!), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue)
context?.translateBy(x: 0, y: image.size.height)
context?.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context!)
image.draw(in: CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(pixelBuffer!, CVPixelBufferLockFlags(rawValue: 0))
return pixelBuffer
}

Is there an easier way to setup a pixel buffer for CoreML? [duplicate]

I am trying to get Apple's sample Core ML Models that were demoed at the 2017 WWDC to function correctly. I am using the GoogLeNet to try and classify images (see the Apple Machine Learning Page). The model takes a CVPixelBuffer as an input. I have an image called imageSample.jpg that I'm using for this demo. My code is below:
var sample = UIImage(named: "imageSample")?.cgImage
let bufferThree = getCVPixelBuffer(sample!)
let model = GoogLeNetPlaces()
guard let output = try? model.prediction(input: GoogLeNetPlacesInput.init(sceneImage: bufferThree!)) else {
fatalError("Unexpected runtime error.")
}
print(output.sceneLabel)
I am always getting the unexpected runtime error in the output rather than an image classification. My code to convert the image is below:
func getCVPixelBuffer(_ image: CGImage) -> CVPixelBuffer? {
let imageWidth = Int(image.width)
let imageHeight = Int(image.height)
let attributes : [NSObject:AnyObject] = [
kCVPixelBufferCGImageCompatibilityKey : true as AnyObject,
kCVPixelBufferCGBitmapContextCompatibilityKey : true as AnyObject
]
var pxbuffer: CVPixelBuffer? = nil
CVPixelBufferCreate(kCFAllocatorDefault,
imageWidth,
imageHeight,
kCVPixelFormatType_32ARGB,
attributes as CFDictionary?,
&pxbuffer)
if let _pxbuffer = pxbuffer {
let flags = CVPixelBufferLockFlags(rawValue: 0)
CVPixelBufferLockBaseAddress(_pxbuffer, flags)
let pxdata = CVPixelBufferGetBaseAddress(_pxbuffer)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB();
let context = CGContext(data: pxdata,
width: imageWidth,
height: imageHeight,
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(_pxbuffer),
space: rgbColorSpace,
bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue)
if let _context = context {
_context.draw(image, in: CGRect.init(x: 0, y: 0, width: imageWidth, height: imageHeight))
}
else {
CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);
return nil
}
CVPixelBufferUnlockBaseAddress(_pxbuffer, flags);
return _pxbuffer;
}
return nil
}
I got this code from a previous StackOverflow post (last answer here). I recognize that the code may not be correct, but I have no idea of how to do this myself. I believe that this is the section that contains the error. The model calls for the following type of input: Image<RGB,224,224>
You don't need to do a bunch of image mangling yourself to use a Core ML model with an image — the new Vision framework can do that for you.
import Vision
import CoreML
let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model)
let request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod)
let handler = VNImageRequestHandler(url: myImageURL)
handler.perform([request])
func myResultsMethod(request: VNRequest, error: Error?) {
guard let results = request.results as? [VNClassificationObservation]
else { fatalError("huh") }
for classification in results {
print(classification.identifier, // the scene label
classification.confidence)
}
}
The WWDC17 session on Vision should have a bit more info — it's tomorrow afternoon.
You can use a pure CoreML, but you should resize an image to (224,224)
DispatchQueue.global(qos: .userInitiated).async {
// Resnet50 expects an image 224 x 224, so we should resize and crop the source image
let inputImageSize: CGFloat = 224.0
let minLen = min(image.size.width, image.size.height)
let resizedImage = image.resize(to: CGSize(width: inputImageSize * image.size.width / minLen, height: inputImageSize * image.size.height / minLen))
let cropedToSquareImage = resizedImage.cropToSquare()
guard let pixelBuffer = cropedToSquareImage?.pixelBuffer() else {
fatalError()
}
guard let classifierOutput = try? self.classifier.prediction(image: pixelBuffer) else {
fatalError()
}
DispatchQueue.main.async {
self.title = classifierOutput.classLabel
}
}
// ...
extension UIImage {
func resize(to newSize: CGSize) -> UIImage {
UIGraphicsBeginImageContextWithOptions(CGSize(width: newSize.width, height: newSize.height), true, 1.0)
self.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height))
let resizedImage = UIGraphicsGetImageFromCurrentImageContext()!
UIGraphicsEndImageContext()
return resizedImage
}
func cropToSquare() -> UIImage? {
guard let cgImage = self.cgImage else {
return nil
}
var imageHeight = self.size.height
var imageWidth = self.size.width
if imageHeight > imageWidth {
imageHeight = imageWidth
}
else {
imageWidth = imageHeight
}
let size = CGSize(width: imageWidth, height: imageHeight)
let x = ((CGFloat(cgImage.width) - size.width) / 2).rounded()
let y = ((CGFloat(cgImage.height) - size.height) / 2).rounded()
let cropRect = CGRect(x: x, y: y, width: size.height, height: size.width)
if let croppedCgImage = cgImage.cropping(to: cropRect) {
return UIImage(cgImage: croppedCgImage, scale: 0, orientation: self.imageOrientation)
}
return nil
}
func pixelBuffer() -> CVPixelBuffer? {
let width = self.size.width
let height = self.size.height
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary
var pixelBuffer: CVPixelBuffer?
let status = CVPixelBufferCreate(kCFAllocatorDefault,
Int(width),
Int(height),
kCVPixelFormatType_32ARGB,
attrs,
&pixelBuffer)
guard let resultPixelBuffer = pixelBuffer, status == kCVReturnSuccess else {
return nil
}
CVPixelBufferLockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
let pixelData = CVPixelBufferGetBaseAddress(resultPixelBuffer)
let rgbColorSpace = CGColorSpaceCreateDeviceRGB()
guard let context = CGContext(data: pixelData,
width: Int(width),
height: Int(height),
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(resultPixelBuffer),
space: rgbColorSpace,
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else {
return nil
}
context.translateBy(x: 0, y: height)
context.scaleBy(x: 1.0, y: -1.0)
UIGraphicsPushContext(context)
self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
UIGraphicsPopContext()
CVPixelBufferUnlockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
return resultPixelBuffer
}
}
The expected image size for inputs you can find in the mimodel file:
A demo project that uses both pure CoreML and Vision variants you can find here: https://github.com/handsomecode/iOS11-Demos/tree/coreml_vision/CoreML/CoreMLDemo
If the input is UIImage, rather than an URL, and you want to use VNImageRequestHandler, you can use CIImage.
func updateClassifications(for image: UIImage) {
let orientation = CGImagePropertyOrientation(image.imageOrientation)
guard let ciImage = CIImage(image: image) else { return }
let handler = VNImageRequestHandler(ciImage: ciImage, orientation: orientation)
}
From Classifying Images with Vision and Core ML