Question regarding UIImage -> CVPixelBuffer -> UIImage conversion - swift

I am working on a simple denoising POC in SwiftUI where I want to:
Load an input image
Apply a CoreML model (denoising) to the input image
Display the output image
I have something working based on dozens of source codes I found online. Based on what I've read, a CoreML model (at least the one I'm using) accepts a CVPixelBuffer and outputs also a CVPixelBuffer. So my idea was to do the following:
Convert the input UIImage into a CVPixelBuffer
Apply the CoreML model to the CVPixelBuffer
Convert the newly created CVPixelBuffer into a UIImage
(Note that I've read that using the Vision framework, one can input a CGImage directly into the model. I'll try this approach as soon as I'm familiar with what I'm trying to achieve here as I think it is a good exercise.)
As a start, I wanted to skip the step (2) to focus on the conversion problem. What I tried to achieve in the code bellow is:
Convert the input UIImage into a CVPixelBuffer
Convert the CVPixelBuffer into a UIImage
I'm not a Swift or an Objective-C developer, so I'm pretty sure that I've made at least a few mistakes. I found this code quite complex and I was wondering if there was a better / simpler way to do the same thing?
func convert(input: UIImage) -> UIImage? {
// Input CGImage
guard let cgInput = input.cgImage else {
return nil
}
// Image size
let width = cgInput.width
let height = cgInput.height
let region = CGRect(x: 0, y: 0, width: width, height: height)
// Attributes needed to create the CVPixelBuffer
let attributes = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
// Create the input CVPixelBuffer
var pbInput:CVPixelBuffer? = nil
let status = CVPixelBufferCreate(kCFAllocatorDefault,
width,
height,
kCVPixelFormatType_32ARGB,
attributes as CFDictionary,
&pbInput)
// Sanity check
if status != kCVReturnSuccess {
return nil
}
// Fill the input CVPixelBuffer with the content of the input CGImage
CVPixelBufferLockBaseAddress(pbInput!, CVPixelBufferLockFlags(rawValue: 0))
guard let context = CGContext(data: CVPixelBufferGetBaseAddress(pbInput!),
width: width,
height: height,
bitsPerComponent: cgInput.bitsPerComponent,
bytesPerRow: cgInput.bytesPerRow,
space: cgInput.colorSpace!,
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else {
return nil
}
context.draw(cgInput, in: region)
CVPixelBufferUnlockBaseAddress(pbInput!, CVPixelBufferLockFlags(rawValue: 0))
// Create the output CGImage
let ciOutput = CIImage(cvPixelBuffer: pbInput!)
let temporaryContext = CIContext(options: nil)
guard let cgOutput = temporaryContext.createCGImage(ciOutput, from: region) else {
return nil
}
// Create and return the output UIImage
return UIImage(cgImage: cgOutput)
}
When I used this code in my SwiftUI project, input and output images looked the same, but there were not identical. I think the input image had a colormap (ColorSync Profile) associated to it that have been lost during the conversion. I assumed I was supposed to use cgInput.colorSpace during the CGContext creation, but it seemed that using CGColorSpace(name: CGColorSpace.sRGB)! was working better. Can somebody please explain that to me?
Thanks for your help.

You can also use CGImage objects with Core ML, but you have to create the MLFeatureValue object by hand and then put it into an MLFeatureProvider to give it to the model. But that only takes care of the model input, not the output.
Another option is to use the code from my CoreMLHelpers repo.

Related

Getting error on image cropping / Converting UIImage to CGImage / Thread 1: EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)

I'm a little new to programming and I'm trying to capture a screenshot then crop it to a specific region.
I was able to come up with the code below, but it's giving me the Thread 1: EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0) error, which usually doesn't tell you much. I tried to isolate snipets of the code to see where the error is coming from and it looks like it's from the CIImage to CGImage conversion.
I have already tried going from UIImage to CGImage directly, but I get the same error.
This is the screenshot capture code that calls the crop function.
UIGraphicsBeginImageContextWithOptions(view.frame.size, false, 0.0)
view.layer.render(in: UIGraphicsGetCurrentContext()!)
let itemToShare = UIGraphicsGetImageFromCurrentImageContext()!
UIGraphicsEndImageContext()
cropImage(itemToShare)
This is the crop function. The .cropping(to: is commented out because I wanted to make sure the error was not coming from the crop.
func cropImage(_ screenshot: UIImage) -> CGImage {
let ciImage = CIImage(image: screenshot)
let crop = CGRect(x: 0,
y: 0,
width: 50,
height: 50)
let cgImage = (ciImage as! CGImage) //.cropping(to: crop)!
return cgImage
}
I appreciate all the help as I have been researching this for a few days and all the answers on stack or on Apple dev forums lead the same way.
Daniel
Ok, so after further research I found the error was in trying to create a CGImage without a context.
So, a couple of lines solved it.
First, the UIImage gets converted to a CIImage with
let ciImage = CIImage(image: screenshot)
then the CIImage gets converted to a CGImage with a context
let context = CIContext(options: nil)
let cgImage = context.createCGImage(ciImage!, from: ciImage!.extent)
now, cgImage can be cropped with
let crop = CGRect(x: 0,
y: 0,
width: 200,
height: 200)
let cropedImage = cgImage!.cropping(to: crop)
and finally, the resulting cropped CGImage can be turned into a UIImage and returned by the function
return UIImage(cgImage: cropedImage!)
Apparently you can't go from a UIImage straight to CGImage.
Attention to sizes and scale as one uses points as reference and the other pixels... so you need to do some math figuring to get exact cropping positions for what you would like.

Cocoa CGImage save corrupts image, but NSImageView display is okay

I'm writing a scanner app using ImageCaptureCore, and I'm able to construct the returned image into a CGImage and (via NSImage) display that in an NSImageView. The image is 8-bit grayscale. A screenshot looks like this:
But when I save the CGImage as PNG and open that in Preview, it looks like this:
The code I use to save looks like this:
let url = URL(fileURLWithPath: "/path/tp/image.png")
if let img = self.imageView.image?.cgImage(forProposedRect: nil, context: nil, hints: nil),
let dest = CGImageDestinationCreateWithURL(url as CFURL, kUTTypePNG, 1, nil)
{
CGImageDestinationAddImage(dest, img, nil)
CGImageDestinationFinalize(dest)
}
I also tried saving using NSBitmapImageRep, as both PNG and TIFF, and I get the same results. The code to create the CGImage looks like:
guard
let data = inBandData.dataBuffer,
let provider = CGDataProvider(data: data as CFData)
else
{
debugLog("No scan data returned or unable to create data provider")
return
}
let colorSpace = CGColorSpaceCreateDeviceGray()
let bmInfo: CGBitmapInfo = [.byteOrder32Little]
if let image = CGImage(width: inBandData.fullImageWidth,
height: inBandData.fullImageHeight,
bitsPerComponent: inBandData.bitsPerComponent, // 8
bitsPerPixel: inBandData.bitsPerPixel, // 8
bytesPerRow: inBandData.fullImageWidth, //inBandData.bytesPerRow, because e.g., bytesPerRow is 830, image width is 827.
space: colorSpace,
bitmapInfo: bmInfo,
provider: provider,
decode: nil,
shouldInterpolate: true,
intent: .defaultIntent)
{
let size = NSSize(width: inBandData.fullImageWidth, height: inBandData.fullImageHeight)
self.imageView.image = NSImage(cgImage: image, size: size)
.
.
.
}
It looks like maybe endianness is wrong, and someone’s reordering 32-bit words, but NSImageView displays it correctly. In fact, changing the CGBitmapInfo to [.byteOrder32Big] fixes this problem, but I still don’t get why NSImageView displays things correctly, or why big works when the architecture is little (even though PNG and TIFF may both work in big-endian 32-bit words).

create transparent texture in swift

I just need to create a transparent texture.(pixels with alpha 0).
func layerTexture()-> MTLTexture {
let width = Int(self.drawableSize.width )
let height = Int(self.drawableSize.height )
let texDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .bgra8Unorm, width: width , height: height, mipmapped: false)
let temparyTexture = self.device?.makeTexture(descriptor: texDescriptor)
return temparyTexture!
}
when I open temparyTexture using preview,it's appeared to be black. What is the missing here?
UPDATE
I just tried to create texture using transparent image.
code.
func layerTexture(imageData:Data)-> MTLTexture {
let width = Int(self.drawableSize.width )
let height = Int(self.drawableSize.height )
let bytesPerRow = width * 4
let texDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm, width: width , height: height, mipmapped: false)
let temparyTexture = self.device?.makeTexture(descriptor: texDescriptor)
let region = MTLRegionMake2D(0, 0, width, height)
imageData.withUnsafeBytes { (u8Ptr: UnsafePointer<UInt8>) in
let rawPtr = UnsafeRawPointer(u8Ptr)
temparyTexture?.replace(region: region, mipmapLevel: 0, withBytes: rawPtr, bytesPerRow: bytesPerRow)
}
return temparyTexture!
}
method is get called as follows
let image = UIImage(named: "layer1.png")!
let imageData = UIImagePNGRepresentation(image)
self.layerTexture(imageData: imageData!)
where layer1.png is a transparent png. But even though it is crashing with message "Thread 1: EXC_BAD_ACCESS (code=1, address=0x107e8c000) " at the point I try to replace texture. I believe it's because image data is compressed and rawpointer should point to uncompressed data. How can I resolve this?
Am I in correct path or completely in wrong direction? Is there any other alternatives. What I just need is to create transparent texture.
Pre-edit: When you quick-look a transparent texture, it will appear black. I just double-checked with some code I have running stably in production - that is the expected result.
Post-edit: You are correct, you should not be copying PNG or JPEG data to a MTLTexture's contents directly. I would recommend doing something like this:
var pixelBuffer: CVPixelBuffer?
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferMetalCompatibilityKey: kCFBooleanTrue]
var status = CVPixelBufferCreate(nil, Int(image.size.width), Int(image.size.height),
kCVPixelFormatType_32BGRA, attrs as CFDictionary,
&pixelBuffer)
assert(status == noErr)
let coreImage = CIImage(image: image)!
let context = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!)
context.render(coreImage, to: pixelBuffer!)
var textureWrapper: CVMetalTexture?
status = CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault,
GPUManager.shared.textureCache, pixelBuffer!, nil, .bgra8Unorm,
CVPixelBufferGetWidth(pixelBuffer!), CVPixelBufferGetHeight(pixelBuffer!), 0, &textureWrapper)
let texture = CVMetalTextureGetTexture(textureWrapper!)!
// use texture now for your Metal texture. the texture is now map-bound to the CVPixelBuffer's underlying memory.
The issue you are running into is that it is actually pretty hard to fully grasp how bitmaps work and how they can be laid out differently. Graphics is a very closed field with lots of esoteric terminology, some of which refers to things that take years to grasp, some of which refers to things that are trivial but people just picked a weird word to call them by. My main pointers are:
Get out of UIImage land as early in your code as possible. The best way to avoiding overhead and delays when you go into Metal land is to get your images into a GPU-compatible representation as soon as you can.
Once you are outside of UIImage land, always know your channel order (RGBA, BGRA). At any point in code that you are editing, you should have a mental model of what pixel format each CVPixelBuffer / MTLTexture has.
Read up on premultiplied vs non-premultiplied alpha, you may not run into issues with this, but it threw me off repeatedly when I was first learning.
total byte size of a bitmap/pixelbuffer = bytesPerRow * height

Why filtering a cropped image is 4x slower than filtering resized image (both have the same dimensions)

I've been trying to wrap my head around this problem with no luck. I have a very simple Swift command-line application which takes one argument - image path to load. It crops the image and filters that image fragment with SepiaTone filter.
It works just fine. It crops the image to 200x200 and filters it with SepiaTone. Now here's the problem that I'm facing - the whole process takes 600ms on my MacBook Air. Now when I RESIZE (instead of cropping) input image to the same dimensions (200x200) it takes 150ms.
Why is that? In both cases I'm filtering an image which is 200x200 in size. I'm using this particular image for testing (5966x3978).
UPDATE:
It's this particular line of code that takes 4x longer when dealing with cropped image:
var ciImage:CIImage = CIImage(cgImage: cgImage)
END OF UPDATE
Code for cropping (200x200):
// parse args and get image path
let args:Array = CommandLine.arguments
let inputFile:String = args[CommandLine.argc - 1]
let inputURL:URL = URL(fileURLWithPath: inputFile)
// load the image from path into NSImage
// and convert NSImage into CGImage
guard
let nsImage = NSImage(contentsOf: inputURL),
var cgImage = nsImage.cgImage(forProposedRect: nil, context: nil, hints: nil)
else {
exit(EXIT_FAILURE)
}
// CROP THE IMAGE TO 200x200
// THIS IS THE ONLY BLOCK OF CODE THAT IS DIFFERENT
// IN THOSE TWO EXAMPLES
let rect = CGRect(x: 0, y: 0, width: 200, height: 200)
if let croppedImage = cgImage.cropping(to: rect) {
cgImage = croppedImage
} else {
exit(EXIT_FAILURE)
}
// END CROPPING
// convert CGImage to CIImage
var ciImage:CIImage = CIImage(cgImage: cgImage)
// initiate SepiaTone
guard
let sepiaFilter = CIFilter(name: "CISepiaTone")
else {
exit(EXIT_FAILURE)
}
sepiaFilter.setValue(ciImage, forKey: kCIInputImageKey)
sepiaFilter.setValue(0.5, forKey: kCIInputIntensityKey)
guard
let result = sepiaFilter.outputImage
else {
exit(EXIT_FAILURE)
}
let context:CIContext = CIContext()
// perform filtering in a GPU context
guard
let output = context.createCGImage(sepiaFilter.outputImage!, from: ciImage.extent)
else {
exit(EXIT_FAILURE)
}
Code for resizing (200x200):
// parse args and get image path
let args:Array = CommandLine.arguments
let inputFile:String = args[CommandLine.argc - 1]
let inputURL:URL = URL(fileURLWithPath: inputFile)
// load the image from path into NSImage
// and convert NSImage into CGImage
guard
let nsImage = NSImage(contentsOf: inputURL),
var cgImage = nsImage.cgImage(forProposedRect: nil, context: nil, hints: nil)
else {
exit(EXIT_FAILURE)
}
// RESIZE THE IMAGE TO 200x200
// THIS IS THE ONLY BLOCK OF CODE THAT IS DIFFERENT
// IN THOSE TWO EXAMPLES
guard let CGcontext = CGContext(data: nil,
width: 200,
height: 200,
bitsPerComponent: cgImage.bitsPerComponent,
bytesPerRow: cgImage.bytesPerRow,
space: cgImage.colorSpace ?? CGColorSpaceCreateDeviceRGB(),
bitmapInfo: cgImage.bitmapInfo.rawValue)
else {
exit(EXIT_FAILURE)
}
CGcontext.draw(cgImage, in: CGRect(x: 0, y: 0, width: 200, height: 200))
if let resizeOutput = CGcontext.makeImage() {
cgImage = resizeOutput
}
// END RESIZING
// convert CGImage to CIImage
var ciImage:CIImage = CIImage(cgImage: cgImage)
// initiate SepiaTone
guard
let sepiaFilter = CIFilter(name: "CISepiaTone")
else {
exit(EXIT_FAILURE)
}
sepiaFilter.setValue(ciImage, forKey: kCIInputImageKey)
sepiaFilter.setValue(0.5, forKey: kCIInputIntensityKey)
guard
let result = sepiaFilter.outputImage
else {
exit(EXIT_FAILURE)
}
let context:CIContext = CIContext()
// perform filtering in a GPU context
guard
let output = context.createCGImage(sepiaFilter.outputImage!, from: ciImage.extent)
else {
exit(EXIT_FAILURE)
}
Its very likely that the cgImage lives in video memory and when you scale the image it actually uses the hardware to write the image to a new area of memory. When you crop the cgImage the documentation implies that it is just referencing the original image. The line
var ciImage:CIImage = CIImage(cgImage: cgImage)
must be triggering a read (maybe to main memory?), and in the case of your scaled image it can probably just read the whole buffer continuously. In the case of the cropped image it may be reading it line by line and this could account for the difference, but thats just me guessing.
It looks like you are doing two very different things. In the "slow" version you are cropping (as in taking a small CGRect of the original image) and in the "fast" version you are resizing (as in reducing the original down to a CGRect).
You can prove this by adding two UIImageViews and adding these lines after each declaration of ciImage:
slowImage.image = UIImage(ciImage: ciImage)
fastImage.image = UIImage(ciImage: ciImage)
Here are two simulator screenshots, with the "slow" image above the "fast" image. The first is with your code where the "slow" CGRect origin is (0,0) and the second is with it adjusted to (2000,2000):
Origin is (0,0)
Origin is (2000,2000)
Knowing this, I can come up with a few things happening on the timing.
I'm including a link to Apple's documentation on the cropping function. It explains that it is doing some CGRect calculations behind the scenes but it doesn't explain how it pulls the pixel bits out of the full-sized CG image - I think that's where the real slow down is.
In the end though, it looks like the timing is due to doing two entirely different things.
CGRect.cropping(to:)

UIImagePNGRepresentation(UIImage()) returns nil

Why does UIImagePNGRepresentation(UIImage()) returns nil?
I'm trying to create a UIImage() in my test code just to assert that it was correctly passed around.
My comparison method for two UIImage's uses the UIImagePNGRepresentation(), but for some reason, it is returning nil.
Thank you.
UIImagePNGRepresentation() will return nil if the UIImage provided does not contain any data. From the UIKit Documentation:
Return Value
A data object containing the PNG data, or nil if there was a problem generating the data. This function may return nil if the image has no data or if the underlying CGImageRef contains data in an unsupported bitmap format.
When you initialize a UIImage by simply using UIImage(), it creates a UIImage with no data. Although the image isn't nil, it still has no data. And, because the image has no data, UIImagePNGRepresentation() just returns nil.
To fix this, you would have to use UIImage with data. For example:
var imageName: String = "MyImageName.png"
var image = UIImage(named: imageName)
var rep = UIImagePNGRepresentation(image)
Where imageName is the name of your image, included in your application.
In order to use UIImagePNGRepresentation(image), image must not be nil, and it must also have data.
If you want to check if they have any data, you could use:
if(image == nil || image == UIImage()){
//image is nil, or has no data
}
else{
//image has data
}
The UIImage documentation says
Image objects are immutable, so you cannot change their properties after creation. This means that you generally specify an image’s properties at initialization time or rely on the image’s metadata to provide the property value.
Since you've created the UIImage without providing any image data, the object you've created has no meaning as an image. UIKit and Core Graphics don't appear to allow 0x0 images.
The simplest fix is to create a 1x1 image instead:
UIGraphicsBeginImageContext(CGSizeMake(1, 1))
let image = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
I faced the same issue i was converting UIImage to pngData but sometime it returns nil. i fixed it by just creating copy of image
func getImagePngData(img : UIImage) -> Data {
let pngData = Data()
if let hasData = img.pngData(){
print(hasData)
pngData = hasData
}
else{
UIGraphicsBeginImageContext(img.size)
img.draw(in: CGRect(x: 0.0, y: 0.0, width: img.width,
height: img.height))
let resultImage =
UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
print(resultImage.pngData)
pngData = resultImage.pngData
}
return pngData
}