Getting pixel format from CGImage - swift

I understand bitmap layout and pixel format subject pretty well, but getting an issue when working with png / jpeg images loaded through NSImage – I can't figure out if what I get is the intended behaviour or a bug.
let nsImage:NSImage = NSImage(byReferencingURL: …)
let cgImage:CGImage = nsImage.CGImageForProposedRect(nil, context: nil, hints: nil)!
let bitmapInfo:CGBitmapInfo = CGImageGetBitmapInfo(cgImage)
Swift.print(bitmapInfo.contains(CGBitmapInfo.ByteOrderDefault)) // True
My kCGBitmapByteOrder32Host is little endian, which implies that the pixel format is also little endian – BGRA in this case. But… png format is big endian by specification, and that's how the bytes are actually arranged in the data – opposite from what bitmap info tells me.
Does anybody knows what's going on? Surely the system somehow knows how do deal with this, since pngs are displayed correctly. Is there a bullet-proof way detecting pixel format of CGImage? Complete demo project is available at GitHub.
P. S. I'm copying raw pixel data via CFDataGetBytePtr buffer into another library buffer, which is then gets processed and saved. In order to do so, I need to explicitly specify pixel format. Actual images I'm dealing with (any png / jpeg files that I've checked) display correctly, for example:
But bitmap info of the same images gives me incorrect endianness information, resulting in bitmap being handled as BGRA pixel format instead of actual RGBA, when I process it the result looks like this:
The resulting image demonstrates the colour swapping between red and blue pixels, if RGBA pixel format is specified explicitly, everything works out perfectly, but I need this detection to be automated.
P. P. S. Documentation briefly mentions that CGColorSpace is another important variable that defines pixel format / byte order, but I found no mentions how to get it out of there.

Some years later and after testing my findings in production I can share them with good confidence, but hoping someone with theory knowledge will explain things better here? Good places to refresh memory:
Wikipedia: RGBA color space – Representation
Apple Lists: Byte Order in CGBitmapContextCreate
Apple Lists: kCGImageAlphaPremultiplied First/Last
Based on that you can use following extensions:
public enum PixelFormat
{
case abgr
case argb
case bgra
case rgba
}
extension CGBitmapInfo
{
public static var byteOrder16Host: CGBitmapInfo {
return CFByteOrderGetCurrent() == Int(CFByteOrderLittleEndian.rawValue) ? .byteOrder16Little : .byteOrder16Big
}
public static var byteOrder32Host: CGBitmapInfo {
return CFByteOrderGetCurrent() == Int(CFByteOrderLittleEndian.rawValue) ? .byteOrder32Little : .byteOrder32Big
}
}
extension CGBitmapInfo
{
public var pixelFormat: PixelFormat? {
// AlphaFirst – the alpha channel is next to the red channel, argb and bgra are both alpha first formats.
// AlphaLast – the alpha channel is next to the blue channel, rgba and abgr are both alpha last formats.
// LittleEndian – blue comes before red, bgra and abgr are little endian formats.
// Little endian ordered pixels are BGR (BGRX, XBGR, BGRA, ABGR, BGR).
// BigEndian – red comes before blue, argb and rgba are big endian formats.
// Big endian ordered pixels are RGB (XRGB, RGBX, ARGB, RGBA, RGB).
let alphaInfo: CGImageAlphaInfo? = CGImageAlphaInfo(rawValue: self.rawValue & type(of: self).alphaInfoMask.rawValue)
let alphaFirst: Bool = alphaInfo == .premultipliedFirst || alphaInfo == .first || alphaInfo == .noneSkipFirst
let alphaLast: Bool = alphaInfo == .premultipliedLast || alphaInfo == .last || alphaInfo == .noneSkipLast
let endianLittle: Bool = self.contains(.byteOrder32Little)
// This is slippery… while byte order host returns little endian, default bytes are stored in big endian
// format. Here we just assume if no byte order is given, then simple RGB is used, aka big endian, though…
if alphaFirst && endianLittle {
return .bgra
} else if alphaFirst {
return .argb
} else if alphaLast && endianLittle {
return .abgr
} else if alphaLast {
return .rgba
} else {
return nil
}
}
}
Note, that you should always pay attention to colour space – it directly affects how raw pixel data is stored. CGColorSpace(name: CGColorSpace.sRGB) is probably the safest one – it stores colours in plain format, for example, if you deal with red RGB it will be stored just like that (255, 0, 0) while device colour space will give you something like (235, 73, 53).
To see this in practice drop above and the following into a playground. You'll need two one-pixel red images with alpha and without, this and this should work.
import AppKit
import CoreGraphics
extension CFData
{
public var pixelComponents: [UInt8] {
let buffer: UnsafeMutablePointer<UInt8> = UnsafeMutablePointer.allocate(capacity: 4)
defer { buffer.deallocate(capacity: 4) }
CFDataGetBytes(self, CFRange(location: 0, length: CFDataGetLength(self)), buffer)
return Array(UnsafeBufferPointer(start: buffer, count: 4))
}
}
let color: NSColor = .red
Thread.sleep(forTimeInterval: 2)
// Must flip coordinates to capture what we want…
let screen: NSScreen = NSScreen.screens.first(where: { $0.frame.contains(NSEvent.mouseLocation) })!
let rect: CGRect = CGRect(origin: CGPoint(x: NSEvent.mouseLocation.x - 10, y: screen.frame.height - NSEvent.mouseLocation.y), size: CGSize(width: 1, height: 1))
Swift.print("Will capture image with \(rect) frame.")
let screenImage: CGImage = CGWindowListCreateImage(rect, [], kCGNullWindowID, [])!
let urlImageWithAlpha: CGImage = NSImage(byReferencing: URL(fileURLWithPath: "/Users/ianbytchek/Downloads/red-pixel-with-alpha.png")).cgImage(forProposedRect: nil, context: nil, hints: nil)!
let urlImageNoAlpha: CGImage = NSImage(byReferencing: URL(fileURLWithPath: "/Users/ianbytchek/Downloads/red-pixel-no-alpha.png")).cgImage(forProposedRect: nil, context: nil, hints: nil)!
Swift.print(screenImage.colorSpace!, screenImage.bitmapInfo, screenImage.bitmapInfo.pixelFormat!, screenImage.dataProvider!.data!.pixelComponents)
Swift.print(urlImageWithAlpha.colorSpace!, urlImageWithAlpha.bitmapInfo, urlImageWithAlpha.bitmapInfo.pixelFormat!, urlImageWithAlpha.dataProvider!.data!.pixelComponents)
Swift.print(urlImageNoAlpha.colorSpace!, urlImageNoAlpha.bitmapInfo, urlImageNoAlpha.bitmapInfo.pixelFormat!, urlImageNoAlpha.dataProvider!.data!.pixelComponents)
let formats: [CGBitmapInfo.RawValue] = [
CGImageAlphaInfo.premultipliedFirst.rawValue,
CGImageAlphaInfo.noneSkipFirst.rawValue,
CGImageAlphaInfo.premultipliedLast.rawValue,
CGImageAlphaInfo.noneSkipLast.rawValue,
]
for format in formats {
// This "paints" and prints out components in the order they are stored in data.
let context: CGContext = CGContext(data: nil, width: 1, height: 1, bitsPerComponent: 8, bytesPerRow: 32, space: CGColorSpace(name: CGColorSpace.sRGB)!, bitmapInfo: format)!
let components: UnsafeBufferPointer<UInt8> = UnsafeBufferPointer(start: context.data!.assumingMemoryBound(to: UInt8.self), count: 4)
context.setFillColor(red: 1 / 0xFF, green: 2 / 0xFF, blue: 3 / 0xFF, alpha: 1)
context.fill(CGRect(x: 0, y: 0, width: 1, height: 1))
Swift.print(context.colorSpace!, context.bitmapInfo, context.bitmapInfo.pixelFormat!, Array(components))
}
This will output the following. Pay attention how screen-captured image differs from ones loaded from disk.
Will capture image with (285.7734375, 294.5, 1.0, 1.0) frame.
<CGColorSpace 0x7fde4e9103e0> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; iMac) CGBitmapInfo(rawValue: 8194) bgra [27, 13, 252, 255]
<CGColorSpace 0x7fde4d703b20> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; Color LCD) CGBitmapInfo(rawValue: 3) rgba [235, 73, 53, 255]
<CGColorSpace 0x7fde4e915dc0> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; Color LCD) CGBitmapInfo(rawValue: 5) rgba [235, 73, 53, 255]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 2) argb [255, 1, 2, 3]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 6) argb [255, 1, 2, 3]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 1) rgba [1, 2, 3, 255]
<CGColorSpace 0x7fde4d60d390> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1) CGBitmapInfo(rawValue: 5) rgba [1, 2, 3, 255]

Could you use NSBitmapFormat?
I wrote a class to source color schemes from images, and that's what I used to determine the bitmap format. Here's a snippet of how I used it:
var averageColorImage: CIImage?
var averageColorImageBitmap: NSBitmapImageRep
//... core image filter code
averageColorImage = filter?.outputImage
averageColorImageBitmap = NSBitmapImageRep(CIImage: averageColorImage!)
let red, green, blue: Int
switch averageColorImageBitmap.bitmapFormat {
case NSBitmapFormat.NSAlphaFirstBitmapFormat:
red = Int(averageColorImageBitmap.bitmapData.advancedBy(1).memory)
green = Int(averageColorImageBitmap.bitmapData.advancedBy(2).memory)
blue = Int(averageColorImageBitmap.bitmapData.advancedBy(3).memory)
default:
red = Int(averageColorImageBitmap.bitmapData.memory)
green = Int(averageColorImageBitmap.bitmapData.advancedBy(1).memory)
blue = Int(averageColorImageBitmap.bitmapData.advancedBy(2).memory)
}

Check out the answer to How to keep NSBitmapImageRep from creating lots of intermediate CGImages?.
The gist is that the NSImage/NSBitmapImageRepresentation implementation automatically handles the input format.
Apple's docs fail to note that the format parameter (for example in CIRenderDestination) specifies the desired output space.
If you want it in a particular format, the docs recommend drawing into that format (example in linked answer).
If you just need particular information, NSBitmapImageRepresentation provides easy access to individual parameters. I could not find a clear and direct route to a CIFormat without setting up cascading manual tests. I assume a way exists somewhere.

Related

memory issue: realtime eye localization

I'm currently working on a iOS-application, which should be able to detect the localization. I've created an tflite which comprises some CNN layers. In order to use the tflite in XCode/Swift I've created a helper class in which the tflite calculates the output. Whenever I run the predict-function once, it works. But apparently the predict function doesn't work in real-time camera-thread.
After about 7 seconds, XCode is throwing the following error:
Thread 1: EXC_BAD_ACCESS (code=1, address=0x123424001). This error must be evoken by looping through the image. Since I need each pixel value I'm using the solution suggested by Firebase.But apparently this solution is not waterproofed. Can anybody help to resolve this memory issue?
func creatInputForCNN(resizedImage: UIImage?) -> Data{
// In this section of the code I loop through the image (150,200,3)
// in order to fetch each pixel value (RGB).
let image: CGImage = resizedImage.cgImage!
guard let context = CGContext(
data: nil,
width: image.width, height: image.height,
bitsPerComponent: 8, bytesPerRow: image.width * 4,
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
) else {return nil}
context.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
guard let imageData = context.data else {return nil}
let size_w = 150
let size_H = 200
var inputData:Data?
inputData = Data()
for row in 0 ..< size_H{
for col in 0 ..< size_w {
let offset = 4 * (row * context.width + col)
// (Ignore offset 0, the unused alpha channel)
let red = imageData.load(fromByteOffset: offset+1, as: UInt8.self)
let green = imageData.load(fromByteOffset: offset+2, as: UInt8.self)
let blue = imageData.load(fromByteOffset: offset+3, as: UInt8.self)
// Normalize channel values to [0.0, 1.0]. This requirement varies
// by model. For example, some models might require values to be
// normalized to the range [-1.0, 1.0] instead, and others might
// require fixed-point values or the original bytes.
var normalizedRed:Float32 = Float32(red) / 255
var normalizedGreen:Float32 = (Float32(green) / 255
var normalizedBlue:Float32 = Float32(blue) / 255
// Append normalized values to Data object in RGB order.
let elementSize = MemoryLayout.size(ofValue: normalizedRed)
var bytes = [UInt8](repeating: 0, count: elementSize)
memcpy(&bytes, &normalizedRed, elementSize)
inputData!.append(&bytes, count: elementSize)
memcpy(&bytes, &normalizedGreen, elementSize)
inputData!.append(&bytes, count: elementSize)
memcpy(&bytes, &normalizedBlue, elementSize)
inputData!.append(&bytes, count: elementSize)
return inputData
}
This is the code

Espresso exception: "Invalid argument":general shape kernel while loading mlmodel

I converted my mlmodel from tf.keras. The goal is to recognize handwritten text from the image
When I run it using this code:
func performCoreMLImageRecognition(_ image: UIImage) {
let model = try! HTRModel()
// process input image
let scale = image.scaledImage(200)
let sized = scale?.resize(size: CGSize(width: 200, height: 50))
let gray = sized?.rgb2GrayScale()
guard let pixelBuffer = sized?.pixelBufferGray(width: 200, height: 50) else { fatalError("Cannot convert image to pixelBufferGray")}
UIImageWriteToSavedPhotosAlbum(gray! ,
self,
#selector(self.didFinishSavingImage(_:didFinishSavingWithError:contextInfo:)),
nil)
let mlArray = try! MLMultiArray(shape: [1, 1], dataType: MLMultiArrayDataType.float32)
let htrinput = HTRInput(image: pixelBuffer, label: mlArray)
if let prediction = try? model.prediction(input: htrinput) {
print(prediction)
}
}
I get the following error:
[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Invalid argument": generic_reshape_kernel: Invalid bottom shape (64 12 1 1 1) for reshape to (768 50 -1 1 1) status=-6
2021-01-21 20:23:50.712585+0900 Guided Camera[7575:1794819] [coreml] Error computing NN outputs -6
2021-01-21 20:23:50.712611+0900 Guided Camera[7575:1794819]
[coreml] Failure in -executePlan:error:.
Here is the model configuration
The model ran perfectly fine. Where am I going wrong in this. I am not well versed with swift and need help.
What does this error mean and How do I resolve this error?
Sometimes during the conversion from Keras (or whatever) to Core ML, the converter doesn't understand how to handle certain operations, which results in a model that doesn't work.
In your case, there is a layer that outputs a tensor with shape (64, 12, 1, 1, 1) while there is a reshape layer that expects something that can be reshaped to (768, 50, -1, 1, 1).
You'll need to find out which layer does this reshape and then examine the Core ML model why it gets an input tensor that is not the correct size. Just because it works OK in Keras does not mean the conversion to Core ML was flawless.
You can examine the Core ML model with Netron, an open source model viewer.
(Note that 64x12 = 768, so the issue appears to be with the 50 in that tensor.)

UIColor(r: g: b:) not working properly [duplicate]

This question already has answers here:
UIColor not working with RGBA values
(6 answers)
Closed 7 years ago.
I have a method called drawLine(), that draws a line between two points (obviously). It has a start point and an end point that are constantly being moved around, so everytime the update() method is called, the line is deleted, and drawLine() is called again, creating a new line.
func drawLine(firstPoint: CGPoint, secondPoint: CGPoint) -> SKShapeNode
{
var path = CGPathCreateMutable()
CGPathMoveToPoint(path, nil, firstPoint.x, firstPoint.y)
CGPathAddLineToPoint(path, nil, secondPoint.x, secondPoint.y)
let shape = SKShapeNode()
shape.path = path
let strength = sqrt(pow((firstPoint.x - secondPoint.x), 2) + pow((firstPoint.y - secondPoint.y), 2))
NSLog("\(strength)")
shape.strokeColor = UIColor(red: strength/3, green: 255 - strength/3, blue: 0, alpha: 1)
shape.lineWidth = 2
return shape
}
The line stretches from "firstPoint" to "secondPoint" without any problems, but the rgb value, which is calculated using a variable called "strength" that is proportional to the length of the line, doesn't seem to be working properly. The line is always yellow no matter what, until the "strength" reaches a value of 765 (which also happens to be 255 * 3) which is when it abruptly switches to red. Why aren't I getting a gradual change? Also I tried inputting the values for turquoise (r: 50, g: 214, b: 200) but I only got grey. Why is this? Thanks in advance (:
The RGB values range from 0.0 to 1.0, not 0 to 255.

Trying to create a 32 bpc NSBitmapImageRep, getting hit with errors

I'm trying to create an NSBitmapImageRep object with 32 bits per sample and an alpha channel (128 bits per pixel in total).
My code looks like this:
let renderSize = NSSize(width: 640, height: 360)
let bitmapRep = NSBitmapImageRep(bitmapDataPlanes: nil, pixelsWide: Int(renderSize.width), pixelsHigh: Int(renderSize.height), bitsPerSample: 32, samplesPerPixel: 4, hasAlpha: true, isPlanar: false, colorSpaceName: NSCalibratedRGBColorSpace, bytesPerRow: 16 * Int(renderSize.width), bitsPerPixel: 128)
println(bitmapRep) //prints "nil"
println(16 * Int(renderSize.width)) //prints 10240
println()
//So what does a 'valid' 32bpc TIFF file with an alpha channel look like?
let imgFile = NSImage(named: "32grad") //http://i.peterwunder.de/32grad.tif
let imgRep = imgFile?.representations[0] as NSBitmapImageRep
println(imgRep.bitsPerSample) //prints 32
println(imgRep.bitsPerPixel) //prints 128
println(imgRep.samplesPerPixel) //prints 4
println(imgRep.bytesPerRow) //prints 10240
println(imgRep.bytesPerRow / Int(imgFile!.size.width)) //prints 16
This appears in the console after executing line 2:
2014-12-31 04:49:16.639 PixelTestBed[4413:2775703] Inconsistent set of values to create NSBitmapImageRep
What's going on here? Why can't I manually create an NSBitmapImageRep with the exact same values that TIFF image has?
By the way, I can't upload the image here because imgur would butcher the image's quality. It's a 3,7 MB 32bpc TIFF with an alpha channel, after all.
You are trying to assign 32 bitsPerSample which is out of the range specified in the docs:
bps
The number of bits used to specify one pixel in a single component of
the data. All components are assumed to have the same bits per sample.
bps should be one of these values: 1, 2, 4, 8, 12, or 16.
NSBitmapImageRep Class Reference

How to NSLog pixel RGB from a UIImage?

I just want to:
1) Copy the pixel data.
2) Iterate and Modify each pixel (just show me how to NSLog the ARGB values as 255)
3) Create a UIImage from the new pixel data
I can figure out the the gory details if someone can just tell me how to NSLog the RGBA values of a pixel as 255. How do I modify the following code to do this? Be Specific Please!
-(UIImage*)modifyPixels:(UIImage*)originalImage
{
NSData* pixelData = (NSData*)CGDataProviderCopyData(CGImageGetDataProvider(originalImage.CGImage));
uint myLength = [pixelData length];
for(int i = 0; i < myLength; i += 4) {
//CHANGE PIXELS HERE
/*
Sidenote: Just show me how to NSLog them
*/
//Example:
//NSLog(#"Alpha 255-Value is: %u", data[i]);
//NSLog(#"Red 255-Value is: %u", data[i+1]);
//NSLog(#"Green 255-Value is: %u", data[i+2]);
//NSLog(#"Blue 255-Value is: %u", data[i+3]);
}
//CREATE NEW UIIMAGE (newImage) HERE
return newImage;
}
Did this direction work for you? I'd get pixel data like this:
UInt32 *pixels = CGBitmapContextGetData( ctx );
#define getRed(p) ((p) & 0x000000FF)
#define getGreen(p) ((p) & 0x0000FF00) >> 8
#define getBlue(p) ((p) & 0x00FF0000) >> 16
// display RGB values from the 11th pixel
NSLog(#"Red: %d, Green: %d, Blue: %d", getRed(pixels[10]), getGreen(pixels[10]), getBlue(pixels[10]));
If you'd like to actually see the image, you can use Florent Pillet's NSLogger:
https://github.com/fpillet/NSLogger
The idea is you start the NSLogger client on your desktop, and then in your app you put this up towards the top:
#import "LoggerClient.h"
And in your modifyPixels method you can do something like this:
LogImageData(#"RexOnRoids", // Any identifier to go along with the log
0, // Log level
newImage.size.width, // Image width
newImage.size.height, // Image height
UIImagePNGRepresentation(newImage)); // Image as PNG
Start the client on your desktop, and then run the app on your iphone, and you'll see real images appear in the client. VERY handy for debugging image problems such as flipping, rotating, colors, alpha, etc.