I'm trying to use the new Detect and Track Objects with ML Kit on iOS however I seem to be running into a roadblock with the object detection bounding box.
Using a lego figure as an example, the image orientation is converted to always be .up as per the documentation however the bounding box almost seems to be rotated 90 degrees to the correct dimensions despite the image orientation being correct. This similar behaviour exists on other objects too with the box being offset.
let options = VisionObjectDetectorOptions()
options.detectorMode = .singleImage
options.shouldEnableMultipleObjects = false
let objectDetector = Vision.vision().objectDetector(options: options)
let image = VisionImage(image: self.originalImage)
objectDetector.process(image) { detectedObjects, error in
guard error == nil else {
print(error)
return
}
guard let detectedObjects = detectedObjects, !detectedObjects.isEmpty else {
print("No objects detected")
return
}
let primaryObject = detectedObjects.first
print(primaryObject as Any)
guard let objectFrame = primaryObject?.frame else{return}
print(objectFrame)
self.imageView.image = self.drawOccurrencesOnImage([objectFrame], self.originalImage)
}
and the function that draws the red box;
private func drawOccurrencesOnImage(_ occurrences: [CGRect], _ image: UIImage) -> UIImage? {
let imageSize = image.size
let scale: CGFloat = 0.0
UIGraphicsBeginImageContextWithOptions(imageSize, false, scale)
image.draw(at: CGPoint.zero)
let ctx = UIGraphicsGetCurrentContext()
ctx?.addRects(occurrences)
ctx?.setStrokeColor(UIColor.red.cgColor)
ctx?.setLineWidth(20)
ctx?.strokePath()
guard let drawnImage = UIGraphicsGetImageFromCurrentImageContext() else {
return nil
}
UIGraphicsEndImageContext()
return drawnImage
}
The image dimensions, according to image.size is (3024.0, 4032.0) and the box frame is (1274.0, 569.0, 1299.0, 2023.0). Any insight to this behaviour would be must appreciated.
Ended up not scaling the image properly which caused the misalignment.
This function ended up fixing my problems.
public func updateImageView(with image: UIImage) {
let orientation = UIApplication.shared.statusBarOrientation
var scaledImageWidth: CGFloat = 0.0
var scaledImageHeight: CGFloat = 0.0
switch orientation {
case .portrait, .portraitUpsideDown, .unknown:
scaledImageWidth = imageView.bounds.size.width
scaledImageHeight = image.size.height * scaledImageWidth / image.size.width
case .landscapeLeft, .landscapeRight:
scaledImageWidth = image.size.width * scaledImageHeight / image.size.height
scaledImageHeight = imageView.bounds.size.height
}
DispatchQueue.global(qos: .userInitiated).async {
// Scale image while maintaining aspect ratio so it displays better in the UIImageView.
var scaledImage = image.scaledImage(
with: CGSize(width: scaledImageWidth, height: scaledImageHeight)
)
scaledImage = scaledImage ?? image
guard let finalImage = scaledImage else { return }
DispatchQueue.main.async {
self.imageView.image = finalImage
self.processImage(finalImage)
}
}
}
Related
I want to create a full-screen camera app, which uses metal shaders. When I used to use AVCaptureVideoPreviewLayer in my root View, with settings [videoPreviewLayer.videoGravity = .resizeAspectFill][1] it was really full screen, but when I switched my app to Metal, I can't resize it more than 1170 x 1560 for iPhone 13 pro. The only settings that I found according to this manual https://developer.apple.com/documentation/avfoundation/additional_data_capture/avcamfilter_applying_filters_to_a_capture_stream
is:
let width = CVPixelBufferGetWidth(previewPixelBuffer)
let height = CVPixelBufferGetHeight(previewPixelBuffer)
I am absolutely new to Metal, so maybe I can't ask the correct answer, but how can I resize Metal texture to full-screen size?
// width = 1170
// height = 1560
private func render() {
guard let pixelBuffer = self.pixelBuffer else { return }
var width = CVPixelBufferGetWidth(pixelBuffer)
var height = CVPixelBufferGetHeight(pixelBuffer)
var cvTextureOut: CVMetalTexture?
print(width)
print(height)
CVMetalTextureCacheCreateTextureFromImage(kCFAllocatorDefault, textureCache!, pixelBuffer, nil, .bgra8Unorm, width, height , 0, &cvTextureOut)
guard let cvTexture = cvTextureOut, let inputTexture = CVMetalTextureGetTexture(cvTexture) else {
fatalError("Failed to create metal textures")
}
guard let drawable: CAMetalDrawable = self.currentDrawable else { fatalError("Failed to create drawable") }
if let commandQueue = commandQueue, let commandBuffer = commandQueue.makeCommandBuffer(), let computeCommandEncoder = commandBuffer.makeComputeCommandEncoder() {
computeCommandEncoder.setComputePipelineState(computePipelineState)
computeCommandEncoder.setTexture(inputTexture, index: 0)
computeCommandEncoder.setTexture(drawable.texture, index: 1)
computeCommandEncoder.dispatchThreadgroups(inputTexture.threadGroups(), threadsPerThreadgroup: inputTexture.threadGroupCount())
computeCommandEncoder.endEncoding()
commandBuffer.present(drawable)
commandBuffer.commit()
}
}
I'm taking snapshot from a PDFView in PDFKit for streaming (20 times per sec), and I use this extesnsion
extension UIView {
func asImageBackground(viewLayer: CALayer, viewBounds: CGRect) -> UIImage {
let renderer = UIGraphicsImageRenderer(bounds: viewBounds)
return renderer.image { rendererContext in
viewLayer.render(in: rendererContext.cgContext)
}
}
}
But the output UIImage from this extension has a high resolution which make it difficult to stream. I can reduce it by this extension
extension UIImage {
func resize(_ max_size: CGFloat) -> UIImage {
// adjust for device pixel density
let max_size_pixels = max_size / UIScreen.main.scale
// work out aspect ratio
let aspectRatio = size.width/size.height
// variables for storing calculated data
var width: CGFloat
var height: CGFloat
var newImage: UIImage
if aspectRatio > 1 {
// landscape
width = max_size_pixels
height = max_size_pixels / aspectRatio
} else {
// portrait
height = max_size_pixels
width = max_size_pixels * aspectRatio
}
// create an image renderer of the correct size
let renderer = UIGraphicsImageRenderer(size: CGSize(width: width, height: height), format: UIGraphicsImageRendererFormat.default())
// render the image
newImage = renderer.image {
(context) in
self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
}
// return the image
return newImage
}
}
but it add an additional workload which make the process even worse. Is there any better way?
Thanks
You can downsample it using ImageIO which is recommended by Apple:
extension UIImage {
func downsample(to resolution: CGSize) -> UIImage? {
let imageSourceOptions = [kCGImageSourceShouldCache: false] as CFDictionary
guard let data = self.jpegData(compressionQuality: 0.75) as? CFData, let imageSource = CGImageSourceCreateWithData(data, imageSourceOptions) else {
return nil
}
let maxDimensionInPixels = Swift.max(resolution.width, resolution.height) * 3
let downsampleOptions = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceShouldCacheImmediately: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: maxDimensionInPixels
] as CFDictionary
guard let downsampledImage = CGImageSourceCreateThumbnailAtIndex(imageSource, 0, downsampleOptions) else {
return nil
}
return UIImage(cgImage: downsampledImage)
}
}
When adding assets for the Graphic Circular Complication there is no option to add an asset for the 45mm version, thus the image does not fill the available space.
Result (The image does not fill the space as it is too small):
I have read that I need to use PDF assets for the 40/42mm but my image is a raster image and thus I can't create it as a PDF. I want to scale the image myself and add it as an asset but there is no option to drop it.
What should I do?
The issue is that the size of the image in the asset catalog is smaller than it really should be according to the Apple Human Interface Guidelines. Thus this causes the images not to be filled. As there's no option to drop the 45mm version you need to calculate and resize the image yourself.
This article is the solution!
http://www.glimsoft.com/02/18/watchos-complications/?utm_campaign=iOS%2BDev%2BWeekly&utm_medium=web&utm_source=iOS%2BDev%2BWeekly%2BIssue%2B547
ComplicationController+Ext.swift
extension ComplicationController {
enum ComplicationImageType {
case graphicCircularImage
}
struct ComplicationImageSizeCollection {
var size38mm: CGFloat = 0
let size40mm: CGFloat
let size41mm: CGFloat
let size44mm: CGFloat
let size45mm: CGFloat
// The following sizes are taken directly from HIG: https://developer.apple.com/design/human-interface-guidelines/watchos/overview/complications/
static let graphicCircularImageSizes = ComplicationImageSizeCollection(size40mm: 42, size41mm: 44.5, size44mm: 47, size45mm: 50)
func sizeForCurrentWatchModel() -> CGFloat {
let screenHeight = WKInterfaceDevice.current().screenBounds.size.height
if screenHeight >= 242 {
// It's the 45mm version..
return self.size45mm
}
else if screenHeight >= 224 {
// It's the 44mm version..
return self.size44mm
}
else if screenHeight >= 215 {
// It's the 41mm version..
return self.size41mm
}
else if screenHeight >= 197 {
return self.size40mm
}
else if screenHeight >= 170 {
return self.size38mm
}
return self.size40mm // Fallback, just in case.
}
static func sizes(for type: ComplicationImageType) -> ComplicationImageSizeCollection {
switch type {
case .graphicCircularImage: return Self.graphicCircularImageSizes
}
}
static func getImage(for type: ComplicationImageType) -> UIImage {
let complicationImageSizes = ComplicationImageSizeCollection.sizes(for: .graphicCircularImage)
let width = complicationImageSizes.sizeForCurrentWatchModel()
let size = CGSize(width: width, height: width)
var filename: String!
switch type {
case .graphicCircularImage: filename = "gedenken_graphic_circular_pdf"
}
return renderPDFToImage(named: filename, outputSize: size)
}
static private func renderPDFToImage(named filename: String, outputSize size: CGSize) -> UIImage {
// Create a URL for the PDF file
let resourceName = filename.replacingOccurrences(of: ".pdf", with: "")
let path = Bundle.main.path(forResource: resourceName, ofType: "pdf")!
let url = URL(fileURLWithPath: path)
guard let document = CGPDFDocument(url as CFURL),
let page = document.page(at: 1) else {
fatalError("We couldn't find the document or the page")
}
let originalPageRect = page.getBoxRect(.mediaBox)
// With the multiplier, we bring the pdf from its original size to the desired output size.
let multiplier = size.width / originalPageRect.width
UIGraphicsBeginImageContextWithOptions(size, false, 0)
let context = UIGraphicsGetCurrentContext()!
// Translate the context
context.translateBy(x: 0, y: (originalPageRect.size.height * multiplier))
// Flip the context vertically because the Core Graphics coordinate system starts from the bottom.
context.scaleBy(x: multiplier * 1.0, y: -1.0 * multiplier)
// Draw the PDF page
context.drawPDFPage(page)
let image = UIGraphicsGetImageFromCurrentImageContext()!
UIGraphicsEndImageContext()
return image
}
}
}
ComplicationController.swift
func createGraphicCircularTemplate() -> CLKComplicationTemplate {
let template = CLKComplicationTemplateGraphicCircularImage()
let imageLogoProvider = CLKFullColorImageProvider()
imageLogoProvider.image = ComplicationImageSizeCollection.getImage(for: .graphicCircularImage)
template.imageProvider = imageLogoProvider
return template
}
Essentially I have the following QR Code function that successfully creates a QR code based on a given string - how can add a square image to the center of this QR code that is static no matter what string the code represents?
The following is the function I use to generate:
func generateQRCode(from string: String) -> UIImage? {
let data = string.data(using: String.Encoding.ascii)
if let filter = CIFilter(name: "CIQRCodeGenerator") {
filter.setValue(data, forKey: "inputMessage")
let transform = CGAffineTransform(scaleX: 3, y: 3)
if let output = filter.outputImage?.transformed(by: transform) {
return UIImage(ciImage: output)
}
}
return nil
}
Sample code from one of my apps, only slightly commented.
The size calculations maybe won't be required for you app.
func generateImage(code: String, size pointSize: CGSize, logo: UIImage? = nil) -> UIImage? {
let pixelScale = UIScreen.main.scale
let pixelSize = CGSize(width: pointSize.width * pixelScale, height: pointSize.height * pixelScale)
guard
let codeData = code.data(using: .isoLatin1),
let generator = CIFilter(name: "CIQRCodeGenerator")
else {
return nil
}
generator.setValue(codeData, forKey: "inputMessage")
// set higher self-correction level
generator.setValue("Q", forKey: "inputCorrectionLevel")
guard let codeImage = generator.outputImage else {
return nil
}
// calculate transform depending on required size
let transform = CGAffineTransform(
scaleX: pixelSize.width / codeImage.extent.width,
y: pixelSize.height / codeImage.extent.height
)
let scaledCodeImage = UIImage(ciImage: codeImage.transformed(by: transform), scale: 0, orientation: .up)
guard let logo = logo else {
return scaledCodeImage
}
// create a drawing buffer
UIGraphicsBeginImageContextWithOptions(pointSize, false, 0)
defer {
UIGraphicsEndImageContext()
}
// draw QR code into the buffer
scaledCodeImage.draw(in: CGRect(origin: .zero, size: pointSize))
// calculate scale to cover the central 25% of the image
let logoScaleFactor: CGFloat = 0.25
// update depending on logo width/height ratio
let logoScale = min(
pointSize.width * logoScaleFactor / logo.size.width,
pointSize.height * logoScaleFactor / logo.size.height
)
// size of the logo
let logoSize = CGSize(width: logoScale * logo.size.width, height: logoScale * logo.size.height)
// draw the logo
logo.draw(in: CGRect(
x: (pointSize.width - logoSize.width) / 2,
y: (pointSize.height - logoSize.height) / 2,
width: logoSize.width,
height: logoSize.height
))
return UIGraphicsGetImageFromCurrentImageContext()!
}
I got a large image with 1920x1080 pixels. I'm trying to scale Image with 2 differents ways:
First: using CIFilter
func resize(image: UIImage, scale: Float, aspect: Float = 1) -> UIImage? {
return autoreleasepool(invoking: {
[weak self] () -> UIImage? in
var filter: CIFilter! = CIFilter(name: "CILanczosScaleTransform")!
filter.setValue(CIImage(image: image), forKey: kCIInputImageKey)
filter.setValue(NSNumber(value: scale as Float), forKey: kCIInputScaleKey)
filter.setValue(NSNumber(value: aspect as Float), forKey: kCIInputAspectRatioKey)
var result: UIImage?
var cgImage: CGImage? = nil
if let outputImage = filter.outputImage {
cgImage = self?.ctx?.createCGImage(outputImage, from: outputImage.extent)
}
if let cgImg = cgImage {
result = self?.convertUIImage(fromCGImage: cgImg)
}
if #available(iOS 10.0, *) {
self?.ctx?.clearCaches()
}
cgImage = nil
filter.setValue(nil, forKey: kCIInputImageKey)
filter.setValue(nil, forKey: kCIInputScaleKey)
filter.setValue(nil, forKey: kCIInputAspectRatioKey)
filter.setDefaults()
filter = nil
return result
})
}
Second: using UIImage()
func scaleImage(scale: CGFloat) -> UIImage? {
if let cgImage = self.cgImage {
return UIImage(cgImage: cgImage, scale: scale, orientation: imageOrientation)
}
return nil
}
But I realized that the scale factor in two methods produced conflicting results. For example, I set scale equal to 2
But in first method: new image is size (3840x2160), and second is (960x540).
I'm really confused. Can anyone explain me why this happened
In the future when using a new function have parameter scale how do I know when scale make my image smaller and vice versa
By default image has scale, its can any value. In began your image scale can be bigger than 2, that why your second way your image received smaller. Try this.
func scaleImage(image: UIImage, scale: CGFloat) -> UIImage? {
let size = CGSize(width: image.size.width * scale, height: image.size.height * scale)
let drawRect = CGRect(origin: .zero, size: size)
UIGraphicsBeginImageContextWithOptions(size, false, 0)
image.draw(in: drawRect)
let newImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return newImage
}