Applying MPSImageGaussianBlur with depth data - swift

I am trying to create an imitation of the portrait mode in Apple's native camera.
The problem is, that applying the blur effect using CIImage with respect to depth data, is too slow for the live preview I want to show to the user.
My code for this is mission is:
func blur(image: CIImage, mask: CIImage, orientation: UIImageOrientation = .up, blurRadius: CGFloat) -> UIImage? {
let start = Date()
let invertedMask = mask.applyingFilter("CIColorInvert")
let output = image.applyingFilter("CIMaskedVariableBlur", withInputParameters: ["inputMask" : invertedMask,
"inputRadius": blurRadius])
guard let cgImage = context.createCGImage(output, from: image.extent) else {
return nil
}
let end = Date()
let elapsed = end.timeIntervalSince1970 - start.timeIntervalSince1970
print("took \(elapsed) seconds to apply blur")
return UIImage(cgImage: cgImage, scale: 1.0, orientation: orientation)
}
I want to apply the blur on the GPU for better performance. For this task, I found this implementation provided by Apple here.
So in Apple's implementation, we have this code snippet:
/** Applies a Gaussian blur with a sigma value of 0.5.
This is a pre-packaged convolution filter.
*/
class GaussianBlur: CommandBufferEncodable {
let gaussian: MPSImageGaussianBlur
required init(device: MTLDevice) {
gaussian = MPSImageGaussianBlur(device: device,
sigma: 5.0)
}
func encode(to commandBuffer: MTLCommandBuffer, sourceTexture: MTLTexture, destinationTexture: MTLTexture) {
gaussian.encode(commandBuffer: commandBuffer,
sourceTexture: sourceTexture,
destinationTexture: destinationTexture)
}
}
How can I apply the depth data into the filtering through the Metal blur version? Or in other words - how can I achieve the first code snippets functionality, with the performance speed of the second code snippet?

For anyone still looking you need to get currentDrawable first in draw(in view: MTKView) method. Implement MTKViewDelegate
func makeBlur() {
device = MTLCreateSystemDefaultDevice()
commandQueue = device.makeCommandQueue()
selfView.mtkView.device = device
selfView.mtkView.framebufferOnly = false
selfView.mtkView.delegate = self
let textureLoader = MTKTextureLoader(device: device)
if let image = self.backgroundSnapshotImage?.cgImage, let texture = try? textureLoader.newTexture(cgImage: image, options: nil) {
sourceTexture = texture
}
}
func draw(in view: MTKView) {
if let currentDrawable = view.currentDrawable,
let commandBuffer = commandQueue.makeCommandBuffer() {
let gaussian = MPSImageGaussianBlur(device: device, sigma: 5)
gaussian.encode(commandBuffer: commandBuffer, sourceTexture: sourceTexture, destinationTexture: currentDrawable.texture)
commandBuffer.present(currentDrawable)
commandBuffer.commit()
}
}

Related

Extract face images from a large number of photos in Photo Library using CIDetector

I am trying to iterate over images in Photo Library and extract faces using CIDetector. The images are required to keep their original resolutions. To do so, I taking the following steps:
1- Getting assets given a date interval (usually more than a year)
func loadAssets(from fromDate: Date, to toDate: Date, completion: #escaping ([PHAsset]) -> Void) {
fetchQueue.async {
let authStatus = PHPhotoLibrary.authorizationStatus()
if authStatus == .authorized || authStatus == .limited {
let options = PHFetchOptions()
options.predicate = NSPredicate(format: "creationDate >= %# && creationDate <= %#", fromDate as CVarArg, toDate as CVarArg)
options.sortDescriptors = [NSSortDescriptor(key: "creationDate", ascending: false)]
let result: PHFetchResult = PHAsset.fetchAssets(with: .image, options: options)
var _assets = [PHAsset]()
result.enumerateObjects { object, count, stop in
_assets.append(object)
}
completion(_assets)
} else {
completion([])
}
}
}
where:
let fetchQueue = DispatchQueue.global(qos: .background)
2- Extracting faces
I then extract face images using:
func detectFaces(in image: UIImage, accuracy: String = CIDetectorAccuracyLow, completion: #escaping ([UIImage]) -> Void) {
faceDetectionQueue.async {
var faceImages = [UIImage]()
let outputImageSize: CGFloat = 200.0 / image.scale
guard let ciImage = CIImage(image: image),
let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil, options: [CIDetectorAccuracy: accuracy]) else { completion(faceImages); return }
let faces = faceDetector.features(in: ciImage) // Crash happens here
let group = DispatchGroup()
for face in faces {
group.enter()
if let face = face as? CIFaceFeature {
let faceBounds = face.bounds
let offset: CGFloat = floor(min(faceBounds.width, faceBounds.height) * 0.2)
let inset = UIEdgeInsets(top: -offset, left: -offset, bottom: -offset, right: -offset)
let rect = faceBounds.inset(by: inset)
let croppedFaceImage = ciImage.cropped(to: rect)
let scaledImage = croppedFaceImage
.transformed(by: CGAffineTransform(scaleX: outputImageSize / croppedFaceImage.extent.width,
y: outputImageSize / croppedFaceImage.extent.height))
faceImages.append(UIImage(ciImage: scaledImage))
group.leave()
} else {
group.leave()
}
}
group.notify(queue: self.faceDetectionQueue) {
completion(faceImages)
}
}
}
where
private let faceDetectionQueue = DispatchQueue(label: "face detection queue",
qos: DispatchQoS.background,
attributes: [],
autoreleaseFrequency: DispatchQueue.AutoreleaseFrequency.workItem,
target: nil)
I use the following extension to get the image from assets:
extension PHAsset {
var image: UIImage {
autoreleasepool {
let manager = PHImageManager.default()
let options = PHImageRequestOptions()
var thumbnail = UIImage()
let rect = CGRect(x: 0, y: 0, width: pixelWidth, height: pixelHeight)
options.isSynchronous = true
options.deliveryMode = .highQualityFormat
options.resizeMode = .exact
options.normalizedCropRect = rect
options.isNetworkAccessAllowed = true
manager.requestImage(for: self, targetSize: rect.size, contentMode: .aspectFit, options: options, resultHandler: {(result, info) -> Void in
if let result = result {
thumbnail = result
} else {
thumbnail = UIImage()
}
})
return thumbnail
}
}
}
The code works fine for a few (usually less that 50) assets, but for more number of images it crashes at:
let faces = faceDetector.features(in: ciImage) // Crash happens here
I get this error:
validateComputeFunctionArguments:858: failed assertion `Compute Function(ciKernelMain): missing sampler binding at index 0 for [0].'
If I reduce the size of the image fed to detectFaces(:) e.g. 400 px, I can analyze a few hundred images (usually less than 1000) but as I mentioned, using the asset's image in the original size is a requirement. My guess is it has something to do with a memory issue when I try to extract faces with CIDetector.
Any idea what this error is about and how I can fix the issue?
I can only guess what could be the issue here, so here are a few ideas:
A CIDetector is an expensive object, so try only to create a single one and re-use it for each image.
Use a single shared CIContext for performing all Core Image operations (more below). Also, pass that to the CIDetector on init. The context manages all resources needed for rendering an image. Sharing it will allow the system to re-use as many resources as possible.
The UIImage(ciImage:) constructor is really tricky. You see, CIImages are basically just recipes for creating an image, not actual bitmaps. They store the instructions for rendering the image. It takes a CIContext to do the actual rendering. When initializing a UIImage with a CIImage, you let UIKit decide how and when to render the image, which, in my experience, caused a lot of issues for other users here on StackOverflow.
Instead, you can use the shared CIContext I mentioned above to render the image first before you make it a UIImage:
let renderedImage = self.ciContext.createCGImage(scaledImage, from: scaledImage.extent).map(UIImage.init)

How do I position an image correctly in MTKView?

I am trying to implement an image editing view using MTKView and Core Image filters and have the basics working and can see the filter applied in realtime. However the image is not positioned correctly in the view - can someone point me in the right direction for what needs to be done to get the image to render correctly in the view. It needs to fit the view and retain its original aspect ratio.
Here is the metal draw function - and the empty drawableSizeWillChange!? - go figure. its probably also worth mentioning that the MTKView is a subview of another view in a ScrollView and can be resized by the user. It's not clear to me how Metals handles resizing the view but it seems that doesn't come for free.
I am also trying to call the draw() function from a background thread and this appears to sort of work. I can see the filter effects as they are applied to the image using a slider. As I understand it this should be possible.
It also seems that the coordinate space for rendering is in the images coordinate space - so if the image is smaller than the MTKView then to position the image in the centre the X and Y coordinates will be negative.
When the view is resized then everything gets crazy with the image suddenly becoming way too big and parts of the background not being cleared.
Also when resting the view the image gets stretched rather than redrawing smoothly.
func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
}
public func draw(in view: MTKView) {
if let ciImage = self.ciImage {
if let currentDrawable = view.currentDrawable { // 1
let commandBuffer = commandQueue.makeCommandBuffer()
let inputImage = ciImage // 2
exposureFilter.setValue(inputImage, forKey: kCIInputImageKey)
exposureFilter.setValue(ev, forKey: kCIInputEVKey)
context.render(exposureFilter.outputImage!,
to: currentDrawable.texture,
commandBuffer: commandBuffer,
bounds: CGRect(origin: .zero, size: view.drawableSize),
colorSpace: colorSpace)
commandBuffer?.present(currentDrawable)
commandBuffer?.commit()
}
}
}
As you can see the image is on the bottom left
let scaleFilter = CIFilter(name: "CILanczosScaleTransform")
That should help you out. The issue is that your CIImage, wherever it might come from, is not the same size as the view you are rendering it in.
So what you could opt to do is calculate the scale, and apply it as a filter:
let scaleFilter = CIFilter(name: "CILanczosScaleTransform")
scaleFilter?.setValue(ciImage, forKey: kCIInputImageKey)
scaleFilter?.setValue(scale, forKey: kCIInputScaleKey)
This resolves your scale issue; I currently do not know what the most efficient approach would be to actually reposition the image
Further reference: https://nshipster.com/image-resizing/
The problem is your call to context.render — you are calling render with bounds: origin .zero. That’s the lower left.
Placing the drawing in the correct spot is up to you. You need to work out where the right bounds origin should be, based on the image dimensions and your drawable size, and render there. If the size is wrong, you also need to apply a scale transform first.
Thanks to Tristan Hume's MetalTest2 I now have it working nicely in two synchronised scrollViews. The basics are in the subclass below - the renderer and shaders can be found at Tristan's MetalTest2 project. This class is managed by a viewController and is a subview of the scrollView's documentView. See image of the final result.
//
// MetalLayerView.swift
// MetalTest2
//
// Created by Tristan Hume on 2019-06-19.
// Copyright © 2019 Tristan Hume. All rights reserved.
//
import Cocoa
// Thanks to https://stackoverflow.com/questions/45375548/resizing-mtkview-scales-old-content-before-redraw
// for the recipe behind this, although I had to add presentsWithTransaction and the wait to make it glitch-free
class ImageMetalView: NSView, CALayerDelegate {
var renderer : Renderer
var metalLayer : CAMetalLayer!
var commandQueue: MTLCommandQueue!
var sourceTexture: MTLTexture!
let colorSpace = CGColorSpaceCreateDeviceRGB()
var context: CIContext!
var ciMgr: CIManager?
var showEdits: Bool = false
var ciImage: CIImage? {
didSet {
self.metalLayer.setNeedsDisplay()
}
}
#objc dynamic var fileUrl: URL? {
didSet {
if let url = fileUrl {
self.ciImage = CIImage(contentsOf: url)
}
}
}
/// Bind to this property from the viewController to receive notifications of changes to CI filter parameters
#objc dynamic var adjustmentsChanged: Bool = false {
didSet {
self.metalLayer.setNeedsDisplay()
}
}
override init(frame: NSRect) {
let _device = MTLCreateSystemDefaultDevice()!
renderer = Renderer(pixelFormat: .bgra8Unorm, device: _device)
self.commandQueue = _device.makeCommandQueue()
self.context = CIContext()
self.ciMgr = CIManager(context: self.context)
super.init(frame: frame)
self.wantsLayer = true
self.layerContentsRedrawPolicy = .duringViewResize
// This property only matters in the case of a rendering glitch, which shouldn't happen anymore
// The .topLeft version makes glitches less noticeable for normal UIs,
// while .scaleAxesIndependently matches what MTKView does and makes them very noticeable
// self.layerContentsPlacement = .topLeft
self.layerContentsPlacement = .scaleAxesIndependently
}
required init(coder: NSCoder) {
fatalError("init(coder:) has not been implemented")
}
override func makeBackingLayer() -> CALayer {
metalLayer = CAMetalLayer()
metalLayer.pixelFormat = .bgra8Unorm
metalLayer.device = renderer.device
metalLayer.delegate = self
// If you're using the strategy of .topLeft placement and not presenting with transaction
// to just make the glitches less visible instead of eliminating them, it can help to make
// the background color the same as the background of your app, so the glitch artifacts
// (solid color bands at the edge of the window) are less visible.
// metalLayer.backgroundColor = CGColor(red: 0.0, green: 0.0, blue: 0.0, alpha: 1.0)
metalLayer.allowsNextDrawableTimeout = false
// these properties are crucial to resizing working
metalLayer.autoresizingMask = CAAutoresizingMask(arrayLiteral: [.layerHeightSizable, .layerWidthSizable])
metalLayer.needsDisplayOnBoundsChange = true
metalLayer.presentsWithTransaction = true
metalLayer.framebufferOnly = false
return metalLayer
}
override func setFrameSize(_ newSize: NSSize) {
super.setFrameSize(newSize)
self.size = newSize
renderer.viewportSize.x = UInt32(newSize.width)
renderer.viewportSize.y = UInt32(newSize.height)
// the conversion below is necessary for high DPI drawing
metalLayer.drawableSize = convertToBacking(newSize)
self.viewDidChangeBackingProperties()
}
var size: CGSize = .zero
// This will hopefully be called if the window moves between monitors of
// different DPIs but I haven't tested this part
override func viewDidChangeBackingProperties() {
guard let window = self.window else { return }
// This is necessary to render correctly on retina displays with the topLeft placement policy
metalLayer.contentsScale = window.backingScaleFactor
}
func display(_ layer: CALayer) {
if let drawable = metalLayer.nextDrawable(),
let commandBuffer = commandQueue.makeCommandBuffer() {
let passDescriptor = MTLRenderPassDescriptor()
let colorAttachment = passDescriptor.colorAttachments[0]!
colorAttachment.texture = drawable.texture
colorAttachment.loadAction = .clear
colorAttachment.storeAction = .store
colorAttachment.clearColor = MTLClearColor(red: 0, green: 0, blue: 0, alpha: 0)
if let outputImage = self.ciImage {
let xscale = self.size.width / outputImage.extent.width
let yscale = self.size.height / outputImage.extent.height
let scale = min(xscale, yscale)
if let scaledImage = self.ciMgr!.scaleTransformFilter(outputImage, scale: scale, aspectRatio: 1),
let processed = self.showEdits ? self.ciMgr!.processImage(inputImage: scaledImage) : scaledImage {
let x = self.size.width/2 - processed.extent.width/2
let y = self.size.height/2 - processed.extent.height/2
context.render(processed,
to: drawable.texture,
commandBuffer: commandBuffer,
bounds: CGRect(x:-x, y:-y, width: self.size.width, height: self.size.height),
colorSpace: colorSpace)
}
} else {
print("Image is nil")
}
commandBuffer.commit()
commandBuffer.waitUntilScheduled()
drawable.present()
}
}
}

Firebase ML kit misaligned bounding box

I'm trying to use the new Detect and Track Objects with ML Kit on iOS however I seem to be running into a roadblock with the object detection bounding box.
Using a lego figure as an example, the image orientation is converted to always be .up as per the documentation however the bounding box almost seems to be rotated 90 degrees to the correct dimensions despite the image orientation being correct. This similar behaviour exists on other objects too with the box being offset.
let options = VisionObjectDetectorOptions()
options.detectorMode = .singleImage
options.shouldEnableMultipleObjects = false
let objectDetector = Vision.vision().objectDetector(options: options)
let image = VisionImage(image: self.originalImage)
objectDetector.process(image) { detectedObjects, error in
guard error == nil else {
print(error)
return
}
guard let detectedObjects = detectedObjects, !detectedObjects.isEmpty else {
print("No objects detected")
return
}
let primaryObject = detectedObjects.first
print(primaryObject as Any)
guard let objectFrame = primaryObject?.frame else{return}
print(objectFrame)
self.imageView.image = self.drawOccurrencesOnImage([objectFrame], self.originalImage)
}
and the function that draws the red box;
private func drawOccurrencesOnImage(_ occurrences: [CGRect], _ image: UIImage) -> UIImage? {
let imageSize = image.size
let scale: CGFloat = 0.0
UIGraphicsBeginImageContextWithOptions(imageSize, false, scale)
image.draw(at: CGPoint.zero)
let ctx = UIGraphicsGetCurrentContext()
ctx?.addRects(occurrences)
ctx?.setStrokeColor(UIColor.red.cgColor)
ctx?.setLineWidth(20)
ctx?.strokePath()
guard let drawnImage = UIGraphicsGetImageFromCurrentImageContext() else {
return nil
}
UIGraphicsEndImageContext()
return drawnImage
}
The image dimensions, according to image.size is (3024.0, 4032.0) and the box frame is (1274.0, 569.0, 1299.0, 2023.0). Any insight to this behaviour would be must appreciated.
Ended up not scaling the image properly which caused the misalignment.
This function ended up fixing my problems.
public func updateImageView(with image: UIImage) {
let orientation = UIApplication.shared.statusBarOrientation
var scaledImageWidth: CGFloat = 0.0
var scaledImageHeight: CGFloat = 0.0
switch orientation {
case .portrait, .portraitUpsideDown, .unknown:
scaledImageWidth = imageView.bounds.size.width
scaledImageHeight = image.size.height * scaledImageWidth / image.size.width
case .landscapeLeft, .landscapeRight:
scaledImageWidth = image.size.width * scaledImageHeight / image.size.height
scaledImageHeight = imageView.bounds.size.height
}
DispatchQueue.global(qos: .userInitiated).async {
// Scale image while maintaining aspect ratio so it displays better in the UIImageView.
var scaledImage = image.scaledImage(
with: CGSize(width: scaledImageWidth, height: scaledImageHeight)
)
scaledImage = scaledImage ?? image
guard let finalImage = scaledImage else { return }
DispatchQueue.main.async {
self.imageView.image = finalImage
self.processImage(finalImage)
}
}
}

swift - convert UIImage to pure Black&White and detect DataMatrix

I really need help. I'm creating a DataMatrix reader, and part of codes are with white background only and causes any problem with AVFoundation, but another part has grey with shimmer background (see image below), and this driving me crazy.
What I've tried:
1) AVFoundation with its metaDataOutput works perfect only with white background, and there was no success with shimmer grey
2)zxing - actually can't find any working example for swift, and their sample from GitHub find no Datamatrix on grey too (with datamatrix as qr code type), will be thankful for tutorial or smth like this (zxingObjC for Swift)
3)about 20 libs from cocoapods/github - nothing with grey back again
4) then I found that Vision perfectly detect Datamatrix on white from photo, so I decided to work with this lib and changed the way: no more catching a video output, only UIImages, then handle them and detect DataMatrix using Vision framework.
And to convert colors I've tried:
CIFilters (ColorsControls, NoirEffect), GPU filters (monochrome, luminance, averageLuminance,adaptiveTreshold) playing with params
In the end I have no solution that will work 10 from 10 with my DataMatrix stickers. sometimes it works with GPUImageAverageLuminanceThresholdFilter and GPUImageAdaptiveThresholdFilter, but about 20% luck.
And this 20% luck only at daylight, with electric light comes shimmer-glitter, I think.
Any advice will be helpful for me! Maybe there is nice solution with Zxing for Swift, which I can't find. Or there is no need to use Vision and get frames from AVFoundation, but how?
I-nigma etc. catch my stickers perfectly, from live video, so there should be the way. Android version of my scanner use Zxing, and I guess that Zxing do the job..
My scanning scheme:
fileprivate func createSession(input:AVCaptureDeviceInput) -> Bool {
let session = AVCaptureSession()
if session.canAddInput(input) {
session.addInput(input)
} else {
return false
}
let output = AVCaptureVideoDataOutput()
if session.canAddOutput(output) {
output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "com.output"))
output.videoSettings = [String(kCVPixelBufferPixelFormatTypeKey):kCVPixelFormatType_32BGRA]
output.alwaysDiscardsLateVideoFrames = true
session.addOutput(output)
}
self.videoSession = session
return true
}
extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func convert(cmage:CIImage) -> UIImage
{
let context:CIContext = CIContext.init(options: nil)
let cgImage:CGImage = context.createCGImage(cmage, from: cmage.extent)!
let image:UIImage = UIImage.init(cgImage: cgImage)
return image
}
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
let threshold:Double = 1.0 / 3
let timeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
currentTime = Double(timeStamp.value) / Double(timeStamp.timescale)
if (currentTime - lastTime > threshold) {
if let image = imageFromSampleBuffer(sampleBuffer: sampleBuffer),
let cgImage = image.cgImage {
let ciImage = CIImage(cgImage: cgImage)
// with CIFilter
// let blackAndWhiteImage = ciImage.applyingFilter("CIColorControls", parameters: [kCIInputContrastKey: 2.5,
// kCIInputSaturationKey: 0,
// kCIInputBrightnessKey: 0.5])
// let imageToScan = convert(cmage: blackAndWhiteImage) //UIImage(ciImage: blackAndWhiteImage)
// resImage = imageToScan
// scanBarcode(cgImage: imageToScan.cgImage!)
let filter = GPUImageAverageLuminanceThresholdFilter()
filter.thresholdMultiplier = 0.7
let imageToScan = filter.image(byFilteringImage: image)
resImage = imageToScan!
scanBarcode(cgImage: imageToScan!.cgImage!)
}
}
}
fileprivate func imageFromSampleBuffer(sampleBuffer : CMSampleBuffer) -> UIImage? {
guard let imgBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return nil
}
// Lock the base address of the pixel buffer
CVPixelBufferLockBaseAddress(imgBuffer, CVPixelBufferLockFlags.readOnly)
// Get the number of bytes per row for the pixel buffer
let baseAddress = CVPixelBufferGetBaseAddress(imgBuffer)
// Get the number of bytes per row for the pixel buffer
let bytesPerRow = CVPixelBufferGetBytesPerRow(imgBuffer)
// Get the pixel buffer width and height
let width = CVPixelBufferGetWidth(imgBuffer)
let height = CVPixelBufferGetHeight(imgBuffer)
// Create a device-dependent RGB color space
let colorSpace = CGColorSpaceCreateDeviceRGB()
// Create a bitmap graphics context with the sample buffer data
var bitmapInfo: UInt32 = CGBitmapInfo.byteOrder32Little.rawValue
bitmapInfo |= CGImageAlphaInfo.premultipliedFirst.rawValue & CGBitmapInfo.alphaInfoMask.rawValue
//let bitmapInfo: UInt32 = CGBitmapInfo.alphaInfoMask.rawValue
let context = CGContext.init(data: baseAddress, width: width, height: height, bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo)
// Create a Quartz image from the pixel data in the bitmap graphics context
let quartzImage = context?.makeImage()
// Unlock the pixel buffer
CVPixelBufferUnlockBaseAddress(imgBuffer, CVPixelBufferLockFlags.readOnly)
CVPixelBufferLockBaseAddress(imgBuffer, .readOnly)
if var image = quartzImage {
if shouldInvert, let inverted = invertImage(image) {
image = inverted
}
let output = UIImage(cgImage: image)
return output
}
return nil
}
fileprivate func scanBarcode(cgImage: CGImage) {
let barcodeRequest = VNDetectBarcodesRequest(completionHandler: { request, _ in
self.parseResults(results: request.results)
})
let handler = VNImageRequestHandler(cgImage: cgImage, options: [.properties : ""])
guard let _ = try? handler.perform([barcodeRequest]) else {
return print("Could not scan")
}
}
fileprivate func parseResults(results: [Any]?) {
guard let results = results else {
return print("No results")
}
print("GOT results - ", results.count)
for result in results {
if let barcode = result as? VNBarcodeObservation {
if let code = barcode.payloadStringValue {
DispatchQueue.main.async {
self.videoSession?.stopRunning()
self.resultLabel.text = code
self.blackWhiteImageView.image = self.resImage //just to check from what image code scanned
}
} else {
print("No results 2")
}
} else {
print("No results 1")
}
}
}
}

Combine two images with CGAffineTransform

I am using the new Apple Vision API's VNImageTranslationAlignmentObservation to get a CGAffineTransform returned. The idea is that you pass it two images that can be merged together and it returns the CGAffineTransform so that you can do so. I have managed to get the code working so that i get a CGAffineTransform returned but after a lot of reading im at a loss as how i can merge two images with the information.
My code is here:
import UIKit
import Vision
class ImageTranslation {
let referenceImage: CGImage!
let floatingImage: CGImage!
let imageTranslationRequest: VNTranslationalImageRegistrationRequest!
init(referenceImage: CGImage, floatingImage: CGImage) {
self.referenceImage = referenceImage
self.floatingImage = floatingImage
self.imageTranslationRequest = VNTranslationalImageRegistrationRequest(targetedCGImage: floatingImage, completionHandler: nil)
}
func handleImageTranslationRequest() -> UIImage {
var alignmentTransform: CGAffineTransform!
let vnImage = VNSequenceRequestHandler()
try? vnImage.perform([imageTranslationRequest], on: referenceImage)
if let results = imageTranslationRequest.results as? [VNImageTranslationAlignmentObservation] {
print("Image Transformations found \(results.count)")
results.forEach { result in
alignmentTransform = result.alignmentTransform
print(alignmentTransform)
}
}
return applyTransformation(alignmentTransform)
}
private func applyTransformation(_ transform: CGAffineTransform) -> UIImage {
let image = UIImage(cgImage: referenceImage)
return image
}
}
The printed transform i get is like so CGAffineTransform(a: 1.0, b: 0.0, c: 0.0, d: 1.0, tx: 672.0, ty: 894.0)
How can i apply this two the two images passed in?
I've been playing with some examples (portrait images only but should work with landscape also) and this has worked:
func mergeImages(first image1:UIImage, second image2:UIImage, transformation: CGAffineTransform) -> UIImage {
let size = CGSize(width: image1.size.width + image2.size.width - (image2.size.width - transformation.tx), height: image1.size.height + image2.size.height - (image2.size.height - transformation.ty))
let renderer = UIGraphicsImageRenderer(size: size)
return renderer.image { context in
let pointImg2 = CGPoint.zero.applying(transformation)
image2.draw(at: pointImg2)
let pointImg1 = CGPoint.zero
image1.draw(at: pointImg1)
}
}
Let me know if it isn't working (upload your sample images if you can) and I'll fix it.