I am using the new Apple Vision API's VNImageTranslationAlignmentObservation to get a CGAffineTransform returned. The idea is that you pass it two images that can be merged together and it returns the CGAffineTransform so that you can do so. I have managed to get the code working so that i get a CGAffineTransform returned but after a lot of reading im at a loss as how i can merge two images with the information.
My code is here:
import UIKit
import Vision
class ImageTranslation {
let referenceImage: CGImage!
let floatingImage: CGImage!
let imageTranslationRequest: VNTranslationalImageRegistrationRequest!
init(referenceImage: CGImage, floatingImage: CGImage) {
self.referenceImage = referenceImage
self.floatingImage = floatingImage
self.imageTranslationRequest = VNTranslationalImageRegistrationRequest(targetedCGImage: floatingImage, completionHandler: nil)
}
func handleImageTranslationRequest() -> UIImage {
var alignmentTransform: CGAffineTransform!
let vnImage = VNSequenceRequestHandler()
try? vnImage.perform([imageTranslationRequest], on: referenceImage)
if let results = imageTranslationRequest.results as? [VNImageTranslationAlignmentObservation] {
print("Image Transformations found \(results.count)")
results.forEach { result in
alignmentTransform = result.alignmentTransform
print(alignmentTransform)
}
}
return applyTransformation(alignmentTransform)
}
private func applyTransformation(_ transform: CGAffineTransform) -> UIImage {
let image = UIImage(cgImage: referenceImage)
return image
}
}
The printed transform i get is like so CGAffineTransform(a: 1.0, b: 0.0, c: 0.0, d: 1.0, tx: 672.0, ty: 894.0)
How can i apply this two the two images passed in?
I've been playing with some examples (portrait images only but should work with landscape also) and this has worked:
func mergeImages(first image1:UIImage, second image2:UIImage, transformation: CGAffineTransform) -> UIImage {
let size = CGSize(width: image1.size.width + image2.size.width - (image2.size.width - transformation.tx), height: image1.size.height + image2.size.height - (image2.size.height - transformation.ty))
let renderer = UIGraphicsImageRenderer(size: size)
return renderer.image { context in
let pointImg2 = CGPoint.zero.applying(transformation)
image2.draw(at: pointImg2)
let pointImg1 = CGPoint.zero
image1.draw(at: pointImg1)
}
}
Let me know if it isn't working (upload your sample images if you can) and I'll fix it.
Related
I am trying to iterate over images in Photo Library and extract faces using CIDetector. The images are required to keep their original resolutions. To do so, I taking the following steps:
1- Getting assets given a date interval (usually more than a year)
func loadAssets(from fromDate: Date, to toDate: Date, completion: #escaping ([PHAsset]) -> Void) {
fetchQueue.async {
let authStatus = PHPhotoLibrary.authorizationStatus()
if authStatus == .authorized || authStatus == .limited {
let options = PHFetchOptions()
options.predicate = NSPredicate(format: "creationDate >= %# && creationDate <= %#", fromDate as CVarArg, toDate as CVarArg)
options.sortDescriptors = [NSSortDescriptor(key: "creationDate", ascending: false)]
let result: PHFetchResult = PHAsset.fetchAssets(with: .image, options: options)
var _assets = [PHAsset]()
result.enumerateObjects { object, count, stop in
_assets.append(object)
}
completion(_assets)
} else {
completion([])
}
}
}
where:
let fetchQueue = DispatchQueue.global(qos: .background)
2- Extracting faces
I then extract face images using:
func detectFaces(in image: UIImage, accuracy: String = CIDetectorAccuracyLow, completion: #escaping ([UIImage]) -> Void) {
faceDetectionQueue.async {
var faceImages = [UIImage]()
let outputImageSize: CGFloat = 200.0 / image.scale
guard let ciImage = CIImage(image: image),
let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil, options: [CIDetectorAccuracy: accuracy]) else { completion(faceImages); return }
let faces = faceDetector.features(in: ciImage) // Crash happens here
let group = DispatchGroup()
for face in faces {
group.enter()
if let face = face as? CIFaceFeature {
let faceBounds = face.bounds
let offset: CGFloat = floor(min(faceBounds.width, faceBounds.height) * 0.2)
let inset = UIEdgeInsets(top: -offset, left: -offset, bottom: -offset, right: -offset)
let rect = faceBounds.inset(by: inset)
let croppedFaceImage = ciImage.cropped(to: rect)
let scaledImage = croppedFaceImage
.transformed(by: CGAffineTransform(scaleX: outputImageSize / croppedFaceImage.extent.width,
y: outputImageSize / croppedFaceImage.extent.height))
faceImages.append(UIImage(ciImage: scaledImage))
group.leave()
} else {
group.leave()
}
}
group.notify(queue: self.faceDetectionQueue) {
completion(faceImages)
}
}
}
where
private let faceDetectionQueue = DispatchQueue(label: "face detection queue",
qos: DispatchQoS.background,
attributes: [],
autoreleaseFrequency: DispatchQueue.AutoreleaseFrequency.workItem,
target: nil)
I use the following extension to get the image from assets:
extension PHAsset {
var image: UIImage {
autoreleasepool {
let manager = PHImageManager.default()
let options = PHImageRequestOptions()
var thumbnail = UIImage()
let rect = CGRect(x: 0, y: 0, width: pixelWidth, height: pixelHeight)
options.isSynchronous = true
options.deliveryMode = .highQualityFormat
options.resizeMode = .exact
options.normalizedCropRect = rect
options.isNetworkAccessAllowed = true
manager.requestImage(for: self, targetSize: rect.size, contentMode: .aspectFit, options: options, resultHandler: {(result, info) -> Void in
if let result = result {
thumbnail = result
} else {
thumbnail = UIImage()
}
})
return thumbnail
}
}
}
The code works fine for a few (usually less that 50) assets, but for more number of images it crashes at:
let faces = faceDetector.features(in: ciImage) // Crash happens here
I get this error:
validateComputeFunctionArguments:858: failed assertion `Compute Function(ciKernelMain): missing sampler binding at index 0 for [0].'
If I reduce the size of the image fed to detectFaces(:) e.g. 400 px, I can analyze a few hundred images (usually less than 1000) but as I mentioned, using the asset's image in the original size is a requirement. My guess is it has something to do with a memory issue when I try to extract faces with CIDetector.
Any idea what this error is about and how I can fix the issue?
I can only guess what could be the issue here, so here are a few ideas:
A CIDetector is an expensive object, so try only to create a single one and re-use it for each image.
Use a single shared CIContext for performing all Core Image operations (more below). Also, pass that to the CIDetector on init. The context manages all resources needed for rendering an image. Sharing it will allow the system to re-use as many resources as possible.
The UIImage(ciImage:) constructor is really tricky. You see, CIImages are basically just recipes for creating an image, not actual bitmaps. They store the instructions for rendering the image. It takes a CIContext to do the actual rendering. When initializing a UIImage with a CIImage, you let UIKit decide how and when to render the image, which, in my experience, caused a lot of issues for other users here on StackOverflow.
Instead, you can use the shared CIContext I mentioned above to render the image first before you make it a UIImage:
let renderedImage = self.ciContext.createCGImage(scaledImage, from: scaledImage.extent).map(UIImage.init)
Would someone please explain to me why this pdf generator I'm attempting to use is always returning nil? I'm attempting to get a thumbnail to display in a UITableView alongside the filename of the PDF. Unfortunately, out of the four or so thumbnail generators I've tried, none of them have returned anything other than nil.
func uploadPDF() {
let types = UTType.types(tag: "pdf",
tagClass: UTTagClass.filenameExtension,
conformingTo: nil)
let documentPickerController = UIDocumentPickerViewController(forOpeningContentTypes: types)
documentPickerController.delegate = self
self.present(documentPickerController, animated: true, completion: nil)
}
func documentPicker(_ controller: UIDocumentPickerViewController, didPickDocumentsAt urls: [URL]) {
for url in urls {
let thumbnail = thumbnailFromPdf(withUrl: url, pageNumber: 0)
self.modelController.bidPDFUploadThumbnails.append(thumbnail!)
}
tableView.reloadData()
}
func thumbnailFromPdf(withUrl url:URL, pageNumber:Int, width: CGFloat = 240) -> UIImage? {
guard let pdf = CGPDFDocument(url as CFURL),
let page = pdf.page(at: pageNumber)
else {
return nil
}
var pageRect = page.getBoxRect(.mediaBox)
let pdfScale = width / pageRect.size.width
pageRect.size = CGSize(width: pageRect.size.width*pdfScale, height: pageRect.size.height*pdfScale)
pageRect.origin = .zero
UIGraphicsBeginImageContext(pageRect.size)
let context = UIGraphicsGetCurrentContext()!
// White BG
context.setFillColor(UIColor.white.cgColor)
context.fill(pageRect)
context.saveGState()
// Next 3 lines makes the rotations so that the page look in the right direction
context.translateBy(x: 0.0, y: pageRect.size.height)
context.scaleBy(x: 1.0, y: -1.0)
context.concatenate(page.getDrawingTransform(.mediaBox, rect: pageRect, rotate: 0, preserveAspectRatio: true))
context.drawPDFPage(page)
context.restoreGState()
let image = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return image
}
Generator source: Thumbnail Generator
the pdf document starts from page 1 not 0 because its not an array.
so simple is
let thumbnail = thumbnailFromPdf(withUrl: url, pageNumber: 1)
you'll get it
rather than using page number you can direct access the thumbnail of by default first page as follow:
import PDFKit
func generatePdfThumbnail(of thumbnailSize: CGSize , for documentUrl: URL, atPage pageIndex: Int) -> UIImage? {
let pdfDocument = PDFDocument(url: documentUrl)
let pdfDocumentPage = pdfDocument?.page(at: pageIndex)
return pdfDocumentPage?.thumbnail(of: thumbnailSize, for: PDFDisplayBox.trimBox)
}
I really need help. I'm creating a DataMatrix reader, and part of codes are with white background only and causes any problem with AVFoundation, but another part has grey with shimmer background (see image below), and this driving me crazy.
What I've tried:
1) AVFoundation with its metaDataOutput works perfect only with white background, and there was no success with shimmer grey
2)zxing - actually can't find any working example for swift, and their sample from GitHub find no Datamatrix on grey too (with datamatrix as qr code type), will be thankful for tutorial or smth like this (zxingObjC for Swift)
3)about 20 libs from cocoapods/github - nothing with grey back again
4) then I found that Vision perfectly detect Datamatrix on white from photo, so I decided to work with this lib and changed the way: no more catching a video output, only UIImages, then handle them and detect DataMatrix using Vision framework.
And to convert colors I've tried:
CIFilters (ColorsControls, NoirEffect), GPU filters (monochrome, luminance, averageLuminance,adaptiveTreshold) playing with params
In the end I have no solution that will work 10 from 10 with my DataMatrix stickers. sometimes it works with GPUImageAverageLuminanceThresholdFilter and GPUImageAdaptiveThresholdFilter, but about 20% luck.
And this 20% luck only at daylight, with electric light comes shimmer-glitter, I think.
Any advice will be helpful for me! Maybe there is nice solution with Zxing for Swift, which I can't find. Or there is no need to use Vision and get frames from AVFoundation, but how?
I-nigma etc. catch my stickers perfectly, from live video, so there should be the way. Android version of my scanner use Zxing, and I guess that Zxing do the job..
My scanning scheme:
fileprivate func createSession(input:AVCaptureDeviceInput) -> Bool {
let session = AVCaptureSession()
if session.canAddInput(input) {
session.addInput(input)
} else {
return false
}
let output = AVCaptureVideoDataOutput()
if session.canAddOutput(output) {
output.setSampleBufferDelegate(self, queue: DispatchQueue(label: "com.output"))
output.videoSettings = [String(kCVPixelBufferPixelFormatTypeKey):kCVPixelFormatType_32BGRA]
output.alwaysDiscardsLateVideoFrames = true
session.addOutput(output)
}
self.videoSession = session
return true
}
extension ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func convert(cmage:CIImage) -> UIImage
{
let context:CIContext = CIContext.init(options: nil)
let cgImage:CGImage = context.createCGImage(cmage, from: cmage.extent)!
let image:UIImage = UIImage.init(cgImage: cgImage)
return image
}
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
let threshold:Double = 1.0 / 3
let timeStamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
currentTime = Double(timeStamp.value) / Double(timeStamp.timescale)
if (currentTime - lastTime > threshold) {
if let image = imageFromSampleBuffer(sampleBuffer: sampleBuffer),
let cgImage = image.cgImage {
let ciImage = CIImage(cgImage: cgImage)
// with CIFilter
// let blackAndWhiteImage = ciImage.applyingFilter("CIColorControls", parameters: [kCIInputContrastKey: 2.5,
// kCIInputSaturationKey: 0,
// kCIInputBrightnessKey: 0.5])
// let imageToScan = convert(cmage: blackAndWhiteImage) //UIImage(ciImage: blackAndWhiteImage)
// resImage = imageToScan
// scanBarcode(cgImage: imageToScan.cgImage!)
let filter = GPUImageAverageLuminanceThresholdFilter()
filter.thresholdMultiplier = 0.7
let imageToScan = filter.image(byFilteringImage: image)
resImage = imageToScan!
scanBarcode(cgImage: imageToScan!.cgImage!)
}
}
}
fileprivate func imageFromSampleBuffer(sampleBuffer : CMSampleBuffer) -> UIImage? {
guard let imgBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return nil
}
// Lock the base address of the pixel buffer
CVPixelBufferLockBaseAddress(imgBuffer, CVPixelBufferLockFlags.readOnly)
// Get the number of bytes per row for the pixel buffer
let baseAddress = CVPixelBufferGetBaseAddress(imgBuffer)
// Get the number of bytes per row for the pixel buffer
let bytesPerRow = CVPixelBufferGetBytesPerRow(imgBuffer)
// Get the pixel buffer width and height
let width = CVPixelBufferGetWidth(imgBuffer)
let height = CVPixelBufferGetHeight(imgBuffer)
// Create a device-dependent RGB color space
let colorSpace = CGColorSpaceCreateDeviceRGB()
// Create a bitmap graphics context with the sample buffer data
var bitmapInfo: UInt32 = CGBitmapInfo.byteOrder32Little.rawValue
bitmapInfo |= CGImageAlphaInfo.premultipliedFirst.rawValue & CGBitmapInfo.alphaInfoMask.rawValue
//let bitmapInfo: UInt32 = CGBitmapInfo.alphaInfoMask.rawValue
let context = CGContext.init(data: baseAddress, width: width, height: height, bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo)
// Create a Quartz image from the pixel data in the bitmap graphics context
let quartzImage = context?.makeImage()
// Unlock the pixel buffer
CVPixelBufferUnlockBaseAddress(imgBuffer, CVPixelBufferLockFlags.readOnly)
CVPixelBufferLockBaseAddress(imgBuffer, .readOnly)
if var image = quartzImage {
if shouldInvert, let inverted = invertImage(image) {
image = inverted
}
let output = UIImage(cgImage: image)
return output
}
return nil
}
fileprivate func scanBarcode(cgImage: CGImage) {
let barcodeRequest = VNDetectBarcodesRequest(completionHandler: { request, _ in
self.parseResults(results: request.results)
})
let handler = VNImageRequestHandler(cgImage: cgImage, options: [.properties : ""])
guard let _ = try? handler.perform([barcodeRequest]) else {
return print("Could not scan")
}
}
fileprivate func parseResults(results: [Any]?) {
guard let results = results else {
return print("No results")
}
print("GOT results - ", results.count)
for result in results {
if let barcode = result as? VNBarcodeObservation {
if let code = barcode.payloadStringValue {
DispatchQueue.main.async {
self.videoSession?.stopRunning()
self.resultLabel.text = code
self.blackWhiteImageView.image = self.resImage //just to check from what image code scanned
}
} else {
print("No results 2")
}
} else {
print("No results 1")
}
}
}
}
I am trying to create an imitation of the portrait mode in Apple's native camera.
The problem is, that applying the blur effect using CIImage with respect to depth data, is too slow for the live preview I want to show to the user.
My code for this is mission is:
func blur(image: CIImage, mask: CIImage, orientation: UIImageOrientation = .up, blurRadius: CGFloat) -> UIImage? {
let start = Date()
let invertedMask = mask.applyingFilter("CIColorInvert")
let output = image.applyingFilter("CIMaskedVariableBlur", withInputParameters: ["inputMask" : invertedMask,
"inputRadius": blurRadius])
guard let cgImage = context.createCGImage(output, from: image.extent) else {
return nil
}
let end = Date()
let elapsed = end.timeIntervalSince1970 - start.timeIntervalSince1970
print("took \(elapsed) seconds to apply blur")
return UIImage(cgImage: cgImage, scale: 1.0, orientation: orientation)
}
I want to apply the blur on the GPU for better performance. For this task, I found this implementation provided by Apple here.
So in Apple's implementation, we have this code snippet:
/** Applies a Gaussian blur with a sigma value of 0.5.
This is a pre-packaged convolution filter.
*/
class GaussianBlur: CommandBufferEncodable {
let gaussian: MPSImageGaussianBlur
required init(device: MTLDevice) {
gaussian = MPSImageGaussianBlur(device: device,
sigma: 5.0)
}
func encode(to commandBuffer: MTLCommandBuffer, sourceTexture: MTLTexture, destinationTexture: MTLTexture) {
gaussian.encode(commandBuffer: commandBuffer,
sourceTexture: sourceTexture,
destinationTexture: destinationTexture)
}
}
How can I apply the depth data into the filtering through the Metal blur version? Or in other words - how can I achieve the first code snippets functionality, with the performance speed of the second code snippet?
For anyone still looking you need to get currentDrawable first in draw(in view: MTKView) method. Implement MTKViewDelegate
func makeBlur() {
device = MTLCreateSystemDefaultDevice()
commandQueue = device.makeCommandQueue()
selfView.mtkView.device = device
selfView.mtkView.framebufferOnly = false
selfView.mtkView.delegate = self
let textureLoader = MTKTextureLoader(device: device)
if let image = self.backgroundSnapshotImage?.cgImage, let texture = try? textureLoader.newTexture(cgImage: image, options: nil) {
sourceTexture = texture
}
}
func draw(in view: MTKView) {
if let currentDrawable = view.currentDrawable,
let commandBuffer = commandQueue.makeCommandBuffer() {
let gaussian = MPSImageGaussianBlur(device: device, sigma: 5)
gaussian.encode(commandBuffer: commandBuffer, sourceTexture: sourceTexture, destinationTexture: currentDrawable.texture)
commandBuffer.present(currentDrawable)
commandBuffer.commit()
}
}
I am trying to write a test that verifies a SKSpriteNode in my scene has the correct texture.
The test looks like this:
let sceneSprite = scene.childNodeWithName("sceneSprite") as SKSpriteNode!
let sprite = SKSpriteNode(imageNamed: expectedSpriteTexture)
sprite.size = sceneSprite.size // 102.4 x 136.533
XCTAssertTrue(sceneSprite.texture!.sameAs(sprite.texture!), "Scene sprite has wrong texture")
The sameAs method for SKTexture is implemented with the following extensions:
extension SKTexture {
func sameAs(texture: SKTexture) -> Bool {
return self.image.sameAs(texture.image)
}
var image: UIImage {
let view = SKView(frame:CGRectMake(0, 0, size().width, size().height))
let scene = SKScene(size: size())
let sprite = SKSpriteNode(texture: self)
sprite.position = CGPoint(x: CGRectGetMidX(view.frame), y: CGRectGetMidY(view.frame))
scene.addChild(sprite)
view.presentScene(scene)
return self.imageWithView(view)
}
func imageWithView(view: UIView) -> UIImage {
UIGraphicsBeginImageContextWithOptions(view.bounds.size, view.opaque, 0.0)
view.drawViewHierarchyInRect(view.bounds, afterScreenUpdates: true)
let image = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return image
}
}
extension UIImage {
func sameAs(image: UIImage) -> Bool {
let firstData = UIImagePNGRepresentation(self);
let secondData = UIImagePNGRepresentation(image);
return firstData.isEqual(secondData)
}
}
The problem is sometimes the tests passes and sometimes the test fails. I have change the code so it save the images on failure, and discovered the test fails because even though the first image is correct, the second image is completely black.
What can be done so the test will pass reliably?
This failure is happening on the simulator for iPad2.
I made the following change, and it seems to make the test pass reliably
func imageWithView(view: UIView) -> UIImage {
UIGraphicsBeginImageContext(view.bounds.size)
view.drawViewHierarchyInRect(view.bounds, afterScreenUpdates: false)
let image = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return image
}
What ideas do people have for why this allows the test to succeed?