Improving saved depth data image from TrueDepth sensor - swift

I'm trying to save depth data from an iPad Pro's FaceId TrueDepth sensor. I have taken this demo code and have added the following code with a simple button:
#IBAction func exportData(_ sender: Any) {
let ciimage = CIImage(cvPixelBuffer: realDepthData.depthDataMap)
let depthUIImage = UIImage(ciImage: ciimage)
let data = depthUIImage.pngData()
print("data: \(realDepthData.depthDataMap)")
do {
let directory = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0];
let path = directory.appendingPathComponent("FaceIdData.png");
try data!.write(to: path)
let activityViewController = UIActivityViewController(activityItems: [path], applicationActivities: nil)
activityViewController.popoverPresentationController?.sourceView = exportMeshButton
present(activityViewController, animated: true, completion: nil)
} catch {
print("Unable to save image")
}
}
realDepthData is a class property I added and that I update in dataOutputSynchronizer:
func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer,
didOutput synchronizedDataCollection: AVCaptureSynchronizedDataCollection) {
...
let depthData = syncedDepthData.depthData
let depthPixelBuffer = depthData.depthDataMap
self.realDepthData = depthData
...
}
I'm able to save the image (grey scale) but I'm losing some depth information, notably in the background where all objects are fully white. You can see this in the image bellow, the wall and the second person behind are not correcly appearing (all white). If I'm not mistaken, from what I've seen in the app, I should have more information!
Thanks!

Only 32-bit depth makes sense – you can see image's depth setting its gamma. .exr and .hdr file formats support 32-bit. .png and .jpg are generally 8-bit. You should also consider channels order when converting.

Related

ARKit, RealityKit: add AnchorEntities in the world after loading saved ARWorldMap

I am developing a simple AR app, where the user can:
select an object to add to the world from a list of objects
manipulate the object, changing its position, its orientation and its scale
remove the object
save all the objects added and modified as an ARWorldMap
load a previously saved ARWorldMap
I'm having some problems in getting the situation as it was before, when the user wants to load the previously saved ARWorldMap. In particular, when I re-add the objects, their positions and orientations are completely different that before.
Here are some details about the app. I initially add the objects as AnchorEntities. Doing some research I found out that the ARWorldMap doesn't save AnchorEntities but only ARAnchors, so I created a singleton object (AnchorEntitiesContainer in the following code) where I have a list of all the AnchorEntities added by the user that gets restored when the user wants to load a saved ARWorldMap.
Here is the initial insertion of objects in the world:
func addObject(objectName: String) {
let path = Bundle.main.path(forResource: objectName, ofType: "usdz")!
let url = URL(fileURLWithPath: path)
if let modelEntity = try? ModelEntity.loadModel(contentsOf: url) {
modelEntity.name = objectName
modelEntity.scale = [3.0, 3.0, 3.0]
let anchor = AnchorEntity(plane: .vertical, minimumBounds: [0.2, 0.2])
anchor.name = objectName + "_anchor"
anchor.addChild(modelEntity)
arView.scene.addAnchor(anchor)
modelEntity.generateCollisionShapes(recursive: true)
}
}
Here is the saving of the ARWorldMap and of the list of entities added:
func saveWorldMap() {
print("Save world map clicked")
self.arView.session.getCurrentWorldMap { worldMap, _ in
guard let map = worldMap else {
self.showAlert(title: "Can't get world map!", message: "Can't get current world map. Retry later.")
return
}
for anchor in self.arView.scene.anchors {
AnchorEntitiesContainer.sharedAnchorEntitiesContainer().addAnchorEntity(anchorEntity: anchorEntity)
}
do {
let data = try NSKeyedArchiver.archivedData(withRootObject: map, requiringSecureCoding: true)
try data.write(to: URL(fileURLWithPath: self.worldMapFilePath), options: [.atomic])
} catch {
fatalError("Can't save map: \(error.localizedDescription)")
}
}
showAlert(title: "Save world map", message: "AR World Map successfully saved!")
}
Here is the loading of the ARWorldMap and the re-insertion of the AnchorEntities:
func loadWorldMap() {
print("Load world map clicked")
let mapData = try? Data(contentsOf: URL(fileURLWithPath: self.worldMapFilePath))
let worldMap = try! NSKeyedUnarchiver.unarchivedObject(ofClass: ARWorldMap.self, from: mapData)
let configuration = self.defaultConfiguration
configuration.initialWorldMap = worldMap
self.arView.session.run(configuration, options: [.resetTracking, .removeExistingAnchors])
for anchorEntity in AnchorEntitiesContainer.sharedAnchorEntitiesContainer().getAnchorEntities()){
self.arView.scene.addAnchor(anchorEntity)
}
}
The problem is that the position and orientation of the AnchorEntities are completely different than before, so they are not where they should be.
Here are the things I tried:
I tried to save ModelEntities instead of AnchorEntities and repeat the initial insertion when loading the saved ARWorldMap but it didn't give the expected results.
I thought that maybe the problem was that the world origins are different when the ARWorldMap gets loaded, so I tried to restore the previous world origin but I don't know how to get that information and how to work with it.
I noticed that the ARWorldMap has a "center" parameter so I tried to modify the AnchorEntities transform matrix with that information but I never got what I wanted.
So my question is how do I load AnchorEntities into the world in exactly their previous positions and orientations when loading an ARWorldMap?

How to programmatically export 3D mesh as USDZ using ModelIO?

Is it possible to programmatically export 3D mesh as .usdz file format using ModelIO and MetalKit frameworks?
Here's a code:
import ARKit
import RealityKit
import MetalKit
import ModelIO
let asset = MDLAsset(bufferAllocator: allocator)
asset.add(mesh)
let filePath = FileManager.default.urls(for: .documentDirectory,
in: .userDomainMask).first!
let usdz: URL = filePath.appendingPathComponent("model.usdz")
do {
try asset.export(to: usdz)
let controller = UIActivityViewController(activityItems: [usdz],
applicationActivities: nil)
controller.popoverPresentationController?.sourceView = sender
self.present(controller, animated: true, completion: nil)
} catch let error {
fatalError(error.localizedDescription)
}
When I press a Save button I get an error.
The Andy Jazz answer is correct, but needs modification in order to work in a SwiftUI Sandboxed app:
First, the SCNScene needs to be rendered in order to export correctly. You can't create a bunch of nodes, stuff them into the scene's root node and call write() and get a correctly rendered usdz. It must first be put on screen in a SwiftUI SceneView, which causes all the assets to load, etc. I suppose you could instantiate a SCNRenderer and call prepare() on the root node, but that has some extra complications.
Second, the Sandbox prevents a direct export to a URL provided by .fileExporter(). This is because Scene.write() works in two steps: it first creates a .usdc export, and zips the resulting files into a single .usdz. The intermediate files don't have the write privileges the URL provided by .fileExporter() does (assuming you've set the Sandbox "User Selected File" privilege to "Read/Write"), so Scene.write() fails, even if the target URL is writeable, if the target directory is outside the Sandbox.
My solution was to write a custom FileWrapper, which I return if the WriteConfiguration UTType is .usdz:
public class USDZExportFileWrapper: FileWrapper {
var exportScene: SCNScene
public init(scene: SCNScene) {
exportScene = scene
super.init(regularFileWithContents: Data())
}
required init?(coder inCoder: NSCoder) {
fatalError("init(coder:) has not been implemented")
}
override public func write(to url: URL,
options: FileWrapper.WritingOptions = [],
originalContentsURL: URL?) throws {
let tempFilePath = NSTemporaryDirectory() + UUID().uuidString + ".usdz"
let tempURL = URL(fileURLWithPath: tempFilePath)
exportScene.write(to: tempURL, delegate: nil)
try FileManager.default.moveItem(at: tempURL, to: url)
}
}
Usage in a ReferenceFileDocument:
public func fileWrapper(snapshot: Data, configuration: WriteConfiguration) throws -> FileWrapper {
if configuration.contentType == .usdz {
return USDZExportFileWrapper(scene: scene)
}
return .init(regularFileWithContents: snapshot)
}
08th January 2023
At the moment iOS developers still can export only .usd, .usda and .usdc files; you can check this using canExportFileExtension(_:) type method:
let usd = MDLAsset.canExportFileExtension("usd")
let usda = MDLAsset.canExportFileExtension("usda")
let usdc = MDLAsset.canExportFileExtension("usdc")
let usdz = MDLAsset.canExportFileExtension("usdz")
print(usd, usda, usdc, usdz)
It prints:
true true true false
However, you can easily export SceneKit's scenes as .usdz files using instance method called: write(to:options:delegate:progressHandler:).
let path = FileManager.default.urls(for: .documentDirectory,
in: .userDomainMask)[0]
.appendingPathComponent("file.usdz")
sceneKitScene.write(to: path,
options: nil,
delegate: nil,
progressHandler: nil)

Vision and ARKit frameworks in Xcode project

I want to create an ARKit app using Xcode. I want it to recognize a generic rectangle without pressing a button and that subsequently the rectangle does a certain function.
How to do it?
You do not need ARKit to recognise rectangles, only Vision.
In case to recognise generic rectangles, use VNDetectRectanglesRequest.
As you rightly wrote, you need to use Vision or CoreML frameworks in your project along with ARKit. Also you have to create a pre-trained machine learning model (.mlmodel file) to classify input data to recognize your generic rectangle.
For creating a learning model use one of the following resources: TensorFlow, Turi, Caffe, or Keras.
Using .mlmodel with classification tags inside it, Vision requests return results as VNRecognizedObjectObservation objects, which identify objects found in the captured scene. So, if the image's corresponding tag is available via recognition process in ARSKView then an ARAnchor will be created (and SK/SCN object can be placed onto this ARAnchor).
Here's a snippet code on a topic "how it works":
import UIKit
import ARKit
import Vision
import SpriteKit
.................................................................
// file – ARBridge.swift
class ARBridge {
static let shared = ARBridge()
var anchorsToIdentifiers = [ARAnchor : String]()
}
.................................................................
// file – Scene.swift
DispatchQueue.global(qos: .background).async {
do {
let model = try VNCoreMLModel(for: Inceptionv3().model)
let request = VNCoreMLRequest(model: model, completionHandler: { (request, error) in
DispatchQueue.main.async {
guard let results = request.results as? [VNClassificationObservation], let result = results.first else {
print ("No results.")
return
}
var translation = matrix_identity_float4x4
translation.columns.3.z = -0.75
let transform = simd_mul(currentFrame.camera.transform, translation)
let anchor = ARAnchor(transform: transform)
ARBridge.shared.anchorsToIdentifiers[anchor] = result.identifier
sceneView.session.add(anchor: anchor)
}
}
let handler = VNImageRequestHandler(cvPixelBuffer: currentFrame.capturedImage, options: [:])
try handler.perform([request])
} catch {
print(error)
}
}
.................................................................
// file – ViewController.swift
func view(_ view: ARSKView, nodeFor anchor: ARAnchor) -> SKNode? {
guard let identifier = ARBridge.shared.anchorsToIdentifiers[anchor] else {
return nil
}
let labelNode = SKLabelNode(text: identifier)
labelNode.horizontalAlignmentMode = .center
labelNode.verticalAlignmentMode = .center
labelNode.fontName = UIFont.boldSystemFont(ofSize: 24).fontName
return labelNode
}
And you can download two Apple's projects (sample code) written by Vision engineers:
Recognizing Objects in Live Capture
Classifying Images with Vision and Core ML
Hope this helps.

Camera feed of dimensions one pixel by one pixel

This is a rather strange request, but I am looking to build an app that has a live camera feed taking up the whole screen. However, instead of displaying the normal resolution it would all be one color. In particular, I want to take the color of what normally would be the middle pixel on the screen and make that take up the entire screen. It needs to be done live and fast.
I attempted to make a function which saved the capturesession as a uiimage and then got the pixel data from that, however, it proved to be slow in real time. Any suggestions?
Assuming you have an AVCaptureSession setup. You need to setup a AVCaptureVideoDataOutput and then setup its sample buffer delegate. The delegate class should override func captureOutput(AVCaptureOutput!, CMSampleBuffer!, AVCaptureConnection!). Within this function you can get access to the pixel buffer to sample your centre point. You could do it as below. I've left the actual sampling of the centre point to you.
class MyClass : NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
func addVideoOutput() {
// Add video data output.
if session.canAddOutput(videoDataOutput)
{
videoDataOutput.setSampleBufferDelegate(self, queue: sessionQueue)
videoDataOutput.videoSettings = [kCVPixelBufferPixelFormatTypeKey as NSString:Int(kCVPixelFormatType_32BGRA)]
videoDataOutput.alwaysDiscardsLateVideoFrames = true
session.addOutput(videoDataOutput)
}
}
// AVCaptureVideoDataOutputSampleBufferDelegate
func captureOutput(_ captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, from connection: AVCaptureConnection!) {
if let buffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
process(pixelBuffer: buffer)
}
}
func process(pixelBuffer: CVPixelBuffer) {
let sourceRowBytes = CVPixelBufferGetBytesPerRow( pixelBuffer );
let width = CVPixelBufferGetWidth( pixelBuffer );
let height = CVPixelBufferGetHeight( pixelBuffer );
let rt = CVPixelBufferLockBaseAddress( pixelBuffer, .readOnly );
if (rt == kCVReturnSuccess) {
...
Do your processing of the pixeldata here
...
CVPixelBufferUnlockBaseAddress(pixelBuffer, .readOnly)
}
}
private let session = AVCaptureSession()
private let sessionQueue = DispatchQueue(label: "session queue", attributes: [], target: nil) // Communicate with the session
private let videoDataOutput = AVCaptureVideoDataOutput()
}

CIDetector Not Detecting Faces with UIImagePickerController (Swift)

I am trying to take a picture with the camera and then detect the faces in it. But it doesn't work... The results array returns a count of zero. I have tested this code with a picture of somebody from the internet and it returned 1 found face. Here is my code:
// MARK: - UIImagePickerControllerDelegate Methods
func imagePickerController(picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : AnyObject]) {
if let pickedImage = info[UIImagePickerControllerOriginalImage] as? UIImage {
Idea.CurrentIdea.idea.mockups.append(PFFile(data: UIImageJPEGRepresentation(pickedImage, 0.5)!))
//Face Detection
let cid:CIDetector = CIDetector(ofType:CIDetectorTypeFace, context:nil, options:[CIDetectorAccuracy: CIDetectorAccuracyHigh]);
let cii = CIImage(CGImage: pickedImage.CGImage!)
let results:NSArray = cid.featuresInImage(cii)
print(results.count)
for r in results {
let face:CIFaceFeature = r as! CIFaceFeature;
NSLog("Face found at (%f,%f) of dimensions %fx%f", face.bounds.origin.x, face.bounds.origin.y, face.bounds.width, face.bounds.height);
}
}
dismissViewControllerAnimated(true, completion: nil)
}
Any ideas? Thanks! There isn't much on the web about it recently.
Make sure that your image orientation and the expected image orientation for the detector are the same. See this answer for more detail: https://stackoverflow.com/a/17019107/919790