This question is also asked in the Apple Forum but so far, I have not seen any response there.
The question is really, after finding the point of interested from a frame in ARSession. How to convert that into 3D world coordinate.
How did I got a point:
let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .up, options: [:])
let handPoseRequest = VNDetectHumanHandPoseRequest()
....
try handler.perform([handPoseRequest])
Then I need to Raycast from the 2D point derived from ARFrame.capturedImage to 3D world coordinate:
fileprivate func convertVNPointTo3D(_ point: VNRecognizedPoint,
_ session: ARSession,
_ frame: ARFrame,
_ viewSize: CGSize) -> Transform? {
let pointX = (point.x / Double(frame.camera.imageResolution.width))*Double(viewSize.width)
let pointY = (point.y / Double(frame.camera.imageResolution.height))*Double(viewSize.height)
let query = frame.raycastQuery(from: CGPoint(x: pointX, y: pointY), allowing: .estimatedPlane, alignment: .any)
let results = session.raycast(query)
if let first = results.first {
return Transform(matrix: first.worldTransform)
} else {
return nil
}
}
According to API, I should use UI point. However, I do not know how capturedImage being converted to UI point. The calculate I used for the points are not correct.
Thanks.
The issue was the image orientation. In my case, using iPad back camera in Portrait direction, I need to do .downMirrored (instead of .up).
let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .downMirrored, options: [:])
Once getting the orientation correct, the point values from image recognition could be DIRECTLY used raycast.
Related
I am working on a project where I have to place a green dot to be always in the center even when we rotate the camera in ARKit. I am using ARSCNView and I have added the node so far everything is good. Now I know I need to modify the position of the node in
func session(_ session: ARSession, didUpdate frame: ARFrame)
But I have no idea how to do that. I saw some example which was close to what I have but it does not run as it suppose to.
func session(_ session: ARSession, didUpdate frame: ARFrame) {
let location = sceneView.center
let hitTest = sceneView.hitTest(location, types: .featurePoint)
if hitTest.isEmpty {
print("No Plane Detected")
return
} else {
let columns = hitTest.first?.worldTransform.columns.3
let position = SCNVector3(x: columns!.x, y: columns!.y, z: columns!.z)
var node = sceneView.scene.rootNode.childNode(withName: "CenterShip", recursively: false) ?? nil
if node == nil {
let scene = SCNScene(named: "art.scnassets/ship.scn")!
node = scene.rootNode.childNode(withName: "ship", recursively: false)
node?.opacity = 0.7
let columns = hitTest.first?.worldTransform.columns.3
node!.name = "CenterShip"
node!.position = SCNVector3(x: columns!.x, y: columns!.y, z: columns!.z)
sceneView.scene.rootNode.addChildNode(node!)
}
let position2 = node?.position
if position == position2! {
return
} else {
//action
let action = SCNAction.move(to: position, duration: 0.1)
node?.runAction(action)
}
}
}
It doesn't matter how I rotate the camera this dot must be in the middle.
It's not clear exactly what you're trying to do, but I assume its one of the following:
A) Place the green dot centered in front of the camera at a fixed distance, eg. always exactly 1 meter in front of the camera.
B) Place the green dot centered in front of the camera at the depth of the nearest detected plane, i.e. using the results of a raycast from the mid point of the ARSCNView
I would have assumed A, but your example code is using (now deprecated) sceneView.hitTest() function which in this case would give you the depth of whatever is behind the pixel at sceneView.center
Anyway here's both:
Fixed Depth Solution
This is pretty straightforward, though there are few options. The simplest is to make the green dot a child node of the scene's camera node, and give it position with a negative z value, since z increases as a position moves toward the camera.
cameraNode.addChildNode(textNode)
textNode.position = SCNVector3(x: 0, y: 0, z: -1)
As the camera moves, so too will its child nodes. More details in this very thorough answer
Scene Depth Solution
To determine the estimated depth behind a pixel, you should use ARSession.raycast instead of SceneView.hitTest, because the latter is definitely deprecated.
Note that, if the raycast() (or still hitTest()) methods return an empty result set (not uncommon given the complexity of scene estimation going on in ARKit), you won't have a position to update the node and this it might not be directly centered in every frame. To handle this is a bit more complex, as you'd need decide exactly what you want to do in that case.
The SCNAction is unnecessary and potentially causing problems. These delegate methods run 60fps, so simply updating the position directly will produce smooth results.
Adapting and simplifying the code you posted:
func createCenterShipNode() -> SCNNode {
let scene = SCNScene(named: "art.scnassets/ship.scn")!
let node = scene.rootNode.childNode(withName: "ship", recursively: false)
node!.opacity = 0.7
node!.name = "CenterShip"
sceneView.scene.rootNode.addChildNode(node!)
return node!
}
func session(_ session: ARSession, didUpdate frame: ARFrame) {
// Check the docs for what the different raycast query parameters mean, but these
// give you the depth of anything ARKit has detected
guard let query = sceneView.raycastQuery(from: sceneView.center, allowing: .estimatedPlane, alignment: .any) else {
return
}
let results = session.raycast(query)
if let hit = results.first {
let node = sceneView.scene.rootNode.childNode(withName: "CenterShip", recursively: false) ?? createCenterShipNode()
let pos = hit.worldTransform.columns.3
node.simdPosition = simd_float3(pos.x, pos.y, pos.z)
}
}
See also: ARRaycastQuery
One last note - you generally don't want to do scene manipulation within this delegate method. It runs on a different thread than the Scenekit rendering thread, and SceneKit is very thread sensitive. This will likely work fine, but beyond adding or moving a node will certainly cause crashes from time to time. You'd ideally want to store the new position, and then update the actual scene contents from within the renderer(_ renderer: SCNSceneRenderer, updateAtTime time: TimeInterval) delegate method.
I am trying to place an object above a QR code in swift. I am able to detect the QR code however the location is wrong for the placement of the box. I don't really understand how the placement works. I know how to place objects down on planes. Do I need to relate the detected planes somehow with where it says it detects the QR code? Any information on how this SCNVector columns thing works would be appreciated as well haha. Also if CIDector is out dated and there is a new method.
Here is a snippet of detecting the QRCode and placing the box:
var discoveredQRCodes = String
func session(_ session: ARSession, didUpdate frame: ARFrame) {
//print("Updated")
if time != 0.5 {
return
}
DispatchQueue.global(qos: .background).async {
let image = CIImage(cvPixelBuffer: frame.capturedImage)
let detector = CIDetector(ofType: CIDetectorTypeQRCode, context: nil, options: nil)
let features = detector!.features(in: image)
for feature in features as! [CIQRCodeFeature] {
if !self.discoveredQRCodes.contains(feature.messageString!) {
self.discoveredQRCodes.append(feature.messageString!)
let url = URL(string: feature.messageString!)
let position = SCNVector3(frame.camera.transform.columns.3.x,
frame.camera.transform.columns.3.y,
frame.camera.transform.columns.3.z)
// add3DModel(fromURL: url!, toPosition: getPositionBasedOnQRCode(frame: frame, position: "df"))
print(position)
print(url)
DispatchQueue.main.async {
let boxNode = SCNNode()
boxNode.geometry = SCNBox(width: 0.04, height: 0.04, length: 0.04, chamferRadius: 0.002)
boxNode.geometry?.firstMaterial?.diffuse.contents = UIColor.green
boxNode.position = position
boxNode.name = "node"
self.arView.scene.rootNode.addChildNode(boxNode)
}
//add3dInstance(fromURL: url!, toPosition: position)
}
}
}
}
Here is an image of the result:
Here is some debug output:
SCNVector3(x: 0.023941405, y: 0.040143043, z: 0.056782123)
First of all, you set up the boxNode position to the camera position. It's not what you want.
Secondly, any QR code detector provides 2d bounding box coordinates in the image space. To translate 2d coordinates to scene coordinates you need to find a ray from the camera to the QR code plane.
Please, check the code here.
I am starting to use ARKit and I have a use case where I want to know the motion from a known position to another one.
So I was wondering if it is possible (like every tracking solution) to set a known position and orientation a starting point of the tracking in ARKit?
Regards
There are at least six approaches allowing you set a starting point for a model. But using no ARAnchors at all in your ARScene is considered as bad AR experience (although Apple's Augmented Reality app template has no any ARAnchors in a code).
First approach
This is the approach that Apple engineers propose us in Augmented Reality app template in Xcode. This approach doesn't use anchoring, so all you need to do is to accommodate a model in air with coordinates like (x: 0, y: 0, z: -0.5) or in other words your model will be 50 cm away from camera.
override func viewDidLoad() {
super.viewDidLoad()
sceneView.scene = SCNScene(named: "art.scnassets/ship.scn")!
let model = sceneView.scene.rootNode.childNode(withName: "ship",
recursively: true)
model?.position.z = -0.5
sceneView.session.run(ARWorldTrackingConfiguration())
}
Second approach
Second approach is almost the same as the first one, except it uses ARKit's anchor:
guard let sceneView = self.view as? ARSCNView
else { return }
if let currentFrame = sceneView.session.currentFrame {
var translation = matrix_identity_float4x4
translation.columns.3.z = -0.5
let transform = simd_mul(currentFrame.camera.transform, translation)
let anchor = ARAnchor(transform: transform)
sceneView.session.add(anchor: anchor)
}
Third approach
You can also create a pre-defined model's position pinned with ARAnchor using third approach, where you need to import RealityKit module as well:
func session(_ session: ARSession, didUpdate anchors: [ARAnchor]) {
let model = ModelEntity(mesh: MeshResource.generateSphere(radius: 1.0))
// ARKit's anchor
let anchor = ARAnchor(transform: simd_float4x4(diagonal: [1,1,1]))
// RealityKit's anchor based on position of ARAnchor
let anchorEntity = AnchorEntity(anchor: anchor)
anchorEntity.addChild(model)
arView.scene.anchors.append(anchorEntity)
}
Fourth approach
If you turned on a plane detection feature you can use Ray-casting or Hit-testing methods. As a target object you can use a little sphere (located at 0, 0, 0) that will be ray-casted.
let query = arView.raycastQuery(from: screenCenter,
allowing: .estimatedPlane,
alignment: .any)
let raycast = session.trackedRaycast(query) { results in
if let result = results.first {
object.transform = result.transform
}
}
Fifth approach
This approach is focused to save and share ARKit's worldMaps.
func writeWorldMap(_ worldMap: ARWorldMap, to url: URL) throws {
let data = try NSKeyedArchiver.archivedData(withRootObject: worldMap,
requiringSecureCoding: true)
try data.write(to: url)
}
func loadWorldMap(from url: URL) throws -> ARWorldMap {
let mapData = try Data(contentsOf: url)
guard let worldMap = try NSKeyedUnarchiver.unarchivedObject(ofClass: ARWorldMap.self,
from: mapData)
else {
throw ARError(.invalidWorldMap)
}
return worldMap
}
Sixth approach
In ARKit 4.0 a new ARGeoTrackingConfiguration is implemented with the help of MapKit module. So now you can use a pre-defined GPS data.
func session(_ session: ARSession, didAdd anchors: [ARAnchor]) {
for geoAnchor in anchors.compactMap({ $0 as? ARGeoAnchor }) {
arView.scene.addAnchor(Entity.placemarkEntity(for: geoAnchor)
}
}
Let's say I draw a circle in the middle of my screen, to use it as a target. If I point this circle to a node, how is it possible for ARKit to detect it?
For now I'm using the tap method
#IBAction func tapHandler(_ sender: UITapGestureRecognizer) {
let viewTouchLocation:CGPoint = sender.location(in: sceneView)
guard let result = sceneView.hitTest(viewTouchLocation, options: nil).first else {
return
}
// ...etc
}
which works really well, but it would be so much better to detect a node just by pointing the camera at it.
let screenRect = UIScreen.main.bounds
let screenWidth = screenRect.size.width
let screenHeight = screenRect.size.height
let location = CGPoint(x:screenWidth/2,y:screenHeight/2)
use location in hittest
I'm actually trying to put a 3D Object on QRCode with ARKit
For that I use a AVCaptureDevice to detect a QRCode and establish the area of the QRCode that gives me a CGRect.
Then, I make a hitTest on every point of the CGRect to get the average 3D coordinates like so :
positionGiven = SCNVector3(0, 0, 0)
for column in Int(qrZone.origin.x)...2*Int(qrZone.origin.x + qrZone.width) {
for row in Int(qrZone.origin.y)...2*Int(qrZone.origin.y + qrZone.height) {
for result in sceneView.hitTest(CGPoint(x: CGFloat(column)/2,y:CGFloat(row)/2), types: [.existingPlaneUsingExtent,.featurePoint]) {
positionGiven.x+=result.worldTransform.columns.3.x
positionGiven.y+=result.worldTransform.columns.3.y
positionGiven.z+=result.worldTransform.columns.3.z
cpts += 1
}
}
}
positionGiven.x=positionGiven.x/cpts
positionGiven.y=positionGiven.y/cpts
positionGiven.z=positionGiven.z/cpts
But the hitTest doesn't detect any result and freeze the camera while when I make a hitTest with a touch on screen it works.
Do you have any idea why it's not working ?
Do you have an other idea that can help me to achieve what I want to do ?
I already thought about 3D translation with CoreMotion that can give me the tilt of the device but that seems really tedious.
I also heard about ARWorldAlignmentCamera that can locked the scene coordinate to match the orientation of the camera but I don't know how to use it !
Edit : I try to move my 3D Object every time I touch the screen and the hitTest is positive, and it's pretty accurate ! I really don't understand why hitTest on an area of pixels doesn't work...
Edit 2 : Here is the code of the hitTest who works with 2-5 touches on the screen:
#objc func touch(sender : UITapGestureRecognizer) {
for result in sceneView.hitTest(CGPoint(x: sender.location(in: view).x,y: sender.location(in: view).y), types: [.existingPlaneUsingExtent,.featurePoint]) {
//Pop up message for testing
alert("\(sender.location(in: view))", message: "\(result.worldTransform.columns.3)")
//Moving the 3D Object to the new coordinates
let objectList = sceneView.scene.rootNode.childNodes
for object : SCNNode in objectList {
object.removeFromParentNode()
}
addObject(SCNVector3(result.worldTransform.columns.3.x,result.worldTransform.columns.3.y,result.worldTransform.columns.3.z))
}
}
Edit 3 :
I manage to resolve my problem partially.
I take the transform matrix of the camera (session.currentFrame.camera.transform) so that the object is in front of the camera.
Then I apply a translation on (x,y) with the position of the CGRect.
However i can't translate the z-axis because i don't have enough informations.
And I will probably need a estimation of z coordinate like the hitTest do..
Thanks in advance ! :)
You could use Apple's Vision API to detect the QR code and place an anchor.
To start detecting QR codes, use:
var qrRequests = [VNRequest]()
var detectedDataAnchor: ARAnchor?
var processing = false
func startQrCodeDetection() {
// Create a Barcode Detection Request
let request = VNDetectBarcodesRequest(completionHandler: self.requestHandler)
// Set it to recognize QR code only
request.symbologies = [.QR]
self.qrRequests = [request]
}
In ARSession's didUpdate Frame
public func session(_ session: ARSession, didUpdate frame: ARFrame) {
DispatchQueue.global(qos: .userInitiated).async {
do {
if self.processing {
return
}
self.processing = true
// Create a request handler using the captured image from the ARFrame
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage,
options: [:])
// Process the request
try imageRequestHandler.perform(self.qrRequests)
} catch {
}
}
}
Handle the Vision QR request and trigger the hit test
func requestHandler(request: VNRequest, error: Error?) {
// Get the first result out of the results, if there are any
if let results = request.results, let result = results.first as? VNBarcodeObservation {
guard let payload = result.payloadStringValue else {return}
// Get the bounding box for the bar code and find the center
var rect = result.boundingBox
// Flip coordinates
rect = rect.applying(CGAffineTransform(scaleX: 1, y: -1))
rect = rect.applying(CGAffineTransform(translationX: 0, y: 1))
// Get center
let center = CGPoint(x: rect.midX, y: rect.midY)
DispatchQueue.main.async {
self.hitTestQrCode(center: center)
self.processing = false
}
} else {
self.processing = false
}
}
func hitTestQrCode(center: CGPoint) {
if let hitTestResults = self.latestFrame?.hitTest(center, types: [.featurePoint] ),
let hitTestResult = hitTestResults.first {
if let detectedDataAnchor = self.detectedDataAnchor,
let node = self.sceneView.node(for: detectedDataAnchor) {
let previousQrPosition = node.position
node.transform = SCNMatrix4(hitTestResult.worldTransform)
} else {
// Create an anchor. The node will be created in delegate methods
self.detectedDataAnchor = ARAnchor(transform: hitTestResult.worldTransform)
self.sceneView.session.add(anchor: self.detectedDataAnchor!)
}
}
}
Then handle adding the node when the anchor is added.
func renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode? {
// If this is our anchor, create a node
if self.detectedDataAnchor?.identifier == anchor.identifier {
let sphere = SCNSphere(radius: 1.0)
sphere.firstMaterial?.diffuse.contents = UIColor.redColor()
let sphereNode = SCNNode(geometry: sphere)
sphereNode.transform = SCNMatrix4(anchor.transform)
return sphereNode
}
return nil
}
Source