I want to achieve something similar like ARCore's raycast method which takes an arbitrary ray in world space coordinates instead of a screen-space point:
List<HitResult> hitTest (float[] origin3, int originOffset, float[] direction3, int directionOffset)
I see ARKit itself has not that method like that, but in any way maybe someone has an idea!
In Apple RealityKit and ARKit frameworks you can find three main types of Raycast methods: ARView Raycast, ARSession Raycast and Scene Raycast (or World Raycast). All methods written in Swift:
This instance method performs a ray cast, where a ray is cast into the scene from the center of the camera through a point in the view, and the results are immediately returned. You can use this type of raycast in ARKit.
func raycast(from point: CGPoint,
allowing target: ARRaycastQuery.Target,
alignment: ARRaycastQuery.TargetAlignment) -> [ARRaycastResult]
This instance method performs a convex ray cast against all the geometry in the scene for a ray of a given origin, direction, and length.
func raycast(origin: SIMD3<Float>,
direction: SIMD3<Float>,
query: CollisionCastQueryType,
mask: CollisionGroup,
relativeTo: Entity) -> [CollisionCastHit]
This instance method repeats a ray-cast query over time to notify you of updated surfaces in the physical environment. You can use this type of raycast in ARKit 3.5.
func trackedRaycast(_ query: ARRaycastQuery,
updateHandler: #escaping ([ARRaycastResult]) -> Void) -> ARTrackedRaycast?
This RealityKit's instance method also performs a tracked ray cast, but here a ray is cast into the scene from the center of the camera through a point in the view.
func trackedRaycast(from point: CGPoint,
allowing target: ARRaycastQuery.Target,
alignment: ARRaycastQuery.TargetAlignment,
updateHandler: #escaping ([ARRaycastResult]) -> Void) -> ARTrackedRaycast?
Code snippet 01:
import RealityKit
let startPosition: SIMD3<Float> = [3,-2,0]
let endPosition: SIMD3<Float> = [10,7,-5]
let query: CollisionCastQueryType = .all
let mask: CollisionGroup = .all
let raycasts: [CollisionCastHit] = arView.scene.raycast(from: startPosition,
to: endPosition,
query: query,
mask: mask,
relativeTo: nil)
guard let rayCast: CollisionCastHit = raycasts.first
else {
Code snippet 02:
import ARKit
let query = arView.raycastQuery(from: screenCenter,
allowing: .estimatedPlane,
alignment: .any)
let raycast = session.trackedRaycast(query) { results in
if let result = results.first {
object.transform = result.transform
I am working on a project where I have to place a green dot to be always in the center even when we rotate the camera in ARKit. I am using ARSCNView and I have added the node so far everything is good. Now I know I need to modify the position of the node in
func session(_ session: ARSession, didUpdate frame: ARFrame)
But I have no idea how to do that. I saw some example which was close to what I have but it does not run as it suppose to.
func session(_ session: ARSession, didUpdate frame: ARFrame) {
let location = sceneView.center
let hitTest = sceneView.hitTest(location, types: .featurePoint)
if hitTest.isEmpty {
print("No Plane Detected")
} else {
let columns = hitTest.first?.worldTransform.columns.3
let position = SCNVector3(x: columns!.x, y: columns!.y, z: columns!.z)
var node = sceneView.scene.rootNode.childNode(withName: "CenterShip", recursively: false) ?? nil
if node == nil {
let scene = SCNScene(named: "art.scnassets/ship.scn")!
node = scene.rootNode.childNode(withName: "ship", recursively: false)
node?.opacity = 0.7
let columns = hitTest.first?.worldTransform.columns.3
node!.name = "CenterShip"
node!.position = SCNVector3(x: columns!.x, y: columns!.y, z: columns!.z)
let position2 = node?.position
if position == position2! {
} else {
let action = SCNAction.move(to: position, duration: 0.1)
It doesn't matter how I rotate the camera this dot must be in the middle.
It's not clear exactly what you're trying to do, but I assume its one of the following:
A) Place the green dot centered in front of the camera at a fixed distance, eg. always exactly 1 meter in front of the camera.
B) Place the green dot centered in front of the camera at the depth of the nearest detected plane, i.e. using the results of a raycast from the mid point of the ARSCNView
I would have assumed A, but your example code is using (now deprecated) sceneView.hitTest() function which in this case would give you the depth of whatever is behind the pixel at sceneView.center
Anyway here's both:
Fixed Depth Solution
This is pretty straightforward, though there are few options. The simplest is to make the green dot a child node of the scene's camera node, and give it position with a negative z value, since z increases as a position moves toward the camera.
textNode.position = SCNVector3(x: 0, y: 0, z: -1)
As the camera moves, so too will its child nodes. More details in this very thorough answer
Scene Depth Solution
To determine the estimated depth behind a pixel, you should use ARSession.raycast instead of SceneView.hitTest, because the latter is definitely deprecated.
Note that, if the raycast() (or still hitTest()) methods return an empty result set (not uncommon given the complexity of scene estimation going on in ARKit), you won't have a position to update the node and this it might not be directly centered in every frame. To handle this is a bit more complex, as you'd need decide exactly what you want to do in that case.
The SCNAction is unnecessary and potentially causing problems. These delegate methods run 60fps, so simply updating the position directly will produce smooth results.
Adapting and simplifying the code you posted:
func createCenterShipNode() -> SCNNode {
let scene = SCNScene(named: "art.scnassets/ship.scn")!
let node = scene.rootNode.childNode(withName: "ship", recursively: false)
node!.opacity = 0.7
node!.name = "CenterShip"
return node!
func session(_ session: ARSession, didUpdate frame: ARFrame) {
// Check the docs for what the different raycast query parameters mean, but these
// give you the depth of anything ARKit has detected
guard let query = sceneView.raycastQuery(from: sceneView.center, allowing: .estimatedPlane, alignment: .any) else {
let results = session.raycast(query)
if let hit = results.first {
let node = sceneView.scene.rootNode.childNode(withName: "CenterShip", recursively: false) ?? createCenterShipNode()
let pos = hit.worldTransform.columns.3
node.simdPosition = simd_float3(pos.x, pos.y, pos.z)
See also: ARRaycastQuery
One last note - you generally don't want to do scene manipulation within this delegate method. It runs on a different thread than the Scenekit rendering thread, and SceneKit is very thread sensitive. This will likely work fine, but beyond adding or moving a node will certainly cause crashes from time to time. You'd ideally want to store the new position, and then update the actual scene contents from within the renderer(_ renderer: SCNSceneRenderer, updateAtTime time: TimeInterval) delegate method.
This question is also asked in the Apple Forum but so far, I have not seen any response there.
The question is really, after finding the point of interested from a frame in ARSession. How to convert that into 3D world coordinate.
How did I got a point:
let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .up, options: [:])
let handPoseRequest = VNDetectHumanHandPoseRequest()
try handler.perform([handPoseRequest])
Then I need to Raycast from the 2D point derived from ARFrame.capturedImage to 3D world coordinate:
fileprivate func convertVNPointTo3D(_ point: VNRecognizedPoint,
_ session: ARSession,
_ frame: ARFrame,
_ viewSize: CGSize) -> Transform? {
let pointX = (point.x / Double(frame.camera.imageResolution.width))*Double(viewSize.width)
let pointY = (point.y / Double(frame.camera.imageResolution.height))*Double(viewSize.height)
let query = frame.raycastQuery(from: CGPoint(x: pointX, y: pointY), allowing: .estimatedPlane, alignment: .any)
let results = session.raycast(query)
if let first = results.first {
return Transform(matrix: first.worldTransform)
} else {
return nil
According to API, I should use UI point. However, I do not know how capturedImage being converted to UI point. The calculate I used for the points are not correct.
The issue was the image orientation. In my case, using iPad back camera in Portrait direction, I need to do .downMirrored (instead of .up).
let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .downMirrored, options: [:])
Once getting the orientation correct, the point values from image recognition could be DIRECTLY used raycast.
There are three ways about Detecting Intersections in RealityKit framework, but I don't know how to use it in my project.
func raycast(origin: SIMD3<Float>,
direction: SIMD3<Float>,
length: Float,
query: CollisionCastQueryType,
mask: CollisionGroup,
relativeTo: Entity?) -> [CollisionCastHit]
func raycast(from: SIMD3<Float>,
to: SIMD3<Float>,
query: CollisionCastQueryType,
mask: CollisionGroup,
relativeTo: Entity?) -> [CollisionCastHit]
func convexCast(convexShape: ShapeResource,
fromPosition: SIMD3<Float>,
fromOrientation: simd_quatf,
toPosition: SIMD3<Float>,
toOrientation: simd_quatf,
query: CollisionCastQueryType,
mask: CollisionGroup,
relativeTo: Entity?) -> [CollisionCastHit]
Simple Ray-Casting
If you want to find out how to position a model made in Reality Composer into a RealityKit scene (that has a detected horizontal plane) using Ray-Casting method, use the following code:
import RealityKit
import ARKit
class ViewController: UIViewController {
#IBOutlet var arView: ARView!
let scene = try! Experience.loadScene()
#IBAction func onTap(_ sender: UITapGestureRecognizer) {
scene.steelBox!.name = "Parcel"
let tapLocation: CGPoint = sender.location(in: arView)
let estimatedPlane: ARRaycastQuery.Target = .estimatedPlane
let alignment: ARRaycastQuery.TargetAlignment = .horizontal
let result: [ARRaycastResult] = arView.raycast(from: tapLocation,
allowing: estimatedPlane,
alignment: alignment)
guard let rayCast: ARRaycastResult = result.first
else { return }
let anchor = AnchorEntity(world: rayCast.worldTransform)
Pay attention to a class ARRaycastQuery. This class comes from ARKit, not from RealityKit.
A Convex-Ray-Casting methods like raycast(from:to:query:mask:relativeTo:) is the op of swiping a convex shapes along a straight line and stopping at the very first intersection with any of the collision shape in the scene. Scene raycast() method performs a hit-tests against all entities with collision shapes in the scene. Entities without a collision shape are ignored.
You can use the following code to perform a convex-ray-cast from start position to end:
import RealityKit
let startPosition: SIMD3<Float> = [0, 0, 0]
let endPosition: SIMD3<Float> = [5, 5, 5]
let query: CollisionCastQueryType = .all
let mask: CollisionGroup = .all
let raycasts: [CollisionCastHit] = arView.scene.raycast(from: startPosition,
to: endPosition,
query: query,
mask: mask,
relativeTo: nil)
guard let rayCast: CollisionCastHit = raycasts.first
else { return }
print(rayCast.distance) /* The distance from the ray origin to the hit */
print(rayCast.entity.name) /* The entity's name that was hit */
A CollisionCastHit structure is a hit result of a collision cast and it lives in RealityKit's scene.
When you use raycast(from:to:query:mask:relativeTo:) method for measuring a distance from camera to entity it doesn't matter what an orientation of ARCamera is, it only matters what its position is in world coordinates.
The ARFaceTrackingConfiguration of ARKit places ARFaceAnchor with information about the position and orientation of the face onto the scene. Among others, this anchor has the lookAtPoint property that I'm interested in. I know that this vector is relative to the face. How can I draw a point on the screen for this position, meaning how can I translate this point's coordinates?
.lookAtPoint instance property is for direction's estimation only.
Apple documentation says: .lookAtPoint is a position in face coordinate space that is estimating only the gaze of face's direction. It's a vector of three scalar values, and it's just gettable, not settable:
var lookAtPoint: SIMD3<Float> { get }
In other words, this is the resulting vector from the product of two quantities – .rightEyeTransform and .leftEyeTransform instance properties (which also are just gettable):
var rightEyeTransform: simd_float4x4 { get }
var leftEyeTransform: simd_float4x4 { get }
Here's an imaginary situation on how you could use this instance property:
func renderer(_ renderer: SCNSceneRenderer, didUpdate node: SCNNode, for anchor: ARAnchor) {
if let faceAnchor = anchor as? ARFaceAnchor,
let faceGeometry = node.geometry as? ARSCNFaceGeometry {
if (faceAnchor.lookAtPoint.x >= 0) { // Looking (+X)
faceGeometry.firstMaterial?.diffuse.contents = UIImage(named: "redTexture.png")
} else { // Looking (-X)
faceGeometry.firstMaterial?.diffuse.contents = UIImage(named: "cyanTexture.png")
faceGeometry.update(from: faceAnchor.geometry)
facialExrpession(anchor: faceAnchor)
DispatchQueue.main.async {
self.label.text = self.textBoard
And here's an image showing axis directions for ARFaceTrackingConfiguration():
Answering your question:
I could say that you can't manage this point's coordinates directly because it's gettable-only property (and there is just XYZ orientation, not XYZ translation).
So if you need both – translation and rotation – use .rightEyeTransform and .lefttEyeTransform instance properties instead.
There are two projection methods:
FIRST. In SceneKit/ARKit you need to take the following instance method for projecting a point onto 2D view (for sceneView instance):
func projectPoint(_ point: SCNVector3) -> SCNVector3
let sceneView = ARSCNView()
SECOND. In ARKit you need to take the following instance method for projecting a point onto 2D view (for arCamera instance):
func projectPoint(_ point: simd_float3,
orientation: UIInterfaceOrientation,
viewportSize: CGSize) -> CGPoint
let camera = ARCamera()
camera.projectPoint(myPoint, orientation: myOrientation, viewportSize: vpSize)
This method helps you project a point from the 3D world coordinate system of the scene to the 2D pixel coordinate system of the renderer.
There's also the opposite method (for unprojecting a point):
func unprojectPoint(_ point: SCNVector3) -> SCNVector3
...and ARKit's opposite method (for unprojecting a point):
#nonobjc func unprojectPoint(_ point: CGPoint,
ontoPlane planeTransform: simd_float4x4,
orientation: UIInterfaceOrientation,
viewportSize: CGSize) -> simd_float3?
I need to convert a point in the 2d coordinate space of my ARSCNView to a coordinate in 3d space. Basically a ray from the point of view to the touched location (up to a set distance away).
I wanted to use arView.unprojectPoint(vec2d) for that, but the point returned always seems to be located in the center of the view
vec2d is a SCNVector3 created from a 2d coordinate like this
SCNVector3(x, y, 0) // 0 specifies camera near plane
What am I doing wrong? How do I get the desired result?
I think you have at least 2 possible solutions:
Use hitTest(_:types:) instance method:
This method searches for real-world objects or AR anchors in the captured camera image corresponding to a point in the SceneKit view.
let sceneView = ARSCNView()
func calculateVector(point: CGPoint) -> SCNVector3? {
let hitTestResults = sceneView.hitTest(point,
types: [.existingPlane])
if let result = hitTestResults.first {
return SCNVector3.init(SIMD3(result.worldTransform.columns.3.x,
return nil
calculateVector(point: yourPoint)
Use unprojectPoint(_:ontoPlane:) instance method:
This method returns the projection of a point from 2D view onto a plane in the 3D world space detected by ARKit.
#nonobjc func unprojectPoint(_ point: CGPoint,
ontoPlane planeTransform: simd_float4x4) -> simd_float3?
let point = CGPoint()
var planeTransform = simd_float4x4()
ontoPlane: planeTransform)
Add a empty node infront of camera at 'x' cm offset and making it the child of camera.
//Add a node in front of camera just after creating scene
hitNode = SCNNode()
hitNode!.position = SCNVector3Make(0, 0, -0.25) //25 cm offset
func unprojectedPosition(touch: CGPoint) -> SCNVector3 {
guard let hitNode = self.hitNode else {
return SCNVector3Zero
let projectedOrigin = sceneView.projectPoint(hitNode.worldPosition)
let offset = sceneView.unprojectPoint(SCNVector3Make(Float(touch.x), Float(touch.y), projectedOrigin.z))
return offset
