I'm very new to Xcode, so any and all help would be a godsend. I'm trying to write an app that saves the positional and rotational data from an iPhone at a set interval and saves it to a file. Right now, I'm not sure where to look when it comes to getting that data.
CoreMotion seems to not be enough so I'm using ARKit. I have a sceneView where I can see the origin and the feature points, but again, I'm stuck when it comes to where or even if the camera's position is tracked.
You can retrieve ARCamera's position and rotation via Transform Matrix (simd_float4x4). This data is contained inside every ARFrame of a running ARSession (for selfie or rear camera).
let sceneView = ARSCNView(frame: .zero)
sceneView.delegate = self
let frame: ARFrame = sceneView.session.currentFrame!
let cameraPosition: simd_float4 = frame.camera.transform.columns.3
let cameraRotation: simd_float3 = frame.camera.eulerAngles
The best place for these lines is SceneKit's renderer(_:didUpdate:for:) instance method. Take into consideration, that ARCamera transform values coming from IMU sensors are specially filtered.
In ARKit, I am using the height and width of reference images (as entered by me into XCode in the AR Resource Group) to overlay planes of the same size onto matched images. Regardless of whether I enter accurate reference image dimensions, ARKit accurately overlays the plane onto the real world (i.e., the plane correctly covers the matched image in the ARSCNView).
If I understand correctly, estimatedScaleFactor tells me the difference between the true size of the reference image and the values I entered in the Resource Group.
My question is, if ARKit is able to figure the true size of the object shown in the reference image, when/why would I need to worry about entering accurate height and width values.
(My reference images are public art and accurately measuring them is sometimes difficult.)
Does ARKit have to work harder, or are there scenarios where I would stop getting good results without accurate Reference Image measurements?
ADDITIONAL INFO: As a concrete example, if I was matching movie posters, I would take a photo of the poster, load it into an AR Resource Group, and arbitrarily set the width to something like one meter (allowing Xcode to set the other dimension based on the proportions of the image).
Then, when ARKit matches the image, I would put a plane on it in renderer(_:didAdd:for:)
let plane = SCNPlane(width: referenceImage.physicalSize.width,
height: referenceImage.physicalSize.height)
plane.firstMaterial?.diffuse.contents = UIColor.planeColor
let planeNode = SCNNode(geometry: plane)
planeNode.eulerAngles.x = -.pi / 2
node.addChildNode(planeNode)
This appears to work as desired--the plane convincingly overlays the matched image--in spite of the fact that the dimensions I entered for the reference image are inaccurate. (And yes, estimatedScaleFactor does give a good approximation of by how much my arbitrary dimensions are off by.)
So, what I am trying to understand is whether this will break down in some scenarios (and when, and what I need to learn to understand why!). If my reference image dimensions are not accurate, will that negatively impact placing planes or other objects onto the node provided by ARKit?
Put another way, if ARKit is correctly understanding the world and reference images without accurate ref image measurements, does that mean I can get away with never entering accurate measurements for ref images?
As official documentation suggests:
The default value of estimatedScaleFactor (a factor between the initial size and the estimated physical size) is 1.0, which means that a version of this image that ARKit recognizes in the physical environment exactly matches its reference image physicalSize.
Otherwise, ARKit automatically corrects the image anchor's transform when estimatedScaleFactor is a value other than 1.0. This adjustment in turn, corrects ARKit's understanding of where the image anchor is located in the physical environment.
var estimatedScaleFactor: CGFloat { get }
For more precise scale of 3D model you need to measure your real-world image and when AR app will be running, ARKit measures its observable reference image. ARImageAnchor stores a value of estimatedScaleFactor property, thus ARKit registers a difference in scale factor, and then it applies the new scale to 3D model and you model becomes bigger or smaller, that estimatedScaleFactor is for.
However, there's an automatic methodology:
To accurately recognize the position and orientation of an image in the AR environment, ARKit must know the image's physical size. You provide this information when creating an AR reference image in your Xcode project's asset catalog, or when programmatically creating an ARReferenceImage.
When you want to recognize different-sized versions of a reference image, you set automaticImageScaleEstimationEnabled to true, and in this case, ARKit disregards physicalSize.
var automaticImageScaleEstimationEnabled: Bool { get set }
Following Apple's Creating an Immersive AR Experience with Audio
, I thought it would be interesting to experiment and try to place objects anywhere and not just on a vertical and horizontal plane. Is it at all possible to place an object using touch without plane detection? I understand that plane detection would increase the accuracy of hit tests and ARAnchor detection, so would there be any way where one could perform hit tests on any other location in the scene?
If your AR scene already contains any 3D geometry in a current session you can definitely use hit-testing to place a new model there (a placement based on already contained 3D geometry), or you can use feature points for model's placement (if any).
If there's no 3D geometry at all in your AR scene, or there's a extremely sparse point cloud, what do you apply hit-testing method to? Hit-test is a projected 2D point from screen-space onto a 3D surface (remember, detected planes are hidden 3D planes), or onto any appropriate feature point.
So, in AR, plane detection is crucial when developer is using hit-testing.
func hitTest(_ point: CGPoint,
types: ARHitTestResult.ResultType) -> [ARHitTestResult]
Here you can see all the ARHitTestResult.ResultType available.
But pay attention to this, there's a hitTest method returning SCNHitTestResult:
func hitTest(_ point: CGPoint,
options: [SCNHitTestOption : Any]?) -> [SCNHitTestResult]
Usage:
let touchPosition: CGPoint = gesture.location(in: sceneView)
let hitTestResult = sceneView.hitTest(touchPosition,
types: .existingPlaneUsingExtent)
or:
let hitTestResult = sceneView.hitTest(touchPosition,
types: .featurePoint)
Also, hit-testing is actively used in 3D games but it's rather for VR there than for AR.
In my Swift / ARKit / SceneKit project, I need to tell if the user's face in front-facing camera is parallel to the camera.
I was able to tell horizontal parallel by comparing the left and right eyes distance (using faceAnchor.leftEyeTransform and the worldPosition property) from the camera.
But I am stuck on vertical parallel. Any ideas, how to achieve that?
Assuming you are using ARFaceTrackingConfiguration in your app, you can actually retrieve the transforms of both the ARFaceAnchor and the camera to determine their orientations. You can get a simd_float4x4 matrix of the head's orientation in world space by using ARFaceAnchor.transform property. Similarly, you can get the transform of the SCNCamera or ARCamera of your scene.
To compare the camera's and face's orientations relative to each other in a SceneKit app (though there are similar functions on the ARKit side of things), I get the world transform for the node that is attached to each of them, let's call them faceNode attached to the ARFaceAnchor and cameraNode representing the ARSCNView.pointOfView. To find the angle between the camera and your face, for example, you could do something like this:
let faceOrientation: simd_quatf = faceNode.simdWorldTransform
let cameraOrientation: simd_quatf = cameraNode.simdWorldTransform
let deltaOrientation: simd_quatf = faceOrientation.inverse * cameraOrientation
By looking at deltaOrientation.angle and deltaOrientation.axis you can determine the relative angles on each axis between the face and the camera. If you do something like deltaOrientation.axis * deltaOrientation.angles, you have a simd_float3 vector giving you a sense of the pitch, yaw and roll (in radians) of the head relative to the camera.
There are a number of ways you can do this using the face anchor and camera transforms, but this simd quaternion method works quite well for me. Hope this helps!
I'm trying to understand and use ARKit. But there is one thing that I cannot fully understand.
Apple said about ARAnchor:
A real-world position and orientation that can be used for placing objects in an AR scene.
But that's not enough. So my questions are:
What is ARAnchor exactly?
What are the differences between anchors and feature points?
Is ARAnchor just part of feature points?
And how does ARKit determines its anchors?
Updated: February 02, 2023.
TL;DR
ARAnchor
ARAnchor is an invisible null-object that holds a 3D model at anchor's position. Think of ARAnchor as a parent transform node of your model that you can translate, rotate and scale like any other SceneKit node. Every 3D model has a pivot point, right? Thus, this pivot point must match a location of an ARAnchor in AR app.
If you're not using anchors in ARKit or ARCore app (in RealityKit, however, it's impossible not to use anchors because they are integral part of a scene), your 3D models may drift from where they were placed, and this will dramatically impact app’s realism and user experience. Thus, anchors are crucial elements of any AR scene.
According to ARKit 2017 documentation:
ARAnchor is a real-world position and orientation that can be used for placing objects in AR Scene. Adding an anchor to the session helps ARKit to optimize world-tracking accuracy in the area around that anchor, so that virtual objects appear to stay in place relative to the real world. If a virtual object moves, remove the corresponding anchor from the old position and add one at the new position.
ARAnchor is a parent class of other 10 anchors' types in ARKit, hence all those subclasses inherit from ARAnchor. Usually you do not use ARAnchor directly. I must also say that ARAnchor and Feature Points have nothing in common. Feature Points are rather special visual elements for tracking and debugging.
ARAnchor doesn't automatically track a real world target. When you need automation, you have to use renderer() or session() instance methods that can be implemented in case you comformed to ARSCNViewDelegate or ARSessionDelegate protocols, respectively.
Here's an image with visual representation of plane anchor. Keep in mind: you can neither see a detected plane nor its corresponding ARPlaneAnchor, by default. So, if want to see the anchor in your scene, you may "visualize" it using three thin SCNCylinder primitives. Each color of the cylinder represents a particular axis: so RGB is XYZ.
In ARKit you can automatically add ARAnchors to your scene using different scenarios:
ARPlaneAnchor
If horizontal and/or vertical planeDetection instance property is ON, ARKit is able to add ARPlaneAnchors to anchors' collection in the running session. Sometimes activated planeDetection considerably increases a time required for scene understanding stage.
ARImageAnchor (conforms to ARTrackable protocol)
This type of anchors contains information about a transform of a detected image – anchor is placed at image's center – on world-tracking or image-tracking config. To activate image tracking, use detectionImages instance property. In ARKit 2.0 you can totally track up to 25 images, in ARKit 3.0 / 4.0 – up to 100 images, respectively. But, in both cases, not more than just 4 images simultaneously. However, it was promised, that in ARKit 5.0 / 6.0, you can detect and track up to 100 images at a time (but it's still not implemented yet).
ARBodyAnchor (conforms to ARTrackable protocol)
You can turn on body tracking by running a session based on ARBodyTrackingConfig(). You'll get ARBodyAnchor at a Root Joint of a real performer's skeleton or, in other words, at pelvis position of a tracked character.
ARFaceAnchor (conforms to ARTrackable protocol)
Face Anchor stores information about head's topology, pose and face expression. You can track ARFaceAnchor with a help of the front TrueDepth camera. When face is detected, Face Anchor will be attached slightly behind a nose, in the center of a face. In ARKit 2.0 you can track just one face, in ARKit 3.0 and higher – up to 3 faces, simultaneously. However, the number of tracked faces depends on presence of a TrueDepth sensor and processor version: gadgets with TrueDepth camera can track up to 3 faces, gadgets with A12+ chipset, but without TrueDepth camera, can also track up to 3 faces.
ARObjectAnchor
This anchor's type keeps an information about 6 Degrees of Freedom (position and orientation) of a real-world 3D object detected in a world-tracking session. Remember that you need to specify ARReferenceObject instances for detectionObjects property of session config.
AREnvironmentProbeAnchor
Probe Anchor provides environmental lighting information for a specific area of space in a world-tracking session. ARKit's Artificial Intelligence uses it to supply reflective shaders with environmental reflections.
ARParticipantAnchor
This is an indispensable anchor type for multiuser AR experiences. If you want to employ it, use true value for isCollaborationEnabled property in ARWorldTrackingConfig. Then import MultipeerConnectivity framework.
ARMeshAnchor
ARKit and LiDAR subdivide the reconstructed real-world scene surrounding the user into mesh anchors with corresponding polygonal geometry. Mesh anchors constantly update their data as ARKit refines its understanding of the real world. Although ARKit updates a mesh to reflect a change in the physical environment, the mesh's subsequent change is not intended to reflect in real time. Sometimes your reconstructed scene can have up to 30-40 anchors or even more. This is due to the fact that each classified object (wall, chair, door or table) has its own personal anchor. Each ARMeshAnchor stores data about corresponding vertices, one of eight cases of classification, its faces and vertices' normals.
ARGeoAnchor (conforms to ARTrackable protocol)
In ARKit 4.0+ there's a geo anchor (a.k.a. location anchor) that tracks a geographic location using GPS, Apple Maps and additional environment data coming from Apple servers. This type of anchor identifies a specific area in the world that the app can refer to. When a user moves around the scene, the session updates a location anchor’s transform based on coordinates and device’s compass heading of a geo anchor. Look at the list of supported cities.
ARAppClipCodeAnchor (conforms to ARTrackable protocol)
This anchor tracks the position and orientation of App Clip Code in the physical environment in ARKit 4.0+. You can use App Clip Codes to enable users to discover your App Clip in the real world. There are NFC-integrated App Clip Code and scan-only App Clip Code.
There are also other regular approaches to create anchors in AR session:
Hit-Testing methods
Tapping on the screen, projects a point onto a invisible detected plane, placing ARAnchor on a location where imaginary ray intersects with this plane. By the way, ARHitTestResult class and its corresponding hit-testing methods for ARSCNView and ARSKView will be deprecated in iOS 14, so you have to get used to a Ray-Casting.
Ray-Casting methods
If you're using ray-casting, tapping on the screen results in a projected 3D point on an invisible detected plane. But you can also perform Ray-Casting between A and B positions in 3D scene. So, ray-casting can be 2D-to-3D and 3D-to-3D. When using the Tracked Ray-Casting, ARKit can keep refining the ray-cast as it learns more and more about detected surfaces.
Feature Points
Special yellow points that ARKit automatically generates on a high-contrast margins of real-world objects, can give you a place to put an ARAnchor on.
ARCamera's transform
iPhone's or iPad's camera position and orientation simd_float4x4 can be easily used as a place for ARAnchor.
Any arbitrary World Position
Place a custom ARWorldAnchor anywhere in your scene. You can generate ARKit's version of world anchor like AnchorEntity(.world(transform: mtx)) found in RealityKit.
This code snippet shows you how to use an ARPlaneAnchor in a delegate's method: renderer(_:didAdd:for:):
func renderer(_ renderer: SCNSceneRenderer,
didAdd node: SCNNode,
for anchor: ARAnchor) {
guard let planeAnchor = anchor as? ARPlaneAnchor
else { return }
let grid = Grid(anchor: planeAnchor)
node.addChildNode(grid)
}
AnchorEntity
AnchorEntity is alpha and omega in RealityKit. According to RealityKit documentation 2019:
AnchorEntity is an anchor that tethers virtual content to a real-world object in an AR session.
RealityKit framework and Reality Composer app were announced at WWDC'19. They have a new class named AnchorEntity. You can use AnchorEntity as the root point of any entities' hierarchy, and you must add it to the Scene anchors collection. AnchorEntity automatically tracks real world target. In RealityKit and Reality Composer AnchorEntity is at the top of hierarchy. This anchor is able to hold a hundred of models and in this case it's more stable than if you use 100 personal anchors for each model.
Let's see how it looks in a code:
func makeUIView(context: Context) -> ARView {
let arView = ARView(frame: .zero)
let modelAnchor = try! Experience.loadModel()
arView.scene.anchors.append(modelAnchor)
return arView
}
AnchorEntity has three components:
Anchoring component
Transform component
Synchronization component
To find out the difference between ARAnchor and AnchorEntity look at THIS POST.
Here are nine AnchorEntity's cases available in RealityKit 2.0 for iOS:
// Fixed position in the AR scene
AnchorEntity(.world(transform: mtx))
// For body tracking (a.k.a. Motion Capture)
AnchorEntity(.body)
// Pinned to the tracking camera
AnchorEntity(.camera)
// For face tracking (Selfie Camera config)
AnchorEntity(.face)
// For image tracking config
AnchorEntity(.image(group: "GroupName", name: "forModel"))
// For object tracking config
AnchorEntity(.object(group: "GroupName", name: "forObject"))
// For plane detection with surface classification
AnchorEntity(.plane([.any], classification: [.seat], minimumBounds: [1, 1]))
// When you use ray-casting
AnchorEntity(raycastResult: myRaycastResult)
// When you use ARAnchor with a given identifier
AnchorEntity(.anchor(identifier: uuid))
// Creates anchor entity on a basis of ARAnchor
AnchorEntity(anchor: arAnchor)
And here are only two AnchorEntity's cases available in RealityKit 2.0 for macOS:
// Fixed world position in VR scene
AnchorEntity(.world(transform: mtx))
// Camera transform
AnchorEntity(.camera)
Also it’s not superfluous to say that you can use any subclass of ARAnchor for AnchorEntity needs:
var anchor = AnchorEntity()
func session(_ session: ARSession, didAdd anchors: [ARAnchor]) {
guard let faceAnchor = anchors.first as? ARFaceAnchor
else { return }
arView.session.add(anchor: faceAnchor) // ARKit Session
self.anchor = AnchorEntity(anchor: faceAnchor)
anchor.addChild(model)
arView.scene.anchors.append(self.anchor) // RealityKit Scene
}
Reality Composer's anchors:
At the moment (February 2022) Reality Composer has just 4 types of AnchorEntities:
// 1a
AnchorEntity(plane: .horizontal)
// 1b
AnchorEntity(plane: .vertical)
// 2
AnchorEntity(.image(group: "GroupName", name: "forModel"))
// 3
AnchorEntity(.face)
// 4
AnchorEntity(.object(group: "GroupName", name: "forObject"))
AR USD Schemas
And of course, I should say a few words about preliminary anchors. There are 3 preliminary anchoring types (July 2022) for those who prefer Python scripting for USDZ models – these are plane, image and face preliminary anchors. Look at this code snippet to find out how to implement a schema pythonically.
def Cube "ImageAnchoredBox"(prepend apiSchemas = ["Preliminary_AnchoringAPI"])
{
uniform token preliminary:anchoring:type = "image"
rel preliminary: imageAnchoring:referenceImage = <ImageReference>
def Preliminary_ReferenceImage "ImageReference"
{
uniform asset image = #somePicture.jpg#
uniform double physicalWidth = 45
}
}
If you want to know more about AR USD Schemas, read this story on Meduim.
Visualizing AnchorEntity
Here's an example of how to visualize anchors in RealityKit (mac version).
import AppKit
import RealityKit
class ViewController: NSViewController {
#IBOutlet var arView: ARView!
var model = Entity()
let anchor = AnchorEntity()
fileprivate func visualAnchor() -> Entity {
let colors: [SimpleMaterial.Color] = [.red, .green, .blue]
for index in 0...2 {
let box: MeshResource = .generateBox(size: [0.20, 0.005, 0.005])
let material = UnlitMaterial(color: colors[index])
let entity = ModelEntity(mesh: box, materials: [material])
if index == 0 {
entity.position.x += 0.1
} else if index == 1 {
entity.transform = Transform(pitch: 0, yaw: 0, roll: .pi/2)
entity.position.y += 0.1
} else if index == 2 {
entity.transform = Transform(pitch: 0, yaw: -.pi/2, roll: 0)
entity.position.z += 0.1
}
model.scale *= 1.5
self.model.addChild(entity)
}
return self.model
}
override func awakeFromNib() {
anchor.addChild(self.visualAnchor())
arView.scene.addAnchor(anchor)
}
}
About ArAnchors in ARCore
At the end of my post, I would like to talk about four types of anchors that are used in ARCore 1.35 and higher. Google's official documentation says the following about anchors: "ArAnchor describes a fixed location and orientation in the real world". ARCore anchors work similarly to ARKit anchors.
Let's take a look at ArAnchors' types:
Local anchors
are stored with the app locally, and valid only for that instance of the app. The user must be physically at the location where they are placing the anchor. Anchor can be attached to Trackable or ARCore Session.
Cloud Anchors
are stored in Google Cloud and may be shared between app instances. The user must be physically at the location where they are placing the anchor. Cloud Anchors are anchors that are hosted in the cloud (thanks to the Persistent Cloud Anchors API), you can create a cloud anchor that can be resolved for 1 to 365 days after creation. They can be resolved by multiple users to establish a common frame of reference across users and their devices.
Geospatial anchors
are based on geodetic latitude, longitude, and altitude, plus Google's Visual Positioning System data, to provide precise location almost anywhere in the world. These anchors may be shared between app instances. The user may place an anchor from a remote location as long as the app is connected to the internet and able to use the VPS.
Terrain anchors
is rather a subtype of Geospatial anchor that allows you to place AR objects using only latitude and longitude, leveraging information from Google Maps to find the precise altitude above ground.
When anchoring objects in ARCore, make sure that they are close to the anchor you are using. Avoid placing objects farther than 8 meters from the anchor to prevent unexpected rotational movement due to ARCore's updates to world space coordinates. If you need to place an object more than eight meters away from an existing anchor, create a new anchor closer to this position and attach the object to the new anchor.
These Kotlin code snippets show you how to use a Geospatial anchor:
fun configureSession(session: Session) {
session.configure(
session.config.apply {
geospatialMode = Config.GeospatialMode.ENABLED
}
)
}
val earth = session?.earth ?: return
if (earth.trackingState != TrackingState.TRACKING) { return }
earthAnchor?.detach()
val altitude = earth.cameraGeospatialPose.altitude - 1
val qx = 0f; val qy = 0f; val qz = 0f; val qw = 1f
earthAnchor = earth.createAnchor(latLng.latitude,
latLng.longitude,
altitude,
qx, qy, qz, qw)