RealityKit and ARKit – What is AR project looking for when the app starts? - swift

You will understand this question better if you open Xcode, create a new Augmented Reality Project and run that project.
After the project starts running on device, you will see the image from the rear camera, shooting your room.
After 3 or 4 seconds, a cube appears.
My questions are:
what were the app doing before the cube appearance? I mean, I suppose the app were looking for tracking points on the scene, so it could anchor the cube, right?
if this is true, what elements are the app looking for?
Suppose I am not satisfied with the point the cube appeared. Is there any function I can trigger with a tap on the screen, so the tracking can search for new points again near the location I have tapped on the screen?
I know my question is generic, so please, just give me the right direction.

ARKit and RealityKit stages
There are three stages in ARKit and RealityKit when you launch AR app:
Tracking
Scene Understanding
Rendering
Each stage may considerably increase a time required for model placement (+1...+4 seconds, depending on the device). Let's talk about each stage.
Tracking
This is initial state for your AR app. Here iPhone mixes visual data coming through RGB rear camera at 60 fps and transform data coming from IMU sensors (accelerometer, gyroscope and compass) at 1000 fps. Automatically generated Feature Points helps ARKit and RealityKit track surrounding environment and build a tracking map (whether it's a World Tracking or, for example, a Face Tracking). Feature Points are spontaneously generated on a high-contract margins of real-world objects and textures, in a well-lit environments. If you already have a previously saved World Map, it reduces a time for model placement into a scene. Also you may use a ARCoachingOverlayView for useful visual instructions that guide you during session initialization and recovery.
Scene Understanding
Second stage can include a horizontal and vertical Plane Detection, Ray-Casting (or Hit-Testing) and Light Estimation. If you have activated Plane Detection feature, it takes some time to detect a plane with a corresponding ARPlaneAnchor (or AnchorEntity(.plane)) that must tether a virtual model – cube in you case. Also there's an Advanced Scene Understanding allowing you to use a Scene Reconstruction feature. You can use scene reconstruction in gadgets with a LiDAR scanner and it gives you improved depth channel for compositing elements in a scene and People Occlusion. You can always enable an Image/Object Detection feature but you must consider it's built on machine learning algorithms that increase a model's placement time in a scene.
Rendering
The last stage is made for rendering of a virtual geometry in your scene. Scenes can contain models with shaders and textures on them, a transform or asset animations, dynamics and sound. Surrounding HDR reflections for metallic shaders are calculated by neural modules. ARKit can't render an AR scene. For 3d rendering you have to use such frameworks as RealityKit, SceneKit or Metal. These frameworks have their own rendering engines.
By default, in RealityKit there are high-quality rendering effects like Motion Blur or Ray-tracing shadows that require additional computational power. Take it into consideration.
Tip
To significantly reduce the time when placing an object in the AR scene, use a LiDAR scanner that works at nanoseconds speed. If you gadget has no LiDAR, then track only a surrounding environment where lighting conditions are good, all real-world objects are clearly distinguishable and textures on them are rich and have no repetitive patterns. Also, try not to use in your project polygonal geometry with more than 10K+ polygons and hi-res textures (jpeg or png with a size 1024x1024 considered as normal).
Also, RealityKit by default has several heavy options enabled – Depth channel Compositing, Motion Blur and Ray-traced Contact Shadows (on A11 and earlier there are Projected Shadows). If you don't need all these features, just disable them. After it your app will be much faster.
Practical Solution I
(shadows, motion blur, depth comp, etc. are disabled)
Use the following properties to disable processor intensive effects:
override func viewDidLoad() {
super.viewDidLoad()
arView.renderOptions = [.disableDepthOfField,
.disableHDR,
.disableMotionBlur,
.disableFaceOcclusions,
.disablePersonOcclusion,
.disableGroundingShadows]
let boxAnchor = try! Experience.loadBox()
arView.scene.anchors.append(boxAnchor)
}
Practical Solution II
(shadows, depth comp, etc. are enabled by default)
When you use the following code in RealityKit:
override func viewDidLoad() {
super.viewDidLoad()
let boxAnchor = try! Experience.loadBox()
arView.scene.anchors.append(boxAnchor)
}
you get a Reality Composer's preconfigured scene containing horizontal plane detection property and AnchorEntity with the following settings:
AnchorEntity(.plane(.horizontal,
classification: .any,
minimumBounds: [0.25, 0.25])
Separating Tracking and Scene Understanding from Model Loading and Rendering
The problem you're having is a time lag that occurs at the moment your app launches. At the same moment starts world tracking (first stage) and then app tries simultaneously to detect a horizontal plane (second stage) and then it renders a metallic shader of a cube (third stage). To get rid of this time lag use this very simple approach (when app's launching you need to track a room and then tap on a screen to load a model):
override func viewDidLoad() {
super.viewDidLoad()
let tap = UITapGestureRecognizer(target: self,
action: #selector(self.tapped))
arView.addGestureRecognizer(tap)
}
#objc func tapped(_ sender: UITapGestureRecognizer) {
let boxAnchor = try! Experience.loadBox()
arView.scene.anchors.append(boxAnchor)
}
This way you reduce the simultaneous load on the CPU and GPU. So your cube is loading faster.
P.S.
Also, as an alternative you can use a loadModelAsync(named:in:) type method that allows you to load a model entity from a file in a bundle asynchronously:
static func loadModelAsync(named name: String,
in bundle: Bundle?) -> LoadRequest<ModelEntity>

In the default Experience.rcproject the cube has an AnchoringComponent with a horizontal plane. So basically the cube will not display until the ARSession finds any horizontal plane in your scene (for example the floor or a table). Once it finds that the cube will appear.
If you want instead to create and anchor and set that as the target when catching a tap event, you could perform a raycast. Using the result of a raycast, you can grab the worldTransform and set the cube's AnchoringComponent to that transform:
Something like this:
boxAnchor.anchoring = AnchoringComponent(.world(transform: raycastResult.worldTransform))

Related

ARKit large model follows camera instead of staying stationary

My code looks for a QR Code in the frame received during the session(didUpdate) ARSCNViewDelegate method. I check to see if all four corners and the center of the QR Code are in the same plane with hitTest, and then drop an ARAnchor at the center. I create a SCNReferenceNode for the anchor with a reference to a scenekit model of a fairly large house (70'w x 30'd x 30'h) I position the house 30 meters in front (z =-30) and 30 meters to the right (x=30) of the detected QR Code, and it initially appears OK. However, if I try to "walk around" the model, it moves with me, always maintaining a constant distance and offset from my iPad camera. I have tried using my own anchors, the plane anchors created by ARKit, and lots of other ideas, nothing changes. How can I get it to stay put, like the plane model does in the boilerplate ARKit xcode project?
It sounds like although you created some new anchors, that you perhaps didn't assign your model to them? So when your model gets loaded and presented, it's being 'tracked' on the gyro. So you get that Pokemon Go effect where regardless of what you do the AR model doesn't change in size.

What's the difference between using ARAnchor to insert a node and directly insert a node?

In ARKit, I have found 2 ways of inserting a node after the hitTest
Insert an ARAnchor then create the node in renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode?
let anchor = ARAnchor(transform:hit.worldTransform)
sceneView.session.add(anchor:anchor)
Insert the node directly
node.position = SCNVector3(hit.worldTransform.columns.3.x, hit.worldTransform.columns.3.y, hit.worldTransform.columns.3.z)
sceneView.scene.rootNode.addChildNode(node)
Both look to work for me, but why one way or the other?
Update: As of iOS 11.3 (aka "ARKit 1.5"), there is a difference between adding an ARAnchor to the session (and then associating SceneKit content with it through ARSCNViewDelegate callbacks) and just placing content in SceneKit space.
When you add an anchor to the session, you're telling ARKit that a certain point in world space is relevant to your app. ARKit can then do some extra work to make sure that its world coordinate space lines up accurately with the real world, at least in the vicinity of that point.
So, if you're trying to make virtual content appear "attached" to some real-world point of interest, like putting an object on a table or wall, you should see less "drift" due to world-tracking inaccuracy if you give that object an anchor than if you just place it in SceneKit space. And if that object moves from one static position to another, you'll want to remove the original anchor and add one at the new position afterward.
Additionally, in iOS 11.3 you can opt in to "relocalization", a process that helps ARKit resume a session after it gets interrupted (by a phone call, switching apps, etc). The session still works while it's trying to figure out how to map where you were before to where you are now, which might result in the world-space positions of anchors changing once relocalization succeeds.
(On the other hand, if you're just making space invaders that float in the air, perfectly matching world space isn't as important, and thus you won't really see much difference between anchor-based and non-anchor-based positioning.)
See the bit around "Use anchors to improve tracking quality around virtual objects" in Apple's Handling 3D Interaction and UI Controls in Augmented Reality article / sample code.
The rest of this answer remains historically relevant to iOS 11.0-11.2.5 and explains some context, so I'll leave it below...
Consider first the use of ARAnchor without SceneKit.
If you're using ARSKView, you need a way to reference positions / orientations in 3D (real-world) space, because SpriteKit isn't 3D. You need ARAnchor to keep track of positions in 3D so that they can get mapped into 2D.
If you're building your own engine with Metal (or GL, for some strange reason)... that's not a 3D scene description API — it's a GPU programming API — so it doesn't really have a notion of world space. You can use ARAnchor as a bridge between ARKit's notion of world space and whatever you build.
So in some cases you need ARAnchor because that's the only sensible way to refer to 3D positions. (And of course, if you're using plane detection, you need ARPlaneAnchor because ARKit will actually move those relative to scene space as it refined its estimates of where planes are.)
With ARSCNView, SceneKit already has a 3D world coordinate space, and ARKit does all the work of making that space match up to the real-world space ARKit maps out. So, given a float4x4 transform that describes a position (and orientation, etc) in world space, you can either:
Create an ARAnchor, add it to the session, and respond to ARSCNViewDelegate callback to provide SceneKit content for each anchor, which ARKit will add to and position in the scene for you.
Create an SCNNode, set its simdTransform, and add it as a child of the scene's rootNode.
As long as you have a running ARSession, there's no difference between the two approaches — they're equivalent ways to say the same thing. So if you like doing things the SceneKit way, there's nothing wrong with that. (You can even use SCNVector3 and SCNMatrix4 instead of SIMD types if you want, but you'll have to convert back and forth if you're also getting SIMD types from ARKit APIs.)
The one time these approaches differ is when the session is reset. If world tracking fails, you resume an interrupted session, and/or
you start a session over again, "world space" may no longer line up with the real world in the same way it did when you placed content in the scene.
In this case, you can have ARKit remove anchors from the session — see the run(_:options:) method and ARSession.RunOptions. (Yes, all of them, because at this point you can't trust any of them to be valid anymore.) If you placed content in the scene using anchors and delegate callbacks, ARKit will nuke all the content. (You get delegate callbacks that it's being removed.) If you placed content with SceneKit API, it stays in the scene (but most likely in the wrong place).
So, which to use sort of depends on how you want to handle session failures and interruptions (and outside of that there's no real difference).
SCNVector3 is just "a representation of a three-component vector." SCNVector3 docs.
When using ARAnchor, you have access to a three-component vector, but also you are able "to track the positions and orientations of real or virtual objects relative to the camera" ARAnchor docs. And that's why you use the session to add the anchor instead of using the scene.
See the docs and you can see the difference in terms of the API :)
Hope it helps.

using the pure Metal-API alongside with SceneKit or SpriteKit

I have a SKView and a MTKView running in one application and everything is working well so far.
The only thing is, that both views are poorly visually integrated. They are just side by side. But I would like to have the pure metal rendering inside the SKView moving with some of the SKNodes inside. It is a kind of fast display inside the SKView.
On the metal side running quite a lot computation and rendering stuff. The SKView should provide a nice ui for the heavy calculations and minimal but very fast rendering of the pure metal part.
I already thought about using SceneKit with an overlay of a SpriteKit scene, because SCNRenderer is offering a possibility to render an own MTLCommandBuffer and MTLRenderPassDescriptor with renderAtTime.
I implemented the following SCNSceneRendererDelegate method and called my own render function, which is preparing the commandBuffer.
func renderer(renderer: SCNSceneRenderer, didRenderScene scene: SCNScene, atTime time: NSTimeInterval) {
nodeArray.render()
}
After the commandBuffer is ready I call the renderAtTime method of my SCNRenderer. Trial and error showed me that the command buffer must committed after calling renderAtTime. If I do it before it will crash the app. If I do it not at all, it will freeze the app.
func bufferFinished(renderer:SCNRenderer, commandBuffer: MTLCommandBuffer, renderPassDescriptor: MTLRenderPassDescriptor){
let current=CFAbsoluteTimeGetCurrent()
renderer.renderAtTime(current, viewport: gameView.bounds, commandBuffer: commandBuffer, passDescriptor: renderPassDescriptor)
commandBuffer.commit()
}
If I do this the app is running but no additional metal context is shown.
I think so whole thing is kind of complex because of the metal part.
Is there any simple sample where pure metal is rendered in a SceneKit view or better in a SpriteKit view?
Metal, SpriteKit & SceneKit
SpriteKit & SceneKit working together. You can use SceneKit within SpriteKit and vice versa.
In SpriteKit you can use SK3DNode to render a SceneKit scene. Here is an Example: https://github.com/CloakedEddy/SK3DNode-example/tree/master/SKSCN%20Crossover
In SceneKit you can use the overlaySKScene property of your SCNSceneRenderer which is inherited by your SCNView. You can render your SpriteKit scenes there side by side with the SceneKit scenes.
But how to integrate metal?
Using metal with SpriteKit can be done with textures. You compute the textures with metal on the GPU and transfer the image data to a SKMutableTexture via modifyPixelDataWithBlock. This is working, but you have to copy the data which is not nice and SpriteKit is working with OpenGL behind the scenes and not with metal. So it has to create a OpenGL-Texture which is very expensive as you can see.
Another issue is, that you have to synchronize the GPU data if you working with large textures to fetch them in CPU memory. Only then you can copy the data to the SKMutableTexture. This is very inefficient since the data will go to the GPU again, but now with OpenGL.
With SceneKit you can choose if you would like to have OpenGL or Metal as underlying framework to the GPU. And you can modify the texture in the material property directly with metal. Just set the contents property of the SCNMaterialProperty. Or you can directly modify the SCNGeometrySource by metal, if you want to change the geometry of an object.

How to display a part of a scene in another scene (Scene Kit + Swift)

First, I just want to introduce to you guys my problem, because it is really complex so you need this to understand it properly.
I am trying to do something with Scene Kit and Swift : I want to reproduce what we can see in the TV Show Doctor Who where the Doctor's spaceship is bigger on the inside, as you can see in this video.
Of course the Scene Kit Framework doesn't support those kind of unreal dimensions so we need to do some sort of hackery to do achieve that.
Now let's talk about my idea in plain english
In fact, what we want to do is to display two completely different dimensions at the same place ; so I was thinking to :
A first dimension for the inside of the spaceship.
A second dimension for the outside of the spaceship.
Now, let's say that you are outside of the ship, you would be in the outside dimension, and in this outside dimension, my goal would be to display a portion of the inside dimension at the level of the door to give this effect where the camera is outside but where we can clearly see that the inside is bigger :
We would use an equivalent principle from the inside.
Now let's talk about the game logic :
I think that a good way to represent these dimensions would be two use two scenes.
We will call outsideScene the scene for the outside, and insideScene the scene for the inside.
So if we take again the picture, this would give this at the scene level :
To make it look realistic, the view of the inside needs to follow the movements of the outside camera, that's why I think that all the properties of these two cameras will be identical :
On the left is the outsideScene and on the right, the insideScene. I represent the camera field of view in orange.
If the outsideScene camera moves right, the insideScene camera will do exactly the same thing, if the outsideScene camera rotates, the insideScene camera will rotate in the same way... you get the principle.
So, my question is the following : what can I use to mask a certain portion of a certain scene (in this case the yellow zone in the outsideView) with what the camera of another view (the insideView) "sees" ?
First, I thought that I could simply get an NSImage from the insideScene and then put it as the texture of a surface in the outsideScene, but the problem would be that Scene Kit would compute it's perspective, lighting etc... so It would just look like we was displaying something on a screen and that's not what I want.
there is no super easy way to achieve this in SceneKit.
If your "inside scene" is static and can be baked into a cube map texture you can use shader modifiers and a technique called interior mapping (you can easily find examples on the web).
If you need a live, interactive "inside scene" you can use the sane technique but will have to render your scene in a texture first (or renderer your inside scene and outer scene one after the other with stencils). This can be done by leveraging SCNTechnique (new in Yosemite and iOS 8). On older versions you will have to write some OpenGL code in SCNSceneRenderer delegate methods.
I don't know if it's 'difficult'. As we have to in iOS , a lot of times the simplest answer ..is the simplest answer.
Maybe consider this:
Map a texture onto a cylinder sector prescribed by the geometry of the Tardis cube shape. Make sure the cylinder radius is equal of the focal point of the camera. Make sure you track the camera to the focal point.
The texture will be distorted because it is a cylinder making onto a cube. The actors' nodes in the Tardis will react properly to the camera but there should be two groups of light sources...One set for the Tardis and one outside the Tardis.

quartz 2d / openGl / cocos2d image distortion in iphone by moving vertices for 2.5d iphone game

We are trying to achieve the following in an iphone game:
Using 2d png files, set-up a scene that seems 3d. As the user moves the device, the individual png files would warp/distort accordingly to give the effect of depth.
example of a scene: an empty room, 5 walls and a chair in the middle. = 6 png files layered.
We have successfully accomplished this using native functions like skew and scale. By applying transformations to the various walls and the chair, as the device is tilted moved, the walls would skew/scale/translate . However, the problem is since we are using 6 png files, the edges dont meet as we move the device. We need a new solution using a real engine.
Question:
we are thinking of instead of applying skew/scale transformations, that if given the freedom to move the vertices of the rectangular images, we could precisly distort images and keep all the edges 100% aligned.
What is the best framework to do this in the LEAST amount of time? Are we going about this the correct way?
You should be able to achieve this effect (at least in regards to the perspective being applied to the walls) using Core Animation layers and appropriate 3-D transforms.
A good example of constructing a scene like this can be found in the example John Blackburn provides here. He shows how to set up layers to represent the walls in a maze by applying the appropriate rotation and translation to them, then gives the scene perspective by using the trick of altering the m34 component of the CATransform3D for the scene.
I'm not sure how well your flat chair would look using something like this, but certainly you can get your walls to have a nice perspective to them. Using layers and Core Animation would let you pull off what you want using far less code than implementing this using OpenGL ES.
Altering the camera angle is as simple as rotating the scene in response to shifts in the orientation of the device.
If you're going to the effort of warping textures as they would be warped in a 3D scene, then why not let the graphics hardware do the hard work for you by mapping the textures to 3D polygons, then changing your projection or moving polygons around?
I doubt you could do it faster by restricting yourself to 2D transformations --- the hardware is geared up to do 3x3 (well, 4x4 homogenous) matrix multiplication.