I am new to iPhone app development (but experienced developer in other languages) and I am making an ARKit application in which I need to track an image's position and display a rectangle around this image.
I could do it in C++ with OpenCV and make the appropriate "bridging" classes to be able to call this code from swift. Now, I need to get the images from ARKit and pass them to this function.
How do I suscribe a function which handles the ARFrames from the ARKit scene? I found that I could get some ARFrame from sceneView.session.currentFrame but I did not find how to make a function that would be called for each frame (or each time my function has ended and is ready to receive the next frame).
Also, I discovered the Vision Framework but it seems to only be able to track an element on which the user tapped. Is that right or is there a combination of predefined functions which could be used to this purpose?
You can get each captured frame by the camera with the session(:didUpdate:) method from the ARSessionDelegate protocol. Note that this method might be useful even if you don't use the provided ARFrame; for instance, I use it to "poll" if there's any virtual content added currently visible.
Regarding the tracking image question, I believe you could create an ARAnchor or SCNNode and position it over the image and ARKit would track it
Related
I'm looking to change the field of view for the rendered content in my AR session. Obviously we can't change the raw camera FOV, but I think it should be possible to change the field of view for the rendered SceneKit content.
Changing the camera field of view is trivial in a raw SceneKit SCNCamera... but I don't know how to do this within an ARSCNView.
You might be able to access the pointOfView property of your ARSCNView (and then retrieve the active SCNCamera).
If that doesn't work (ARKit changing the camera property every frame etc.), you can always go the path of writing the code yourself by using ARSession directly with SCNView.
Note that unless you have a 3D scene covering the entire camera stream, changing the FoV of your virtual camera would break the AR registration (alignment).
The developer documentation for ARSession suggests "If you build your own renderer for AR content, you'll need to instantiate and maintain an ARSession object yourself."
This repo does this: https://github.com/hanleyweng/iOS-ARKit-Headset-View
This code will retrieve the camera from an ARSCNView if there is one:
sceneView.scene.rootNode.childNodes.first(where: { $0.camera != nil})
Note that this will return the camera's associated node, which you may need if you want to control its position or angle directly. The SCNCamera itself is stored in the node's camera property.
It's best not to touch the AR camera if you can avoid it as it will mess up the association between the world model and the scene. I occasionally use this technique if I want a control system that can optionally use AR to track device motion and angle, but which doesn't have to translate into real-world coordinates (i.e. VR apps that don't display the camera feed).
Essentially, I'd only do this if you're using AROrientationTrackingConfiguration or similar.
EDIT I should probably mention that ARKit overrides the camera's projectionTransform property, so you probably won't be able to set fieldOfView manually. I've had some success setting xFov and yFov, but since these are deprecated you shouldn't rely on them.
As many other developers, I have plunged myself into Apple's new ARKit technology. It's great.
For a specific project however, I would like to be able to recognise (real-life) images in the scene, to either project something on it (just like Vuforia does with its target images), or to use it to trigger an event in my application.
In my research on how to accomplish this, I stumbled upon the Vision and CoreML frameworks by Apple. This seems promising, although I have not yet been able to wrap my head around it.
As I understand it, I should be able to do exactly what I want by finding rectangles using the Vision framework and feeding those into a CoreML model that simply compares it to the target images that I predefined within the model. It should then be able to spit out which target image it found.
Although this sounds good in my head, I have not yet found a way to do this. How would I go about creating a model like that, and is it even possible at all?
I found this project on Github some weeks ago:
AR Kit Rectangle Detection
I think that is exactly what you are looking for...
As of ARKit 1.5 (coming with IOS 11.3 in the spring of 2018), a feature seems to be implemented directly on top of ARKit that solves this problem.
ARKit will fully support image recognition.
Upon recognition of an image, the 3d coordinates can be retrieved as an anchor, and therefore content can be placed onto them.
Vision's ability to detect images was implemented in ARKit beginning from iOS 11.3+, so since then ARKit has ARImageAnchor subclass that extends ARAnchor parent class and conforms to ARTrackable protocol.
// Classes hierarchy and Protocol conformance...
ObjectiveC.NSObject: NSObjectProtocol
↳ ARKit.ARAnchor: ARAnchorCopying
↳ ARKit.ARImageAnchor: ARTrackable
ARWorldTrackingConfiguration class has a detectionImages instance property that is actually a set of images that ARKit attempts to detect in the user's environment.
open var detectionImages: Set<ARReferenceImage>!
And ARImageTrackingConfiguration class has a trackingImages instance property that is a set as well, it serves the same purpose – ARKit attempts to detect and track it in the user's environment.
open var trackingImages: Set<ARReferenceImage>
So, having a right configuration and an ability to automatically get ARImageAnchor in a ARSession, you can tether any geometry to that anchor.
P.S. If you want to know how to implement image detection feature in your ARKit app please look at this post.
In ARKit, I have found 2 ways of inserting a node after the hitTest
Insert an ARAnchor then create the node in renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode?
let anchor = ARAnchor(transform:hit.worldTransform)
sceneView.session.add(anchor:anchor)
Insert the node directly
node.position = SCNVector3(hit.worldTransform.columns.3.x, hit.worldTransform.columns.3.y, hit.worldTransform.columns.3.z)
sceneView.scene.rootNode.addChildNode(node)
Both look to work for me, but why one way or the other?
Update: As of iOS 11.3 (aka "ARKit 1.5"), there is a difference between adding an ARAnchor to the session (and then associating SceneKit content with it through ARSCNViewDelegate callbacks) and just placing content in SceneKit space.
When you add an anchor to the session, you're telling ARKit that a certain point in world space is relevant to your app. ARKit can then do some extra work to make sure that its world coordinate space lines up accurately with the real world, at least in the vicinity of that point.
So, if you're trying to make virtual content appear "attached" to some real-world point of interest, like putting an object on a table or wall, you should see less "drift" due to world-tracking inaccuracy if you give that object an anchor than if you just place it in SceneKit space. And if that object moves from one static position to another, you'll want to remove the original anchor and add one at the new position afterward.
Additionally, in iOS 11.3 you can opt in to "relocalization", a process that helps ARKit resume a session after it gets interrupted (by a phone call, switching apps, etc). The session still works while it's trying to figure out how to map where you were before to where you are now, which might result in the world-space positions of anchors changing once relocalization succeeds.
(On the other hand, if you're just making space invaders that float in the air, perfectly matching world space isn't as important, and thus you won't really see much difference between anchor-based and non-anchor-based positioning.)
See the bit around "Use anchors to improve tracking quality around virtual objects" in Apple's Handling 3D Interaction and UI Controls in Augmented Reality article / sample code.
The rest of this answer remains historically relevant to iOS 11.0-11.2.5 and explains some context, so I'll leave it below...
Consider first the use of ARAnchor without SceneKit.
If you're using ARSKView, you need a way to reference positions / orientations in 3D (real-world) space, because SpriteKit isn't 3D. You need ARAnchor to keep track of positions in 3D so that they can get mapped into 2D.
If you're building your own engine with Metal (or GL, for some strange reason)... that's not a 3D scene description API — it's a GPU programming API — so it doesn't really have a notion of world space. You can use ARAnchor as a bridge between ARKit's notion of world space and whatever you build.
So in some cases you need ARAnchor because that's the only sensible way to refer to 3D positions. (And of course, if you're using plane detection, you need ARPlaneAnchor because ARKit will actually move those relative to scene space as it refined its estimates of where planes are.)
With ARSCNView, SceneKit already has a 3D world coordinate space, and ARKit does all the work of making that space match up to the real-world space ARKit maps out. So, given a float4x4 transform that describes a position (and orientation, etc) in world space, you can either:
Create an ARAnchor, add it to the session, and respond to ARSCNViewDelegate callback to provide SceneKit content for each anchor, which ARKit will add to and position in the scene for you.
Create an SCNNode, set its simdTransform, and add it as a child of the scene's rootNode.
As long as you have a running ARSession, there's no difference between the two approaches — they're equivalent ways to say the same thing. So if you like doing things the SceneKit way, there's nothing wrong with that. (You can even use SCNVector3 and SCNMatrix4 instead of SIMD types if you want, but you'll have to convert back and forth if you're also getting SIMD types from ARKit APIs.)
The one time these approaches differ is when the session is reset. If world tracking fails, you resume an interrupted session, and/or
you start a session over again, "world space" may no longer line up with the real world in the same way it did when you placed content in the scene.
In this case, you can have ARKit remove anchors from the session — see the run(_:options:) method and ARSession.RunOptions. (Yes, all of them, because at this point you can't trust any of them to be valid anymore.) If you placed content in the scene using anchors and delegate callbacks, ARKit will nuke all the content. (You get delegate callbacks that it's being removed.) If you placed content with SceneKit API, it stays in the scene (but most likely in the wrong place).
So, which to use sort of depends on how you want to handle session failures and interruptions (and outside of that there's no real difference).
SCNVector3 is just "a representation of a three-component vector." SCNVector3 docs.
When using ARAnchor, you have access to a three-component vector, but also you are able "to track the positions and orientations of real or virtual objects relative to the camera" ARAnchor docs. And that's why you use the session to add the anchor instead of using the scene.
See the docs and you can see the difference in terms of the API :)
Hope it helps.
First, I just want to introduce to you guys my problem, because it is really complex so you need this to understand it properly.
I am trying to do something with Scene Kit and Swift : I want to reproduce what we can see in the TV Show Doctor Who where the Doctor's spaceship is bigger on the inside, as you can see in this video.
Of course the Scene Kit Framework doesn't support those kind of unreal dimensions so we need to do some sort of hackery to do achieve that.
Now let's talk about my idea in plain english
In fact, what we want to do is to display two completely different dimensions at the same place ; so I was thinking to :
A first dimension for the inside of the spaceship.
A second dimension for the outside of the spaceship.
Now, let's say that you are outside of the ship, you would be in the outside dimension, and in this outside dimension, my goal would be to display a portion of the inside dimension at the level of the door to give this effect where the camera is outside but where we can clearly see that the inside is bigger :
We would use an equivalent principle from the inside.
Now let's talk about the game logic :
I think that a good way to represent these dimensions would be two use two scenes.
We will call outsideScene the scene for the outside, and insideScene the scene for the inside.
So if we take again the picture, this would give this at the scene level :
To make it look realistic, the view of the inside needs to follow the movements of the outside camera, that's why I think that all the properties of these two cameras will be identical :
On the left is the outsideScene and on the right, the insideScene. I represent the camera field of view in orange.
If the outsideScene camera moves right, the insideScene camera will do exactly the same thing, if the outsideScene camera rotates, the insideScene camera will rotate in the same way... you get the principle.
So, my question is the following : what can I use to mask a certain portion of a certain scene (in this case the yellow zone in the outsideView) with what the camera of another view (the insideView) "sees" ?
First, I thought that I could simply get an NSImage from the insideScene and then put it as the texture of a surface in the outsideScene, but the problem would be that Scene Kit would compute it's perspective, lighting etc... so It would just look like we was displaying something on a screen and that's not what I want.
there is no super easy way to achieve this in SceneKit.
If your "inside scene" is static and can be baked into a cube map texture you can use shader modifiers and a technique called interior mapping (you can easily find examples on the web).
If you need a live, interactive "inside scene" you can use the sane technique but will have to render your scene in a texture first (or renderer your inside scene and outer scene one after the other with stencils). This can be done by leveraging SCNTechnique (new in Yosemite and iOS 8). On older versions you will have to write some OpenGL code in SCNSceneRenderer delegate methods.
I don't know if it's 'difficult'. As we have to in iOS , a lot of times the simplest answer ..is the simplest answer.
Maybe consider this:
Map a texture onto a cylinder sector prescribed by the geometry of the Tardis cube shape. Make sure the cylinder radius is equal of the focal point of the camera. Make sure you track the camera to the focal point.
The texture will be distorted because it is a cylinder making onto a cube. The actors' nodes in the Tardis will react properly to the camera but there should be two groups of light sources...One set for the Tardis and one outside the Tardis.
I went through all my resources but I am not getting the scattering effect of an image smoothly. However, I am able to zoom it and I had scattered it but it is not as smooth as I want. I just want to click on a button and it should zoom and the other image should get scatter
I want the image to be scattered as the link given below is it possible in iPhone?
http://www.touchmagix.com/templates/diamond.htm
this is hell lot of graphics. If you intend to achieve this with iPhone sdk, start learning details about CALayer and then try to manipulate the movements of different objects. Which will require lot of coding logic.
My suggestion would be going for Cocos2D and try with the different classes and api's present there. In Cocos2D you can create an image which will act as an actual physical object and you can apply all laws of physics to it. Say if you create a ball, and you push it, the ball will go in the direction of the push and bounce back if it gets hit by any other physical object and you don't require to do any coding for this. Cocos2D takes care of all these things.
Try it.