HOW do Unity World Anchors work on the HoloLens? - unity3d

I'm currently building a HoloLens application and have a feature in-mind that requires holograms to be dynamically created, placed, and to persist between sessions. Those holograms don't need to be shared between devices.
I've had a nightmare trying to find (working) implementations and documentation for Unity WorldAnchors, with Azure Spatial Anchors seeming to stomp out most traces of it. Thankfully I've gotten past that and have managed to implement WorldAnchors by using the older HoloToolkit, since documentation for WorldAnchors in the newer MRTK also seems to have also disappeared.
MY QUESTION (because I am unable to find any docs for it) is how do WorldAnchors work?
I'd hazard a guess that it's based on spatial mapping, which presents the limitation that if you have 2 identical rooms or objects that move in the original room, the anchor/s is/are going to be lost.
What I'd LIKE to hear is that it's some magical management of transforms, which means my app has an understanding of its change in real-world location between uses even if the app is launched from a different location each time.
Does anybody know the answer or where I might look (beyond the limited Unity and MS Docs for this matter) to find out implementation details?
Thank you.

I'd hazard a guess that it's based on spatial mapping, which presents the limitation that if you have 2 identical rooms or objects that move in the original room, the anchor/s is/are going to be lost.
We won’t divulge the internal implementation details of the internal coding of the World Anchor but we can state that it is not based on GPS currently with HoloLens v1 or HoloLens v2. Currently, the World Anchor uses the data in the spatial map for placement. The underlying piece that is key is the anchors rely on the spatial scanning and the scanning can use wifi to improve the speed and accuracy, see these two references: 1 & 2
What I'd LIKE to hear is that it's some magical management of transforms, which means my app has an understanding of its change in real-world location between uses even if the app is launched from a different location each time.
It is certainly possible to have two identical rooms with exact layout to trick the mapping to think it is the same room. We document that here:
https://learn.microsoft.com/en-us/windows/mixed-reality/coordinate-systems#headset-tracks-incorrectly-due-to-identical-spaces-in-an-environment

Related

Where should the user-independent configuration for a multi-app system exist within the registry?

I apologise if this seems like a question with no clear-cut answer (although I'm hoping there is one), but please bear with me...
The company I work for is developing an immersive environment in Unity, which consists of a box-shaped room with multiple projectors casting images onto the walls and ceiling to produce a sort of effect similar to VR, but without the need for a headset. We're using Unity and C# to develop the system, and I've been writing a sort of "platform" that acts as a starting point for the applications that we develop for the environment. One of the systems contained within this platform is for screen configuration; this includes the dimensions of the screens and the mapping of projectors to views (i.e. it indicates which projector is responsible for projecting the forwards view .etc.)
Now, in order to make things simple, I'm going to be storing this configuration within the registry; this way, all of the separate applications will share the configuration of the immersive environment. I've implemented this, and everything works as expected. However, as I'm pedantic about things, I just want to make sure I'm using the correct location within the register for what I'm storing.
At the moment, I'm using "HKEY_LOCAL_MACHINE\System\BlueRoom..." ("Blue Room" is the name of the environment we're developing). I know I'll want to store the configuration within HKLM as opposed to HKCU, as the setup of the Blue Room's screens are the same regardless of the user. However, beyond this I'm not sure if I should be storing the configuration in "\System\BlueRoom..." or "\Software\BlueRoom...". Are there set guidelines pertaining to this, or is it a matter of preference?
Put it in the "\Software\BlueRoom\" seeing as that's what it is. System is more reserved for things that pertain to the functionality of the system e.g. Hardware and Windows.

Hololens-SpatialMapping (Unity3D)

I'm actually doing a project with the Hololens of Microsoft. The problem is that the Hololens memory is bad, so i can only make a spatialmapping of a room and not of a building because he can't remember all the building. I had an idea, maybe a can create more object and assemble them ? But no one talk about this... Do you think it's possible ?
Thanks for reading me.
Y.P
Since you don’t have a compass, you could establish some convention to help. For example, you could start the scanning by giving a voice command (and stop it by another one), and decide to only start scanning when you’re facing north, for example. Then it would be easy to know the orientation of each room. What may be harder is to get the angle exactly right. Your head might be off by a few degrees and you may have to work some “magic” (post processing) to correct it.
Or placing QR codes on a wall (printer paper + scotch tape) and using something like Vuforia can help you avoid this orientation problem altogether (you would get the QR code’s orientation which would match that of the wall).
You can also simplify the scanned mesh and convert it to planes. That way you can remember simpler objects instead of the raw spatial mapping mesh. (Search for the SurfaceToPlanes script in the Holographic Academy tutorials).
Scanning, the first layer, as in HoloLens trying to reason about the environment is an unstoppable process. There is no API for starting or stopping it. And that process also does slowly consume more and more memory as far as I know. The only thing you can do is deleting space (aka deleting holograms) or covering the sensors. But that's OS/hardware level, not app level, which you presumably want.
Layer two, what you are you probably talking about, is starting and stopping the spatial reconstruction process, where that raw spatial data is processed into a low-poly mesh (aka spatial mapping). This process can be started or stopped. For example through Unity's SpatialMappingCollider and SpatialMappingRenderer components, if you use Unity.
Finally the third level is extracting some objects/segments from that spatial mapping mesh into primitives. Like that SurfaceToPlanes. That you can also fully control in terms of when.
There has been a great confusion, especially due to the a re-naming parties in MixedRealityToolkit (overuse of word Scanning) and Unity (SpatialAnchor to WorldAnchor etc.) and misleading tutorials using a lot of colloquialisms instead of crisp terminology.
Theory aside. If you want the HoloLens to think of your entire building as one continuous space in terms of the first layer, you're out of luck. It was designed for a living room and there is a lot of voodoo involved into making it work stable in facilities 30x30 meters. You probably want to rely on disjointed "islands" with specific detection anchors to identify where you are. Or rely on markers and coordinates relative to them.
Cheers

Modeling a Physical Place inside iPhone Application

I need to find a way to model a physical place inside an iPhone application. For example, I want to be able to take images for a restaurant and then use some tools or programming API to model this resturant as a 3d place and make the user able to navigate and explore the place and rooms.
I have thought about HTML 5 inside a web view but I don't think the WebGL is compatible with iPhone Web View (Safari Engine).
Can you please recommend a method, API, Commercial Library or anything to help me achieve this task?
First, you need to be able to display 3D models for IPhone. One of the most popular 3D engine is Unity3D:
http://unity3d.com/
It is extremely easy to start playing with Unity3D. You even have a free license with limited features:
http://unity3d.com/unity/licenses
Then, you now need to reconstruct a 3D model from pictures. This is not a trivial problem so it is better if you know some computer vision. You can try to play with OpenCV:
http://opencv.willowgarage.com/wiki/
Best regards.
Actually Nuke from the Foundry has a decent start at the future of creating computer models from images.
Basically it takes a high contrast point and tracks it through successive moments. Given hundreds and thousands of tracked points, the next step is to calculate the perspective change between points.
Say two points are a known pixel distance apart at time zero and a certain time period later they are a different distance apart. This change in difference could be a bad tracking point. But assuming that the two points are perfectly tracking, then the distance change could be caused by a camera motion laterally or rotationally. And in real space a point further away from you will have a different perspective then a closer point . This perspective change is a mathematical certainty.
Initially the tracking is typically used to refilm a piece of film to stabilize it. But the process the software uses to analyze the film can be saved , it is often called a point cloud. connection of many close points that track very closely usually are because the points are parts of a surface, so a model can be built.
But my friend, we are barbarians to the speed and software that can do that perfectly yet. Or all the CG Artists out there would not have anything to model in Maya except fantasy monsters and space ships that don't exist yet....

Does a free API for a Augmented reality service exist?

Currently I am trying to create an app for iPhone which is capable of recognizing the objects on an image such as car, bus, building, bridge, human, etc, and label as object name with the help of Internet.
Is there any free service which provide solution to my problem, as object recognition its self a complex algorithm requiring digital image processing, neural networks and all.
Can this can be done via API?
If you want to recognise planar images the current generation of mobile AR SDKs from Metaio, Qualcomm and Layar will allow you to upload images to match against, and perform the matching.
If you want to match freely against a set of 3D objects, e.g. a Toyota Prius or the Empire state, the same techniques might be applied to match against sets of images taken at different rotations, but you might have to choose to match just one object due to limitations on how large an image database you can have with the service, or contact those companies for a custom solution, and it may not work very reliably given the state of the art is to reliably match against planar images.
If you want to recognize general classes (human, car, building), this is a very difficult problem, and I don't know of any solutions anywhere fast enough to operate online (which I assume is a requirement given you want an AR solution - is that a fair assumption?). It's been a few years since I studied CV, but at that time the most promising solution for visual classification was "bag of visual words" approaches - you might try reading up on those.
Take a look at Cortexica. Very useful for this sort of thing.
http://www.cortexica.com/
I haven't done work with mobile AR in a while, but the last time I was working on this stuff I was using Layar and starting to investigate Junaio. Those are oriented toward 3D graphics, not simply text labels, so for your use case you may be better served with OpenCV.
Note that Layar (and I believe Junaio too) works like a web app, where you put the content on your own server and give Layar the URL to link to.

OpenGL render state management

I'm currently working on a small iPhone game, and am porting the 3d engine I've started to develop for the Mac to the iPhone. This is all going very well, and all functionality of the Mac engine is now present on the iPhone. The engine was by no means finished, but now at least I have basic resource management, a scene graph and a construction to easily animate and move objects around.
A screenshot of what I have now: http://emle.nl/forumpics/site/planes_grid.png. The little plane is a test object I've made several years ago for a game I was making then. It's not related to the game I'm developing now, but the 3d engine and its facilities are, of course.
Now, I've come to the topic of materials, the description of which textures, lights, etc belong to a renderable object. This means a lot of OpenGL clientstate and glEnable/glDisable calls for every object. What way would you suggest to minimise these state changes?
Currently I'm sorting by material, since objects with the same material don't need any changes at all. I've created a class called RenderState that caches the current OpenGL state and only applies the members that are different when a different material is selected. Is this a workable solution, or will it grow beyond control when the engine matures and more and more state needs to be cached?
A bit of advice. Just write the code you need for your game. Don't spend time writing a generalised rendering engine because it's more than likely you won't need it. If you end writing another game then extract the useful bits out into an engine at that point. This will be way quicker.
If the number of states in OpenGL ES as high as the standard version, it will be difficult to manage at some point.
Also, if you really want to minimize state changes you might need some kind of state-sorting concept, so that drawables with similar states are rendered together w/o needing a lot of glEnable/glDisable's between them. However, this might be sort of difficult to manage even on the PC hardware (imagine state-sorting thousands of drawables) and blindly changing the state might actually be cheaper, depending on the OpenGL implementation.
For a comparison, here's the approach taken by OpenSceneGraph:
Basically, every node in the scene graph has its own stateset which stores the material properties, states etc. The nice thing is that statesets can be shared by multiple nodes. This way, the rendering backend can just sort the drawables with respect to their stateset pointers (not the contents of the stateset!) and render nodes with same stateset together. This offers a nice trade-off since the backend is not bothered with managing individual opengl states, yet can achieve nearly minimal state changing, if the scenegraph is generated accordingly.
What I suggest, in your case is that you should do a lot of testing before sticking with a solution. Whatever you do, I'm sure that you will need some kind of abstraction to OpenGL states.