What AR libraries would be useful in order to develop an iOS app for showing information about certain objects? - iphone

I am starting an AR project for a client which involves using AR in order to show information about certain objects. In this project, for example, the user would point the camera at a car. Depending on which part of the car the user is looking at (headlights, windshield) a button would appear. When the user presses that button, an information window would appear on screen, giving the user more information about that certain car part.
The client doesn't wish to place physical markers on the car (QR code / patterns), and so the car parts would have to be detected another way.
I have developed AR apps before, but based on user location and generated markers in the sky. I feel this system wouldn't be entirely relevant for the client's request.
Would anybody be able to point me in the right direction (iOS library) for this sort of project, and whether or not it would be entirely feasible.
Thanks for the input,
Andy.

What you need is a model-based tracker/6DOF object tracker. As you want to track a car, it will certainly be featureless (or you will only get sparse features), so you should look at textureless non planar 3D (object) tracking solutions.
It's pretty much state of the art right now (lot of research, few products/SDK), but using library like OpenCV and with the appropriate literature (see below) you should be able to develop one. You can look at an open-source solution like the ViSP library which has a module for model based tracker but not an official iOS port. for commercial libraries, closest will be AR libraries supporting SLAM or "3D object tracking".
In term of techniques, you have different way to handle this problem, some pointers:
You can use a model-based tracker relying on edge detection + initial CAD model of the object: 3D Textureless Object Detection and Tracking: An Edge-based Approach or Harald Wuest, Folker Wientapper, Didier Stricker Adaptable Model-based Tracking Using Analysis-by-Synthesis Techniques
The 12th International Computer Analysis of Images and Patterns (CAIP), 27-29th August 2007, Vienna, Austria.
You can use a model-based tracker relying on edge detection + (trained) template images
You can use some SLAM techniques combined with a model based tracker.
M. Tamaazousti, V. Gay-Bellile, S. Naudet Collette, S. Bourgeois, M. Dhome Real-Time Accurate Localization in a Partially Known Environment:Application to Augmented Reality on textureless 3D Objects. TrakMark 2011, Basel, Switzerland 26-29/10/2011
if your system will only run indoor, you can look at some RGBD tracker
S. Hinterstoisser, V. Lepetit, S. Ilic, S. Holzer, G. Bradski, K. Konolige, N. Navab
Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes Asian Conference on Computer Vision (ACCV), Korea, Daejeon, November 2012
(access to the software)

It seems you are heading for an interesting topic. However, my concern is the accuracy of what you are trying to do. Location-based AR would be a starting point for your research work. Still the granularity would be less to your problem domain. Since you have worked on the location-based AR application, you might have noticed the accuracy that you can expect would be maximum upto 3 meters. Therefore, that level of accuracy cannot address your problem domain in an advanced way.
However, I have seen prototypes that addresses your problem domain. One good example would be the BMW Augmented Reality Manual. Check this link http://www.youtube.com/watch?v=P9KPJlA5yds
Hence, I never came across a proper Augmented Reality library for iOS or even Android which can address your problem domain in the marker-less AR context.
The information above is only for your knowledge, but not to discourage you in any way.

Related

Generate a real time 3D (mesh) model in Unity using Kinect

I'm currently developing an application with the initial goal of obtaining, in real time, a 3D model of the environment "seen" by a Kinect device. This information would be later on used for projection mapping but that's not an issue, for the moment.
There are a couple of challenges to overcome, namely the fact that the Kinect will be mounted on a mobile platform (robot) and the model generation has to be in real-time (or close to it).
After a long research on this topic, I came up with several possible (?) architectures:
1) Use the depth data obtained from Kinect, convert it into a point cloud (using PCL for this step), then a Mesh and then export it into Unity for further work.
2) Use the depth data obtained from Kinect, convert it into a point cloud (using PCL for this step), export it into Unity and then convert it into a Mesh.
3) Use KinectFusion that already the option of creating a Mesh model, and (somehow) automatically load the Mesh model created into Unity.
4) Use OpenNI+ZDK (+ wrapper) to obtain the depth map and generate the Mesh using Unity.
Quite honestly, I'm kinda lost here, my main issue is the real-time requirement along with being forced to integrate several software components makes this it tricky problem. I don't know which if any of these solutions are viable and the information/tutorials on these issues isn't exactly abundant like the one, for example, for Skeleton tracking .
Any sort of help would be greatly appreciated.
Regards,
Nuno
Sorry, I might not be providing a solution for realtime mesh creation within Unity - but the process discussion here, was interesting enough for me to reply.
In the hard science novel Memories with Maya - there is discussion of exactly such a scenario:
"“Point taken,” he said. “So… Satish showed me a demo of the Quad [Quad=Drone] acquiring real-time depth and texture maps.”
“Nothing new in that,” I said.
“Yeah, but look above us.”
I tilted my head up. The crude shape of the Quad came into view.
“The Quad is here, but you can't see it because the FishEye [Fisheye=Kinect 2] is on it aimed straight ahead.”
“So it's mapping video texture over live geometry? Cool,” I said.
“Yeah, the breakthrough is I can freeze a frame… freeze real life as it were, step out of the scene and study it.”
“All you do is block out the live world with the cross polarizers?”
“Yeah,” he said. “It's a big deal for AYREE to be able to use such data-sets.”
“The resolution has improved,” I said.
“Good observation,” he said. “So has the range sensing. The lens optics have also been upgraded.”
“I noticed that if I turn around I don't see the live feed, just the empty street,” I said.
“Yes, of course,” he replied. “The Quad is facing the other way around. It's why I'm standing in front of you. The whole street, however, is a 3D model done by a standard laser scan taken from the top of that high tower.”
Krish pointed to a building block at the far end of the street. I turned back to the live 3D view again. He walked in front of me.
“This is uber cool. Everyone looks so real.”
“Haha. You should see how cool it is when you're here in person with the Wizer on,” he said. “I'm here watching these real people pass by, only they have a mesh of themselves mapped onto them.”
“Ahhh! Yes.”
“Yeah, it's like they have living paint on them. I feel like reaching out and touching, just to feel the texture.”...
The work that you're thinking of doing in this area, and this use of a live mesh goes far beyond Projection Mapping for events- for sure!
Wishing you the best on the project, and I will be following your updates.
Some of the science behind the story is on www.dirrogate.com if the topic interests you.
Kind Regards.
I would use Kinect Fusion, as it has a sample with the ability to export to .obj, which unity supports. You can automatically save it, and import it to unity to generate a mesh automatically. Especially if you have multiple Kinects, then Microsoft even has a sample to show the basics of Kinect Fusion with multiple Kinects. Also, since Fusion is already pre-written, there is not much code you will have to write.
Here is an example of a mesh from Fusion with one camera:
I do want you to notice how many vertices there are though... This could cause performance problems later on.
Good luck!

Use core image for object detection

I want to detect items in an image (like core image for a face), but the items aren't faces.
The image What can I use to do so?
I have an image with a few items, a car, a person a tree and mailbox. I want to cut the image around each item and create a subimage of each . Now i would have 1 image with a car, 1 with a person, 1 with a mailbox. There may be overlap of other objects, but the predominant feature in each would be the main object.
Thanks
This is a surprisingly complicated topic of ongoing research in the field of Computer Vision. There are many good academic papers written on the topic (heres a nice video) and no publicly available turnkey solutions.
I dont think core image currently supports this kind of functionality nor will it in the near future.
However your best bet is to start by checking out the now well established OpenCV library maintained by Willow Garage for all major operating systems (including iOS and Android). The following link might help you towards what you are looking for:
OpenCV object detection tutorials
Alternatively you could try out augmented reality toolkits designed specifically for tracking known targets. Some good examples are:
Metaio,
Vuforia,
ARLab,
String,
Junaio
EDIT, Nov 2016
Although CoreImage still does not support this, it is somewhat more likely that it may support it in the future. Recent years have seen a dramatic increase in the availability of object detection frameworks that use deep networks to perform object classification and localization.
A good first place to start would be to look at projects that use TensorFlow for Android and iOS.
One such link.
EDIT, Dec 2017
This is now fairly standard across all major mobile and desktop computing platforms (amazing how much changes in only 1 year). Specifically for Apple you can look at CoreML

Modeling a Physical Place inside iPhone Application

I need to find a way to model a physical place inside an iPhone application. For example, I want to be able to take images for a restaurant and then use some tools or programming API to model this resturant as a 3d place and make the user able to navigate and explore the place and rooms.
I have thought about HTML 5 inside a web view but I don't think the WebGL is compatible with iPhone Web View (Safari Engine).
Can you please recommend a method, API, Commercial Library or anything to help me achieve this task?
First, you need to be able to display 3D models for IPhone. One of the most popular 3D engine is Unity3D:
http://unity3d.com/
It is extremely easy to start playing with Unity3D. You even have a free license with limited features:
http://unity3d.com/unity/licenses
Then, you now need to reconstruct a 3D model from pictures. This is not a trivial problem so it is better if you know some computer vision. You can try to play with OpenCV:
http://opencv.willowgarage.com/wiki/
Best regards.
Actually Nuke from the Foundry has a decent start at the future of creating computer models from images.
Basically it takes a high contrast point and tracks it through successive moments. Given hundreds and thousands of tracked points, the next step is to calculate the perspective change between points.
Say two points are a known pixel distance apart at time zero and a certain time period later they are a different distance apart. This change in difference could be a bad tracking point. But assuming that the two points are perfectly tracking, then the distance change could be caused by a camera motion laterally or rotationally. And in real space a point further away from you will have a different perspective then a closer point . This perspective change is a mathematical certainty.
Initially the tracking is typically used to refilm a piece of film to stabilize it. But the process the software uses to analyze the film can be saved , it is often called a point cloud. connection of many close points that track very closely usually are because the points are parts of a surface, so a model can be built.
But my friend, we are barbarians to the speed and software that can do that perfectly yet. Or all the CG Artists out there would not have anything to model in Maya except fantasy monsters and space ships that don't exist yet....

Does a free API for a Augmented reality service exist?

Currently I am trying to create an app for iPhone which is capable of recognizing the objects on an image such as car, bus, building, bridge, human, etc, and label as object name with the help of Internet.
Is there any free service which provide solution to my problem, as object recognition its self a complex algorithm requiring digital image processing, neural networks and all.
Can this can be done via API?
If you want to recognise planar images the current generation of mobile AR SDKs from Metaio, Qualcomm and Layar will allow you to upload images to match against, and perform the matching.
If you want to match freely against a set of 3D objects, e.g. a Toyota Prius or the Empire state, the same techniques might be applied to match against sets of images taken at different rotations, but you might have to choose to match just one object due to limitations on how large an image database you can have with the service, or contact those companies for a custom solution, and it may not work very reliably given the state of the art is to reliably match against planar images.
If you want to recognize general classes (human, car, building), this is a very difficult problem, and I don't know of any solutions anywhere fast enough to operate online (which I assume is a requirement given you want an AR solution - is that a fair assumption?). It's been a few years since I studied CV, but at that time the most promising solution for visual classification was "bag of visual words" approaches - you might try reading up on those.
Take a look at Cortexica. Very useful for this sort of thing.
http://www.cortexica.com/
I haven't done work with mobile AR in a while, but the last time I was working on this stuff I was using Layar and starting to investigate Junaio. Those are oriented toward 3D graphics, not simply text labels, so for your use case you may be better served with OpenCV.
Note that Layar (and I believe Junaio too) works like a web app, where you put the content on your own server and give Layar the URL to link to.

Water simulation with a grid

For a while I've been attempting to simulate flowing water with algorithms I've scavenged from "Real-Time Fluid Dynamics for Games". The trouble is I don't seem to get out water-like behavior with those algorithms.
Myself I guess I'm doing something wrong and that those algorithms aren't all suitable for water-like fluids.
What am I doing wrong with these algorithms? Are these algorithms correct at all?
I have the associated project in bitbucket repository. (requires gletools and newest pyglet to run)
Voxel-based solutions are fine for simulating liquids, and are frequently used in film.
Ron Fedkiw's website gives some academic examples - all of the ones there are based on a grid. That code underpins many of the simulations used by Pixar and ILM.
A good source is also Robert Bridson's Fluid Simulation course notes from SIGGRAPH and his website. He has a book "Fluid Simulation for Computer Graphics" that goes through developing a liquid simulator in detail.
The most specific answer I can give to your question is that Stam's real-time fluids for games is focused on smoke, ie. where there isn't a boundary between the fluid (water), and an external air region. Basically smoke and liquids use the same underlying mechanism, but for liquid you also need to track the position of the liquid surface, and apply appropriate boundary conditions on the surface.
Cem Yuksel presented a fantastic talk about his Wave Particles at SIGGRAPH 2007. They give a very realistic effect for quite a low cost. He was even able to simulate interaction with rigid bodies like boxes and boats. Another interesting aspect is that the boat motion isn't scripted, it's simulated via the propeller's interaction with the fluid.
(source: cemyuksel.com)
At the conference he said he was planning to release the source code, but I haven't seen anything yet. His website contains the full paper and the videos he showed at the conference.
Edit: Just saw your comment about wanting to simulate flowing liquids rather than rippling pools. This wouldn't be suitable for that, but I'll leave it here in case someone else finds it useful.
What type of water are you trying to simulate? Pools of water that ripple, or flowing liquids?
I don't think I've ever seen flowing water ever, except in rendered movies. Rippling water is fairly easy to do, this site usually crops up in this type of question.
Yeah, this type of voxel based solution only really work if your liquid is confined to very discrete and static boundaries.
For simulating flowing liquid, do some investigation into particles. Quite alot of progress has been made recently accelerating them on the GPU, and you can get some stunning results.
Take a look at, http://nzone.com/object/nzone_cascades_home.html as a great example of what can be achieved.