Are there any limitations in Vuforia compared to ARCore and ARKit? - unity3d

I am a beginner in the field of augmented reality, working on applications that create plans of buildings (floor plan, room plan, etc with accurate measurements) using a smartphone. So I am researching about the best AR SDK which can be used for this. There are not many articles pitting Vuforia against ARCore and ARKit.
Please suggest the best SDK to use, pros and cons of each.

Updated: December 15, 2022.
TL;DR
Foreword
Before answering your question, I would like to point out that any AR framework will greatly benefit from having many types of anchors. The abundance of different types of anchors will allow you not only to securely tether 3D models according to a certain scenario, but even use a real human being as a starting point in calculations (I mean ARBodyAnchor), or iOS device with U1 chip (I mean a precise distance to corresponding ARAnchorEntity). Also, any framework's indisputable advantage is the availability of a high quality 32-bit depth data for Scene Reconstruction and occlusion. In fact, almost any new feature of the AR framework is a contribution to the quality of the AR experience.
What is what
Google ARCore allows you build apps for Android and iOS. With Apple ARKit and RealityKit you can build apps for iOS. Great old PTC Vuforia was designed to create apps for Android, iOS and Universal Windows Platform.
A crucial Vuforia's peculiarity is that it uses ARCore/ARKit technologies (also known as platform hard'n'soft enablers) if the hardware it's running on supports them. Otherwise, Vuforia uses its own AR technology and engine, known as software solution without dependent hardware. In practice, an AR experience created using the Vuforia Engine will attempt to use the top most technologies and work theirs way downward dependent on what's available on the device during runtime.
When developing for Android OEM smartphones, you may encounter an unpleasant issue: devices from different manufacturers need a sensors’ calibration in order to observe the same AR experience. Luckily, Apple gadgets have no such drawback because all sensors used there were calibrated under identical conditions.
Let me put first things first.
Google ARCore 1.35
ARCore was released in March 2018. ARCore is based on the three main fundamental concepts : Motion Tracking, Environmental Understanding and Light Estimation. ARCore allows a supported mobile device to track its position and orientation relative to the world in 6 degrees of freedom (6DoF) using special technique called Concurrent Odometry and Mapping. COM helps us detect the size and location of horizontal, vertical and angled tracked surfaces. Motion Tracking works robustly thanks to optical data coming from a RGB camera at 60 fps, combined with inertial data coming from gyroscope and accelerometer at 1000 fps, and depth data coming from ToF sensor at 60 fps. Surely, ARKit, Vuforia and other AR libraries operate almost the same way.


When you move your phone through the real environment, ARCore tracks a surrounding space to understand where a smartphone is, relative to the world coordinates. At tracking stage, ARCore "sows" so called feature points. These feature points are visible through RGB camera, and ARCore uses them to compute phone's location change. The visual data then must be combined with measurements from IMU (Inertial Measurement Unit) to estimate the position and orientation of the ArCamera over time. If a phone isn't equipped with ToF sensor, ARCore looks for clusters of feature points that appear to lie on horizontal, vertical or angled surfaces and makes these surfaces available to your app as planes (we call this technique Plane Detection). After detection process you can use these planes to place 3D objects in your scene. Virtual geometry with assigned shaders will be rendered by ARCore's companion – Sceneform supporting a real-time Physically Based Rendering (a.k.a. PBR) engine – Filament.
Notwithstanding the above, at this moment Sceneform repository has been archived and it no longer actively maintaining by Google. The last released official version was Sceneform 1.17.1. That may sound strange but ARCore team member said "there's no direct replacement for Sceneform library and ARCore developers are free to use any 3D game library with Android AR apps. However, there's an unofficial Sceneform + SceneView fork, so it's the continuation of the archived Sceneform framework (the last release is Sceneform 1.21).

ARCore's environmental understanding lets you place 3D objects with a correct depth occlusion in a way that realistically integrates with the real world. For example, you can place a virtual cup of coffee on the table using Depth hit-testing and ArAnchors.

ARCore can also define lighting parameters of a real environment and provide you with the average intensity and color correction of a given camera image. This data lets you light your virtual scene under the same conditions as the environment around you, considerably increasing the sense of realism.

Current ARCore version has such a significant APIs as ARCore API, Raw Depth API and Full Depth API, Geospatial API, Lighting Estimation, Terrain Anchor API, Augmented Faces, Augmented Images, Instant Placement, Debugging Tools, 365-days Cloud Anchors, Recording and Playback and Multiplayer support. The main advantage of ARCore in Android Studio over ARKit in Xcode is Android Emulator allowing you run and debug AR apps using virtual device.

In ARCore 1.31, the Google engineers mapped each shade of gray in the 16-bit depth channel to a distance of 1 mm. Thus, they managed to cover a distance of 65,536 millimeters (2^16). This table presents the difference between Raw Depth API and Full Depth API:
Full Depth API (v1.31+)
Raw Depth API (v1.24+)
Full Depth API (v1.18+)
Accuracy
Bad
Good
Bad
Coverage
All pixels
Not all pixels
All pixels
Distance
0 to 65.5 m
0.5 to 5.0 m
0 to 8.2 m
ARCore is older than ARKit. Do you remember Project Tango released in 2014? Roughly speaking, ARCore is just a rewritten Tango SDK. But a wise acquisition of FlyBy Media, Faceshift, MetaIO, Camerai and Vrvana helped Apple not only to catch up but significantly overtake Google. Suppose it's good for AR industry.
The latest version of ARCore supports OpenGL ES acceleration, and integrates with Unity, Unreal, and Web applications. At the moment the most powerful and energy efficient chipsets for AR experience on Android platform are MediaTek Dimensity 9000 (4 nm), Snapdragon 8 Gen 1 (4 nm), Exynos 2200 (4 nm) and Google Tensor G2 (5 nm).
ARCore price: FREE.
ARCore pros
ARCore cons
iToF and Depth API support
Poor Google Glass API
Quick Plane Detection
Cloud Anchors hosted online
Long-distance-accuracy
Lack of native rendering engines
ARCore Emulator in Android Studio
Poor developer documentation
High-quality Lighting API
No external camera support
Geospatial anchoring
Quickly drains phone's battery
Here's ARCore code's snippet written in Kotlin:
private fun addNodeToScene(fragment: ArFragment,
anchor: Anchor,
renderable: Renderable) {
val anchorNode = AnchorNode(anchor)
anchorNode.setParent(fragment.arSceneView.scene)
val modelNode = TransformableNode(fragment.transformationSystem)
modelNode.setParent(anchorNode)
modelNode.setRenderable(renderable)
modelNode.localPosition = Vector3(0.0f, 0.0f, -3.0f)
fragment.arSceneView.scene.addChild(anchorNode)
modelNode.select()
}
Platform-specific directions: Android (Kotlin/Java), Android NDK (C) and Unity (AR Foundation).
Apple ARKit 6.0
ARKit was released in June 2017. Like its competitors, ARKit also uses special technique for tracking, but its name is Visual Inertial Odometry. VIO is used to very accurately track the world around your device. VIO is quite similar to COM found in ARCore. There are also three similar fundamental concepts in ARKit: World Tracking, Scene Understanding (which includes four stages: Plane Detection, Ray-Casting, Light Estimation, Scene Reconstruction), and Rendering with a great help of ARKit companions – SceneKit framework, that’s actually an Apple 3D game engine since 2012, RealityKit framework specially made for AR and written in Swift from scratch (released in 2019), and SpriteKit framework with its 2D engine (since 2013).
VIO fuses RGB sensor data at 60 fps with Core-Motion data (IMU) at 1000 fps and LiDAR data. In addition to that, It should be noted that due to a very high energy impact (because of an enormous burden on CPU and GPU), your iPhone's battery will be drained pretty quickly. The same can be said about Android devices.
ARKit has a handful of useful approaches for robust tracking and accurate measurements. Among its arsenal you can find easy-to-use functionality for saving and retrieving ARWorldMaps. World map is an indispensable "portal" for Persistent and Multiuser AR experience that allows you to come back to the same environment filled with the same chosen 3D content just before the moment your app became inactive. Support for simultaneous front and back camera capture and support for collaborative sessions, is also great.
There are good news for gamers: up to 6 people are simultaneously able to play the same AR game, thanks to MultipeerConnectivity framework. For 3D geometry you could use a brand-new USDZ file format, developed and supported by Pixar. USDZ is a good choice for sophisticated 3D models with multiple PBR shaders, physics, animations and spatial sound. Also you can use the following 3D formats for ARKit.
ARKit can also help you perform People and Objects Occlusion technique (based on alpha and depth channels' segmentation), LiDAR Scene Reconstruction, Body Motion Capture tracking, Vertical and Horizontal Planes detection, Image detection, 3D Object detection, 3D Object scanning, 4K HDR video capture and RoomPlan Scanning powered by ARKit. With People and Objects Occlusion tool your AR content realistically passes behind and in front of real world entities, making AR experiences even more immersive. Realistic reflections, that use machine learning algorithms, and Face tracking experience allowing you to track up to 3 faces at a time, are also available for you.

Using ARKit and iBeacons, you assist an iBeacon-aware application to know what room it’s in, and show a right 3D content chosen for that room. Working with ARKit you should intensively exploit ARAnchor class and all its subclasses.
For creating ARKit 6.0 apps you need macOS Ventura, Xcode 14+ and device running iOS 16. ARKit is a worthy candidate to marry Metal framework for GPU acceleration. Don’t forget that ARKit tightly integrates with Unity and Unreal. At the moment the most powerful and energy efficient chipsets for AR experience are Apple M2 (5 nm) and A16 Bionic (4 nm).
ARKit price: FREE.
ARKit pros
ARKit cons
LiDAR and Depth API support
No support for AR glasses
Stable 6 DoF World Tracking
No auto-update for ARAnchors
Collaborative Sessions
OS and Chipsets' Restrictions
WorldMaps and iBeacon-awareness
No ARKit Simulator in Xcode
4 rendering technologies
No external camera support
Rich developer documentation
Quickly drains phone's battery
Here's ARKit code's snippet written in Swift:
func renderer(_ renderer: SCNSceneRenderer,
didAdd node: SCNNode,
for anchor: ARAnchor) {
guard let planeAnchor = anchor as? ARPlaneAnchor else { return }
let planeNode = tableTop(planeAnchor)
node.addChildNode(planeNode)
}
func tableTop(_ anchor: ARPlaneAnchor) -> SCNNode {
let x = CGFloat(anchor.extent.x)
let z = CGFloat(anchor.extent.z)
let tableNode = SCNNode()
tableNode.geometry = SCNPlane(width: x, height: z)
tableNode.position = SCNVector3(anchor.center.x, 0, anchor.center.z)
return tableNode
}
Apple RealityKit 2.0
RealityKit (and its twin brother RealityFoundation) was introduced at WWDC 2019. It is a high-level framework for iOS and macOS apps. It supports up-to-date Entity-Component-System paradigm that allows you to more efficiently implement Non-AR and AR experiences. Frankly speaking, there is no need to list all the features of RealityKit here, as you can read about them in this SO post.
RealityKit is based on two fundamental entities (nodes): a ModelEntity object, that depends on MeshResource and Materials, and an AnchorEntity object, that depends on Transform and can automatically track target (unlike ARAnchor in ARKit). In the image below, you'll see which components such entities as Model, Camera, and Light contain.
RealityKit gives you a rich set of tools to work with AR/VR: 3D primitives, PBR materials, occlusion materials, video materials, lights with realistic ray-traced shadows, spatial audio processing, 10 different types of anchors, simplified setup for collaborative sessions, robust physics' setup, indispensable built-in ML algorithms and many other features. RealityKit supports Object Reconstruction API and Scene Reconstruction.
When working with a LiDAR scanner, the main feature of RealityKit and ARKit is the existence of optional .sceneDepth object of ARDepthData type inside each AR frame, which provides you with a high-quality depth data with a corresponding confidence map.
let arView = ARView(frame: .zero) // RealityKit View
arView.session.currentFrame?.sceneDepth?.depthMap // 32-bit
arView.session.currentFrame?.sceneDepth?.confidenceMap // 8-bit
Let's see what Apple documentation says about .confidenceMap:
The natural light of the physical environment affects the .depthMap property such that ARKit is less confident about the accuracy of the LiDAR Scanner’s depth measurements for surfaces that are highly reflective, or that have high light absorption. This property measures the accuracy of the scene depth-data by containing an ARConfidenceLevel raw-value for every component in .depthMap.
unowned(unsafe) var confidenceMap: CVPixelBuffer? { get }
RealityKit price: FREE.
RealityKit pros
RealityKit cons
Can create AR apps without ARKit
Intensive usage of CPU/GPU
Pixar USD Hydra support
iOS 13+ and macOS 10.15+ only
Suitable for AR/VR projects
Start lagging on old devices
Robust API for Reality Composer scenes
There's no particle system
Asynchronous asset loading
Lack of Apple documentation
Autoupdating tracking target
No AR glasses support
Here's RealityKit code's snippet written in Swift:
override func viewDidLoad() {
super.viewDidLoad()
let textAnchor = try! SomeText.loadTextScene()
let entity: Entity = textAnchor.realityComposer!.children[0]
var textMC: ModelComponent = entity.children[0].components[ModelComponent]!
var material = SimpleMaterial()
material.baseColor = .color(.yellow)
textMC.materials[0] = material
textMC.mesh = .generateText("Hello, RealityKit")
textAnchor.realityComposer!.children[0].children[0].components.set(textMC)
arView.scene.anchors.append(textAnchor)
}
Pay attention to RealityKit's satellite – Reality Composer app that's a part of Xcode. Its simple and intuitive UI is good for AR scenes prototyping. Scenes built in Reality Composer can be packed with dynamics, simple animations and PBR shaders. Reality Composer has a royalty free library with downloadable 3D assets. You can export your composition as a lightweight .reality file for AR Quick Look experience. In Reality Composer, you launch every project using one of the five anchor types: horizontal, vertical, image, face and object – which correspond to a desired type of tracking.
One more important part of Apple's AR ecosystem is Reality Converter app. Now, instead of using a command line conversion tool, you can use a Reality Converter's UI. The app makes it easy for you to view, customize and convert .usdz 3D objects on Mac. Simply drag-and-drop supported file formats (such as .obj, .gltf or .usd) and UV-mapped textures. You can even preview your .usdz model under a variety of lighting conditions with built-in Image-Based Lighting (IBL) option.
PTC Vuforia 10.12
In October 2015 PTC acquired Vuforia from Qualcomm for $65 million. Take into consideration that Qualcomm launched Vuforia in 2010. So Vuforia is an older sister in AR family. Big sister is watching you, guys! ;)
In November 2016 Unity Technologies and PTC announced a strategic collaboration to simplify AR development. Since then they work together integrating new features of the Vuforia AR platform into the Unity game engine. Vuforia can be used with such development environments as Unity, MS Visual Studio, Apple Xcode and Android Studio. It supports a wide range of smartphones, tablets and AR smart glasses, such as HoloLens and Magic Leap 2.
Vuforia Engine's Visual-Inertial Simultaneous Localization And Mapping, or VISLAM, is an algorithm that implements a markerless AR experience. VISLAM combines the benefits of Visual-Inertial Odometry (VIO) and Simultaneous Localization And Mapping (SLAM).
Vuforia Engine boasts roughly the same principal capabilities that you can find in the latest version of ARKit but also it has its own tools, such as Model Targets with Deep Learning and External Camera support for iOS, new experimental APIs for ARCore, and support for industry latest AR glasses. The main advantage of Vuforia over ARKit and ARCore that it has a wider list of supported devices and it supports the development of Universal Windows Platform apps for Intel-based Windows devices, including Microsoft Surface and HoloLens. Vuforia has a standalone version and a version baked directly into Unity.
Vuforia Fusion
Vuforia Fusion is a set of technologies designed to provide the best possible AR experience on a wide range of devices. It was designed to solve the problem of fragmentation in AR enabling technologies such as cameras, sensors, chipsets, and software frameworks. With Vuforia Fusion, your app will automatically provide the best experience possible with no extra work required on your end.
Vuforia Fusion has the following functionality:
Advanced Model Targets 360 | recognition powered by AI.
Model Targets with ML | allow to instantly recognize objects by shape.
Barcode Scanner | an API for reading QR codes and barcodes.
Model Target Runtime 3D Guide Views | for creating guide views in Unity at runtime.
Model Target Web API | generates Model Targets using the Web API.
Image Targets | the easiest way to put AR content on flat objects.
Multi Targets | for objects with flat surfaces and multiple sides.
Cylinder Targets | for placing AR content on objects with cylindrical shapes.
Ground Plane | enables content to be placed on floors and tabletop surfaces.
VuMarks | allows identify and add content to series of objects.
Object Targets | for scanning an object.
Static and Adaptive Modes | for stationary and moving objects.
Simulation Play Mode | allows to “walk through” or around the 3D model.
Vuforia Area Target Creator | enables us to scan and generate new Area Targets.
AR Session Recorder | can record AR experiences in the location.
and, of course, Vuforia Engine Area Targets.
Vuforia Engine Area Targets enable developers to use an entire space, be it a factory floor or retail store, as AR target. Using a supported device, like Matterport Pro2 3D camera, developers can create a detailed 3D scan of a desired location. Locations are recommended to be indoors, mostly static, and no larger than 1,000 sqm (around 10,000 sqft). Once the scan produces a 3D model it can be converted into an Area Target with the Vuforia Area Target Generator. This target can then be brought into Unity.
Occlusion Management is one of the key features for building a realistic AR experience. When you're using Occlusion Management, Vuforia Engine detects and tracks targets, even when they’re partially hidden behind everyday barriers, like your hand. Special occlusion handling allows apps to display graphics as if they appear inside physical objects.
Vuforia API allows for a Static or Adaptive mode. When the real-world model remains stationary, like a large industrial machine, implementing the Static API will use significantly less processing power. This enables a longer lasting and higher performance experience for those models. For objects that won’t be stationary the Adaptive API allows for a continued robust experience.
Vuforia supports Metal acceleration for iOS devices. Also you can use Vuforia Samples for your projects. For example: the Vuforia Core Samples library includes various scenes using Vuforia features, including a pre-configured Object Recognition scene that you can use as a reference and starting point for Object Recognition application.
Here's AR Foundation code's snippet written in C#:
private void UpdatePlacementPose() {
var scrCenter = Camera.main.ViewportToScreenPoint(new Vector3(0.5f, 0.5f));
var hits = new List<ARRaycastHit>();
arOrigin.Raycast(scrCenter, hits, TrackableType.Planes);
placementPoseIsValid = hits.Count > 0;
if (placementPoseIsValid) {
placementPose = hits[0].pose;
var cameraForward = Camera.current.transform.forward;
var cameraBearing = new Vector3(cameraForward.x, 0,
cameraForward.z).normalized;
placementPose.rotation = Quaternion.LookRotation(cameraBearing);
}
}
Vuforia SDK Pricing Options:
Free license – you just need to register for a free Development License Key
Basic license ($42/month, billed annually) – For Students
Basic + Cloud license ($99/month) – For Small Businesses
Agency Package (personal price) – 5 short-term licenses
Pro license (personal price) – For All Companies Types
Here are Pros and Cons.
Vuforia pros
Vuforia cons
Supports Android, iOS, UWP
The price is not reasonable
A lot of supported devices
Poor developer documentation
External Camera support
PTC's mixing business with politics
Webcam/Simulator Play Mode
Doesn't support Geo tracking
Cylinder Targets support
Poor potential in Unity
Important
Deprecation Notice: Apple has deprecated Bitcode as of Xcode 14, and no longer supports app submissions containing Bitcode. A future release of the Vuforia Engine SDK will remove Bitcode from the iOS framework.
CONCLUSION :
There are no vital limitations when developing with PTC Vuforia compared to ARCore and ARKit. Vuforia is an old great product and it supports a wider list of Apple and Android devices (even those that are not officially supported) and it supports several latest models of AR glasses.
But in my opinion, ARKit, RoomPlan and Reality Family toolkit (RealityKit, RealityFoundation, Reality Composer and Reality Converter) have an extra bunch of useful up-to-date features that Vuforia and ARCore just partially have. ARKit personally has a better short-distance measurement accuracy within a room than any ARCore compatible device has, without any need for calibration. This is achieved thanks to Apple LiDAR dToF scanner. ARCore uses iToF cameras with Raw Depth API. Both iToF and LiDAR allow you create a high-quality virtual mesh with OcclusionMaterial for real-world surfaces at scene understanding stage. This mesh is ready-for-measurement and ready-for-collision. With iToF and dToF sensors, frameworks instantly detect non-planar surfaces and surfaces with no-tracking-features-at-all, such as texture-free white walls in a poorly-lit rooms.
If you implement iBeacon tools, ARWorldMaps and support for GPS – it will help you boost a tracking quality and eliminate many tracking errors accumulated over time. And ARKit's tight integration with Vision and CoreML frameworks makes a huge contribution to a robustness of AR toolset. Integration with Apple Maps allows ARKit put GPS Location Anchors outdoors with a highest possible precision at the moment.
ARCore, in its turn, uses Geospatial anchors that obtain geodata from Google Earth and Street View images, created with the help of Google Trekker.
Vuforia's measurement accuracy depends on what platform you're developing for. Some of Vuforia features are built on top of the tracking engine (ARKit or ARCore). Even popular Vuforia Chalk application uses ARKit positional tracker.

Excellent info. However would like to add few points based on the experience in using ARCore and ARkit. With respect to mapping, ARCore has the ability to manage larger maps compared to ARkit. ARcore tracks more feature point compared to ARkit.
Another point is ARKit differentiates the horizontal and vertical detection of surfaces better than ARcore.

AR KIT and AR CORE, they are the best option.The libraries are developed by the operating system( android / Apple devices) dev community, so you get the latest updates for the latest technological advancement of the devices and support as well.
So if you are planning to work in AR realm for a longer period you need to stick to these 2 (just my opinion). I have worked on vuforia for a very long time. It taught me basics of AR and created a-lot of different applications. But at certain level it had barriers the main one for me was the price which led to certain restrictions. Where ar-foundation / AR core /AR kit are free and more stable, a bit flexible too. comparatively
You can explore AR-FOUNDATION : Its an amazing package by unity,You just need to code once and it will export to Android and IOS using ARCORE AND ARKIT
FEATURES OF VUFORIA : https://library.vuforia.com/getting-started/vuforia-features
Features of AR foundation : https://unity.com/unity/features/arfoundation[![Screen shot from ARfoundation website]1]1

Related

ARCore in unity vs Sceneform features/use cases?

The way I understand it is that there are several environments that support ARCore and Unity and Sceneform SDK are some of the options.
I was wondering how are they different from each other besides one being in Java and the other being in C#? Why would someone choose one over the other aside from language preference?
Thank you
Sceneform empowers Android developers to work with ARCore without learning 3D graphics and OpenGL. It includes a high-level scene graph API, realistic physically based renderer, an Android Studio plugin for importing, viewing, and building 3D assets, and easy integration into ARCore that makes it straightforward to build AR apps. Visit this video link of Google I/O '18.
Whereas ARCore in Unity uses three key capabilities to integrate virtual content with the real world as seen through your phone's camera:
Motion tracking
Environmental understanding allows the phone to detect the size
and location of all type of surfaces: horizontal, vertical and
angled surfaces like the ground, a coffee table or walls.
Light estimation allows the phone to estimate the environment's
current lighting conditions.
ARCore is Google’s platform for building augmented reality experiences. Using different APIs, ARCore enables your phone to sense its environment, understand the world and interact with information. Some of the APIs are available across Android and iOS to enable shared AR experiences.

Unity Vuforia hand/fingure recognition

I want to make an app like this video Hand Segmentation + Vuforia Augmented Reality
The description of video tell there was two SDK could help: Hand Recognition with Intel Perceptual Computing SDK & Augmented Reality with Qualcomm Vuforia
The problem is I don't know which feature of Vuforia could support and combine with Intel Perceptual Computing SDK, is that Object Recognition and how? Any detail instruction would be appreciate.
I do not know this intel SDK, but I can tell you how basically this should be done. You need to have Vuforia as the 'main' app, and let them take over the camera, as done in all of their samples, and then take the image (with unity, it is shown here) and pass it to another to another service to handle the hand detection. It has nothing to do with what feature of Vuforia you want to use.

Is there a Unity plug-in that would allow you to generate a 3d model using your webcam?

I've looked into Metaio which can do Facial 3d reconstructions
video here: https://www.youtube.com/watch?v=Gq_YwW4KSjU
but I'm not looking to do that. I want to simply be able to have the user scan in a small simple object and a 3d model be created from that. I don't need it textured or anything. As far as I can tell Metaio cannot do what I'm looking for, or at least I can't find the documentation for it.
Since you are targeting mobile, you would have to take multiple pictures from different angles and use an approach used in this CSAIL paper.
Steps
For finding the keypoints, I would use FAST, or a method using the Laplacian of Gaussian. Other options include SURF and SIFT.
Once you identify the points, use triangulation to find where the points will be in 3D.
With all of the points, create a point cloud. In unity, I would recommend doing something similar to this project, which used particle systems as the points.
You now have a 3d reconstruction of the object!
Now, in implementing each of these steps, you could reinvent the wheel, or use C++ native plugins in Unity. This enables you to use OpenCV which has many of these operations already implemented (SURF, SIFT, possibly even some 3D reconstruction classes/methods, which use Stereo Calibration*).
That all being said... the Android Computer Vision Plugin(also apparently called "Starry Night") seems to have these capabilities. However, in version 1.0, only PrimeSense sensors are supported. See the description of the plugin**
Starry Night is an easy to use Unity plugin that provides high-level 3D computer vision processing functions that allow applications to interact with the real world. Version 1.0 provides SLAM (Simultaneous Localization and Mapping) functions which can be used for 3D reconstruction, augmented reality, robot controls, and many other applications. Starry Night will interface to any type of 3D sensor or stereo camera. However, version 1.0 will interface only to a PrimeSense Carmine sensor.
*Note: That tutorial is in matlab, but I think the overview section gives a good understanding of stereo calibration
**as of May 12th, 2014

How can I perform facial recogntion on iOS?

I've started work on an application for iOS that would recognize faces from a photo or from the iPhone / iPad camera.
Existing solutions like OpenCV and Core Image (in iOS 5.0) provide facial detection within an image, but I can't find a library or example that matches a face with a person.
Does such a means of performing facial recognition, not just detection, exist for iOS?
On iOS 5 you can use CoreImage (CIDetector, CIFeature, CIFaceFeature should be named as the relevant keywords) for that task. Check out the SquareCam example App from Apple, it includes face detection. If you're targeting older iOS versions, openCV seems to be a good approach.
http://developer.apple.com/library/ios/#samplecode/SquareCam/Introduction/Intro.html
Edit_: Argh, soory. CoreImage can only detect faces but not recognize them. But maybe you can build a solution based on CoreImage...
What OpenCV does is called "face detection." This is different than, but related to, face recognition, which is what you seem to want to do.
Face detection "detects" faces by finding the location of facial features such as the eyes, the mouth, etc. To "recognize" these faces, you then need to compare these features with a database of known faces, for which the features have already been detected.
I'm not aware of a face recognition library for iOS, and this is no easy feat. Even Apple's own iPhoto has, in my experience, very low accuracy.
However, if you only want to do face detection, or want to build your own facial recognition algorithm on top of a face detection library, iOS 5 also includes a face detection API. You can find it in the CoreImage framework.

3D Object Joints Accessing in iPhone

I have 3D object in Maya which have joints for its movements, how can I access the object's joints in iPhone, Kindly guide me so that I can make it possible.
There is no iPhone API for this. You'll have to find a game engine that runs on iPhone and integrates with Maya.
Unity and SIO2 both run on iPhone and cite Maya support, and there are probably others. Whether said support includes inverse joint data, I don't know.
Actually, try asking on https://gamedev.stackexchange.com/. I only just found out about it.