Matching 3d object with camera image for augmented reality

Matching 3d object with camera image for augmented reality - unity3d

I am working on a simple AR engine and I am having a problem with matching 3d object with a camera image.
For better understanding, I illustrated it with the picture. Points A and B are in 3d space. Points C and D are given on a texture plane. The distance to the plane from the camera is known.
I know how to obtain the coordinates of Anear, Bnear, Afar, Bfar, Cnear, Dnear, Cfar and Dfar.
The problem is how to find points A' and B' in 3d space such as vector d==d' and points Anear == Cnear and Bnear == Dnear (the projection of 3d points to the screen should result with the same coordinates)
Could anyone please help me with the math here, or at least point me to where to look for the answer?
PS. Seems like my problem description is not clear enough so to put it in other words: I have a pair of points in 3d space and a pair of points on texture plane (image from webcam). I need to put the points in 3d space at the correct distance from camera - so after perspective transformation they overlay the points on texture plane. The spatial relation of the 3d points need to be preserved. In the drawing the visual solution are points A' and B'. The dashed line illustrates the perspective transformation (where they are casted on near plane at the same location as points C and D).

So if I understand correct
given points in world-space are
A
B
C
D
also known is the distance d and implicitely the Camera.position origin and Camera.transform.forward direction.
Searched are
A'
B'
As I understand you could find the first point A' by finding the intersection point of the line (origin = A, direction = camera forward) and the line (origin = camera.position, direction = Camera.position -> C).
Equally also the second point B' by finding the intersection point of the line (origin = B, direction = camera.forward) and the line (origin = camera.position, direction = Camera.position -> D).
Unity offers some special Math3d that come to help here e.g.:
//Calculate the intersection point of two lines. Returns true if lines intersect, otherwise false.
//Note that in 3d, two lines do not intersect most of the time. So if the two lines are not in the
//same plane, use ClosestPointsOnTwoLines() instead.
public static bool LineLineIntersection(out Vector3 intersection, Vector3 linePoint1, Vector3 lineVec1, Vector3 linePoint2, Vector3 lineVec2)
{
Vector3 lineVec3 = linePoint2 - linePoint1;
Vector3 crossVec1and2 = Vector3.Cross(lineVec1, lineVec2);
Vector3 crossVec3and2 = Vector3.Cross(lineVec3, lineVec2);
float planarFactor = Vector3.Dot(lineVec3, crossVec1and2);
//is coplanar, and not parrallel
if(Mathf.Abs(planarFactor) < 0.0001f && crossVec1and2.sqrMagnitude > 0.0001f)
{
float s = Vector3.Dot(crossVec3and2, crossVec1and2) / crossVec1and2.sqrMagnitude;
intersection = linePoint1 + (lineVec1 * s);
return true;
}
else
{
intersection = Vector3.zero;
return false;
}
}
So you could probably do something like
public static bool TryFindPoints(Vector3 cameraOrigin, Vector3 cameraForward, Vector3 A, Vector3 B, Vector3 C, Vector3 D, out Vector3 AMapped, out Vector3 BMapped)
{
AMapped = default;
BMapped = default;
if(LineLineIntersection(out AMapped, A, cameraForward, cameraOrigin, C - cameraOrigin))
{
if(LineLineIntersection(out BMapped, B, cameraForward, cameraOrigin, D - cameraOrigin))
{
return true;
}
}
return false;
}
and then use it like
if(TryFindPoints(Camera.transform.position, Camera.transform.forward, A, B, C, D, out var aMapped, out var bMapped))
{
// do something with aMapped and bMapped
}
else
{
Debug.Log("It was mathematically impossible to find valid points");
}
Note: Typed on smartphone but I hope the idea gets clear

Given K the camera position and X=A' and Y=B'
var angleK = Vector3.Angle(C-K,D-K);
var angleB = Vector3.Angle(D-K, A-B);
var XK = Mathf.Sin(angleB)*(Vector3.Distance(A,B))/Mathf.Sin(angleK);
var X= K+(C-K).normalized*XK;
var Y= B + X - A;

Related

Dragging an object using perspective camera

I'm trying to write a function so when I hold the mouse down I can drag the game object and then I latch it into target.
I'm using a perspective,vertical camera with physical Camera checked and with focal length 35. Also I don't know if this is important but I am dragging the object in the Y and Z axis.
The code I'm using drags the objects too close to the camera. How can I fix this?
private void OnMouseDrag()
{
if (IsLatched)
{
print($"is latched:{IsLatched}");
return;
}
float distance = -Camera.main.transform.position.z + this.transform.position.z;
Ray ray = Camera.main.ScreenPointToRay(Input.mousePosition);
Vector3 rayPoint = ray.GetPoint(distance);
this.transform.position = rayPoint;
print($"{name} transform.position:{transform.position}");
this.gameObject.GetComponent<Rigidbody>().isKinematic = true;
isHeld = true;
}

You are calculating the distance by subtracting the z coordinates, then taking a point along the click-ray with that distance. That will not be a point on the same z coordinate. If you want to keep one component constant, I would rather intersect the ray with an XY plane.
Ray ray = Camera.main.ScreenPointToRay(Input.mousePosition);
float Zplane = this.transform.position.z; // example. use any Z from anywhere here.
// find distance along ray.
float distance = (Zplane-ray.origin.z)/ray.direction.z ;
// that is our point
Vector3 point = ray.origin + ray.direction*distance;
// Z will be equal to Zplane, unless considering rounding errors.
// but can remove that error anyway.
point.z = Zplane;
this.transform.position = point;
Could this help? Would work similar with any other plane.

Make ring of vectors "flat" relative to world space

I am trying to simulate liquid conformity in a container. The container is a Unity cylinder and so is the liquid. I track current volume and max volume and use them to determine the coordinates of the center of where the surface should be. When the container is tilted, each vertex in the upper ring of the cylinder should maintain it's current local x and z values but have a new local y value that is the same height in the global space as the surface center.
In my closest attempt, the surface is flat relative to the world space but the liquid does not touch the walls of the container.
Vector3 v = verts[i];
Vector3 newV = new Vector3(v.x, globalSurfaceCenter.y, v.z);
verts[i] = transform.InverseTransformPoint(newV);
(I understand that inversing the point after using v.x and v.z changes them, but if I change them after the fact the surface is no longer flat...)
I have tried many different approaches and I always end up at this same point or a stranger one.
Also, I'm not looking for any fundamentally different approach to the problem. It's important that I alter the vertices of a cylinder.
EDIT
Thank you, everyone, for your feedback. It helped me make progress with this problem but I've reached another roadblock. I made my code more presentable and took some screenshots of some results as well as a graph model to help you visualize what's happening and give variable names to refer to.
In the following images, colored cubes are instantiated and given the coordinates of some of the different vectors I am using to get my results.
F(red) A(green) B(blue)
H(green) E(blue)
Graphed Model
NOTE: when I refer to capital A and B, I'm referring to the Vector3's in my code.
The cylinders in the images have the following rotations (left to right):
(0,0,45) (45,45,0) (45,0,20)
As you can see from image 1, F is correct when only one dimension of rotation is applied. When two or more are applied, the surface is flat, but not oriented correctly.
If I adjust the rotation of the cylinder after generating these results, I can get the orientation of the surface to make sense, but the number are not what you might expect.
For example: cylinder 3 (on the right side), adjusted to have a surface flat to the world space, would need a rotation of about (42.2, 0, 27.8).
Not sure if that's helpful but it is something that increases my confusion.
My code: (refer to graph model for variable names)
Vector3 v = verts[iter];
Vector3 D = globalSurfaceCenter;
Vector3 E = transform.TransformPoint(new Vector3(v.x, surfaceHeight, v.z));
Vector3 H = new Vector3(gsc.x, E.y, gsc.z);
float a = Vector3.Distance(H, D);
float b = Vector3.Distance(H, E);
float i = (a / b) * a;
Vector3 A = H - D;
Vector3 B = H - E;
Vector3 F = ((A + B)) + ((A + B) * i);
Instantiate(greenPrefab, transform).transform.position = H;
Instantiate(bluePrefab, transform).transform.position = E;
//Instantiate(redPrefab, transform).transform.position = transform.TransformPoint(F);
//Instantiate(greenPrefab, transform).transform.position = transform.TransformPoint(A);
//Instantiate(bluePrefab, transform).transform.position = transform.TransformPoint(B);
Some of the variables in my code and in the graphed model may not be necessary in the end, but my hope is it gives you more to work with.
Bear in mind that I am less than proficient in geometry and math in general. Please use Laymans's terms. Thank you!
And thanks again for taking the time to help me.

As a first step, we can calculate the normal of the upper cylinder surface in the cylinder's local coordinate system. Given the world transform of your cylinder transform, this is simply:
localNormal = inverse(transform) * (0, 1, 0, 0)
Using this normal and the cylinder height h, we can define the plane of the upper cylinder in normal form as
dot(localNormal, (x, y, z) - (0, h / 2, 0)) = 0
I am assuming that your cylinder is centered around the origin.
Using this, we can calculate the y-coordinate for any x/z pair as
y = h / 2 - (localNormal.x * x + localNormal.z * z) / localNormal.y

Unity3d: Find which gameObject is in front

I have two gameObjects A and B. They are rotated at 90 degrees, which makes its local y axis face forward.
1st Case
In this case, the local y position of B is ahead of local y position of A
2nd Case
Even though their global position is same as the 1st case, we can observe here that local y position of A is ahead of local y position of B.
I tried using A.transform.localPosition.y and B.transform.localPosition.y to find which is greater but it doesnt work. What can I do to find which is front in these two different cases?

Vector projections are your friend here. Project both positions onto a line and compare their magnitude (or square magnitude, it's faster).
Case 1:
Vector3 a = Vector3.Project(A.position, Vector3.up);
Vector3 b = Vector3.Project(B.position, Vector3.up);
if (a.sqrMagnitude > b.sqrMagnitude)
{
// a is ahead
}
else
{
// b is ahead
}
Case 2: Project both positions onto Vector3.left.
Maybe you can even always simply project the two positions onto one of the two objects' forward vector (A.forward or B.forward assuming they're rotated equally).
Hope this helps.

You could compare Vector3.Dot(A.position, A.forward) and Vector3.Dot(B.position, B.forward) to find the one in front in relation to their forward.
The object with the bigger Dot product is in front, and this works in all rotations, including 3D ones.
You can use the following snippet to test for yourself:
// Assign these values on the Inspector
public Transform a, b;
public float RotationZ;
void Update() {
a.eulerAngles = new Vector3(0, 0, RotationZ);
b.eulerAngles = new Vector3(0, 0, RotationZ);
Debug.DrawRay(a.position, a.right, Color.green);
Debug.DrawRay(b.position, b.right, Color.red);
var DotA = Vector2.Dot(a.position, a.right);
var DotB = Vector2.Dot(b.position, b.right);
if (DotA > DotB) { Debug.Log("A is in front"); }
else { Debug.Log("B is in front"); }
}

Unity function to access the 2D box immediately from the 3D pipeline?

In Unity, say you have a 3D object,
Of course, it's trivial to get the AABB, Unity has direct functions for that,
(You might have to "add up all the bounding boxes of the renderers" in the usual way, no issue.)
So Unity does indeed have a direct function to give you the 3D AABB box instantly, out of the internal mesh/render pipeline every frame.
Now, for the Camera in question, as positioned, that AABB indeed covers a certain 2D bounding box ...
In fact ... is there some sort of built-in direct way to find that orange 2D box in Unity??
Question - does Unity have a function which immediately gives that 2D frustrum box from the pipeline?
(Note that to do it manually you just make rays (or use world to screen space as Draco mentions, same) for the 8 points of the AABB; encapsulate those in 2D to make the orange box.)
I don't need a manual solution, I'm asking if the engine gives this somehow from the pipeline every frame?
Is there a call?
(Indeed, it would be even better to have this ...)
My feeling is that one or all of the
occlusion system in particular
the shaders
the renderer
would surely know the orange box, and perhaps even the blue box inside the pipeline, right off the graphics card, just as it knows the AABB for a given mesh.
We know that Unity lets you tap the AABB 3D box instantly every frame for a given mesh: In fact does Unity give the "2D frustrum bound" as shown here?

As far as I am aware, there is no built in for this.
However, finding the extremes yourself is really pretty easy. Getting the mesh's bounding box (the cuboid shown in the screenshot) is just how this is done, you're just doing it in a transformed space.
Loop through all the verticies of the mesh, doing the following:
Transform the point from local to world space (this handles dealing with scale and rotation)
Transform the point from world space to screen space
Determine if the new point's X and Y are above/below the stored min/max values, if so, update the stored min/max with the new value
After looping over all vertices, you'll have 4 values: min-X, min-Y, max-X, and max-Y. Now you can construct your bounding rectangle
You may also wish to first perform a Gift Wrapping of the model first, and only deal with the resulting convex hull (as no points not part of the convex hull will ever be outside the bounds of the convex hull). If you intend to draw this screen space rectangle while the model moves, scales, or rotates on screen, and have to recompute the bounding box, then you'll want to do this and cache the result.
Note that this does not work if the model animates (e.g. if your humanoid stands up and does jumping jacks). Solving for the animated case is much more difficult, as you would have to treat every frame of every animation as part of the original mesh for the purposes of the convex hull solving (to insure that none of your animations ever move a part of the mesh outside the convex hull), increasing the complexity by a power.

3D bounding box
Get given GameObject 3D bounding box's center and size
Compute 8 corners
Transform positions to GUI space (screen space)
Function GUI3dRectWithObject will return the 3D bounding box of given GameObject on screen.
2D bounding box
Iterate through every vertex in a given GameObject
Transform every vertex's position to world space, and transform to GUI space (screen space)
Find 4 corner value: x1, x2, y1, y2
Function GUI2dRectWithObject will return the 2D bounding box of given GameObject on screen.
Code
public static Rect GUI3dRectWithObject(GameObject go)
{
Vector3 cen = go.GetComponent<Renderer>().bounds.center;
Vector3 ext = go.GetComponent<Renderer>().bounds.extents;
Vector2[] extentPoints = new Vector2[8]
{
WorldToGUIPoint(new Vector3(cen.x-ext.x, cen.y-ext.y, cen.z-ext.z)),
WorldToGUIPoint(new Vector3(cen.x+ext.x, cen.y-ext.y, cen.z-ext.z)),
WorldToGUIPoint(new Vector3(cen.x-ext.x, cen.y-ext.y, cen.z+ext.z)),
WorldToGUIPoint(new Vector3(cen.x+ext.x, cen.y-ext.y, cen.z+ext.z)),
WorldToGUIPoint(new Vector3(cen.x-ext.x, cen.y+ext.y, cen.z-ext.z)),
WorldToGUIPoint(new Vector3(cen.x+ext.x, cen.y+ext.y, cen.z-ext.z)),
WorldToGUIPoint(new Vector3(cen.x-ext.x, cen.y+ext.y, cen.z+ext.z)),
WorldToGUIPoint(new Vector3(cen.x+ext.x, cen.y+ext.y, cen.z+ext.z))
};
Vector2 min = extentPoints[0];
Vector2 max = extentPoints[0];
foreach (Vector2 v in extentPoints)
{
min = Vector2.Min(min, v);
max = Vector2.Max(max, v);
}
return new Rect(min.x, min.y, max.x - min.x, max.y - min.y);
}
public static Rect GUI2dRectWithObject(GameObject go)
{
Vector3[] vertices = go.GetComponent<MeshFilter>().mesh.vertices;
float x1 = float.MaxValue, y1 = float.MaxValue, x2 = 0.0f, y2 = 0.0f;
foreach (Vector3 vert in vertices)
{
Vector2 tmp = WorldToGUIPoint(go.transform.TransformPoint(vert));
if (tmp.x < x1) x1 = tmp.x;
if (tmp.x > x2) x2 = tmp.x;
if (tmp.y < y1) y1 = tmp.y;
if (tmp.y > y2) y2 = tmp.y;
}
Rect bbox = new Rect(x1, y1, x2 - x1, y2 - y1);
Debug.Log(bbox);
return bbox;
}
public static Vector2 WorldToGUIPoint(Vector3 world)
{
Vector2 screenPoint = Camera.main.WorldToScreenPoint(world);
screenPoint.y = (float)Screen.height - screenPoint.y;
return screenPoint;
}
Reference: Is there an easy way to get on-screen render size (bounds)?

refer to this
It needs the game object with skinnedMeshRenderer.
Camera camera = GetComponent();
SkinnedMeshRenderer skinnedMeshRenderer = target.GetComponent();
// Get the real time vertices
Mesh mesh = new Mesh();
skinnedMeshRenderer.BakeMesh(mesh);
Vector3[] vertices = mesh.vertices;
for (int i = 0; i < vertices.Length; i++)
{
// World space
vertices[i] = target.transform.TransformPoint(vertices[i]);
// GUI space
vertices[i] = camera.WorldToScreenPoint(vertices[i]);
vertices[i].y = Screen.height - vertices[i].y;
}
Vector3 min = vertices[0];
Vector3 max = vertices[0];
for (int i = 1; i < vertices.Length; i++)
{
min = Vector3.Min(min, vertices[i]);
max = Vector3.Max(max, vertices[i]);
}
Destroy(mesh);
// Construct a rect of the min and max positions
Rect r = Rect.MinMaxRect(min.x, min.y, max.x, max.y);
GUI.Box(r, "");

picking in 3D with ray-tracing using NinevehGL or OpenGL i-phone

I couldn't find the correct and understandable expression of picking in 3D with method of ray-tracing. Has anyone implemented this algorithm in any language? Share directly working code, because since pseudocodes can not be compiled, they are genereally written with lacking parts.

What you have is a position in 2D on the screen. The first thing to do is convert that point from pixels to normalized device coordinates — -1 to 1. Then you need to find the line in 3D space that the point represents. For this, you need the transformation matrix/ces that your 3D app uses to create a projection and camera.
Typically you have 3 matrics: projection, view and model. When you specify vertices for an object, they're in "object space". Multiplying by the model matrix gives the vertices in "world space". Multiplying again by the view matrix gives "eye/camera space". Multiplying again by the projection gives "clip space". Clip space has non-linear depth. Adding a Z component to your mouse coordinates puts them in clip space. You can perform the line/object intersection tests in any linear space, so you must at least move the mouse coordinates to eye space, but it's more convenient to perform the intersection tests in world space (or object space depending on your scene graph).
To move the mouse coordinates from clip space to world space, add a Z-component and multiply by the inverse projection matrix and then the inverse camera/view matrix. To create a line, two points along Z will be computed — from and to.
In the following example, I have a list of objects, each with a position and bounding radius. The intersections of course never match perfectly but it works well enough for now. This isn't pseudocode, but it uses my own vector/matrix library. You'll have to substitute your own in places.
vec2f mouse = (vec2f(mousePosition) / vec2f(windowSize)) * 2.0f - 1.0f;
mouse.y = -mouse.y; //origin is top-left and +y mouse is down
mat44 toWorld = (camera.projection * camera.transform).inverse();
//equivalent to camera.transform.inverse() * camera.projection.inverse() but faster
vec4f from = toWorld * vec4f(mouse, -1.0f, 1.0f);
vec4f to = toWorld * vec4f(mouse, 1.0f, 1.0f);
from /= from.w; //perspective divide ("normalize" homogeneous coordinates)
to /= to.w;
int clickedObject = -1;
float minDist = 99999.0f;
for (size_t i = 0; i < objects.size(); ++i)
{
float t1, t2;
vec3f direction = to.xyz() - from.xyz();
if (intersectSphere(from.xyz(), direction, objects[i].position, objects[i].radius, t1, t2))
{
//object i has been clicked. probably best to find the minimum t1 (front-most object)
if (t1 < minDist)
{
minDist = t1;
clickedObject = (int)i;
}
}
}
//clicked object is objects[clickedObject]
Instead of intersectSphere, you could use a bounding box or other implicit geometry, or intersect a mesh's triangles (this may require building a kd-tree for performance reasons).
[EDIT]
Here's an implementation of the line/sphere intersect (based off the link above). It assumes the sphere is at the origin, so instead of passing from.xyz() as p, give from.xyz() - objects[i].position.
//ray at position p with direction d intersects sphere at (0,0,0) with radius r. returns intersection times along ray t1 and t2
bool intersectSphere(const vec3f& p, const vec3f& d, float r, float& t1, float& t2)
{
//http://wiki.cgsociety.org/index.php/Ray_Sphere_Intersection
float A = d.dot(d);
float B = 2.0f * d.dot(p);
float C = p.dot(p) - r * r;
float dis = B * B - 4.0f * A * C;
if (dis < 0.0f)
return false;
float S = sqrt(dis);
t1 = (-B - S) / (2.0f * A);
t2 = (-B + S) / (2.0f * A);
return true;
}

vec4f from = toWorld * vec4f(mouse, -1.0f, 1.0f);
vec4f to = toWorld * vec4f(mouse, 1.0f, 1.0f);
I'm assuming that 'from' is the position of the mouse cursor? If so then why is its z negative one, if we are assuming openGL coordinates.
Also in this way do we assume that the depth at this time is -1 to +1 right? Rather than the depth of our frustrum.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse