ARKit use Lidar mesh to smooth estimated planes - arkit

I'm trying to use ARKit's mesh scene reconstruction (with lidar) data to improve detected plane/geometry detection.
Right now, when pointing to a surface, ARKit gives me a very rough rectangle (far from actual surface's dimension). It happens almost instantly, but still far from the actual shape.
I'm trying to use this plane info, hit detection, and mesh data, to actually draw a smoothed rectangle around the detected surface. I don't expect full code, but rather just some hints of what to do.
Note: I'm using SceneKit (not RealityKit).
This is what I have so far for visualization:
Basically, I want the blue rectangle to better adjust to the real world shape by using the already available mesh data.

instead of using plane extents, use anchor.geometry

Related

How can I set a Projection Matrix to have a Tibia like projection?

I am beating my head a little bit here for a while but I still could bot find a way to set up a matrix that projects my Unity game in a Tibianeske like manner:
Reading on tutorials on internet I could figure out how a normal orthographic perspective works, but tibia's one is kind of odd.
Digging over webs I found in here a guy (Clint Bellanger) who describes really well how to get the same perspective in blender's render according to him:
Start with a scene in 45 degree isometric. Video game style, where
the camera angle is Blender (60,0,45).
In Blender if you look at Buttons Window -> Scene -> Render Buttons ->
Format, you can set the render aspect ratio. Set AspY to half of
AspX. This is the same as taking regular rendered output and scaling
X by 50%. If you rendered a cube, the top of the cube will be a
perfect square (though at a 45 degree angle).
We can then use Blender nodes to rotate the result 45 degrees. The
output:
Note this started as a cube, so there's a lot of "vertical"
distortion. So you might have to scale meshes to 50% Z before using
this method. Also notice the Edge seems to be applied after the
Aspect, so the edge isn't distorted.
Blend file: http://clintbellanger.net/images/temp/UltimaVII.blend (I'm
a Nodes noob so there might be a smarter setup).
For kicks, here is that tower again. I pulled it into the above
workflow scene and scaled Z by 50%. Click "Re-render this layer" on
the first node to create the composite.
On his method, he used stuff like rescaling the render and changing the scale of models, Im convinced I could get along just with the 4x4matrix in unity(or in any other 3d environment really).
Hope someone more experienced with perks of 3D maths could help me to figure it out. Thank you! =D
What you ask for is a simple parallel projection. The typical orthographic projection is just a special case where the projection rays are perpendicular to the image plane. However, every parallel projection can be represented by an affine shear transformation followed by a standard orthogonal projection.
Im convinced I could get along just with the 4x4matrix in unity(or in any other 3d environment really).
Yes. Using default GL conventions here, all you have to do is to take the standard ortho matrix, post-multiply it by an appropriate shear matrix and use that as the projection matrix.

Restoring the image of a face's plane

I am using an API to analyze faces in Matlab, where I get for each picture a 3X3 rotation matrix of the face's orientation, telling which direction the head is pointing.
I am trying to normalize the image according to that matrix, so that it will be distorted to get the image of the face's plane. This is something like 'undoing' the projection of the face to the camera plane. For example, if the head is directed a little to the left, it will stretch the left side to (more or less) preserve the face's original proportions.
Tried using 'affine2d' and 'projective2d' with 'imwarp', but it didn't achieve that goal
Achieving your goal with simple tools like affine transformations seems impossible to me since a face is hardly a flat surface. An extreme example: Imagine the camera recording a profile view of someone's head. How are you going to reconstruct the missing half of the face?
There have been successful attempts to change the orientation of faces in images and real-time video, but the methods used are quite complex:
[We] propose a gaze correction method that needs just a
single webcam. We apply recent shape deformation techniques
to generate a 3D face model that matches the user’s face. We
then render a gaze-corrected version of this face model and
seamlessly insert it into the original image.
(Giger et al., https://graphics.ethz.ch/Downloads/Publications/Papers/2014/Gig14a/Gig14a.pdf)

Aligning VideoOverlayScreen with Mesh camera

How do you properly place the Video overlay on the back of a meshing camera, so that the mesh generated matches what's seen in the video?
(Using Unity 5.2.1f3)
I think there are two important parts needed in order to make sure the video overlay align with the mesh:
The render camera's projection matrix
You have to make sure the projection matrix of your render camera matches the projection matrix of the physical camera. That requires a customized projection matrix calculated based on Tango color camera's intrinsics value. Here is a snippet of sample code doing that (quoted from tango unity example). After the projection matrix is matched, the image you see will be aligned with the meshes.
Timestampe synchronization.
To be more precise on the rendering, you might want to do a synchronization between the point cloud, color camera, and the pose. To do that, you will need to query a pose based on color camera's update timestamp. Each time you received a point cloud, you need to transform the points to the color camera frame, because the point cloud is received in different timestamp. Then use the transformed point cloud to do the mesh reconstruction. Put it in a matrix equation:
P_color = inverse(ss_T_color) * ss_T_depth* P_depth

Shader-coding: nonlinear projection models

As I understand it, the standard projection model places an imaginary grid in front of the camera, and for each triangle in the scene, determines which 3 pixels its 3 corners project onto. The color is determined for each of these points, and the fragment shader fills in the rest using interpolation.
My question is this: is it possible to gain control over this projection model? For example, create my own custom distorted uv-grid? Or even just supply my own algorithm:
xyPixelPos_for_Vector3( Vector3 v ) {...}
I'm working in Unity3D, so I think that limits me to cG or openGL.
I did once write a GLES2 shader, but I don't remember ever performing any kind of "ray hits quad" type test to resolve the pixel position of a particular 3D point in space.
I'm going to assume that you want to render 3d images based upon 3d primitives that are defined by vertices. This is not the only way to render images with OpenGL but it is the most common. The technique that you describe sounds much more like Ray-Tracing.
How OpenGL Typically Works:
I wouldn't say that OpenGL creates an imaginary grid. Instead, what it does is take the positions of each of your vertices, and converts them into a different space using linear algebra (Matrices).
If you want to start playing around with this, it would be best to do some reading on Matrices, to understand what the graphics card is doing.
You can easily start warping the positions of Vertices by making a vertex shader. However, there is some setup involved. See the Lighthouse tutorials (http://www.lighthouse3d.com/tutorials/glsl-tutorial/hello-world-in-glsl/) to get started with that! You will also want to read their tutorials on lighting (http://www.lighthouse3d.com/tutorials/glsl-tutorial/lighting/), to create a fully functioning vertex shader which includes a lighting model.
Thankfully, once the shader is set up, you can distort your entire scene to your hearts content. Just remember to do your distortions in the right 'space'. World coordinates are much different than eye coordinates!

Open GL - ES 2.0 : Touch detection

Hi Guys I am doing some work on iOS and the work requires use of OpenGL es. So now I have a bunch of squares, cubes and triangles on the screen. Some of these geometries might overlap. Any ideas/ approaches for touch detection?
Regards
To follow up on the answer already given, squares, cubes and triangles are convex shapes so you can perform ray-object intersection quite easily, even directly from the geometry rather than from the mathematical description of the perfect object.
You're going to need to be able to calculate the distance of a point from the plane and the intersection of a ray with the plane. As a simple test you can implement yourself very quickly, for each polygon on the convex shape work out the intersection between the ray and the plane. Then check whether that point is behind all the planes defined by polygons that share an edge with the one you just tested. If so then the hit is on the surface of the object — though you should be careful about coplanar adjoining polygons and rounding errors.
Once you've found a collision you can easily get the length of the ray to the point of collision. The object with the shortest distance is the one that's in front.
If that's fast enough then great, otherwise you'll probably want to look into partitioning the world or breaking objects down to their silhouettes. Convex objects are really simple — consider all the edges that run between one polygon and the next. If only exactly one of those polygons is front facing then the edge is part of the silhouette. All the silhouettes edges together can be projected to a convex 2d shape on the view plane. You can then test touches by performing a 2d point-in-polygon from that.
A further common alternative that eliminates most of the maths is picking. You'd render the scene to an invisible buffer with each object appearing as a solid blob in a suitably unique colour. To test for touch, you'd just do a glReadPixels and inspect the colour.
For the purposes of glu on the iPhone, you can grab SGI's implementation (as used by MESA). I've used its tessellator in a shipping, production project before.
I had that problem in the past. What I have used is an implementation of glu unproject that you can find on google (it uses the inverse of the model view projection matrix and the viewport size). This allows you to map the 2D screen coordinates to a 3D vector into the world. Then, you can use this vector to intersect with your objects and see which one intersects (or comes really close to doing so).
I do hope there are better ways of doing this, so I look forward to other answers as well!
Once you get the inverse-modelview and cast your ray (vector), you still need to know if the ray intersects your geometry. One approach would be to grab the depth (z in view coordinate system) of the object's center and extend (stretch) your vector just that far. Then see if the vector's "head" ends within the volume of your object or not (you need the objects center and e.g. Its radius, if it's a sphere)