Kinect - Calculating Surface Area - distance

I'd like to be able to calculate the surface area of objects seen by the depth camera. Is there an easy way to do this? For example if the kinect is seeing a player I need to calculate how much surface it is covering.
If there is no such functions existing, I can calculate by creating multiple squares with coordinate (x,y) (x+1,y) (x, y+1) (x+1, y+1) and taking into consideration the z value. But I'm not sure how to get the distance in mm or cm between pixels in the x or y axis.
Thanks

Related

Find 3D coordinate with respect to the camera using 2D image coordinates

I need to calculate the X,Y coordinates in the world with respect to the camera using u,v coordinates in the 2D image. I am using an S7 edge camera to send a 720x480 video feed to MATLAB.
What I know: Z i.e the depth of the object from the camera, size of the camera pixels (1.4um), focal length (4.2mm)
Let's say the image point is at (u,v) = (400,400).
My approach is as follows:
Subtract the pixel value of center point (240,360) from the u,v pixel coordinates of the point in the image. This should give us the pixel coordinates with respect to the camera's optical axis (z axis). The origin is now at the center of the image. So new coordinates are: (160, -40)
Multiply the new u,v pixel values with pixel size to obtain the distance of the point from the origin in physical units. Let's call it (x,y). We get (x,y) = (0.224,-0.056) in mm units.
Use the formula X = xZ/f & Y = yZ/f to calculate X,Y coordinates in the real world with respect to the camera's optical axis.
Is my approach correct?
Your approach is going in the right way, but it would be easier if you use a more standardize approach. What we usually do is use Pinhole Camera Model to give you a transformation between the world coordinates [X, Y, Z] to the pixel [x, y]. Take a look in this guide which describes step-by-step the process of building your transformation.
Basically you have to define you Internal Camera Matrix to do the transformation:
fx and fy are your focal length scaled to use as pixel distance. You can calculate this with your FOV and the total pixel in each direction. Take a look here and here for more info.
u0 and v0 are the piercing point. Since our pixels are not centered in the [0, 0] these parameters represents a translation to the center of the image. (intersection of the optical axis with the image plane provided in pixel coordinates).
If you need, you can also add a the skew factor a, which you can use to correct shear effects of your camera. Then, the Internal Camera Matrix will be:
Since your depth is fixed, just fix your Z and continue the transformation without a problem.
Remember: If you want the inverse transformation (camera to world) just invert you Camera Matrix and be happy!
Matlab has also a very good guide for this transformation. Take a look.

Understanding depth values in 3D point cloud

I have problems understanding the depth (Z) value in 3D point cloud resulted from 3d sparse reconstruction like this example in MATLAB: http://www.mathworks.com/help/vision/ug/sparse-3-d-reconstruction-from-multiple-views.html
I have attached a picture showing the reconstructed 3D point cloud in the above example. I have put some datatips on the figure so we know the (x,y,z) coordinates of the points. here are my questions:
1- what does the Z value in point cloud represent? is it the distance in millimeters from the camera? if that's the case then it does not make sense based on the picture I attached since I am sure the distance of the sphere and checkerboard from the camera must be greater than 200 mm.
Or maybe it is from some reference point in space? then what is this reference point? and how can I make a 3D point cloud that the Z values indicate the distance from the camera?
2- why is there negative values for Z? what does that mean in terms of distance to the camera?
I appreciate if someone can explain.
In this example the world coordinates are defined by the checkerboard. The checkerboard defines the X-Y plane, and the Z-axis points into the checkerboard, as explained in the documentation:
Since your 3D points are above the checkerboard, they have negative Z-coordinates.
Your (x,y,z) coordinates are in world units, which are completely disconnected from metric values (unless you build a scale between world and metric, there are various methods to do it). So the z value tells you about the depth of each point in world coordinates.
If you have the pose of each camera, and you multiply each point by the camera projection matrix, you will get the (x',y',z') points in camera coordinates. At that point, if z' is negative, it means it's behind the camera.

Create depth map from 3d points

I have given 3d points of a scene or a subset of these points comprising one object of the scene. I would like to create a depth image from these points, that is the pixel value in the image encodes the distance of the corresponding 3d point to the camera.
I have found the following similar question
http://www.mathworks.in/matlabcentral/newsreader/view_thread/319097
however the answers there do not help me, since I want to use MATLAB. To get the image values is not difficult (e.g. simply compute the distance of each 3d point to the camera's origin), however I do not know how to figure out the corresponding locations in the 2d image.
I could only imagine that you project all 3d points on a plane and bin their positions on the plane in discrete, well, rectangles on the plane. Then you could average the depth value for each bin.
I could however imagine that the result of such a procedure would be a very pixelated image, not being very smooth.
How would you go about this problem?
Assuming you've corrected for camera tilt (a simple matrix multiplication if you know the angle), you can probably just follow this example
X = data(:,1);
Y = data(:,1);
Z = data(:,1);
%// This bit requires you to make some choices like the start X and Z, end X and Z and resolution (X and Z) of your desired depth map
[Xi, Zi] = meshgrid(X_start:X_res:X_end, Z_start:Z_res:Z_end);
depth_map = griddata(X,Z,Y,Xi,Zi)

How to get real world coordinates (x, y, z) from a distinct object using a Kinect

I have to get the real world coordinates (x, y, z) using Kinect. Actually, I want the x, y, z distance (in meters) from Kinect.
I have to get these coordinates from a unique object (e.g. a little yellow box) in the scenario, colored in a distinct color.
Here you can see an example of the scenario
I want the distance (x, y, z in meters) of the yellow object in the shelf.
Note that is not required a person (skeleton) in the scenario.
First of all, I would like to know if it is possible and simple to do?
So, I would appreciate if you send some links/code that could help me doing this task.
You would need to use both the Color Stream and the Depth Stream.
First, using the Color Stream you would need to collect an array of pixels that match the color you are looking for and then lookup the depth data from the Depth Stream for those pixels to get an average distance from the camera. That gives you the Z.
To get the X and Y you would use the math from this answer.
The Z distance (from object to kinect) you get from Position.Z of a specific Joint. So there is no problem with getting it.
The X and Y. It depends do you want to get distance from joint to joint or from joint to Kinect. You can calculate it. Use the math. You need to take angle of view of kinect and distance from it

Creating a cylinder with axis centered differently

I know Matlab has a function called cylinder to create the points for a cylinder when number of points along the circumference, and the radius length. What if I don't want a unit cylinder, and also don't want it to center at the default axis (for example along z-axis)? What would be the easiest approach to create such a cylinder? Thanks in advance.
The previous answer is fine, but you can get matlab to do more of the work for you (because the results of cylinder separate x,y,z components you need to work a little to do the matrix multiplication for the rotation). To have the center of base of the cylinder at [x0 y0 z0], scaled by [xf yf xf] (use xf=yf unless you want an elliptic cylinder), use:
[x y z] = cylinder;
h=mesh(x*xf+x0,y*yf+y0,z*zf+z0)
If you also want to rotate it so it isn't aligned along the z-axis, use rotate. For example, to rotate about the x-axis by 90 degrees, so it's aligned along the y-axis, use:
rotate(h,[1 0 0],90)
Multiply the points by your favourite combination of a scaling matrix, a translation matrix, and a rotation matrix.