I have given 3d points of a scene or a subset of these points comprising one object of the scene. I would like to create a depth image from these points, that is the pixel value in the image encodes the distance of the corresponding 3d point to the camera.
I have found the following similar question
http://www.mathworks.in/matlabcentral/newsreader/view_thread/319097
however the answers there do not help me, since I want to use MATLAB. To get the image values is not difficult (e.g. simply compute the distance of each 3d point to the camera's origin), however I do not know how to figure out the corresponding locations in the 2d image.
I could only imagine that you project all 3d points on a plane and bin their positions on the plane in discrete, well, rectangles on the plane. Then you could average the depth value for each bin.
I could however imagine that the result of such a procedure would be a very pixelated image, not being very smooth.
How would you go about this problem?
Assuming you've corrected for camera tilt (a simple matrix multiplication if you know the angle), you can probably just follow this example
X = data(:,1);
Y = data(:,1);
Z = data(:,1);
%// This bit requires you to make some choices like the start X and Z, end X and Z and resolution (X and Z) of your desired depth map
[Xi, Zi] = meshgrid(X_start:X_res:X_end, Z_start:Z_res:Z_end);
depth_map = griddata(X,Z,Y,Xi,Zi)
Related
Good day to all! My task is as follows: there is an array of points in space that form an ellipse, and this array must be designed on a 2D image (as if we were getting an image from a camera). My problem is that in some cases, instead of an ellipse, I can get a hyperbola, which is not good. How to avoid such cases? I do the mapping in the matlab as follows:
function [u,v] = point2camProjection(point,f,Oc)
K = [f,0,Oc(1);0,f,Oc(2);0,0,1];
UVW = K*point;
u=UVW(1)/UVW(3);
v=UVW(2)/UVW(3);
end
point is the coordinates of the 3D point, f is the focal length of the camera, and Oc is the optical center of the camera.
I need to calculate the X,Y coordinates in the world with respect to the camera using u,v coordinates in the 2D image. I am using an S7 edge camera to send a 720x480 video feed to MATLAB.
What I know: Z i.e the depth of the object from the camera, size of the camera pixels (1.4um), focal length (4.2mm)
Let's say the image point is at (u,v) = (400,400).
My approach is as follows:
Subtract the pixel value of center point (240,360) from the u,v pixel coordinates of the point in the image. This should give us the pixel coordinates with respect to the camera's optical axis (z axis). The origin is now at the center of the image. So new coordinates are: (160, -40)
Multiply the new u,v pixel values with pixel size to obtain the distance of the point from the origin in physical units. Let's call it (x,y). We get (x,y) = (0.224,-0.056) in mm units.
Use the formula X = xZ/f & Y = yZ/f to calculate X,Y coordinates in the real world with respect to the camera's optical axis.
Is my approach correct?
Your approach is going in the right way, but it would be easier if you use a more standardize approach. What we usually do is use Pinhole Camera Model to give you a transformation between the world coordinates [X, Y, Z] to the pixel [x, y]. Take a look in this guide which describes step-by-step the process of building your transformation.
Basically you have to define you Internal Camera Matrix to do the transformation:
fx and fy are your focal length scaled to use as pixel distance. You can calculate this with your FOV and the total pixel in each direction. Take a look here and here for more info.
u0 and v0 are the piercing point. Since our pixels are not centered in the [0, 0] these parameters represents a translation to the center of the image. (intersection of the optical axis with the image plane provided in pixel coordinates).
If you need, you can also add a the skew factor a, which you can use to correct shear effects of your camera. Then, the Internal Camera Matrix will be:
Since your depth is fixed, just fix your Z and continue the transformation without a problem.
Remember: If you want the inverse transformation (camera to world) just invert you Camera Matrix and be happy!
Matlab has also a very good guide for this transformation. Take a look.
I am wondering if there is a way to build a random 3D surface from only one (top-down) 2D image of this surface. The fact is that the 3D surface needs the z-coordinates (the heights and the depths) and the 2D (top-down) image gives only the x and y coordinates.
I believe that the main problem is that we can't get the real ranges of the dimensions (x,y,z) of the surface from one 2D (top-down) image but we can get some kind of normalized scaling which is not the real one (it's just similar).
For example:
If we have an image with a surface (2D) and we want 3D of this surface (x,y,z) we can have easily the x and the y coordinates from the image. We can't have the real range of the amplitude (z coordinate) in each point of the surface but only the gray-tones scaling. Is there any ideas on how could we take the real sizes of the amplitudes of a surface from one 2D (top-down) image?
Left is a sample of 2D top-down image and Right is a surface which created by the 2D
http://www.sendspace.com/file/9wzx0u
p.s.
I can't post an image because of my reputation, so I uploaded one on sedspace.com.
Read in the image:
A = imread(filename)
Plot the surface plot using the magnitude of the value read in for each x and y from the file:
surf(A)
I am trying to artificially manipulate a 2D image using a rigid 3D transformation (T). Specifically, I have an image and I want to transform using T it to determine the image if captured from a different location.
Here's what I have so far:
The problem reduces to determining the plane-induced homography (Hartley and Zisserman Chapter 13) - without camera calibration matrices this is H = R-t*n'/d.
I am unsure, however, how to define n and d. I know that they help to define the world plane, but I'm not sure how to define them in relation to the first image plane (e.g. the camera plane of the original image).
Please advise! Thanks! K
Not sure what you mean by "first image plane": the camera's?
The vector n and the scalar d define the equation of the plane defining the homography:
n X + d = 0, or, in coordinates, n_x * x + n_y * y + n_z * z + d = 0, for every point X = (x, y, z) belonging to the plane.
There are various ways to estimate the homography. For example, you can map a quadrangle on the plane to a rectangle of known aspect ratio.
Or you can estimate the locations of vanishing points (this comes handy when you have, say, an image of a skyscraper, with nice rows of windows). In this case, if p and q are the homogeneous coordinates of the vanishing points of two orthogonal lines on a plane in the scene, then normal to the plane in camera coordinates is simply given by (p X q) / (|p| |q|)
I have a 3D matrix of data in matlab, but I want to extract an arbitrarily rotated slice of data from that matrix and store it as a 2D matrix, which I can access. Similar to how the slice() function displays data sliced at any angle, except I would also like to be able to view and modify the data as if it were an array.
I have the coordinates of the pivot-point of the plane as well as the angles of rotation (in x, y and z axis), I have also calculated the equation of the plane in the form:
Ax + By + Cz = D
and can extract a 3D matrix containing only the data that fall on that plane, but I don't know how to then convert that into a simple 2D array.
Another way of doing it would be to somehow rotate the source matrix in the opposite direction of the angle of the plane, so as to line up the plane of data with the XY axis, and simply extract that portion of the matrix, but I do not know if rotating a matrix like that is possible.
I hope this hasn't been answered elsewhere, I've been googling it all day, but none of the problems seem to exactly match mine.
Thanks
You can take a look at the code here. I think the function is similar to what you are trying to solve.
The function extracts an arbitrary plane from a volume given the size of the plane, the center point of the plane, and the plane normal, i.e. [A,B,C]. It also outputs the volumetric index and coordinate of each pixel on the plane.
Aha! May have just solved it myself.
To produce the plane equation I rotate a normal vector of (0,0,1) using rotation matrices and then find D. If I also rotate the following vectors:
(1,0,0) //step in the x direction of our 2D array
and
(0,1,0) //step in the y direction of our 2D array
I'll have the gradients that denote how much my coordinates in x,y,z have to change before I step to the next column in my array, or to the next row.
I'll mock this up ASAP and mark it as the answer if it works
EDIT: Ok slight alteration, when I'm rotating my vectors I should also rotate the point in 3D space that represents the xyz coordinates of x=0,y=0,z=0 (although I'm rotating around the centre of the structure, so it's actually -sizex/2,-sizey/2,-sizez/2, where size is the size of the data, and then I simply add size/2 to each coordinate after the rotations to translate it back to where it should be).
Now that I have the gradient change in 3D as I increase the x coordinate of my 2D array and the gradient change as I increase the y coordinate, I can simply loop through all possible x and y coordinates (the resulting array will be 50x50 for a 50x50x50 array, I'm not sure what it will be for irregular sizes, which I'll need to work out eventually) in my 2D array and calculate the resulting 3D coordinates on my plane in the data. My rotated corner value serves as the starting point. Hooray!
Just got to work out a good test for this encompassing all angles and then I'll approve this as an answer