I want to normalize the sift descriptors by rotating them so that the horizontal direction is aligned with the dominant gradient orientation of the patch.
I am using vl_feat library. Is there any way in vl_feat to normalize the sift descriptos?
or
what is the effective way of doing this using matlab?
I believe the ones in VLfeat are already oriented in the dominant gradient direction.
It shows them rotated if you look here: http://www.vlfeat.org/overview/sift.html
[f,d] = vl_sift(I) ;
f is a Nx4 matrix of keypoints. N being the keypoint indexand the other 4 being, x position, y position, scale, and orientation. d is a Nx128 matrix, where N is the keypoint index, and the 128 dimensions belong to the SIFT descriptor.
If all of your images are rotated upright, it is actually beneficial to not use rotational invariance. See this paper which assume a gravity vector: https://dspace.cvut.cz/bitstream/handle/10467/9548/2009-Efficient-representation-of-local-geometry-for-large-scale-object-retrieval.pdf?sequence=1
Related
I have a set of data coordinates in 3D with respect to an Origin, say O1, and another set of data that represents the same movements but in 2D with respect to an Origin, say O2. What can I do to calculate the transformations that are required to implement on the 3D set to be able to compare the data points(2D vs 3D) in a 2D frame?
Setup of data generation
If the measurements are at the same times, then the following might work. I have not actually tried it and make no claim that it is the best solution, but it seems doable. Basically, after adjusting the coordinates so that they have the same origin, you want to minimize the following 2-norm:
|| vec(A*x) - vec(P*B*y) ||
where x is the 2xN array of the 2-D locations, A is a 2x2 rotation matrix that rotates x by the unknown angle theta, P is a 2x3 projection matrix that projects a 3x1 vector onto its first two dimensions (viz., just throws out the third value), B is a 3x3 rotation matrix, and y is the 3xN array of the 3-D locations.
The unknowns in the above are theta in the 2-D rotation matrix, and the three rotation angles needed to define B. The four angles that minimize the above should (at least I think) give you the rotation matrices needed to align the coordinate systems.
Basically, I have a many irregular circle on the ground in the form of x,y,z coordinates (of 200*3 matrix). but I want to fix a best circle in to the data of x,y,z coordinates (of 200*3 matrix).
Any help will be greatly appreciated.
I would try using the RANSAC algorithm which finds the parameters of your model (in your case a circle) given noisy data. The algorithm is quite easy to understand and robust against outliers.
The wikipedia article has a Matlab example for fitting a line but it shouldn't be too hard to adapt it to fit a circle.
These slides give a good introduction to the RANSAC algorithm (starting from page 42). They even show examples for fitting a circle.
Though this answer is late, I hope this helps others
To fit a circle to 3d points
Find the centroid of the 3d points (nx3 matrix)
Subtract the centroid from the 3D points.
Using RANSAC, fit a plane to the 3D points. You can refer here for the function to fit plane using RANSAC
Apply SVD to the 3d points (nx3 matrix) and get the v matrix
Generate the axes along the RANSAC plane using the axes from SVD. For example, if the plane norm is along the z-direction, then cross product between the 1st column of v matrix and the plane norm will generate the vector along the y-direction, then the cross product between the generated y-vector and plane norm will generate a vector along the x-direction. Using the generated vectors, form a Rotation matrix [x_vector y_vector z_vector]
Multiply the Rotation matrix with the centroid subtracted 3d points so that the points will be parallel to the XY plane.
Project the points to XY plane by simply removing the Z-axes from the 3d points
fit a circle using Least squares circle fit
Rotate the center of the circle using the inverse of the rotation matrix obtained from step 5
Translate back the center to the original location using the centroid
The circle in 3D will have the center, the radius will be the same as the 2D circle we obtained from step 8, the circle plane will be the RANSAC plane we obtained from step 3
stereoParameters takes two extrinsic parameters: RotationOfCamera2 and TranslationOfCamera2.
The problem is that the documentation is a not very detailed about what RotationOfCamera2 really means, it only says: Rotation of camera 2 relative to camera 1, specified as a 3-by-3 matrix.
What is the coordinate system in this case ?
A rotation matrix can be specified in any coordinate system.
What does it exactly mean "the coordinate system of Camera 1" ? What are its x,y,z axes ?
In other words, if I calculate the Essential Matrix, how can I get the corresponding RotationOfCamera2 and TranslationOfCamera2 from the Essential Matrix ?
RotationOfCamera2 and TranslationOfCamera2 describe the transformation from camera1's coordinates into camera2's coordinates. A camera's coordinate system has its origin at the camera's optical center. Its X and Y-axes are in the image plane, and its Z-axis points out along the optical axis.
Equivalently, the extrinsics of camera 1 are identity rotation and zero translation, while the extrinsics of camera 2 are RotationOfCamera2 and TranslationOfCamera2.
If you have the Essential matrix, you can decompose it into the rotation and a translation. Two things to keep in mind. First, the translation is up to scale, so t will be a unit vector. Second, the rotation matrix will be a transpose of what you get from estimateCameraParameters, because of the difference in the vector-matrix multiplication conventions.
Out of curiosity, what is it that you are trying to accomplish? Are you working with a single moving camera? Otherwise, why not use the Stereo Camera Calibrator app to calibrate your cameras, and get rotation and translation for free?
Suppose for left camera's 1st checkerboard (or to any world reference) rotation is R1 and translation is T1, right camera's 1st checkerboard rotation is R2 and translation is T2, then you can calculate them as follows;
RotationOfCamera2 = R2*R1';
TranslationOfCamera2= T2-RotationOfCamera2*T1
But please note that this calculations are just for one identical checkerboard reference. Inside matlab these two parameters are calculated by all given pair of checkerboard images and calculate median values as initial guess. Later these parameters will be refine by nonlinear optimization. So after median calculations they might be sigtly differ. But if you have just one reference point tranfomation for both two camera, you should use above formula. Note Dima told, matlab's rotation matrix is transpose of normal usage. So I wrote it as how the literature tells not matlab's style.
I'm trying to make a 3D reconstruction from a set of uncalibrated photographs in MATLAB. I use SIFT to detect feature points and matches between images. I want to make a projective reconstruction first and then update this to a metric one using auto-calibration.
I know how to estimate the 3D points from 2 images by computing the fundamental matrix, camera matrices and triangulation. Now say I have 3 images, a, b and c. I compute the camera matrices and 3D points for image a and b. Now I want to update the structure by adding image c. I estimate the camera matrix by using known 3D points (calculated from a and b) that match with 2D points in image c, since:
However when I reconstruct the 3D points between b and c they don't add up with the existing 3D points from a and b. I'm assuming this is because I don't know the correct depth estimates of the points (depicted by s in above formula).
With the factorization method of Sturm and Triggs I can estimate the depths and find the structure and motion. However in order to do this, all points have to be visible in all views, which is not the case for my images. How can I estimate the depths for points not visible in all views?
This is not a question about Matlab. It is about an algorithm.
It is not mathematically possible to estimate the position of a 3D point in an image when you don't see an observation of the point in said image.
There are extensions for factorization to work with missing data. However, the field seems to have converged to Bundle Adjustment as the Gold Standard.
An excellent tutorial on how to achieve what you want can be found here, which is a culmination of several years of research into a working application. Starting from projective reconstruction up to the metric upgrade.
I'm looking at the core motion class CMAttitude, it can express the device's orientation as a 3x3 rotational matrix. At the same time I've taken a look at the CATransform3D, which encapsulates the view's attitude, as well as scaling. The CATransform3D is a 4x4 matrix.
I've seen that the OpenGL rotational matrix is 4x4 and is simply 0001 padded in the 4th row and column.
I'm wandering if the CMAttitude's rotational matrix is related to CATransform's matrix?
Can I use the device's rotation in space obtained via a rotational matrix to transform a UIView using CATransform3D? My intention is to let the user move the phone and apply the same transform to a UIView on the screen.
Bonus question: if they are related, how do I transform a CMAttitude's rotational matrix to CATransform3D?
Gyroscope is used to determine only the orientation of the device in space. There are many ways to parameterize the orientation itself (see the information about SO(3) group for theoretical information) - quaternions, Euler angles and 3x3 matrices are one of them.
The "embedding" of 3x3 matrix into the 4x4 matrix is not a GL-specific trick. It is a "semi-direct product" of the group of translations (which is isomorphic to all the 3D vectors) and the group of rotations (the SO(3) mentioned above).
To get the CATransform3D matrix from CMAttitude you have to suppose some position of your object. If it is zero, then just pad the matrix with 0001 as you've said.
This question might be of interest for you: Apple gyroscope sample code