I've been attempting to figure out how to take a homography between two planes and convert it into an projective transform. Matlab does this automatically, but I've been trying to figure out how matlab implements the conversion.
You can look at the source code in toolbox\images\images\maketform.m
At least within the editor you can get to this by hitting F4 on the function name.
A homography is a projective transform that maps lines to lines, keeps cross ratio, but does not keep parallelism or other similarity magnitudes (angles, distances, etc).
A homography can be expressed as a homogeneous 3x3 matrix, and computed in many (really, many) different ways according to your problem.
The most typical one is to determine 4 point correspondences between the two planes and use the Direct Linear Transform (DLT). There are also many implementations of the DLT. If you are familiar with OpenCV, you can easily obtain such homography matrix using cv::findHomography (http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html?highlight=findhomography#findhomography).
In general, I recommend you to take a look to the "Multiple View Geometry" book from Hartley & Zisserman, which explain in detail the concept of homographies in the context of computer vision.
Related
Currently, I am working at a short project about stereo-vision.
I'm trying to create depth maps of a scenery. For this, I use my phone from to view points and use the following code/workflow provided by Matlab : https://nl.mathworks.com/help/vision/ug/uncalibrated-stereo-image-rectification.html
Following this code I am able to create nice disparity maps, but I want to now the depths (as in meters). For this, I need the baseline, focal length and disparity, as shown here: https://www.researchgate.net/figure/Relationship-between-the-baseline-b-disparity-d-focal-length-f-and-depth-z_fig1_2313285
The focal length and base-line are known, but not the baseline. I determined the estimate of the Fundamental Matrix. Is there a way to get from the Fundamental Matrix to the baseline, or by making some assumptions to get to the Essential Matrix, and from there to the baseline.
I would be thankful for any hint in the right direction!
"The focal length and base-line are known, but not the baseline."
I guess you mean the disparity map is known.
Without a known or estimated calibration matrix, you cannot determine the essential matrix.
(Compare Multi View Geometry of Hartley and Zisserman for details)
With respect to your available data, you cannot compute a metric reconstruction. From the fundamental matrix, you can only extract camera matrices in a canonical form that allow for a projective reconstruction and will not satisfy the true baseline of the setup. A projective reconstruction is a reconstruction that differs from the metric result by an unknown transformation.
Non-trivial techniques could allow to upgrade these reconstructions to an Euclidean reconstruction result. However, the success of these self-calibration techniques strongly depends of the quality of the data. Thus, using images of a calibrated camera is actually the best way to go.
I used this script to reconstruct an image of the Shepp-Logan phantom.
Basically, it just simply used radon to get sinogram and used iradon to transform it back.
However, I found that a very obvious moire pattern can be seen when I adjust contrast. This is even more obvious if I use my CT image dataset.
Can anyone help me to understand this? Thanks!
img = phantom(512)*1000;
views = 576;
angles = [0:180/576:180-180/576];
sino = radon(img,angles);
img_rec = iradon(sino,angles);
imshow(img_rec,[]);
Full image after being adjusted contrast:
Regions with obvious moire pattern:
This may be happening because of some factors:
From the MATLAB documentation, iradon uses 'Ram-Lak' (known as ramp filter) filtering as default and does not use any windowing to de-emphasizes noise in the high frequencies. You stated "This is even more obvious if I use my CT image dataset", that is because there you have real noise in the images. The documentation itself advises to use some windowing:
"Because this filter is sensitive to noise in the projections, one of the filters listed below might be preferable. These filters multiply the Ram-Lak filter by a window that de-emphasizes higher frequencies."
Other inconvenient is related to the projector itself. The built-in functions radon and iradon from MATLAB does not take into account the detector size and the x-ray length which cross the pixel. These functions are just pixel driving methods, i.e., they basically project geometrically the pixels in the detector and interpolate them.
Possible solutions:
There are more sophisticated projectors today as [1] and [2]. As I stated here, I implemented the distance-driven projector for 2D Computed Tomography (CT) and 3D Digital Breast Tomosynthesis (DBT), so feel free to use it for your experiments.
For example, I generated 3600 equally spaced projections of the phantom with the distance-driven method, and reconstructed it with the iradon function using this line code:
slice = iradon(sinogram',rad2deg(geo.theta));
I am new to Digital Image Processing and have to simulate a Fourier Descriptor Program that is Affine Invariant, I want to know the prerequisites required to be able to understand this program, my reference is Digital Image Processing Using MATLAB by Gonzalez, I have seen a question on this site, regarding same program, but not able to understand the program as well as the solution, the question says:
"I am using Gonzalez frdescp function to get Fourier descriptors of a boundary. I use this code, and I get two totally different sets of numbers describing two identical but different in scale shapes.
So what is wrong?"
Can some body help me in knowing the prerequisite to understand this program as well as help me further?
Let me give this a try as I will have to use english and not mathematical notation. First, this is the documentation of the frdescp shown here. frdescp takes one argument which is an n by 2 matrix of numbers. What are these numbers? This requires some understanding of the mathematical foundation of Fourier Descriptors. The assumptions, before computing the Fourier Descriptors, is that you have a contour of the object, and you have some points on that contour. So for example a contour is shown in this picture:
You see that black line in the image? That is where you will pick a list of points going clockwise from the contour. Let's call this vector {(x_1, y_1), (x_2,y_2),... ,(x_n,y_n)}. Now that we have these points we are ready to compute the Fourier descriptors of this contour. The complex Fourier descriptor implemented in this Matlab function requires numbers to be in the complex domain. So you have to convert the numbers in our list to complex numbers, this is easy as you can transform a tuple of real numbers in 2D (x,y) to x + iy in the complex plane. However the matlab the function already does this for you. But now you know what the n by 2 matrix is for, it is just a list of xs and ys on the contour. After you have this, the matlab function takes the discrete Fourier transform and you get the descriptors. The benefit of this descriptor business is that it is invariant under certain geometric transformations such translation, rotation and scaling. I hope this was helpful.
I am trying to use MATLAB to implement a CT (computed tomography) projection operator, A, which I think is also referred as "system matrix" often times.
Basically, for a N x N image M, the projection data, P, can be obtained by multiplication of the project operator to the image:
P = AM
and the backprojection procedure can be performed by multiplying the (conjugate) transpose of the projection operator to the projection data:
M = A'P
Anyone has any idea/example/sample code on how to implement matrix A (for example: Radon transform)? I would really like to start with a small size of matrix, say 8 x 8, or 16 x 16, if possible.
My question really is: how to implement the projection operator, such that by multiplying the operator with an image, I can get the projections, and by multiplying the (conjugate) transpose of the operator with the projections, I can get the original image back.
EDIT:
Particularly, I would like to implement distance-driven projector, in which case beam trajectory (parallel, fan, or etc) would not matter. Very simple example (MATLAB preferred) will be the best for me to start.
You have different examples:
Here there is a Matlab example related to 3d Cone Beam. It can be a good starting point.
Here you also have another operator
Here you have a brief explanation of the Distance-Driven Method. So using the first example and the explanation in this book, you can obtain some ideas.
If not, you can always go to the Distance-Driven operator paper and implement it using the first example.
As far as I'm aware, there are no freely available implementations of the distance-driven projector/backprojector (it is patented). You can, however, code it yourself without too much difficulty.
Start by reading the papers and understanding what the projector is doing. There are only a few key parts that you need:
Projecting pixel boundaries onto an axis.
Projecting detector boundaries onto an axis.
The overlap kernel.
The first two are simple geometry. The overlap kernel is described in good detail (and mostly usable pseudocode) in the papers.
Note that you won't wind up with an actual matrix that does the projection. The system would be too large for all but the tiniest examples. Instead, you should write a function that implements the linear operator corresponding to distance-driven projection.
Although there are already a lot of satisfactory answers, I would like to mention that I have implemented the Distance Driven method for 2D Computed Tomography (CT) and 3D Digital Breast Tomosynthesis (DBT) on MATLAB.
Until now, for 2D CT, these codes are available:
Simple Distance-Driven, base on the original papers [1] and [2],
Branchless Distance-Driven, for acceleration on GPU, based on the papers [3] and [4],
and for 3D DBT:
Simple Distance-Driven, based on the book [5].
Note that:
1 - The code for DBT is strictly for limited angle tomography; however it is straightforward to extend to a full rotation angle.
2 - All the codes are implemented for CPU.
Please, report any issue on the codes so we can keep improving it.
Distance-driven projection is not implemented in stock MATLAB. For forward projection, there is the fanbeam() and radon() command, depending on what geometry you're looking for. I don't consider fanbeam to be very good. It exhibits nonlinear behavior, as of R2013a, see here for details
As for a matching transpose, there is no function for that either for fanbeam or parallel geometry. Note, iradon and ifanbeam are not operator implementations of the matching transpose. However, you might consider using FUNC2MAT. It will let you convert any linear operator from function form to matrix form and then you can transpose freely.
Things are like this:
I have some graphs like the pictures above and I am trying to classify them to different kinds so the shape of a character can be recognized, and here is what I've done:
I apply a 2-D FFT to the graphs, so I can get the spectral analysis of these graphs. And here are some result:
S after 2-D FFT
T after 2-D FFT
I have found that the same letter share the same pattern of magnitude graph after FFT, and I want to use this feature to cluster these letters. But there is a problem: I want the features of interested can be presented in a 2-D plane, i.e in the form of (x,y), but the features here is actually a graph, with about 600*400 element, and I know the only thing I am interested is the shape of the graph(S is a dot in the middle, and T is like a cross). So what can I do to reduce the dimension of the magnitude graph?
I am not sure I am clear about my question here, but thanks in advance.
You can use dimensionality reduction methods such as
k-means clustering
SVM
PCA
MDS
Each of these methods can take 2-dimensional arrays, and work out the best coordinate frame to distinguish / represent etc your letters.
One way to start would be reducing your 240000 dimensional space to a 26-dimensional space using any of these methods.
This would give you an 'amplitude' for each of the possible letters.
But as #jucestain says, a network classifiers are great for letter recognition.