Phase shifting image content after analyzing with riesz transform - matlab

I am working on a project involving video motion magnification algorithms. Currently I am trying to understand phase based motion magnification using a riesz pyramid. My main source of information is this document:
Riesz Pyramids for Fast Phase-Based Video Magnification
\
I have performed the following steps to attempt to reproduce some of the results in the paper:
Decompose an image into multiple scales using the provided matlab code for the riesz pyramid
Generate the images Riesz1 and Riesz2 by convolving one subband of the pyramid with [-0.5, 0, 0.5] and [-0.5, 0, 0.5]' using the approximate riesz transform introduced in the paper.
Determine the dominant local orientation in every pixel of the subband by calculating atan(R2/R1). This calculation is derived from equation 3 in the paper.
Steer the transform to the dominant local orientation and calculate the resulting quadrature pair
Use the quadrature pair to generate a complex number (I + iQ) whose phase I then used to determine the local phase in the specific pixel.
This is the Matlab code I created:
%Generate a circle image
img = zeros(512, 512);
img(:) = 128;
rad = 180;
for i = size(img, 1)/2 - rad : size(img,1)/2 + rad
for j = size(img, 2)/2 - rad : size(img,2)/2 + rad
deltaX = abs(size(img, 1)/2 - i);
deltaY = abs(size(img, 2)/2 - j);
if (sqrt(deltaX^2+deltaY^2) <= rad)
img(i, j) = 255;
end
end
end
%build riesz pyramid
[pyr, pind] = buildNewPyr(img);
%extract band2 from pyramid (no orientation information yet)
I = pyrBand(pyr,pind,3);
%convolve band2 with approximate riesz filter for first quadrature pair
%element
R1 = conv2(I, [0.5, 0, -0.5], 'same');
%convolve band2 with approximate riesz filter (rotated by 90°) for second
%quadrature pair element
R2 = conv2(I, [0.5, 0, -0.5]', 'same');
% show the resulting image containing orientation information!
% imshow(band2_r2, []);
%To extract the phase, we have to steer the pyramid to its dominant local
%orientation. Orientation is calculated as atan(R2/R1)
theta = atan(R2./R1);
theta(isnan(theta) | isinf(theta)) = 0;
%imshow(theta, []);
% create quadrature pair
Q = zeros(size(theta, 1), size(theta, 2));
for i = 1:size(theta, 1)
for j = 1:size(theta, 1)
if theta(i, j) ~= 0
%create rotation matrix
rot_mat = ...
[cos(theta(i, j)), sin(theta(i, j));...
-sin(theta(i, j)) cos(theta(i, j))];
%steer to dominant local orientation(theta) and set Q
resultPair = rot_mat*[R1(i, j), R2(i,j)]';
Q(i,j) = resultPair(1);
end
end
end
% create amplitude and phase image
A = abs(complex(I, Q));
Phi = angle(complex(I, Q));
The generated images look like this:
Now my questions:
When calculating theta using atan(R2/R1) I get a lot of artifacts in the result (see image "dominant orientation"). Is there something obvious I miss here/do wrong?
Assuming what my results are correct thus far. To magnify motion I need to not only be able to determine the local phase, but also to change it. I seem to miss something obvious, but how would I go about that? Do I need to somehow change the phase of the pyramid subband pixels and then collapse the pyramid? If yes, how?
I am (obviously) quite new to this topic and only have a rudimentary understanding of image processing. I would be very thankful for any answer, be it a solution to my problems or just a referral to an other useful source of information.
Sincerely

I have got a functioning implementation of this algorithm. Here are the steps I took to successfully motion-magnify a video using this method.
These steps should be applied to each channel of a video sequence that you (I have tried it for RGB video, you could probably get away with doing it for just luminance, in a YUV video).
Create an image pyramid of each frame. The original paper has a recommended pyramid structure to allow greater magnification values, although it works fairly well with a Laplacian pyramid.
For each pyramid level of each video channel, calculate the Riesz transform (see The Riesz transform and simultaneous representations of phase, energy and orientation in spatial vision for an overview of the transform, and see the paper in the original question for an efficient approximate implementation).
Using the Riesz transform, calculate the local amplitude, orientation and phase for each pixel of each pyramid level of each video frame. The following Matlab code will calculate the local orientation, phase and amplitude of a (double format) image using the approximate Riesz transform:
function [orientation, phase, amplitude] = riesz(image)
[imHeight, imWidth] = size(image);
%approx riesz, from Riesz Pyramids for Fast Phase-Based Video Magnification
dxkernel = zeros(size(image));
dxkernel(1, 2)=-0.5;
dxkernel(1,imWidth) = 0.5;
dykernel = zeros(size(image));
dykernel(2, 1) = -0.5;
dykernel(imHeight, 1) = 0.5;
R1 = ifft2(fft2(image) .* fft2(dxkernel));
R2 = ifft2(fft2(image) .* fft2(dykernel));
orientation = zeros(imHeight, imWidth);
phase = zeros(imHeight, imWidth);
orientation = (atan2(-R2, R1));
phase = ((unwrap(atan2(sqrt(R1.^2 + R2.^2) , image))));
amplitude = sqrt(image.^2 + R1.^2 + R2.^2);
end
For each pyramid level, temporally filter the phase values of each pixel using a bandpass filter set to a frequency appropriate for the motion that you wish to magnify. Note that this removes the DC component from the phase value.
Calculate the amplified phase value by
amplifiedPhase = phase + (requiredGain * filteredPhase);
Use the amplified phase to calculate new pixel values for each pyramid level by
amplifiedSequence = amplitude .* cos(amplifiedPhase);
Collapse the pyramids to generate the new, amplified video channel.
Recombine your amplified channels into a new video frame.
There are some other steps in the original paper to improve noise performance, but the sequence above produces motion amplified video quite nicely.

I have fully implemented the methodology of Riesz Pyramids for fast phase based video motion magnification. I felt the papers did not clearly describe the appropriate steps required to correctly filter phase. It is important to realise that mulitple mathematically correct expressions for phase and orientation may in fact not be suitable using MATLAB's acos(), asin() and atan() functions. Here is my implementation:
% R1, R2 are Riesz transforms of the image I and Q is the Quadrature pair
Q = sqrt((R1.^2) + (R2.^2));
phase = atan2(Q,I);
The phase should be wrapped to be between -pi and +pi, i.e. if phase is greater than +pi, phase = phase - 2*pi, if phase is smaller than -pi,
phase = phase + 2*pi.
amplitude = sqrt((I.^2) + (R1.^2) + (R2.^2));
Furthermore, it is essential that the change in phase between successive frames is filtered, not the phase directly.
phase_diff = phase(t+1) - phase(t);
This quantity "phase_diff" is temporally filtered, and amplified by multiplication with an amplification factor. The filtered, amplified, variation in phase is then used to phase shift the input.
magnified output = amplitude.*cos(phase_diff_filtered_amplified + original phase).

whilst DrMcCleod's answer didn't directly provide a solution, it seems he was on the right track.
The complex representation of the image can be constructed from the input and the quadrature pair with
complexImg = complex(I, Q);
A phase shifted reconstruction of the image can then be generated by multiplying the complex representation with e^(-i*shift) which removes the complex part of the representation and results in the original image plus the introduced phase shift.
reconstructed = complexImg*exp(-sqrt(-1) * shift);
I'll have to experiment a little, but this seems to produce the expected results.
Thanks for the help!

Related

Implementing a filtered backprojection algorithm using the central slice theorem in Matlab

I'm working on a filtered back projection algorithm using the central slice theorem for a homework assignment and while I understand the theory on paper, I've run into an issue implementing it in Matlab. I was provided with a skeleton to follow to do it but there is a step that I think I'm maybe misunderstanding. Here is what I have:
function img = sampleFBP(sino,angs)
% This step is necessary so that frequency information is preserved: we pad
% the sinogram with zeros so that this is ensured.
sino = padarray(sino, floor(size(sino,1)/2), 'both');
% diagDim should be the length of an individual row of the sinogram - don't
% hardcode this!
diagDim = size(sino, 2);
% The 2DFT (2D Fourier transform) of our image will start as a matrix of
% all zeros.
fimg = zeros(diagDim);
% Design your 1-d ramp filter.
rampFilter_1d = abs(linspace(-1, 1, diagDim))';
rowIndex = 1;
for nn = angs
% Each contribution to the image's 2DFT will also begin as all zero.
imContrib = zeros(diagDim);
% Get the current row of the sinogram - use rowIndex.
curRow = sino(rowIndex,:);
% Take the 1D Fourier transform the current row - be careful, as it's
% necessary to perform ifftshift and fftshift as Matlab tends to
% place zero-frequency components of a spectrum at the edges.
fourierCurRow = fftshift(fft(ifftshift(curRow)));
% Place the Fourier-transformed sino row and place it at the center of
% the next image contribution. Add the ramp filter in Fourier domain.
imContrib(floor(diagDim/2), :) = fourierCurRow;
imContrib = imContrib * fft(rampFilter_1d);
% Rotate the current image contribution to be at the correct angle on
% the 2D Fourier-space image.
imContrib = imrotate(imContrib, nn, 'crop');
% Add the current image contribution to the running representation of
% the image in Fourier space!
fimg = fimg + imContrib;
rowIndex = rowIndex + 1;
end
% Finally, just take the inverse 2D Fourier transform of the image! Don't
% forget - you may need an fftshift or ifftshift here.
rcon = fftshift(ifft2(ifftshift(fimg)));
The sinogram I'm inputting is just the output of the radon function on a Shepp-Logan phantom from 0 to 179 degrees. Running the code as it is now gives me a black image. I think I'm missing something in the loop where I add the FTs of rows to the image. From my understanding of the central slice theorem, what I think should be happening is this:
Initialize an array the same size as the what the 2DFT will be (i.e., diagDim x diagDim). This is the Fourier space.
Take a row of the sinogram which corresponds to the line integral information from a single angle and apply a 1D FT to it
According to the Central Slice Theorem, the FT of this line integral is a line through the Fourier domain that passes through the origin at an angle that corresponds to the angle at which the projection was taken. So to emulate that, I take the FT of that line integral and place it in the center row of the diagDim x diagDim matrix I created
Next I take the FT of the 1D ramp filter I created and multiply it with the FT of the line integral. Multiplication in the Fourier domain is equivalent to a convolution in the spatial domain so this convolves the line integral with the filter.
Now I rotate the entire matrix by the angle the projection was taken at. This should give me a diagDim x diagDim matrix with a single line of information passing through the center at an angle. Matlab increases the size of the matrix when it is rotated but since the sinogram was padded at the beginning, no information is lost and the matrices can still be added
If all of these empty matrices with a single line through the center are added up together, it should give me the complete 2D FT of the image. All that needs to be done is take the inverse 2D FT and the original image should be the result.
If the problem I'm running into is something conceptual, I'd be grateful if someone could point out where I messed up. If instead this is a Matlab thing (I'm still kind of new to Matlab), I'd appreciate learning what it is I missed.
The code that you have posted is a pretty good example of filtered backprojection (FBP) and I believe could be useful to people who wanted to learn the basis of FBP. One can use the function iradon(...) in MATLAB (see here) to perform FBP using a variety of filters. In your case of course, the point is to learn the basis of the central slice theorem and so finding a short cut is not the point. I have also learned a lot and refreshed my knowledge through answering to your question!
Now your code has been perfectly commented and describes the steps that need to be taken. There are a couple of subtle [programming] issues that need to be fixed so that the code works just fine.
First, your image representation in Fourier domain may end up having a missing array due to floor(diagDim/2) depending on the size of the sinogram. I would change this to round(diagDim/2) to have complete dataset in fimg. Be aware that this may lead to an error for certain sinogram sizes if not handled correctly. I would encourage you to visualize fimg to understand what that missing array is and why it matters.
Second issue is that your sinogram needs to be transposed to be consistent with your algorithm. Hence an addition of sino = sino'. Again, I do encourage you to try the code without this to see what happens! Note that zero padding must be happened along the views to avoid artifacts due to aliasing. I will demonstrate an example for this in this answer.
Third and most importantly, imContrib is a temporary holder for an array along fimg. Therefore, it must maintain the same size as fimg, so
imContrib = imContrib * fft(rampFilter_1d);
should be replaced with
imContrib(floor(diagDim/2), :) = imContrib(floor(diagDim/2), :)' .* rampFilter_1d;
Note that the Ramp filter is linear in frequency domain (thanks to #Cris Luengo for correcting this error). Therefore, you should drop the fft in fft(rampFilter_1d) as this filter is applied in the frequency domain (remember fft(x) decomposes the domain of x, such as time, space, etc to its frequency content).
Now a complete example to show how it works using the modified Shepp-Logan phantom:
angs = 0:359; % angles of rotation 0, 1, 2... 359
init_img = phantom('Modified Shepp-Logan', 100); % Initial image 2D [100 x 100]
sino = radon(init_img, angs); % Create a sinogram using radon transform
% Here is your function ....
% This step is necessary so that frequency information is preserved: we pad
% the sinogram with zeros so that this is ensured.
sino = padarray(sino, floor(size(sino,1)/2), 'both');
% Rotate the sinogram 90-degree to be compatible with your codes definition of view and radial positions
% dim 1 -> view
% dim 2 -> Radial position
sino = sino';
% diagDim should be the length of an individual row of the sinogram - don't
% hardcode this!
diagDim = size(sino, 2);
% The 2DFT (2D Fourier transform) of our image will start as a matrix of
% all zeros.
fimg = zeros(diagDim);
% Design your 1-d ramp filter.
rampFilter_1d = abs(linspace(-1, 1, diagDim))';
rowIndex = 1;
for nn = angs
% fprintf('rowIndex = %g => nn = %g\n', rowIndex, nn);
% Each contribution to the image's 2DFT will also begin as all zero.
imContrib = zeros(diagDim);
% Get the current row of the sinogram - use rowIndex.
curRow = sino(rowIndex,:);
% Take the 1D Fourier transform the current row - be careful, as it's
% necessary to perform ifftshift and fftshift as Matlab tends to
% place zero-frequency components of a spectrum at the edges.
fourierCurRow = fftshift(fft(ifftshift(curRow)));
% Place the Fourier-transformed sino row and place it at the center of
% the next image contribution. Add the ramp filter in Fourier domain.
imContrib(round(diagDim/2), :) = fourierCurRow;
imContrib(round(diagDim/2), :) = imContrib(round(diagDim/2), :)' .* rampFilter_1d; % <-- NOT fft(rampFilter_1d)
% Rotate the current image contribution to be at the correct angle on
% the 2D Fourier-space image.
imContrib = imrotate(imContrib, nn, 'crop');
% Add the current image contribution to the running representation of
% the image in Fourier space!
fimg = fimg + imContrib;
rowIndex = rowIndex + 1;
end
% Finally, just take the inverse 2D Fourier transform of the image! Don't
% forget - you may need an fftshift or ifftshift here.
rcon = fftshift(ifft2(ifftshift(fimg)));
Note that your image has complex value. So, I use imshow(abs(rcon),[]) to show the image. A couple of helpful images (food for thought) with the final reconstructed image rcon:
And here is the same image if you comment out the zero padding step (i.e. comment out sino = padarray(sino, floor(size(sino,1)/2), 'both');):
Note the different object size in the reconstructed images with and without zero padding. The object shrinks when the sinogram is zero padded since the radial contents are compressed.

Two-dimensional matched filter

I want to implement two dimensional matched filter for blood vessel extraction according to the paper "Detection of Blood Vessels in Retinal Images Using Two-Dimensional Matched Filters" by Chaudhuri et al., IEEE Trans. on Medical Imaging, 1989 (there's a PDF on the author's web site).
A brief discription is that blood vessel's cross-section has a gaussian distribution and therefore I want to use gaussian matched filter to increase SNR. Such a kernel may be mathematically expressed as:
K(x,y) = -exp(-x^2/2*sigma^2) for |x|<3*sigma, |y|<L/2
L here is the length of vessel with fixed orientation. Experimentally sigma=1.5 and L = 7.
My MATLAB code for this part is:
s = 1.5; %sigma
t = -3*s:3*s;
theta=0:15:165; %different rotations
%one dimensional kernel
x = 1/sqrt(6*s)*exp(-t.^2/(2*s.^2));
L=7;
%two dimensional gaussian kernel
x2 = repmat(x,L,1);
Consider the response of this filter for a pixel belonging to the background retina. Assuming the background to have constant intensity with zero mean additive Gaussian white noise, the expected value of the filter output should ideally be zero. The convolution kernel is, therefore, modified by subtracting the mean value of s(t) from the function itself. The mean value of the kernel is determined as: m = Sum(K(x,y))/(number of points).
Thus, the convolutional mask used in this algorithm is given by: K(x, y) = K(x,y) - m.
My MATLAB code:
m = sum(x2(:))/(size(x2,1)*size(x2,2));
x2 = x2-m;
A vessel may be oriented at any angle 0<theta<180 and the matched filter response is maximum when when it is aligned at theta+- 90 (cross-section distribution is gaussian not the vessel itself).
Thus we need to rotate the matched filter 12 times with 15 degree increment.
My MATLAB code is attached here but I don't get a desirable result. Any help is appreciated.
%apply rotated matched filter on image
r = {};
for k = 1:12
x3=imrotate(x2,theta(k),'crop');%figure;imagesc(x3);colormap gray;
r{k}=conv2(img,x3);
end
w=[];h = zeros(584,565);
for i = 1:565
for j = 1:584
for k = 1:32
w= [w ,r{k}(j,i)];
end
h(j,i)=max(abs(w));
w = [];
end
end
%show result
figure('Name','after matched filter');imagesc(h);colormap gray
For rotation I used imrotate which seems more sensible to me but in the paper it is different: suppose p=[x,y] be a discrete point in the kernel. To compute coefficients in the rotated kernel we have [u,v] = p*Rotation_Matrix.
Rotation_Matrix=[cos(theta),sin(theta);-sin(theta),cos(theta)]
And the kernel is:
K(x,y) = -exp(-u^2/2*s^2)
But the new kernel doesn't have a gaussian shape anymore. Using imrotate preserves gaussian shape. So what is the benefit of using Rotation matrix?
Input image is:
Output:
Matched filtering helps increase SNR but background noise is amplified too.
Am I right to use imrotate to rotate the kernel? My main problem is with rotation matrix that why and what is the right code to implement it.
The reason to build the filter from its analytic expression for each rotation, rather than using imrotate, is that the filter extent is not circular, and therefore rotating brings in "new" pixel values and pushes some other pixels out of the kernel. Furthermore, rotating a kernel constructed as here (smooth transition along one direction, step edge along the other dimension) requires different interpolation methods along each dimension, which imrotate cannot do. The resulting rotated kernel will always be wrong.
Both these issues can be easily seen when displaying the kernel you make together with two rotated versions:
This display brings an additional issues to the front: the kernel is not centered on a pixel, causing it to shift the output by half a pixel.
Note also that, when subtracting the mean, it is important that this mean be computed only over the original domain of the filter, and that any zeros used to pad this domain to a rectangular shape remain zero (these should not become negative).
The rotated kernels can be constructed as follows:
m = max(ceil(3*s),(L-1)/2);
[x,y] = meshgrid(-m:m,-m:m); % non-rotated coordinate system, contains (0,0)
t = pi/6; % angle in radian
u = cos(t)*x - sin(t)*y; % rotated coordinate system
v = sin(t)*x + cos(t)*y; % rotated coordinate system
N = (abs(u) <= 3*s) & (abs(v) <= L/2); % domain
k = exp(-u.^2/(2*s.^2)); % kernel
k = k - mean(k(N));
k(~N) = 0; % set kernel outside of domain to 0
This is the result for the three rotations used in the example above (the grey around the edges of the kernel corresponds to the value 0, the black pixels have a negative value):
Another issue is that you use conv2 with the default 'full' output shape, you should be using 'same' here, so that the output of the filter matches the input.
Note that, instead of computing all filter responses, and computing the max afterwards, it is much easier to compute the max as you compute each filter response. All of the above leads to the following code:
img = im2double(rgb2gray(img));
s = 1.5; %sigma
L = 7;
theta = 0:15:165; %different rotations
out = zeros(size(img));
m = max(ceil(3*s),(L-1)/2);
[x,y] = meshgrid(-m:m,-m:m); % non-rotated coordinate system, contains (0,0)
for t = theta
t = t / 180 * pi; % angle in radian
u = cos(t)*x - sin(t)*y; % rotated coordinate system
v = sin(t)*x + cos(t)*y; % rotated coordinate system
N = (abs(u) <= 3*s) & (abs(v) <= L/2); % domain
k = exp(-u.^2/(2*s.^2)); % kernel
k = k - mean(k(N));
k(~N) = 0; % set kernel outside of domain to 0
res = conv2(img,k,'same');
out = max(out,res);
end
out = out/max(out(:)); % force output to be in [0,1] interval that MATLAB likes
imwrite(out,'so_result.png')
I get the following output:

Fourier transform for fiber alignment

I'm working on an application to determine from an image the degree of alignment of a fiber network. I've read several papers on this issue and they basically do this:
Find the 2D discrete Fourier transform (DFT = F(u,v)) of the image (gray, range 0-255)
Find the Fourier Spectrum (FS = abs(F(u,v))) and the Power Spectrum (PS = FS^2)
Convert spectrum to polar coordinates and divide it into 1º intervals.
Calculate number-averaged line intensities (FI) for each interval (theta), that is, the average of all the intensities (pixels) forming "theta" degrees with respect to the horizontal axis.
Transform FI(theta) to cartesian coordinates
Cxy(theta) = [FI*cos(theta), FI*sin(theta)]
Find eigenvalues (lambda1 and lambda2) of the matrix Cxy'*Cxy
Find alignment index as alpha = 1 - lamda2/lambda1
I've implemented this in MATLAB (code below), but I'm not sure whether it is ok since point 3 and 4 are not really clear for me (I'm getting similar results to those of the papers, but not in all cases). For instance, in point 3, "spectrum" is referring to FS or to PS?. And in point 4, how should this average be done? are all the pixels considered? (even though there are more pixels in the diagonal).
rgb = imread('network.tif');%513x513 pixels
im = rgb2gray(rgb);
im = imrotate(im,-90);%since FFT space is rotated 90º
FT = fft2(im) ;
FS = abs(FT); %Fourier spectrum
PS = FS.^2; % Power spectrum
FS = fftshift(FS);
PS = fftshift(PS);
xoffset = (513-1)/2;
yoffset = (513-1)/2;
% Avoid low frequency points
x1 = 5;
y1 = 0;
% Maximum high frequency pixels
x2 = 255;
y2 = 0;
for theta = 0:pi/180:pi
% Transposed rotation matrix
Rt = [cos(theta) sin(theta);
-sin(theta) cos(theta)];
% Find radial lines necessary for improfile
xy1_rot = Rt * [x1; y1] + [xoffset; yoffset];
xy2_rot = Rt * [x2; y2] + [xoffset; yoffset];
plot([xy1_rot(1) xy2_rot(1)], ...
[xy1_rot(2) xy2_rot(2)], ...
'linestyle','none', ...
'marker','o', ...
'color','k');
prof = improfile(F,[xy1_rot(1) xy2_rot(1)],[xy1_rot(2) xy2_rot(2)]);
i = i + 1;
FI(i) = sum(prof(:))/length(prof);
Cxy(i,:) = [FI(i)*cos(theta), FI(i)*sin(theta)];
end
C = Cxy'*Cxy;
[V,D] = eig(C)
lambda2 = D(1,1);
lambda1 = D(2,2);
alpha = 1 - lambda2/lambda1
Figure: A) original image, B) plot of log(P+1), C) polar plot of FI.
My main concern is that when I choose an artificial image perfectly aligned (attached figure), I get alpha = 0.91, and it should be exactly 1.
Any help will be greatly appreciated.
PD: those black dots in the middle plot are just the points used by improfile.
I believe that there are a couple sources of potential error here that are leading to you not getting a perfect alpha value.
Discrete Fourier Transform
You have discrete imaging data which forces you to take a discrete Fourier transform which inevitably (depending on the resolution of the input data) have some accuracy issues.
Binning vs. Sampling Along a Line
The way that you have done the binning is that you literally drew a line (rotated by a particular angle) and sampled the image along that line using improfile. Using improfile performs interpolation of your data along that line introducing yet another potential source of error. The default is nearest neighbor interpolation which in the example shown below can cause multiple "profiles" to all pick up the same points.
This was with a rotation of 1-degree off-vertical when technically you'd want those peaks to only appear for a perfectly vertical line. It is clear to see how this sort of interpolation of the Fourier spectrum can lead to a spread around the "correct" answer.
Data Undersampling
Similar to Nyquist sampling in the Fourier domain, sampling in the spatial domain has some requirements as well.
Imagine for a second that you wanted to use 45-degree bin widths instead of the 1-degree. Your approach would still sample along a thin line and use that sample to represent 45-degrees worth or data. Clearly, this is a gross under-sampling of the data and you can imagine that the result wouldn't be very accurate.
It becomes more and more of an issue the further you get from the center of the image since the data in this "bin" is really pie wedge shaped and you're approximating it with a line.
A Potential Solution
A different approach to binning would be to determine the polar coordinates (r, theta) for all pixel centers in the image. Then to bin the theta components into 1-degree bins. Then sum all of the values that fall into that bin.
This has several advantages:
It removes the undersampling that we talked about and draws samples from the entire "pie wedge" regardless of the sampling angle.
It ensures that each pixel belongs to one and only one angular bin
I have implemented this alternate approach in the code below with some false horizontal line data and am able to achieve an alpha value of 0.988 which I'd say is pretty good given the discrete nature of the data.
% Draw a bunch of horizontal lines
data = zeros(101);
data([5:5:end],:) = 1;
fourier = fftshift(fft2(data));
FS = abs(fourier);
PS = FS.^2;
center = fliplr(size(FS)) / 2;
[xx,yy] = meshgrid(1:size(FS,2), 1:size(FS, 1));
coords = [xx(:), yy(:)];
% De-mean coordinates to center at the middle of the image
coords = bsxfun(#minus, coords, center);
[theta, R] = cart2pol(coords(:,1), coords(:,2));
% Convert to degrees and round them to the nearest degree
degrees = mod(round(rad2deg(theta)), 360);
degreeRange = 0:359;
% Band pass to ignore high and low frequency components;
lowfreq = 5;
highfreq = size(FS,1)/2;
% Now average everything with the same degrees (sum over PS and average by the number of pixels)
for k = degreeRange
ps_integral(k+1) = mean(PS(degrees == k & R > lowfreq & R < highfreq));
fs_integral(k+1) = mean(FS(degrees == k & R > lowfreq & R < highfreq));
end
thetas = deg2rad(degreeRange);
Cxy = [ps_integral.*cos(thetas);
ps_integral.*sin(thetas)]';
C = Cxy' * Cxy;
[V,D] = eig(C);
lambda2 = D(1,1);
lambda1 = D(2,2);
alpha = 1 - lambda2/lambda1;

Matlab Template Matching Using FFT

I am struggling with template matching in the Fourier domain in Matlab. Here are my images (the artist is RamalamaCreatures on DeviantArt):
My aim is to place a bounding box around the ear of the possum, like this example (where I performed template matching using normxcorr2):
Here is the Matlab code I am using:
clear all; close all;
template = rgb2gray(imread('possum_ear.jpg'));
background = rgb2gray(imread('possum.jpg'));
%% calculate padding
bx = size(background, 2);
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);
%% fft
c = real(ifft2(fft2(background) .* fft2(template, by, bx)));
%% find peak correlation
[max_c, imax] = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:)));
figure; surf(c), shading flat; % plot correlation
%% display best match
hFig = figure;
hAx = axes;
position = [xpeak(1)-tx, ypeak(1)-ty, tx, ty];
imshow(background, 'Parent', hAx);
imrect(hAx, position);
The code is not functioning as intended - it is not identifying the correct region. This is the failed result - the wrong area is boxed:
This is the surface plot of the correlations for the failed match:
Hope you can help! Thanks.
What you're doing in your code is actually not correlation at all. You are using the template and performing convolution with the input image. If you recall from the Fourier Transform, the multiplication of the spectra of two signals is equivalent to the convolution of the two signals in time/spatial domain.
Basically, what you are doing is that you are using the template as a kernel and using that to filter the image. You are then finding the maximum response of this output and that's what is deemed to be where the template is. Where the response is being boxed makes sense because that region is entirely white, and using the template as the kernel with a region that is entirely white will give you a very large response, which is why it most likely identified that area to be the maximum response. Specifically, the region will have a lot of high values (~255 or so), and naturally performing convolution with the template patch and this region will give you a very large output due to the operation being a weighted sum. As such, if you used the template in a dark area of the image, the output would be small - which is false because the template is also consisting of dark pixels.
However, you can certainly use the Fourier Transform to locate where the template is, but I would recommend you use Phase Correlation instead. Basically, instead of computing the multiplication of the two spectra, you compute the cross power spectrum instead. The cross power spectrum R between two signals in the frequency domain is defined as:
Source: Wikipedia
Ga and Gb are the original image and the template in frequency domain, and the * is the conjugate. The o is what is known as the Hadamard product or element-wise product. I'd also like to point out that the division of the numerator and denominator of this fraction is also element-wise. Using the cross power spectrum, if you find the (x,y) location here that produces the absolute maximum response, this is where the template should be located in the background image.
As such, you simply need to change the line of code that computes the "correlation" so that it computes the cross power spectrum instead. However, I'd like to point out something very important. When you perform normxcorr2, the correlation starts right at the top-left corner of the image. The template matching starts at this location and it gets compared with a window that is the size of the template where the top-left corner is the origin. When finding the location of the template match, the location is with respect to the top-left corner of the matched window. Once you compute normxcorr2, you traditionally add the half of the rows and half of the columns of the maximum response to find the centre location.
Because we are more or less doing the same operations for template matching (sliding windows, correlation, etc.) with the FFT / frequency domain, when you finish finding the peak in this correlation array, you must also take this into account. However, your call to imrect to draw a rectangle around where the template matches takes in the top left corner of a bounding box anyway, so there's no need to do the offset here. As such, we're going to modify that code slightly but keep the offset logic in mind when using this code for later if want to find the centre location of the match.
I've modified your code as well to read in the images directly from StackOverflow so that it's reproducible:
clear all; close all;
template = rgb2gray(imread('http://i.stack.imgur.com/6bTzT.jpg'));
background = rgb2gray(imread('http://i.stack.imgur.com/FXEy7.jpg'));
%% calculate padding
bx = size(background, 2);
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);
%% fft
%c = real(ifft2(fft2(background) .* fft2(template, by, bx)));
%// Change - Compute the cross power spectrum
Ga = fft2(background);
Gb = fft2(template, by, bx);
c = real(ifft2((Ga.*conj(Gb))./abs(Ga.*conj(Gb))));
%% find peak correlation
[max_c, imax] = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:)));
figure; surf(c), shading flat; % plot correlation
%% display best match
hFig = figure;
hAx = axes;
%// New - no need to offset the coordinates anymore
%// xpeak and ypeak are already the top left corner of the matched window
position = [xpeak(1), ypeak(1), tx, ty];
imshow(background, 'Parent', hAx);
imrect(hAx, position);
With that, I get the following image:
I also get the following when showing a surface plot of the cross power spectrum:
There is a clear defined peak where the rest of the output has a very small response. That's actually a property of Phase Correlation and so obviously, the location of the maximum value is clearly defined and this is where the template is located.
Hope this helps!
Just ended up implementing the same with python with similar ideas as #rayryeng's using scipy.fftpack.fftn() / ifftn() functions with the following result on the same target and template images:
import numpy as np
import scipy.fftpack as fp
from skimage.io import imread
from skimage.color import rgb2gray, gray2rgb
import matplotlib.pylab as plt
from skimage.draw import rectangle_perimeter
im = 255*rgb2gray(imread('http://i.stack.imgur.com/FXEy7.jpg')) # target
im_tm = 255*rgb2gray(imread('http://i.stack.imgur.com/6bTzT.jpg')) # template
# FFT
F = fp.fftn(im)
F_tm = fp.fftn(im_tm, shape=im.shape)
# compute the best match location
F_cc = F * np.conj(F_tm)
c = (fp.ifftn(F_cc/np.abs(F_cc))).real
i, j = np.unravel_index(c.argmax(), c.shape)
print(i, j)
# 214 317
# draw rectangle around the best match location
im2 = (gray2rgb(im)).astype(np.uint8)
rr, cc = rectangle_perimeter((i,j), end=(i + im_tm.shape[0], j + im_tm.shape[1]), shape=im.shape)
for x in range(-2,2):
for y in range(-2,2):
im2[rr + x, cc + y] = (255,0,0)
# show the output image
plt.figure(figsize=(10,10))
plt.imshow(im2)
plt.axis('off')
plt.show()
Also, the below animation shows the result obtained while locating a bird's template image inside a set of (target) frames extracted from a video with a flock of birds.
One thing to note: the output is very much dependent on the similarity of the size and shape of the object that is to be matched with the template, if it's quite different from that of the template image, the template may not be matched at all.

Determine the right parameters for deblurring motion blurred images in MATLAB

I'm currently working on program and I need to automatic perform motion un-blurring on a picture. Currently, I make a for loop for LEN and THETA, guessing from LEN 0:50 and THETA from 1:180. There are plenty of motion un-blurred pictures produce in this way - some correct and some are wrong. Now here is my problem: How do I actually determine which set of parameters yields the one most closest to original photo?
I'm thinking of using pixel comparison. Any idea on this?
Here's a pictorial example of what I generated:
http://dl.dropboxusercontent.com/u/81112742/Capture.JPG
If you have access to the original clean image, I would compute the Peak Signal to Noise Ratio (PSNR) for all of the images you have generated, then choose the one with the highest PSNR. Amro posted a very nice post on how to compute this for images and can be found here: https://stackoverflow.com/a/16265510/3250829
However, for self-containment, I'll post the code to do it here. Supposing that your original image is stored in the variable I, and supposing your reconstructed (un-blurred) image is stored in the variable K. Therefore, to calculate the PSNR, you would need to calculate the Mean Squared Error first, then use it to calculate the PSNR. In other words:
mse = mean(mean((im2double(I) - im2double(K)).^2, 1), 2);
psnr = 10 * log10(1 ./ mean(mse,3));
The equations for MSE and PSNR are:
Source: Wikipedia
As such, to use this in your code, your for loops should look something like this:
psnr_max = -realmax;
for LEN = 0 : 50
for THETA = 1 : 180
%// Unblur the image
%//...
%//...
%// Compute PSNR
mse = mean(mean((im2double(I) - im2double(K)).^2, 1), 2);
psnr = 10 * log10(1 ./ mean(mse,3));
if (psnr > psnr_max) %// Get largest PSNR and get the
LEN_final = LEN; %// parameters that made this so
THETA_final = THETA;
psnr_max = psnr;
end
end
end
This loop will go through each pair of LEN and THETA, and LEN_final, THETA_final will be those parameters that gave you the best reconstruction (un-blurring) of the image.