Decomposing an image into two frequency components using DCT? - matlab

I am a beginner in digital image processing field, recently I am working on a project where I have to decompose an image into two frequency components namely (low and high) using DCT. I searched a lot on web and I found that MATLAB has a built-in function for Discrete Cosine Transform which is used like this:
dct_img = dct2(img);
where img is input image and dct_img is resultant DCT of img.
Question
My question is, "How can I decompose the dct_img into two frequency components namely low and high frequency components".

As you've mentioned, dct2 and idct2 will do most of the job for you. The question that remains is then: What is high frequency and what is low frequency content? The coefficients after the 2 dimensional transform will actually represent two frequencies each (one in x- and one in y-direction). The following figure shows the bases for each coefficient in an 8x8 discrete cosine transform:
Therefore, that question of low vs. high can be answered in different ways. A common way, which is also used in the JPEG encoding, proceeds diagonally from zero-frequency downto the max as shown above. As we can see in the following example that is mostly motivated because natural images are largely located in the "top left" corner of "low" frequencies. It is certainly worth looking at the result of dct2 and play around with the actual choice of your regions for high and low.
In the following I'm dividing the spectrum diagonally and also plotting the DCT coefficients - in logarithmic scale because otherwise we would just see one big peak around (1,1). In the example I'm cutting far above half of the coefficients (adjustable with cutoff) we can see that the high-frequency part ("HF") still contains some relevant image information. If you set cutoff to 0 or below only noise of small amplitude will be left.
%// Load an image
Orig = double(imread('rice.png'));
%// Transform
Orig_T = dct2(Orig);
%// Split between high- and low-frequency in the spectrum (*)
cutoff = round(0.5 * 256);
High_T = fliplr(tril(fliplr(Orig_T), cutoff));
Low_T = Orig_T - High_T;
%// Transform back
High = idct2(High_T);
Low = idct2(Low_T);
%// Plot results
figure, colormap gray
subplot(3,2,1), imagesc(Orig), title('Original'), axis square, colorbar
subplot(3,2,2), imagesc(log(abs(Orig_T))), title('log(DCT(Original))'), axis square, colorbar
subplot(3,2,3), imagesc(log(abs(Low_T))), title('log(DCT(LF))'), axis square, colorbar
subplot(3,2,4), imagesc(log(abs(High_T))), title('log(DCT(HF))'), axis square, colorbar
subplot(3,2,5), imagesc(Low), title('LF'), axis square, colorbar
subplot(3,2,6), imagesc(High), title('HF'), axis square, colorbar
(*) Note on tril: The lower triangle-function operates with respect to the mathematical diagonal from top-left to bottom-right, since I want the other diagonal I'm flipping left-right before and afterwards.
Also note that this kind of operations are not usually applied to entire images, but rather to blocks of e.g. 8x8. Have a look at blockproc and this article.

An easy example:
I2 = dct_img;
I2(8:end,8:end) = 0;
I3 = idct2(I2);
imagesc(I3)
I3 can be seen as the image after low pass filter (the low frequency components), then idct2(dct_img - I2) can be viewed as high frequency.

Related

Subpixel edge detection for almost vertical edges

I want to detect edges (with sub-pixel accuracy) in images like the one displayed:
The resolution would be around 600 X 1000.
I came across a comment by Mark Ransom here, which mentions about edge detection algorithms for vertical edges. I haven't come across any yet. Will it be useful in my case (since the edge isn't strictly a straight line)? It will always be a vertical edge though. I want it to be accurate till 1/100th of a pixel at least. I also want to have access to these sub-pixel co-ordinate values.
I have tried "Accurate subpixel edge location" by Agustin Trujillo-Pino. But this does not give me a continuous edge.
Are there any other algorithms available? I will be using MATLAB for this.
I have attached another similar image which the algorithm has to work on:
Any inputs will be appreciated.
Thank you.
Edit:
I was wondering if I could do this:
Apply Canny / Sobel in MATLAB and get the edges of this image (note that it won't be a continuous line). Then, somehow interpolate this Sobel edges and get the co-ordinates in subpixel. Is it possible?
A simple approach would be to project your image vertically and fit the projected profile with an appropriate function.
Here is a try, with an atan shape:
% Load image
Img = double(imread('bQsu5.png'));
% Project
x = 1:size(Img,2);
y = mean(Img,1);
% Fit
f = fit(x', y', 'a+b*atan((x0-x)/w)', 'Startpoint', [150 50 10 150])
% Display
figure
hold on
plot(x, y);
plot(f);
legend('Projected profile', 'atan fit');
And the result:
I get x_0 = 149.6 pix for your first image.
However, I doubt you will be able to achieve a subpixel accuracy of 1/100th of pixel with those images, for several reasons:
As you can see on the profile, your whites are saturated (grey levels at 255). As you cut the real atan profile, the fit is biased. If you have control over the experiments, I suggest you do it again again with a smaller exposure time for instance.
There are not so many points on the transition, so there is not so many information on where the transition is. Typically, your resolution will be the square root of the width of the atan (or whatever shape you prefer). In you case this limits the subpixel resolution at 1/5th of a pixel, at best.
Finally, your edges are not stricly vertical, they are slightly titled. If you choose to use this projection method, to increase the accuracy you should look for a way to correct this tilt before projecting. This won't increase your accuracy by several orders of magnitude, though.
Best,
There is a problem with your image. At pixel level, it seems like there are four interlaced subimages (odd and even rows and columns). Look at this zoomed area close to the edge.
In order to avoid this artifact, I just have taken the even rows and columns of your image, and compute subpixel edges. And finally, I look for the best fitting straight line, using the function clsq whose code is in this page:
%load image
url='http://i.stack.imgur.com/bQsu5.png';
image = imread(url);
imageEvenEven = image(1:2:end,1:2:end);
imshow(imageEvenEven, 'InitialMagnification', 'fit');
% subpixel detection
threshold = 25;
edges = subpixelEdges(imageEvenEven, threshold);
visEdges(edges);
% compute fit line
A = [ones(size(edges.x)) edges.x edges.y];
[c n] = clsq(A,2);
y = [1,200];
x = -(n(2)*y+c) / n(1);
hold on;
plot(x,y,'g');
When executing this code, you can see the green line that best aproximate all the edge points. The line is given by the equation c + n(1)*x + n(2)*y = 0
Take into account that this image has been scaled by 1/2 when taking only even rows and columns, so the right coordinates must be scaled.
Besides, you can try with the other tree subimages (imageEvenOdd, imageOddEven and imageOddOdd) and combine the four straigh lines to obtain the best solution.

abs function for fft2 is not working in MATLAB

i am trying to plot the figure of FFT magnitude of an image using the following code in the command window:
a= imread('lena','png')
figure,imshow(a)
ffta=fft2(a)
fftshift1=fftshift(ffta)
magnitude=abs(fftshift1)
figure,imshow(magnitude),title('magnitude')
However, the figure with the title magnitude shows nothing, even though MATLAB shows that it has computed abs() on fftshift. The figure is still empty, and there is no error. Also, why do we need to compute the phase shift before magnitude?
The reason why this is probably happening is because of the following things:
When you take the 2D fft of your image, it will produce a double valued result, even though your image is mostly unsigned 8-bit integer. MATLAB assumes that double formatted images have their intensities / colours between [0,1]. By doing imshow on just the magnitude itself, you will most likely get an entirely white image because I suspect a good majority of the FFT coefficients are bigger than 1. This is probably the blank figure that you're referring to.
Even if you rescale the magnitude so that it is between [0,1], the DC coefficient will be so large that if you try to display the image, you'll only see a white dot in the middle while every other component will be black.
As a side note, the reason why you are doing fftshift is because by default, MATLAB assumes that the origin of the FFT for 2D is located at the top left corner. Doing fftshift will allow the origin to be in the middle, which is what we would intuitively expect of the 2D FFT.
In order to remedy this situation, I would suggest doing a log transformation on the FFT coefficients so you can visually see the results. I would also normalize the coefficients once you log transform it so that they go between [0,1]. Do not actually modify the FFT coefficients as this would be improper. You need to leave them the same way that it is because if you intend to do any processing on the spectrum, you would start by working on the raw image. Doing filter design or anything of that sort will require the raw spectrum, as the final filter will depend on these coefficients untouched. Unless you actually want to do a log operation as part of your pipeline, then leave these coefficients as is. As such, this can be done through the following MATLAB code:
imshow(log(1 + magnitude), []);
I'm going to show an example, using your code that you have provided but using another image as you haven't provided one here. I'm going to use the cameraman.tif image that's part of the MATLAB system path. As such:
a= imread('cameraman.tif');
figure,imshow(a);
ffta=fft2(a);
fftshift1=fftshift(ffta);
magnitude=abs(fftshift1);
figure;
imshow(log(1 + magnitude), []); %// NEW
title('magnitude')
This is what I get:
As you can see, the magnitude is displayed more nicely. Also, the DC coefficient is in the middle of the spectrum thanks to fftshift.
If you want to apply this for colour images, fft2 should still work. It will apply the 2D fft to each colour plane by itself. However, if you want this to work, you'll not only need to take the log transform, but you'll also need to normalize each plane separately. You have to do this because if we tried doing the imshow command we did earlier, it would normalize it so that the greatest value in the spectrum of the colour image gets normalized to 1. This will inevitably produce that same small dot effect that we talked about earlier.
Let's try a colour image that's built-in to MATLAB: onion.png. We will use the same code that you used above, but we need an additional step of normalizing each colour plane by itself. As such:
a = imread('onion.png');
figure,imshow(a);
ffta=fft2(a);
fftshift1=fftshift(ffta);
magnitude=abs(fftshift1);
logMag = log(1 + magnitude); %// New
for c = 1 : size(a,3); %// New - normalize each plane
logMag(:,:,c) = mat2gray(logMag(:,:,c));
end
figure; imshow(logMag); title('magnitude');
Note that I had to loop through each colour plane and use mat2gray to normalize each plane to [0,1]. Also, I had to create a new variable called logMag because I have to modify each colour plane individually, and you can't do this with a single imshow call.
With this, these are the results I get:
What's different with this spectrum is that we are applying the FFT to each colour plane separately, and so you'll see a whole bunch of colour spatters because for each location in this image, we are visualizing a linear combination of components from the red, green and blue channels. For each location, we have a value in between [0,1] for each colour plane, and the combination of these give you a colour at this location. You could say that darker colours are for locations that have a relatively low magnitude for at least one of the colour channels, while locations that are brighter have a relatively high magnitude for at least one of the colour channels.
Hope this helps!
Can't be sure about your version of "lena.png", but if it's a color RGB picture, you need to convert it first to grayscale, or at least select which RGB plane you want to examine.
I.e., the following works for http://optipng.sourceforge.net/pngtech/img/lena.png (color png):
clear; close all;
a = imread('lena','png');
ag = rgb2gray(a);
ag = im2double(ag);
figure(1);
imshow(ag);
F = fftshift( fft2(ag) ); % also try fft2(ag, N, N) where N < image size. Say N=128.
magnitude=abs(F);
figure(2);
imshow(magnitude);

Evaluate straightness of an arbitrary countour

I want to get a metric of straightness of contour in my binary image (relatively faster). The image looks as follows:
Now, the contours in the red box are the ones which I would like to be removed preferably. Since they are not straight. These are the things I have tried. I am as of now implementing in MATLAB.
1.Collect row and column coordinates of each contour and then take derivative. For straight objects (such as rectangle), derivative will be mostly low with a few spikes (along the corners of the rectangle).
Problem: The coordinates collected are not in order i.e. the order in which the contour will be traversed if we imaging it as a path. Therefore, derivative gives absurdly high values sometimes. Also, the contour is not absolutely straight, its an output of edge detection algorithm, so you can imagine that there might be some discontinuity (see the rectangle at the bottom, human eye can understand that it is a rectangle though it is not absolutely straight).
2.Tried to think about polyfit, but again this contour issue comes up. Since its a rectangle I don't know how to apply polyfit to that point set.
Also, I would like to remove contours which are distributed vertically/horizontally. Basically this is a lane detection algorithm. So lanes cannot be absolutely vertical/horizontal.
Any ideas?
You should look into the features of regionprops more. To be fair I stole the script from this answer, but here it is:
BW = imread('lanes.png');
BW = im2bw(BW);
figure(1),
subplot(1,2,1);
imshow(BW);
cc = bwconncomp(BW);
l = labelmatrix(cc);
a_rp = regionprops(CC,'Area','MajorAxisLength','MinorAxislength','Orientation','PixelList','Eccentricity');
idx = ([a_rp.Eccentricity] > 0.99 & [a_rp.Area] > 100 & [a_rp.Orientation] < 70 & [a_rp.Orientation] > -90);
BW2 = ismember(l,find(idx));
subplot(1,2,2);
imshow(BW2);
You can mess around with the properties. 'Orientation', 'Eccentricity', and 'Area' are probably the parameters you want to mess with. I also messed with the ratios of the major/minor axis lengths but eccentricity basically does this (eccentricity is a measure of how "circular" an ellipse is). Here's the output:
I actually saw a good video specifically from matlab for lane detection using regionprops. I'll try to see if I can find it and link it.
You can segment your image using bwlabel, then work separately on each bwlabel connected object, using find. This should help solve your order problem.
About a metric, the only thing that come to mind at the moment is to fit to an ellipse, and set the a/b (major axis/minor axis) ratio (basically eccentricity) a parameter. For example a straight line (even if not perfect) will be fitted to an ellipse with a very big major axis and a very small minor axis. So say you set a ratio threshold of >10 etc... Fitting to an ellipse can be done using this FEX submission for example.

Hough transform in MATLAB without using hough function

I found an implementation of the Hough transform in MATLAB at Rosetta Code, but I'm having trouble understanding it. Also I would like to modify it to show the original image and the reconstructed lines (de-Houghing).
Any help in understanding it and de-Houghing is appreciated. Thanks
Why is the image flipped?
theImage = flipud(theImage);
I can't wrap my head around the norm function. What is its purpose, and can it be avoided?
EDIT: norm is just a synonym for euclidean distance: sqrt(width^2 + height^2)
rhoLimit = norm([width height]);
Can someone provide an explanation of how/why rho, theta, and houghSpace is calculated?
rho = (-rhoLimit:1:rhoLimit);
theta = (0:thetaSampleFrequency:pi);
numThetas = numel(theta);
houghSpace = zeros(numel(rho),numThetas);
How would I de-Hough the Hough space to recreate the lines?
Calling the function using a 10x10 image of a diagonal line created using the identity (eye) function
theImage = eye(10)
thetaSampleFrequency = 0.1
[rho,theta,houghSpace] = houghTransform(theImage,thetaSampleFrequency)
The actual function
function [rho,theta,houghSpace] = houghTransform(theImage,thetaSampleFrequency)
%Define the hough space
theImage = flipud(theImage);
[width,height] = size(theImage);
rhoLimit = norm([width height]);
rho = (-rhoLimit:1:rhoLimit);
theta = (0:thetaSampleFrequency:pi);
numThetas = numel(theta);
houghSpace = zeros(numel(rho),numThetas);
%Find the "edge" pixels
[xIndicies,yIndicies] = find(theImage);
%Preallocate space for the accumulator array
numEdgePixels = numel(xIndicies);
accumulator = zeros(numEdgePixels,numThetas);
%Preallocate cosine and sine calculations to increase speed. In
%addition to precallculating sine and cosine we are also multiplying
%them by the proper pixel weights such that the rows will be indexed by
%the pixel number and the columns will be indexed by the thetas.
%Example: cosine(3,:) is 2*cosine(0 to pi)
% cosine(:,1) is (0 to width of image)*cosine(0)
cosine = (0:width-1)'*cos(theta); %Matrix Outerproduct
sine = (0:height-1)'*sin(theta); %Matrix Outerproduct
accumulator((1:numEdgePixels),:) = cosine(xIndicies,:) + sine(yIndicies,:);
%Scan over the thetas and bin the rhos
for i = (1:numThetas)
houghSpace(:,i) = hist(accumulator(:,i),rho);
end
pcolor(theta,rho,houghSpace);
shading flat;
title('Hough Transform');
xlabel('Theta (radians)');
ylabel('Rho (pixels)');
colormap('gray');
end
The Hough Transform is a "voting" approach where each image point casts a vote on the existence of a certain line (not a line segment) in an image. The voting is carried out in the parameter space for a line: the polar coordinate representation of normal vectors.
We discretize the parameter space and allow each image point to suggest parameters which would be compatible with a line through the point. Each of your questions can be addressed in terms of how the parameter space is treated in code. Wikipedia has a good article with worked examples that might clarify things (if you are having any conceptual troubles).
For your specific questions:
The image is flipped so the origin is the bottom right corner. As far as I can tell this step is not technically necessary. It does change the outcome somewhat due to discretization issues. The other implementations on Rosetta Code do not flip the image.
rhoLimit holds the maximum radius of an image point in polar coordinates (recall the norm of a vector is its magnitude).
rho and theta are discretizations of the polar coordinate plane according to a sampling rate. houghSpace creates a matrix with an element for each possible combination of the discrete rho/theta values.
The Hough Transform does not specify the lengths of putative lines; the peaks in the voting space just specify the polar coordinates of the normal vector of the line. You can "de-Hough" by selecting the peaks and drawing the corresponding lines, or perhaps by drawing every possible line and using the number of votes as a grayscale weight. It is not possible to re-create the original image from the Hough Transform, just the lines identified by the transform (and your thresholding scheme on the votes).
Following the example from the question produces the following graph. The placement of grid lines and the datatips cursor can be a bit misleading (though the variable values in the 'tip are correct). Since this is an image of the parameter space and not the image space the sampling rate we chose is determining the number of bins in each variable. At this sampling rate, the image points are compatible with more than one possible line; in other words our lines have subpixel resolution, in the sense that they cannot be drawn without overlap in a 10x10 image.
Once we have chosen a peak, such as that corresponding to the line with normal (rho,theta) = (6.858,0.9), we can draw that line in an image however we choose. Automated peak picking, that is thresholding to find the highly up-voted lines, is its own problem - you could ask a another question about the topic in DSP or about a particular algorithm here.
For example methods see the code and documentation of MATLAB's houghpeaks and houghlines functions.

How to improve image quality in Matlab

I'm building an "Optical Character Recognition" system.
so far the system is capable to identify licence plates in good quality without any noise.
what I want in the next level is to be able to identify licence plates in poor quality beacuse of different reasons.
for example, let's look at the next plate:
as you see, the numbers are not look clearly, because of light returns or something else.
for my question: how can I improve the image quality, so when I move to binary image the numbers will not fade away?
thanks in advance.
We can try to correct for lighting effect by fitting a linear plane over the image intensities, which will approximate the average level across the image. By subtracting this shading plane from the original image, we can attempt to
normalize lighting conditions across the image.
For color RGB images, simply repeat the process on each channel separately,
or even apply it to a different colorspace (HSV, Lab*, etc...)
Here is a sample implementation:
function img = correctLighting(img, method)
if nargin<2, method='rgb'; end
switch lower(method)
case 'rgb'
%# process R,G,B channels separately
for i=1:size(img,3)
img(:,:,i) = LinearShading( img(:,:,i) );
end
case 'hsv'
%# process intensity component of HSV, then convert back to RGB
HSV = rgb2hsv(img);
HSV(:,:,3) = LinearShading( HSV(:,:,3) );
img = hsv2rgb(HSV);
case 'lab'
%# process luminosity layer of L*a*b*, then convert back to RGB
LAB = applycform(img, makecform('srgb2lab'));
LAB(:,:,1) = LinearShading( LAB(:,:,1) ./ 100 ) * 100;
img = applycform(LAB, makecform('lab2srgb'));
end
end
function I = LinearShading(I)
%# create X-/Y-coordinate values for each pixel
[h,w] = size(I);
[X Y] = meshgrid(1:w,1:h);
%# fit a linear plane over 3D points [X Y Z], Z is the pixel intensities
coeff = [X(:) Y(:) ones(w*h,1)] \ I(:);
%# compute shading plane
shading = coeff(1).*X + coeff(2).*Y + coeff(3);
%# subtract shading from image
I = I - shading;
%# normalize to the entire [0,1] range
I = ( I - min(I(:)) ) ./ range(I(:));
end
Now lets test it on the given image:
img = im2double( imread('http://i.stack.imgur.com/JmHKJ.jpg') );
subplot(411), imshow(img)
subplot(412), imshow( correctLighting(img,'rgb') )
subplot(413), imshow( correctLighting(img,'hsv') )
subplot(414), imshow( correctLighting(img,'lab') )
The difference is subtle, but it might improve the results of further image processing and OCR task.
EDIT: Here is some results I obtained by applying other contrast-enhancement techniques IMADJUST, HISTEQ, ADAPTHISTEQ on the different colorspaces in the same manner as above:
Remember you have to fine-tune any parameter to fit your image...
It looks like your question has been more or less answered already (see d00b's comment); however, here are a few basic image processing tips that might help you here.
First, you could try a simple imadjust. This simply maps the pixel intensities to a "better" value, which often increases the contrast (making it easier to view/read). I have had a lot of success with it in my work. It is easy to use too! I think its worth a shot.
Also, this looks promising if you simply want a higher resolution image.
Enjoy the "pleasure" of image-processing in MATLAB!
Good luck,
tylerthemiler
P.S. If you are flattening the image to binary tho, you are most likely ruining the image to start with, so don't do that if you can avoid it!
As you only want to find digits (of which there are only 10), you can use cross-correlation.
For this you would Fourier transform the picture of the plate. You also Fourier transform a pattern you want to match a good representation of a picture of the digit 1. Then you multiply in fourier space and inversely Fourier transform the result.
In the final cross-correlation, you will see pronounced peaks, where the pattern overlaps nicely with your image.
You do this 10 times and know where each digit is. Note that you must correct the tilt before you do the cross correlation.
This method has the advantage that you don't have to threshold your image.
There are certainly much more sophisticated algorithms in the literature for assigning number plates. One could for example use Bayes theory to estimate which digit would most likely occur (this helps a lot if you already have a databases of possible numbers).