I am doing a project on histogram based image retrieval, and I need to compare learning algorithms for a set of images. So, in MATLAB, I converted an image (256x256 pixels) into HSV, quantized it to 8(H),3(S),3(V) and created a weighted sum, which is a 256x256 matrix.
I want to use this matrix (of all images in the dataset) to create an ARFF file, and I am stuck at this point. Can anyone help me out with how it has to be done?
If I understood what you did, you took the image as input (256x256 RGB matrix) and converted it to a 256x256 matrix where each position is a weighted sum of HSV values.
However, if you want to extract a color histogram (which, in this case, is the appropriate input to Weka), you should have as output a vector, where each entry is the count of how many pixels has a given H, S and L value.
Since you have 8 different values for H (0 to 7), 3 for S (0 to 2) and 3 for L (0 to 2), your vector V should have 8+3+3=14 entries. In order to compute V, use the following algorithm:
Input: quantized HSL image I
Output: histogram V
for each pixel p in I:
V[p.H] = V[p.H] + 1 // Increment the count for the H component.
V[7 + p.S] = V[7 + p.S] + 1 // Increment the count for the S component.
V[10 + p.L] = V[10 + p.L] + 1 // Increment the count for the L component.
return V
Related
The function to perform an N-dimensional convolution of arrays A and B in matlab is shown below:
C = convn(A,B) % returns the N-dimensional convolution of arrays A and B.
I am interested in a 3-D convolution with a Gaussian filter.
If A is a 3 x 5 x 6 matrix, what do the dimensions of B have to be?
The dimensions of B can be anything you want. There is no set restriction in terms of size. For the Gaussian filter, it can be 1D, 2D or 3D. In 1D, what will happen is that each row gets filtered independently. In 2D, what will happen is that each slice gets filtered independently. Finally, in 3D you will be doing what is expected in 3D convolution. I am assuming you would like a full 3D convolution, not just 1D or 2D.
You may be interested in the output size of convn. If you refer to the documentation, given the two N dimensional matrices, for each dimension k of the output and if nak is the size of dimension k for the matrix A and nbk is the size of dimension k for matrix B, the size of dimension of the output matrix C or nck is such that:
nck = max([nak + nbk - 1, nak, nbk])
nak + nbk - 1 is straight from convolution theory. The final output size of a dimension is simply the sum of the two sizes in dimension k subtracted by 1. However should this value be smaller than either of nak or nbk, we need to make sure that the output size is compatible so that any of the input matrices can fit in the final output. This is why you have the final output size and bounded by both A and B.
To make this easier, you can set the size of the filter guided by the standard deviation of the distribution. I would like to refer you to my previous Stack Overflow post: By which measures should I set the size of my Gaussian filter in MATLAB?
This determines what the output size of a Gaussian filter should be given a standard deviation.
In 2D, the dimensions of the filter are N x N, such that N = ceil(6*sigma + 1) with sigma being the desired standard deviation. Therefore, you would allocate a 3D matrix of size N x N x N with N = ceil(6*sigma + 1);.
Therefore, the code you would want to use to create a 3D Gaussian filter would be something like this:
% Example input
A = rand(3, 5, 6);
sigma = 0.5; % Example
% Find size of Gaussian filter
N = ceil(6*sigma + 1);
% Define grid of centered coordinates of size N x N x N
[X, Y, Z] = meshgrid(-N/2 : N/2);
% Compute Gaussian filter - note normalization step
B = exp(-(X.^2 + Y.^2 + Z.^2) / (2.0*sigma^2));
B = B / sum(B(:));
% Convolve
C = convn(A, B);
One final note is that if the filter you provide has any of its dimensions that are beyond the size of the input matrix A, you will get a matrix using the constraints of each nck value, but then the border elements will be zeroed due to zero-padding.
I have a n channel image and I have a 100x2 matrix of points (in my case n is 20 but perhaps it is more clear to think of this as a 3 channel image). I need to sample the image at each point and get an nx100 array of these image points.
I know how to do this with a for loop:
for j = 1:100
samples(j,:) = image(points(j,1),points(j,2),:);
end
How would I vectorize this? I have tried
samples = image(points);
but this gives 200 samples of 20 channels. And if I try
samples = image(points,:);
this gives me 200 samples of 4800 channels. Even
samples = image(points(:,1),points(:,2));
gives me 100 x 100 samples of 20 (one for each possible combination of x in X and y in Y)
A concise way to do this would be to reshape your image so that you force your image that was [nRows, nCols, nChannels] to be [nRows*nCols, nChannels]. Then you can convert your points array into a linear index (using sub2ind) which will correspond to the new "combined" row index. Then to grab all channels, you can simply use the colon operator (:) for the second dimension which now represents the channels.
% Determine the new row index that will correspond to each point after we reshape it
sz = size(image);
inds = sub2ind(sz([1, 2]), points(:,2), points(:,1));
% Do the reshaping (i.e. flatten the first two dimensions)
reshaped_image = reshape(image, [], size(image, 3));
% Grab the pixels (rows) that we care about for all channels
newimage = reshaped_image(inds, :);
size(newimage)
100 20
Now you have the image sampled at the points you wanted for all channels.
I was wondering how I could extract bit planes of an image for image compression in MATLAB?
Getting individual bit planes is very easy in MATLAB. Use the bitget function.
bitget takes in an array / matrix of an integral type (uint8, uint16, etc.) and it returns an array / matrix of the same size that gives you the bit at a specified position.
For example, supposing that your image was A of size M x N and you wanted the least significant bit, you would do this:
B = bitget(A, 1);
B would be a M x N matrix where each location gives you the least significant bit for the corresponding pixels in the image. You would change the second parameter from 1 up to as many bits as the type supported to get the desired bit location you want so from 1, the least significant bit, up to K, the most significant bit.
If you wanted all bit planes in a single 3D matrix, that can easily be done in the following way assuming an 8-bit unsigned integer grayscale image stored in A:
B = zeros(size(A, 1), size(A, 2), 8, 'uint8');
for idx = 1 : 8
B(:,:,idx) = bitget(A, idx);
end
This will produce a 3D matrix B of 8 slices where the first slice (B(:,:,1)) denotes the LSB at each pixel location up to the last slice (B(:,:,8)) which denotes the MSB at each pixel location.
Read more about bitget on MathWorks' official documentation on the function: http://www.mathworks.com/help/matlab/ref/bitget.html
I have to apply prewit filter to an image in the frequency domain.Here is the procedure I am following.
1) Convert the NxN matrix of image to 2*Nx2*N matrix by padding zeros
2) Center the image transform by multiplying image with (-1)^(x+y)
3) Compute DFT of image matrix
4) Create the filter of dimensions 2Nx2N and the center at coordinates (N,N)
5) Multiply image matrix with filter matrix
6) Calculate inverse DFT of it and extract the real part of result.
7) Decentralize the result by multiplying with (-1)^(x+y)
8) Finally extract the upper left NxN part of the resultant matrix
My code is below:
% mask=[-1,0,1;-1,0,1;-1,0,1];
%read image
signal=imread('cman.pgm');
signal=double(signal);
% image has NxN dimensions
l=size(signal,1);
pad_signal=zeros(2*l,2*l);
pad_signal(1:l,1:l)=signal;
m=size(mask,1);
mask_f=zeros(2*l,2*l);
for i=-1:1
mask_f(l+i,l-1)=-1;
mask_f(l+i,l+1)=1;
end
x=1:2*l;
[x y]=meshgrid(x,x);
% Multiply each pixel f(x,y) with (-1)*(x+y)
pad_signal=pad_signal.*((-1).^(x+y));
mask_f=myDFT(mask_f);
%find the DFT of image
signal_dft=myDFT(pad_signal);
%multiply the filter with image
res=mask_f*signal_dft;
% find the inverse DFT of real values of result
res=real(myIDFT(res));
res=res.*((-1).^(x+y));
%extract the upper left NxN portion of the result
res=res(1:l,1:l);
imshow(uint8(res));
The method above is from an image processing book. What I am confused about is should I be using a window of 3x3 as prewitt filter is of 3x3 or is my current way of using the filter correct? (i.e. by placing the filter values at centre of 2Nx2N filter matrix and setting all other index values to 0) .
If not either of them, then how should the filter be formed to be multiplied with the dft of image.
Your current way of padding the filter to be the same size as the image is basically correct. We often speak loosely about filtering a length M signal with a length 3 filter, but the implicit assumption is that we are padding both to length M, or maybe length M+3-1.
Some details of your approach complicate things:
1) The multiplication by (-1)^(x+y) just translates the DFT and isn't needed. (See Foundations of Signal Processing Table 3.7 "Circular shift in frequency" for the 1D case. In that notation, you are letting k_0 be N/2, so the W_N term in the left column is just toggling between -1 and 1.)
2) Because the Prewitt filter only has a 3x3 non-zero support, your output only needs to be of size N+2 by N+2. The formula to remember here is length(signal) + length(filter) - 1.
Here's how I would approach this:
clear
x = im2double(imread('cameraman.tif'));
[M, N] = size(x);
h = [-1 0 1;
-1 0 1;
-1 0 1];
P = M + size(h,1) - 1;
Q = N + size(h,2) - 1;
xPadded = x;
xPadded(P, Q) = 0;
hPadded = h;
hPadded(P,Q) = 0;
hShifted = circshift(hPadded, [-1 -1]);
H = fft2(hShifted);
X = fft2(xPadded);
Y = H .* X;
y = ifft2(Y);
yCropped = y(1:M, 1:N);
imshow(yCropped,[]);
Here is how I have solved my problem. I first removed step 2 and 7 from the algorithm. Then centered the transform by swapping the first half of the indices with the second half, in both horizontal and vertical direction. I did this to center the transform of the image. Then I undid this after calculating the inverse DFT of the resultant matrix. I am not sure why my above method does not work but it does so now.
1) Convert the NxN matrix of image to 2*Nx2*N matrix by padding zeros
2) Compute DFT of image matrix
3) Centre the transform of the image by swapping the first and second half of rows and columns.
4) Create the filter of dimensions 2Nx2N and the center at coordinates (N,N)
5) Multiply image matrix with filter matrix
6) Calculate inverse DFT of it and extract the real part of result.
7) Decentralize the result by reapplying step 3 on the resultant matrix
8) Finally extract the upper left NxN part of the resultant matrix
The above is the modified version of steps that I have followed when applying my filtering.
Here is my code (edited/new version)
function res=myFreqConv(signal,mask)
signal=double(signal);
l=size(signal,1);
% padding the image matrix with zeros and making it's size equal to
% 2Nx2N
pad_signal=zeros(2*l,2*l);
pad_signal(1:l,1:l)=signal;
m=size(mask,1);
mask_f=zeros(2*l,2*l);
% Creating a mask of 2Nx2N dims where the prewitt filter values are
at
% the center of the mask i.e. the indices are like this
% [(N-1,N-1), (N-1,N), (N-1,N+1);(N,N-1), (N,N), (N,N+1); (N+1,N-1),
(N+1,N), (N+1,N+1)]
for i=-1:1
mask_f(l+i,l-1)=-1;
mask_f(l+i,l+1)=1;
end
% calculate DFT of mask
mask_f=myDFT(mask_f);
signal_dft=myDFT(pad_signal);
% shifting the image transform to center
indices=cell(1,2);
indices{1}=[2*l/2+1:2*l 1:2*l/2];
indices{2}=[2*l/2+1:2*l 1:2*l/2];
signal_dft=signal_dft(indices{:});
%multiply mask with image
res=mask_f.*signal_dft;
res=real(myIDFT(res));
% shifting the image transform back to original
res=res(indices{:});
res=res(1:l,1:l);
end
I am brand new to MATLAB but am trying to do some image compression code for grayscale images.
Questions
How can I use SVD to trim off low-valued eigenvalues to reconstruct a compressed image?
Work/Attempts so far
My code so far is:
B=imread('images1.jpeg');
B=rgb2gray(B);
doubleB=double(B);
%read the image and store it as matrix B, convert the image to a grayscale
photo and convert the matrix to a class 'double' for values 0-255
[U,S,V]=svd(doubleB);
This allows me to successfully decompose the image matrix with eigenvalues stored in variable S.
How do I truncate S (which is 167x301, class double)? Let's say of the 167 eigenvalues I want to take only the top 100 (or any n really), how do I do that and reconstruct the compressed image?
Updated code/thoughts
Instead of putting a bunch of code in the comments section, this is the current draft I have. I have been able to successfully create the compressed image by manually changing N, but I would like to do 2 additional things:
1- Show a pannel of images for various compressions (i/e, run a loop for N = 5,10,25, etc.)
2- Somehow calculate the difference (error) between each image and the original and graph it.
I am horrible with understanding loops and output, but this is what I have tried:
B=imread('images1.jpeg');
B=rgb2gray(B);
doubleB=im2double(B);%
%read the image and store it as matrix B, convert the image to a grayscale
%photo and convert the image to a class 'double'
[U,S,V]=svd(doubleB);
C=S;
for N=[5,10,25,50,100]
C(N+1:end,:)=0;
C(:,N+1:end)=0;
D=U*C*V';
%Use singular value decomposition on the image doubleB, create a new matrix
%C (for Compression diagonal) and zero out all entries above N, (which in
%this case is 100). Then construct a new image, D, by using the new
%diagonal matrix C.
imshow(D);
error=C-D;
end
Obviously there are some errors because I don't get multiple pictures or know how to "graph" the error matrix
Although this question is old, it has helped me a lot to understand SVD. I have modified the code you have written in your question to make it work.
I believe you might have solved the problem, however just for the future reference for anyone visiting this page, I am including the complete code here with the output images and graph.
Below is the code:
close all
clear all
clc
%reading and converting the image
inImage=imread('fruits.jpg');
inImage=rgb2gray(inImage);
inImageD=double(inImage);
% decomposing the image using singular value decomposition
[U,S,V]=svd(inImageD);
% Using different number of singular values (diagonal of S) to compress and
% reconstruct the image
dispEr = [];
numSVals = [];
for N=5:25:300
% store the singular values in a temporary var
C = S;
% discard the diagonal values not required for compression
C(N+1:end,:)=0;
C(:,N+1:end)=0;
% Construct an Image using the selected singular values
D=U*C*V';
% display and compute error
figure;
buffer = sprintf('Image output using %d singular values', N)
imshow(uint8(D));
title(buffer);
error=sum(sum((inImageD-D).^2));
% store vals for display
dispEr = [dispEr; error];
numSVals = [numSVals; N];
end
% dislay the error graph
figure;
title('Error in compression');
plot(numSVals, dispEr);
grid on
xlabel('Number of Singular Values used');
ylabel('Error between compress and original image');
Applying this to the following image:
Gives the following result with only first 5 Singular Values,
with first 30 Singular Values,
and the first 55 Singular Values,
The change in error with increasing number of singular values can be seen in the graph below.
Here you can notice the graph is showing that using approximately 200 first singular values yields to approximately zero error.
Just to start, I assume you're aware that the SVD is really not the best tool to decorrelate the pixels in a single image. But it is good practice.
OK, so we know that B = U*S*V'. And we know S is diagonal, and sorted by magnitude. So by using only the top few values of S, you'll get an approximation of your image. Let's say C=U*S2*V', where S2 is your modified S. The sizes of U and V haven't changed, so the easiest thing to do for now is to zero the elements of S that you don't want to use, and run the reconstruction. (Easiest way to do this: S2=S; S2(N+1:end, :) = 0; S2(:, N+1:end) = 0;).
Now for the compression part. U is full, and so is V, so no matter what happens to S2, your data volume doesn't change. But look at what happens to U*S2. (Plot the image). If you kept N singular values in S2, then only the first N rows of S2 are nonzero. Compression! Except you still have to deal with V. You can't use the same trick after you've already done (U*S2), since more of U*S2 is nonzero than S2 was by itself. How can we use S2 on both sides? Well, it's diagonal, so use D=sqrt(S2), and now C=U*D*D*V'. So now U*D has only N nonzero rows, and D*V' has only N nonzero columns. Transmit only those quantities, and you can reconstruct C, which is approximately like B.
For example, here's a 512 x 512 B&W image of Lena:
We compute the SVD of Lena. Choosing the singular values above 1% of the maximum singular value, we are left with just 53 singular values. Reconstructing Lena with these singular values and the corresponding (left and right) singular vectors, we obtain a low-rank approximation of Lena:
Instead of storing 512 * 512 = 262144 values (each taking 8 bits), we can store 2 x (512 x 53) + 53 = 54325 values, which is approximately 20% of the original size. This is one example of how SVD can be used to do lossy image compression.
Here's the MATLAB code:
% open Lena image and convert from uint8 to double
Lena = double(imread('LenaBW.bmp'));
% perform SVD on Lena
[U,S,V] = svd(Lena);
% extract singular values
singvals = diag(S);
% find out where to truncate the U, S, V matrices
indices = find(singvals >= 0.01 * singvals(1));
% reduce SVD matrices
U_red = U(:,indices);
S_red = S(indices,indices);
V_red = V(:,indices);
% construct low-rank approximation of Lena
Lena_red = U_red * S_red * V_red';
% print results to command window
r = num2str(length(indices));
m = num2str(length(singvals));
disp(['Low-rank approximation used ',r,' of ',m,' singular values']);
% save reduced Lena
imwrite(uint8(Lena_red),'Reduced Lena.bmp');
taking the first n max number of eigenvalues and their corresponding eigenvectors may solve your problem.For PCA, the original data multiplied by the first ascending eigenvectors will construct your image by n x d where d represents the number of eigenvectors.