.
I'm to take an image, convert it into a set of 3 matrices using imread(), then calculate a truncated-sum approximation to each matrix using N=1,2,3,4,8,16,32,64,128 terms. I have the matrices, but I'm not really sure about that last part and the reading is a bit vague. What do they mean by a truncated-sum approximation?
Update based on a given answer:
I tried the following:
A = double(imread("image.jpg"))/255;
[U1, S1, V1] = svd(A(:,:,1));
[U2, S2, V2] = svd(A(:,:,2));
[U3, S3, V3] = svd(A(:,:,3));
N = 128;
trunc_image = (U1(1:763,1:N)*S1(1:N,1:N)*V1(1:N,1:691))*255;
imwrite(trunc_image, "truncimg.jpg", "jpg");
...but the resulting image looks like this:
When you perform svd on an image I:
[U,S,V] = svd(I,'econ'); %//you get matrices U, S, V
S will be a diagonal matrix, with decreasing singular values along the diagonals.
Approximation by truncating... means that I can reconstruct I' by zeroing out singular values in S:
I_recon = U(1:256,1:N)*S(1:N,1:N)*V(1:256,1:N).'; %//Reconstruct by keeping the first N singular values in S.
What happens here is that I_recon is an image reconstructed from the N most significant singular values. The purpose of doing this is so that we can remove less significant contributions to the image I, and represent I with less data.
This is an example of reconstructed images with varying N:
Related
Suppose I have a large (but possibly sparse) matrix A, which is K-by-K in dimension. I have another K-by-1 vector, b.
Let Ax=b. If I am only interested in the first n rows, where n < K, of x, then one way of dealing with this in MATLAB is to calculate x=A\b and take the first n elements.
If the dimension K is so large that the entire computation infeasible, is there any other way to get these elements?
I guess one way would be to rearrange the columns of A and rows of x so that the elements you are interested in occur at the end of x. Then you would reduce [A,b] to row echelon form. Finally, to get the components you are after, you take the lower right hand nxn submatrix of the modified A (let's call it An) and you solve the reduced system An * xn = bn, where xn denotes the submarine of x that you are interested in, and bn denotes the last n rows of b after the row echelon reduction.
I mean, the conversion here to echelon form is still expensive, but you don't need to solve for the rest of the components in x, which can save you time.
Just an idea: You could try to use Block Matrix inversion: if you block your matrix into A = [A11, A12;A21, A22], where A11 is n x n, you can compute the blocks of its inverse B = inv(A) = [B11, B12;B21, B22] via Block Matrix Inversion. There are different versions of it, you could use the one where the Schur complement you use is only of size n x n. I'm not quite sure whether it is possible to avoid any inversion that scales with K, but you could look into it.
Your solution is then x(1:n) = [B11, B12]*b. It saves you from ever computing B21, B22. Still, I'm not sure if it is worth it. Depends on the dimensions I guess.
Here is one version, though this still needs the inverse of A22 which is (K-n)x(K-n):
K = 100;
n = 10;
A = randn(K,K);
b = randn(K,1);
% reference version: full inverse
xfull = inv(A)*b;
% blocks of A
A11 = A(1:n,1:n);A12 = A(1:n,n+1:K);A21 = A(n+1:K,1:n);A22 = A(n+1:K,n+1:K);
% blocks of inverse
A22i = inv(A22); % not sure if this can be avoided
B11 = inv(A11 - A12*A22i*A21);
B12 = -B11*A12*A22i;
% solution
x_n = [B11,B12]*b;
disp(x_n - xfull(1:n))
edit: Of course, this computes the inverse "explicitly" and as such is probably much slower than just solving the LSE. It could be worth it, if you had several vectors b you want to fit for a fixed A.
I have a randomly defined H matrix of size 600 x 10. Each element in this matrix H can be represented as H(k,t). I obtained a speech spectrogram S which is 600 x 597. I obtained it using Mel features, so it should be 40 x 611 but then I used a frame stacking concept in which I stacked 15 frames together. Therefore it gave me (40x15) x (611-15+1) which is 600 x 597.
Now I want to obtain an output matrix Y which is given by the equation based on convolution Y(k,t) = ∑ H(k,τ)S(k,t-τ). The sum goes from τ=0 to τ=Lh-1. Lh in this case would be 597.
I don't know how to obtain Y. Also, my doubt is the indexing into both H and S when computing the convolution. Specifically, for Y(1,1), we have:
Y(1,1) = H(1,0)S(1,1) + H(1,1)S(1,0) + H(1,2)S(1,-1) + H(1,3)S(1,-2) + ...
Now, there is no such thing as negative indices in MATLAB - for example, S(1,-1) S(1,-2) and so on. So, what type of convolution should I use to obtain Y? I tried using conv2 or fftfilt but I think that will not give me Y because Y must also be the size of S.
That's very easy. That's a convolution on a 2D signal only being applied to 1 dimension. If we assume that the variable k is used to access the rows and t is used to access the columns, you can consider each row of H and S as separate signals where each row of S is a 1D signal and each row of H is a convolution kernel.
There are two ways you can approach this problem.
Time domain
If you want to stick with time domain, the easiest thing would be to loop over each row of the output, find the convolution of each pair of rows of S and H and store the output in the corresponding output row. From what I can tell, there is no utility that can convolve in one dimension only given an N-D signal.... unless you go into frequency domain stuff, but let's leave that for later.
Something like:
Y = zeros(size(S));
for idx = 1 : size(Y,1)
Y(idx,:) = conv(S(idx,:), H(idx,:), 'same');
end
For each row of the output, we perform a row-wise convolution with a row of S and a row of H. I use the 'same' flag because the output should be the same size as a row of S... which is the bigger row.
Frequency domain
You can also perform the same computation in frequency domain. If you know anything about the properties of convolution and the Fourier Transform, you know that convolution in time domain is multiplication in the frequency domain. You take the Fourier Transform of both signals, multiply them element-wise, then take the Inverse Fourier Transform back.
However, you need to keep the following intricacies in mind:
Performing a full convolution means that the final length of the output signal is length(A)+length(B)-1, assuming A and B are 1D signals. Therefore, you need to make sure that both A and B are zero-padded so that they both match the same size. The reason why you make sure that the signals are the same size is to allow for the multiplication operation to work.
Once you multiply the signals in the frequency domain then take the inverse, you will see that each row of Y is the full length of the convolution. To ensure that you get an output that is the same size as the input, you need to trim off some points at the beginning and at the end. Specifically, since each kernel / column length of H is 10, you would have to remove the first 5 and last 5 points of each signal in the output to match what you get in the for loop code.
Usually after the inverse Fourier Transform, there are some residual complex coefficients due to the nature of the FFT algorithm. It's good practice to use real to remove the complex valued parts of the results.
Putting all of this theory together, this is what the code would look like:
%// Define zero-padded H and S matrices
%// Rows are the same, but columns must be padded to match point #1
H2 = zeros(size(H,1), size(H,2)+size(S,2)-1);
S2 = zeros(size(S,1), size(H,2)+size(S,2)-1);
%// Place H and S at the beginning and leave the rest of the columns zero
H2(:,1:size(H,2)) = H;
S2(:,1:size(S,2)) = S;
%// Perform Fourier Transform on each row separately of padded matrices
Hfft = fft(H2, [], 2);
Sfft = fft(S2, [], 2);
%// Perform convolution
Yfft = Hfft .* Sfft;
%// Take inverse Fourier Transform and convert to real
Y2 = real(ifft(Yfft, [], 2));
%// Trim off unnecessary values
Y2 = Y2(:,size(H,2)/2 + 1 : end - size(H,2)/2 + 1);
Y2 should be the convolved result and should match Y in the previous for loop code.
Comparison between them both
If you actually want to compare them, we can. What we'll need to do first is define H and S. To reconstruct what I did, I generated random values with a known seed:
rng(123);
H = rand(600,10);
S = rand(600,597);
Once we run the above code for both the time domain version and frequency domain version, let's see how they match up in the command prompt. Let's show the first 5 rows and 5 columns:
>> format long g;
>> Y(1:5,1:5)
ans =
1.63740867892464 1.94924208172753 2.38365646354643 2.05455605619097 2.21772526557861
2.04478411247085 2.15915645246324 2.13672842742653 2.07661341840867 2.61567534623066
0.987777477630861 1.3969752201781 2.46239452105228 3.07699790208937 3.04588738611503
1.36555260994797 1.48506871890027 1.69896157726456 1.82433906982894 1.62526864072424
1.52085236885395 2.53506897420001 2.36780282057747 2.22335617436888 3.04025523335182
>> Y2(1:5,1:5)
ans =
1.63740867892464 1.94924208172753 2.38365646354643 2.05455605619097 2.21772526557861
2.04478411247085 2.15915645246324 2.13672842742653 2.07661341840867 2.61567534623066
0.987777477630861 1.3969752201781 2.46239452105228 3.07699790208937 3.04588738611503
1.36555260994797 1.48506871890027 1.69896157726456 1.82433906982894 1.62526864072424
1.52085236885395 2.53506897420001 2.36780282057747 2.22335617436888 3.04025523335182
Looks good to me! As another measure, let's figure out what the largest difference is between one value in Y and a corresponding value in Y2:
>> max(abs(Y(:) - Y2(:)))
ans =
5.32907051820075e-15
That's saying that the max error seen between both outputs is in the order of 10-15. I'd say that's pretty good.
I am brand new to MATLAB but am trying to do some image compression code for grayscale images.
Questions
How can I use SVD to trim off low-valued eigenvalues to reconstruct a compressed image?
Work/Attempts so far
My code so far is:
B=imread('images1.jpeg');
B=rgb2gray(B);
doubleB=double(B);
%read the image and store it as matrix B, convert the image to a grayscale
photo and convert the matrix to a class 'double' for values 0-255
[U,S,V]=svd(doubleB);
This allows me to successfully decompose the image matrix with eigenvalues stored in variable S.
How do I truncate S (which is 167x301, class double)? Let's say of the 167 eigenvalues I want to take only the top 100 (or any n really), how do I do that and reconstruct the compressed image?
Updated code/thoughts
Instead of putting a bunch of code in the comments section, this is the current draft I have. I have been able to successfully create the compressed image by manually changing N, but I would like to do 2 additional things:
1- Show a pannel of images for various compressions (i/e, run a loop for N = 5,10,25, etc.)
2- Somehow calculate the difference (error) between each image and the original and graph it.
I am horrible with understanding loops and output, but this is what I have tried:
B=imread('images1.jpeg');
B=rgb2gray(B);
doubleB=im2double(B);%
%read the image and store it as matrix B, convert the image to a grayscale
%photo and convert the image to a class 'double'
[U,S,V]=svd(doubleB);
C=S;
for N=[5,10,25,50,100]
C(N+1:end,:)=0;
C(:,N+1:end)=0;
D=U*C*V';
%Use singular value decomposition on the image doubleB, create a new matrix
%C (for Compression diagonal) and zero out all entries above N, (which in
%this case is 100). Then construct a new image, D, by using the new
%diagonal matrix C.
imshow(D);
error=C-D;
end
Obviously there are some errors because I don't get multiple pictures or know how to "graph" the error matrix
Although this question is old, it has helped me a lot to understand SVD. I have modified the code you have written in your question to make it work.
I believe you might have solved the problem, however just for the future reference for anyone visiting this page, I am including the complete code here with the output images and graph.
Below is the code:
close all
clear all
clc
%reading and converting the image
inImage=imread('fruits.jpg');
inImage=rgb2gray(inImage);
inImageD=double(inImage);
% decomposing the image using singular value decomposition
[U,S,V]=svd(inImageD);
% Using different number of singular values (diagonal of S) to compress and
% reconstruct the image
dispEr = [];
numSVals = [];
for N=5:25:300
% store the singular values in a temporary var
C = S;
% discard the diagonal values not required for compression
C(N+1:end,:)=0;
C(:,N+1:end)=0;
% Construct an Image using the selected singular values
D=U*C*V';
% display and compute error
figure;
buffer = sprintf('Image output using %d singular values', N)
imshow(uint8(D));
title(buffer);
error=sum(sum((inImageD-D).^2));
% store vals for display
dispEr = [dispEr; error];
numSVals = [numSVals; N];
end
% dislay the error graph
figure;
title('Error in compression');
plot(numSVals, dispEr);
grid on
xlabel('Number of Singular Values used');
ylabel('Error between compress and original image');
Applying this to the following image:
Gives the following result with only first 5 Singular Values,
with first 30 Singular Values,
and the first 55 Singular Values,
The change in error with increasing number of singular values can be seen in the graph below.
Here you can notice the graph is showing that using approximately 200 first singular values yields to approximately zero error.
Just to start, I assume you're aware that the SVD is really not the best tool to decorrelate the pixels in a single image. But it is good practice.
OK, so we know that B = U*S*V'. And we know S is diagonal, and sorted by magnitude. So by using only the top few values of S, you'll get an approximation of your image. Let's say C=U*S2*V', where S2 is your modified S. The sizes of U and V haven't changed, so the easiest thing to do for now is to zero the elements of S that you don't want to use, and run the reconstruction. (Easiest way to do this: S2=S; S2(N+1:end, :) = 0; S2(:, N+1:end) = 0;).
Now for the compression part. U is full, and so is V, so no matter what happens to S2, your data volume doesn't change. But look at what happens to U*S2. (Plot the image). If you kept N singular values in S2, then only the first N rows of S2 are nonzero. Compression! Except you still have to deal with V. You can't use the same trick after you've already done (U*S2), since more of U*S2 is nonzero than S2 was by itself. How can we use S2 on both sides? Well, it's diagonal, so use D=sqrt(S2), and now C=U*D*D*V'. So now U*D has only N nonzero rows, and D*V' has only N nonzero columns. Transmit only those quantities, and you can reconstruct C, which is approximately like B.
For example, here's a 512 x 512 B&W image of Lena:
We compute the SVD of Lena. Choosing the singular values above 1% of the maximum singular value, we are left with just 53 singular values. Reconstructing Lena with these singular values and the corresponding (left and right) singular vectors, we obtain a low-rank approximation of Lena:
Instead of storing 512 * 512 = 262144 values (each taking 8 bits), we can store 2 x (512 x 53) + 53 = 54325 values, which is approximately 20% of the original size. This is one example of how SVD can be used to do lossy image compression.
Here's the MATLAB code:
% open Lena image and convert from uint8 to double
Lena = double(imread('LenaBW.bmp'));
% perform SVD on Lena
[U,S,V] = svd(Lena);
% extract singular values
singvals = diag(S);
% find out where to truncate the U, S, V matrices
indices = find(singvals >= 0.01 * singvals(1));
% reduce SVD matrices
U_red = U(:,indices);
S_red = S(indices,indices);
V_red = V(:,indices);
% construct low-rank approximation of Lena
Lena_red = U_red * S_red * V_red';
% print results to command window
r = num2str(length(indices));
m = num2str(length(singvals));
disp(['Low-rank approximation used ',r,' of ',m,' singular values']);
% save reduced Lena
imwrite(uint8(Lena_red),'Reduced Lena.bmp');
taking the first n max number of eigenvalues and their corresponding eigenvectors may solve your problem.For PCA, the original data multiplied by the first ascending eigenvectors will construct your image by n x d where d represents the number of eigenvectors.
d=50;
im = imread('H:\matlab\bildanalys\terminator.gif');
M2 = double(im);
[U S V] = svd(M2);
U2 = U(:,1:d);
S2 = S(1:d,1:d);
V2 = V(:,1:d);
compressed=U2*S2*V2';
imwrite(compressed,'H:\matlab\bildanalys\compressedterminator.gif','gif')
S2
the compressed image is 3times bigger...
I do svd on the image, throw away the smaller singular values(although they are quite big) then multiply together the matrices again to get the compressed image. the compressed image is black and white and bigger than the original. where do i fail?
I'm not sure how you end up with 3 times bigger, as conversions between classes scale in powers of 2. The only explanation I can think of is that the original image, im was probably a uint8 with 3 color channels. This upon conversion to grayscale and double, becomes 8/3~2.67 times as big. However, it doesn't look like you're going from 3 dims to 1, unless if you didn't post that part of the code here.
As for using SVDs to reduce storage, if you re-multiply your matrices, you're going back to a full matrix with the exact same number of elements as you started with and hence you'll get the same sized image.
The original problem is A = USV* with dimensions of (m x n) = (m x m) (m x n) (n x n).
The central idea is to use the thin SVD and replace the image with the decomposition products U, S, and V* after discarding the null space components corresponding to the new zero values in the SVD. For example, a deep space probe may have lots of computational power and little transmission bandwidth. The satellite records an image, decomposes, compresses and transmits the decomposition products, NOT the image. The image is recomposed later.
Compression reduces the rank from p = min( m, n ) to rho. As r.m. notes the dimensions for the compressed image are the same (m x n) = (m x rho) (m x rho) (rho x n).
The original rank p may be in the thousands, with rho being as low as 10. The input matrix has mn elements; the compressed data requires rho( m + n + rho) elements.
Example: m = n = 2048, rho = 10. The image requires 4,194,304 elements; the sum of elements in the compressed decomposition is 41,060, less than 1% of the original data volume.
The factor of three increase is a MATLAB issue which r.m. has diagnosed accurately.
You're probably saving redundant columns and rows. After you truncate the singular values diagonal---which you're storing as a 1D array, of course---you should also truncate the columns and rows of the matrices U and V^t to macth, respectively. Otherwise, you are storing a bunch of values that will be multiplied out to 0 during the reconstruction M = U * S * Vt.
For example, using the Apache Commons library for Java (as I'm less familiar with MatLab, sorry):
// compute your SVD from some input matrix
final SingularValueDecomposition svd = new SingularValueDecomposition(matrix);
// only store necessary matrix portions in memory required to compute compressed matrix
final RealMatrix u = svd.getU().getSubMatrix(0, svd.getU().getRowDimension()-1, 0, (svd.getU().getColumnDimension()-1)-compression);
// grab singular values
final RealMatrix s = svd.getS().getSubMatrix(0, (size-compression)-1, 0, (size-compression)-1);
// grab V transpose
final RealMatrix vt = svd.getVT().getSubMatrix(0, (svd.getVT().getRowDimension()-1)-compression, 0, svd.getVT().getColumnDimension()-1);
// compute compressed matrix
return new Array2DRowRealMatrix(u)
.multiply(new Array2DRowRealMatrix(s))
.multiply(new Array2DRowRealMatrix(vt));
I am working towards comparing multiple images. I have these image data as column vectors of a matrix called "images." I want to assess the similarity of images by first computing their Eucledian distance. I then want to create a matrix over which I can execute multiple random walks. Right now, my code is as follows:
% clear
% clc
% close all
%
% load tea.mat;
images = Input.X;
M = zeros(size(images, 2), size (images, 2));
for i = 1:size(images, 2)
for j = 1:size(images, 2)
normImageTemp = sqrt((sum((images(:, i) - images(:, j))./256).^2));
%Need to accurately select the value of gamma_i
gamma_i = 1/10;
M(i, j) = exp(-gamma_i.*normImageTemp);
end
end
My matrix M however, ends up having a value of 1 along its main diagonal and zeros elsewhere. I'm expecting "large" values for the first few elements of each row and "small" values for elements with column index > 4. Could someone please explain what is wrong? Any advice is appreciated.
Since you're trying to compute a Euclidean distance, it looks like you have an error in where your parentheses are placed when you compute normImageTemp. You have this:
normImageTemp = sqrt((sum((...)./256).^2));
%# ^--- Note that this parenthesis...
But you actually want to do this:
normImageTemp = sqrt(sum(((...)./256).^2));
%# ^--- ...should be here
In other words, you need to perform the element-wise squaring, then the summation, then the square root. What you are doing now is summing elements first, then squaring and taking the square root of the summation, which essentially cancel each other out (or are actually the equivalent of just taking the absolute value).
Incidentally, you can actually use the function NORM to perform this operation for you, like so:
normImageTemp = norm((images(:, i) - images(:, j))./256);
The results you're getting seem reasonable. Recall the behavior of the exp(-x). When x is zero, exp(-x) is 1. When x is large exp(-x) is zero.
Perhaps if you make M(i,j) = normImageTemp; you'd see what you expect to see.
Consider this solution:
I = Input.X;
D = squareform( pdist(I') ); %'# euclidean distance between columns of I
M = exp(-(1/10) * D); %# similarity matrix between columns of I
PDIST and SQUAREFORM are functions from the Statistics Toolbox.
Otherwise consider this equivalent vectorized code (using only built-in functions):
%# we know that: ||u-v||^2 = ||u||^2 + ||v||^2 - 2*u.v
X = sum(I.^2,1);
D = real( sqrt(bsxfun(#plus,X,X')-2*(I'*I)) );
M = exp(-(1/10) * D);
As was explained in the other answers, D is the distance matrix, while exp(-D) is the similarity matrix (which is why you get ones on the diagonal)
there is an already implemented function pdist, if you have a matrix A, you can directly do
Sim= squareform(pdist(A))