I've been researching on image compression with SVD for school. However, I do not see how there will be a reduction in memory by using SVD and truncating the number of singular values used. The original image would be m x n in size, thereby using m x n x pixel-size bytes.
After SVD the resultant matrix is still m x n. Would it not then use the same amount of space?
That's because the rank-k approximation of the image requires you to store (think about saving the image into a file) only the first k singular vectors and singular values, resulting in a m x k storage space instead of m x n. Then, when you want to render the image on screen you are obviously uncompressing it back to the m x n size (as you do with any other kind of compression), but that's not the true size of the image, is only rendering.
Related
I have a matrix like M = K x N ,where k is 49152 and is the dimension of the problem and N is 52 and is the number of observations.
I have tried to use [U,S,V]=SVD(M) but doing this I get less memory space.
I found another code which uses [U,S,V]=SVD(COV(M)) and it works well. My questions are what is the meaning of using the COV(M) command inside the SVD and what is the meaning of the resultant [U,S,V]?
Finding the SVD of the covariance matrix is a method to perform Principal Components Analysis or PCA for short. I won't get into the mathematical details here, but PCA performs what is known as dimensionality reduction. If you like a more formal treatise on the subject, you can read up on my post about it here: What does selecting the largest eigenvalues and eigenvectors in the covariance matrix mean in data analysis?. However, simply put dimensionality reduction projects your data stored in the matrix M onto a lower dimensional surface with the least amount of projection error. In this matrix, we are assuming that each column is a feature or a dimension and each row is a data point. I suspect the reason why you are getting more memory occupied by applying the SVD on the actual data matrix M itself rather than the covariance matrix is because you have a significant amount of data points with a small amount of features. The covariance matrix finds the covariance between pairs of features. If M is a m x n matrix where m is the total number of data points and n is the total number of features, doing cov(M) would actually give you a n x n matrix, so you are applying SVD on a small amount of memory in comparison to M.
As for the meaning of U, S and V, for dimensionality reduction specifically, the columns of V are what are known as the principal components. The ordering of V is in such a way where the first column is the first axis of your data that describes the greatest amount of variability possible. As you start going to the second columns up to the nth column, you start to introduce more axes in your data and the variability starts to decrease. Eventually when you hit the nth column, you are essentially describing your data in its entirety without reducing any dimensions. The diagonal values of S denote what is called the variance explained which respect the same ordering as V. As you progress through the singular values, they tell you how much of the variability in your data is described by each corresponding principal component.
To perform the dimensionality reduction, you can either take U and multiply by S or take your data that is mean subtracted and multiply by V. In other words, supposing X is the matrix M where each column has its mean computed and the is subtracted from each column of M, the following relationship holds:
US = XV
To actually perform the final dimensionality reduction, you take either US or XV and retain the first k columns where k is the total amount of dimensions you want to retain. The value of k depends on your application, but many people choose k to be the total number of principal components that explains a certain percentage of your variability in your data.
For more information about the link between SVD and PCA, please see this post on Cross Validated: https://stats.stackexchange.com/q/134282/86678
Instead of [U, S, V] = svd(M), which tries to build a matrix U that is 49152 by 49152 (= 18 GB 😱!), do svd(M, 'econ'). That returns the “economy-class” SVD, where U will be 52 by 52, S is 52 by 52, and V is also 52 by 52.
cov(M) will remove each dimension’s mean and evaluate the inner product, giving you a 52 by 52 covariance matrix. You can implement your own version of cov, called mycov, as
function [C] = mycov(M)
M = bsxfun(#minus, M, mean(M, 1)); % subtract each dimension’s mean over all observations
C = M' * M / size(M, 1);
(You can verify this works by looking at mycov(randn(49152, 52)), which should be close to eye(52), since each element of that array is IID-Gaussian.)
There’s a lot of magical linear algebraic properties and relationships between the SVD and EVD (i.e., singular value vs eigenvalue decompositions): because the covariance matrix cov(M) is a Hermitian matrix, it’s left- and right-singular vectors are the same, and in fact also cov(M)’s eigenvectors. Furthermore, cov(M)’s singular values are also its eigenvalues: so svd(cov(M)) is just an expensive way to get eig(cov(M)) 😂, up to ±1 and reordering.
As #rayryeng explains at length, usually people look at svd(M, 'econ') because they want eig(cov(M)) without needing to evaluate cov(M), because you never want to compute cov(M): it’s numerically unstable. I recently wrote an answer that showed, in Python, how to compute eig(cov(M)) using svd(M2, 'econ'), where M2 is the 0-mean version of M, used in the practical application of color-to-grayscale mapping, which might help you get more context.
The title of this post may be a bit confusing. Please allow me to provide a bit of context and then elaborate on what I'm asking. For your reference, the question I'm asking is toward the end and is denoted by bold letters. I provide some code, outlining where I'm currently at in solving the problem, immediately beforehand.
Essentially what I'm trying to do is Kernel Regression, which is usually done using a single test point x and a set of training instances . A reference to this can be found on wikipedia here. The kernel I'm using is the RBF kernel, a Wikipedia reference for which can be found here.
Anyway, I have some code written in Matlab so that this can be done quickly for a single instance of x, which is 1 x p in size. What I'd like to do is make it so I can estimate for numerous points very quickly, say m x p.
For the sake of avoiding notational mixups, I'll let the training instances be denoted Train and the instances I want estimates for as Test: and . It also needs to be mentioned that I want to estimate a vector of numbers for each of the m points. For a single point this vector would be 1 x v in size. Now I need it to be m x v. Therefore, Train will also have a vector of these know values associated with it called TS: . Lastly, we need a vector of sigmas that is 1 x v in size. This is denoted as Sig.
Here's the code I have so far:
%First, we have to get the matrices to equivalent size so we can subtract Train from Test
tm0 = kron(ones(size(Train,1),1),Test) - kron(ones(size(Test,1),1),Train);
%Secondly, we apply the Euclidean norm sq by row and then multiply each of these results by each element (j) in Sig times 1/2j^2
tm3 = exp(-kron(sum((tm0).^2,2),1/2./(Sig.^2)));
Now, at this point tm3 is an (m*n) x v matrix. This is where my question is: I now need to multiply TS' (TS transpose) times each of the n x v-sized segments in tm3 (there are m of these segments), get the diagonal elements of each of these resulting segments (after multiplication one of the m segments will be v x v, so each chunk of diagonal elements will be 1 x v meaning the resulting matrix is m x v) and sum these diagonal elements together to produce an m x 1 sized matrix. Lastly, I will need to divide each entry i in this m x 1 matrix by each of the v elements in the ith row of the diagonal-holding m x v-sized matrix, producing an m x v-sized result matrix.
I hope all of that makes sense. I'm sure there's some kind of trick that can be employed, but I'm just not coming up with it. Any help is greatly appreciated.
Edit 1: I was asked to provide more of an example to help demonstrate what it is that I would like done. The following represent that two matrices I'm talking about, TS and tm3:
As you can see, TS'(TS transpose) is v x n and tm3 is mn x v. In tm3 there are blocks that are of size n x v -- there are m blocks of this size. Notice that the size of TS' is of size v x n. This means that I can multiply TS' by a single block of tm3, which again is of size n x v. This would result in a matrix that is v x v in size. I would like to do this operation -- individually multiplying TS' by each of the n x v-sized blocks of tm3, which would produce m v x v matrices.
From here, though, I would like to obtain the diagonal elements from each of these v x v matrices. So, for a single v x v matrix, denoted using a:
Ultimately, I would to do this for each of the m v x v matrices giving me something that looks like the following, where s is the mth v x v matrix:
If I denote this last matrix as Q, which is m x v in size, it is trivial to sum the elements across the rows to produce the m x 1 vector I was looking for. I will refer to this vector as C. However, I would then like to divide each of these m scalar values by the corresponding row of matrix Q, to produce another m x v matrix:
This is the final matrix I'm looking for. Hopefully this helps make it clear what I'm looking for. Thanks for taking the time to read this!
Thought: I'm pretty sure I could accomplish this by converting tm3 to a cell matrix by doing tc1 = mat2cell(tm3,repmat(length(Train),1,m),length(Sig)), and then put replicate TS m times in another cell matrix tc2 = mat2cell(TS',length(indirectSigma),repmat(length(Train),1,m))'. Finally, I could do operations like tc3 = cellfun(#(a,b) a*b, tc2,tc1,'UniformOutput',false), which would give me m cells filled with the v x v matrices I was looking for. I could proceed from there. However, I'm not sure how fast these cell operations are. Can anybody comment? I'm afraid they might be slow, so I would prefer operations be performed on normal matrices, which I know to be fast. Thanks!
Say there is a matrix of (m x n x p), esp. a color image with R G and B channel. Each channel information is 8-bit integer.
But, for an analysis, the three 8-bit values have to be combined to get a 24-bit value and the analysis is done on the (m x n) matrix of 24-bit values.
After the analysis, the matrix has to be decomposed back to three 8-bit channels for displaying the results.
What I am doing right now:
Iterating through all the values in the matrix
Convert each decimal value to binary (using dec2bin)
Combine the three binary values together to get a 24-bit number (using strcat and bin2dec)
Code:
for i=1:m
for j=1:n
new_img(i,j) = bin2dec(strcat(...
sprintf('%.8d',str2double(dec2bin(img(i,j,1)))), ...
sprintf('%.8d',str2double(dec2bin(img(i,j,2)))), ...
sprintf('%.8d',str2double(dec2bin(img(i,j,3))))));
end
end
For the decomposition back to three 8-bits after analysis, the exact reverse process is done, still iterating through (m x n) values.
The problem is huge computation time.
I know that this is the not the correct way of doing this. Is there any matrix operation that I can do to achieve this so that the computation is done quickly?
Although I don't understand why you'd "combine" the rgb planes this way, this'll get you what you're looking for in one command.
a = bitshift(img(:,:,1),16)+...
bitshift(img(:,:,2,8)+...
img(:,:,3);
And to invert the process requires binary masking in addition to shifting back to the right.
A=zeros(size(img));
A(:,:,1)=bitshift(a,-16);
A(:,:,2)=bitshift(bitand(a,2^16-2^8),-8);
A(:,:,3)=bitand(a,2^8-2^0);
I have an mxm kernel matrix, K, which is for the sake of simplicity, a linear kernel computed as pdist2(X,X), where X is mxn and the m dimension relates to feature vectors with n dimensions.
since n is large, I save computation time by precalculating K for all X.
Later on, I need to swap two of the features in X, say X_1 and X_5.
Can I somehow rearrange K, without having to recompute the entire matrix?
If pv is your permutation vector and J0=pdist2(X,X), then
Y=X(pv,:); J1=pdist2(Y,Y);
should get you the same answer as
J1=J0(pv,pv);
If you are permuting the columns (I couldn't quite tell from your question), then it seems like J1 and J0 should be equal...
a process of mine produces 256 binary (logical) matrices, one for each level of a grayscale source image.
Here is the code :
so = imread('bio_sd.bmp');
co = rgb2gray(so);
for l = 1:256
bw = (co == l); % Binary image from level l of original image
be = ordfilt2(bw, 1, ones(3, 3)); % Convolution filter
bl(int16(l)) = {bwlabel(be, 8)}; % Component labelling
end
I obtain a cell array of 256 binary images. Such a binary image contains 1s if the source-image pixel at that location has the same level as the index of the binary image.
ie. the binary image bl{12} contains 1s where the source image has pixels with the level 12.
I'd like to create new image by combining the 256 binary matrices back to a grayscale image.
But i'm very new to Matlab and i wonder if someone can help me to code it :)
ps : i'm using matlab R2010a student edition.
this whole answer only applies to the original form of the question.
Lets assume you can get all your binary matrices together into a big n-by-m-by-256 matrix binaryimage(x,y,greyvalue). Then you can calculate your final image as
newimage=sum(bsxfun(#times,binaryimage,reshape(0:255,1,1,[])),3)
The magic here is done by bsxfun, which multiplies the 3D (n x m x 256) binaryimage with the 1 x 1 x 256 vector containing the grey values 0...255. This produces a 3D image where for fixed x and y, the vector (y,x,:) contains many zeros and (for the one grey value G where the binary image contained a 1) it contains the value G. So now you only need to sum over this third dimension to get a n x m image.
Update
To test that this works correctly, lets go the other way first:
fullimage=floor(rand(100,200)*256);
figure;imshow(fullimage,[0 255]);
is a random greyscale image. You can calculate the 256 binary matrices like this:
binaryimage=false([size(fullimage) 256]);
for i=1:size(fullimage,1)
for j=1:size(fullimage,2)
binaryimage(i,j,fullimage(i,j)+1)=true;
end
end
We can now apply the solution I gave above
newimage=sum(bsxfun(#times,binaryimage,reshape(0:255,1,1,[])),3);
and verify that I returns the original image:
all(newimage(:)==fullimage(:))
which gives 1 (true) :-).
Update 2
You now mention that your binary images are in a cell array, I assume binimg{1:256}, with each cell containing an n x m binary array. If you can it probably makes sense to change the code that produces this data to create the 3D binary array I use above - cells are mostly usefull if different cells contain data of different types, shapes or sizes.
If there are good reasons to stick with a cell array, you can convert it to a 3D array using
binaryimage = reshape(cell2mat(reshape(binimg,1,256)),n,m,256);
with n and m as used above. The inner reshape is not necessary if you already have size(binimg)==[1 256]. So to sum it up, you need to use your cell array binimg to calculate the 3D matrix binaryimage, which you can then use to calculate the newimage that you are interested in using the code at the very beginning of my answer.
Hope this helps...
What your code does...
I thought it may be best to first go through what the code you posted is actually doing, since there are a couple of inconsistencies. I'll go through each line in your loop:
bw = (co == l);
This simply creates a binary matrix bw with ones where your grayscale image co has a pixel intensity equal to the loop value l. I notice that you loop from 1 to 256, and this strikes me as odd. Typically, images loaded into MATLAB will be an unsigned 8-bit integer type, meaning that the grayscale values will span the range 0 to 255. In such a case, the last binary matrix bw that you compute when l = 256 will always contain all zeroes. Also, you don't do any processing for pixels with a grayscale level of 0. From your subsequent processing, I'm guessing you purposefully want to ignore grayscale values of 0, in which case you probably only need to loop from 1 to 255.
be = ordfilt2(bw, 1, ones(3, 3));
What you are essentially doing here with ORDFILT2 is performing a binary erosion operation. Any values of 1 in bw that have a 0 as one of their 8 neighbors will be set to 0, causing islands of ones to erode (i.e. shrink in size). Small islands of ones will disappear, leaving only the larger clusters of contiguous pixels with the same grayscale level.
bl(int16(l)) = {bwlabel(be, 8)};
Here's where you may be having some misunderstandings. Firstly, the matrices in bl are not logical matrices. In your example, the function BWLABEL will find clusters of 8-connected ones. The first cluster found will have its elements labeled as 1 in the output image, the second cluster found will have its elements labeled as 2, etc. The matrices will therefore contain positive integer values, with 0 representing the background.
Secondly, are you going to use these labeled clusters for anything? There may be further processing you do for which you need to identify separate clusters at a given grayscale intensity level, but with regard to creating a grayscale image from the elements in bl, the specific label value is unnecessary. You only need to identify zero versus non-zero values, so if you aren't using bl for anything else I would suggest that you just save the individual values of be in a cell array and use them to recreate a grayscale image.
Now, onto the answer...
A very simple solution is to concatenate your cell array of images into a 3-D matrix using the function CAT, then use the function MAX to find the indices where the non-zero values occur along the third dimension (which corresponds to the grayscale value from the original image). For a given pixel, if there is no non-zero value found along the third dimension (i.e. it is all zeroes) then we can assume the pixel value should be 0. However, the index for that pixel returned by MAX will default to 1, so you have to use the maximum value as a logical index to set the pixel to 0:
[maxValue,grayImage] = max(cat(3,bl{:}),[],3);
grayImage(~maxValue) = 0;
Note that for the purposes of displaying or saving the image you may want to change the type of the resulting image grayImage to an unsigned 8-bit integer type, like so:
grayImage = uint8(grayImage);
The simplest solution would be to iterate through each of your logical matrices in turn, multiply it by its corresponding weight, and accumulate into an output matrix which will represent your final image.