how to calculate shannon entropy of byte bigrams - matlab

I have read a image file into a array like this
A = imread(fileName);
and now i want to calculate shannon entropy. The shannon entropy implementation found in maltab is a byte level entropy analysis which considers a file to be composed of 256 byte levels.
wentropy(x,'shannon')
But i need to perform a bigram entropy analysis which would need to view a file as consisting of 65536 levels. Could anyone suggest me a good method of accomplishing this.

The entropy of a random variable can be calculated using the following formula:
Where p(x) is the Prob(X=x).
Given a set of n observations (x1, x2, .... xn) You then compute P(X=x) for the range all x values (in your case it would be between (0 and 65535) and then sum across all values. The easiest way to do this is using hist
byteLevel = 65536
% count the observations
observationHist = hist(observations, byteLevel);
% convert to a probability
probXVal = observationHist ./ sum(observationHist);
% compute the entropy
entropy = - sum( probXVal .* log2(probXVal) );
There are several implementations of this on the file exchange that are worth checking out.
Note: where are you getting that wentropy is using 256 byte levels? I don't see that anywhere in the docks? Remember that in Matlab the pixels of a color image have 3 channels (R,G,B) with each channel requiring 8 bits (or 256 byte levels?).
Also because each channel is bound between [0 256) you could create a mapping from P(R=r,G=g,B=b) to P(X=x) as follows:
data = imageData(:,:,1);
data = data + (imgData(:,:,2) * 256);
data = data + (imgData(:,:,3) * 256 * 256);
I believe you can then use data to calculate the total entropy of the image where each channel is independent.

Convert color image with "65536" levels to gray image with "256" levels and consider entropy evaluation.

Related

Sampling an image

I have a 2D image G(m,n).
G is constructed by first acquiring k-space values and then inverse Fourier transforming.
The k-space consist m*n number of complex values.
What is meant by acquiring only 1/q of this amount (from m*n)? (q is a positive number)
In a scheme I will keep only 1/q th of the original k-space values.
Other elements 0f the original k space will make to zero/one.
Thank you.
Discarding a Fraction of Least Significant Frequency Components
One method is to use the fft2() function to convert the image to the frequency domain and delete the least significant frequency components based on their magnitudes. To find the least significant values the sort() function is used and the corresponding indices are returned. We can set the specific indices corresponding to the lowest frequency components to zero using matrix-indexing. You've pretty much described what has to be done above, but to provide more context:
• 1/q frequency components must remain.
• 1 - (1/q) frequency components must be set to zero/deleted.
%Grabbing a built-in test image%
Image = imread("peppers.png");
%Converting to grayscale if colour%
if size(Image,3) == 3
Image = rgb2gray(Image);
end
%Converting to frequency domain%
Frequency_Domain = fft2(Image);
%Sorting the the frequency domain values from greatest to least magnitude%
[Sorted_Coefficients,Sort_Indices] = sort(reshape(abs(Frequency_Domain),[numel(Frequency_Domain) 1]),'descend');
%Evaluating the number of coefficients to delete%
Number_Of_Coefficients = length(Sort_Indices);
q = 40;
Preserved_Fraction = 1/q;
Number_Of_Coefficients_To_Keep = round(Preserved_Fraction*Number_Of_Coefficients);
%Finding out which values to deleted based on the indices corresponding to the sorted array%
Delete_Indices = Sort_Indices(Number_Of_Coefficients_To_Keep+1:end);
Frequency_Domain(Delete_Indices) = 0;
%Evaluating how many frequency components were deleted%
Number_Of_Deleted_Frequency_Components = numel(Delete_Indices);
fprintf("Deleted %d frequency coefficients\n",Number_Of_Deleted_Frequency_Components);
%Converting the image back to the spatial domain%
G = uint8(ifft2(Frequency_Domain));
subplot(1,2,1); imshow(Image);
title("Original Image");
subplot(1,2,2); imshow(G);
title("Frequency Sampled Image");
disp(1 - (numel(Delete_Indices)/numel(Frequency_Domain)));

Histogram equalization for non-images in MATLAB

I have a vector of values and I want to change the values somehow that its histogram is closer to the uniform distribution using MATLAB. I am aware of histeq in MATLAB that takes an image as input and assumes the densities are in 0-255 range. I am looking for a more general version of histeq.
You are looking to do a full scale contrast stretch, correct? If so this function will work. You can change K to be the largest value in your vector if you are not using 8 bit integers.
function [result] = myfscs(image)
K=255;
A= min(image(:));
B= max(image(:));
P=K/(B-A);
L=A*K/(B-A);
J = (P .* image - L);
result = uint8(J); % doesn't have to be a uint8 returned

Generate random data given mean and standard deviation in MATLAB?

I have limited data RV for which I can find the mean mu and standard deviation sigma. Now I want to generate more data points keeping the same mu and sigma. How would I go about doing this in MATLAB? I did the following, however when I plot mean of the generated data (mu_2) it does not match mu...
N = 15
R = mean(RV) + std(RV)*randn(N, 1);
mu = mean(RV)*ones(N,1);
mu_2 = mean(R)*ones(N,1);
I think you should use normrnd(mu,sigma) function
go to documentation to get more details
Best regards
That looks correct. For such a small sample size, it's unlikely that you'll get a very good match. Try a much bigger value of N.
If you want to force your dataset to a particular mean and stddev, then you could just generate a set of samples, then measure their mean and stddev, and then just adjust by scaling and scalar addition.
For example:
R = randn(N,1);
% Measure
mu_tmp = mean(R);
std_tmp = std(R);
% Normalise and denormalise
R = (R - mu_tmp) / std_tmp;
R = (R * std_desired) + mu_desired;
You can also generate Gaussian mixtures using the Netlab library (its free!)
mix=gmm(8,3,'spherical');
[Data, Label]=gmmsamp(mix,1000);
The above generates a data set with 8 dimensions and three centers (spherical) over 1000 observations.

Using SVD to compress an image in MATLAB

I am brand new to MATLAB but am trying to do some image compression code for grayscale images.
Questions
How can I use SVD to trim off low-valued eigenvalues to reconstruct a compressed image?
Work/Attempts so far
My code so far is:
B=imread('images1.jpeg');
B=rgb2gray(B);
doubleB=double(B);
%read the image and store it as matrix B, convert the image to a grayscale
photo and convert the matrix to a class 'double' for values 0-255
[U,S,V]=svd(doubleB);
This allows me to successfully decompose the image matrix with eigenvalues stored in variable S.
How do I truncate S (which is 167x301, class double)? Let's say of the 167 eigenvalues I want to take only the top 100 (or any n really), how do I do that and reconstruct the compressed image?
Updated code/thoughts
Instead of putting a bunch of code in the comments section, this is the current draft I have. I have been able to successfully create the compressed image by manually changing N, but I would like to do 2 additional things:
1- Show a pannel of images for various compressions (i/e, run a loop for N = 5,10,25, etc.)
2- Somehow calculate the difference (error) between each image and the original and graph it.
I am horrible with understanding loops and output, but this is what I have tried:
B=imread('images1.jpeg');
B=rgb2gray(B);
doubleB=im2double(B);%
%read the image and store it as matrix B, convert the image to a grayscale
%photo and convert the image to a class 'double'
[U,S,V]=svd(doubleB);
C=S;
for N=[5,10,25,50,100]
C(N+1:end,:)=0;
C(:,N+1:end)=0;
D=U*C*V';
%Use singular value decomposition on the image doubleB, create a new matrix
%C (for Compression diagonal) and zero out all entries above N, (which in
%this case is 100). Then construct a new image, D, by using the new
%diagonal matrix C.
imshow(D);
error=C-D;
end
Obviously there are some errors because I don't get multiple pictures or know how to "graph" the error matrix
Although this question is old, it has helped me a lot to understand SVD. I have modified the code you have written in your question to make it work.
I believe you might have solved the problem, however just for the future reference for anyone visiting this page, I am including the complete code here with the output images and graph.
Below is the code:
close all
clear all
clc
%reading and converting the image
inImage=imread('fruits.jpg');
inImage=rgb2gray(inImage);
inImageD=double(inImage);
% decomposing the image using singular value decomposition
[U,S,V]=svd(inImageD);
% Using different number of singular values (diagonal of S) to compress and
% reconstruct the image
dispEr = [];
numSVals = [];
for N=5:25:300
% store the singular values in a temporary var
C = S;
% discard the diagonal values not required for compression
C(N+1:end,:)=0;
C(:,N+1:end)=0;
% Construct an Image using the selected singular values
D=U*C*V';
% display and compute error
figure;
buffer = sprintf('Image output using %d singular values', N)
imshow(uint8(D));
title(buffer);
error=sum(sum((inImageD-D).^2));
% store vals for display
dispEr = [dispEr; error];
numSVals = [numSVals; N];
end
% dislay the error graph
figure;
title('Error in compression');
plot(numSVals, dispEr);
grid on
xlabel('Number of Singular Values used');
ylabel('Error between compress and original image');
Applying this to the following image:
Gives the following result with only first 5 Singular Values,
with first 30 Singular Values,
and the first 55 Singular Values,
The change in error with increasing number of singular values can be seen in the graph below.
Here you can notice the graph is showing that using approximately 200 first singular values yields to approximately zero error.
Just to start, I assume you're aware that the SVD is really not the best tool to decorrelate the pixels in a single image. But it is good practice.
OK, so we know that B = U*S*V'. And we know S is diagonal, and sorted by magnitude. So by using only the top few values of S, you'll get an approximation of your image. Let's say C=U*S2*V', where S2 is your modified S. The sizes of U and V haven't changed, so the easiest thing to do for now is to zero the elements of S that you don't want to use, and run the reconstruction. (Easiest way to do this: S2=S; S2(N+1:end, :) = 0; S2(:, N+1:end) = 0;).
Now for the compression part. U is full, and so is V, so no matter what happens to S2, your data volume doesn't change. But look at what happens to U*S2. (Plot the image). If you kept N singular values in S2, then only the first N rows of S2 are nonzero. Compression! Except you still have to deal with V. You can't use the same trick after you've already done (U*S2), since more of U*S2 is nonzero than S2 was by itself. How can we use S2 on both sides? Well, it's diagonal, so use D=sqrt(S2), and now C=U*D*D*V'. So now U*D has only N nonzero rows, and D*V' has only N nonzero columns. Transmit only those quantities, and you can reconstruct C, which is approximately like B.
For example, here's a 512 x 512 B&W image of Lena:
We compute the SVD of Lena. Choosing the singular values above 1% of the maximum singular value, we are left with just 53 singular values. Reconstructing Lena with these singular values and the corresponding (left and right) singular vectors, we obtain a low-rank approximation of Lena:
Instead of storing 512 * 512 = 262144 values (each taking 8 bits), we can store 2 x (512 x 53) + 53 = 54325 values, which is approximately 20% of the original size. This is one example of how SVD can be used to do lossy image compression.
Here's the MATLAB code:
% open Lena image and convert from uint8 to double
Lena = double(imread('LenaBW.bmp'));
% perform SVD on Lena
[U,S,V] = svd(Lena);
% extract singular values
singvals = diag(S);
% find out where to truncate the U, S, V matrices
indices = find(singvals >= 0.01 * singvals(1));
% reduce SVD matrices
U_red = U(:,indices);
S_red = S(indices,indices);
V_red = V(:,indices);
% construct low-rank approximation of Lena
Lena_red = U_red * S_red * V_red';
% print results to command window
r = num2str(length(indices));
m = num2str(length(singvals));
disp(['Low-rank approximation used ',r,' of ',m,' singular values']);
% save reduced Lena
imwrite(uint8(Lena_red),'Reduced Lena.bmp');
taking the first n max number of eigenvalues and their corresponding eigenvectors may solve your problem.For PCA, the original data multiplied by the first ascending eigenvectors will construct your image by n x d where d represents the number of eigenvectors.

How to divide a vector into frames in MATLAB?

I'am building a voice morphing system using MATLAB and I need to divide the source and target, training and test samples into frames of 128 samples so that I can then apply DWT on each of the frame.
So please guide me how to divide the vector into frames?
You can change a vector into a matrix of equally-sized columns/rows (i.e. frames) using the reshape function:
x = rand(128 * 100, 1);
X = reshape(x, 128, 100);
% X is a 128-by-100 matrix; the i-th column of 128 elements
% is addressed by X(:,i)
An alternative to using reshape would be to use buffer if you have the signal processing toolbox available. Simply . . .
y = buffer(x,128)
.. in your instance. The buffer command will also add trailing zeros to the final frame if the number of elements in your original signal (x) is not an integer multiple of 128.