Fast computation of joint histogram of two images [duplicate] - matlab

I have two black and white images and I need to calculate the mutual information.
Image 1 = X
Image 2 = Y
I know that the mutual information can be defined as:
MI = entropy(X) + entropy(Y) - JointEntropy(X,Y)
MATLAB already has built-in functions to calculate the entropy but not to calculate the joint entropy. I guess the true question is: How do I calculate the joint entropy of two images?
Here is an example of the images I'd like to find the joint entropy of:
X =
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Y =
0 0 0 0 0 0
0 0 0.38 0.82 0.38 0.04
0 0 0.32 0.82 0.68 0.17
0 0 0.04 0.14 0.11 0
0 0 0 0 0 0

To calculate the joint entropy, you need to calculate the joint histogram between two images. The joint histogram is essentially the same as a normal 1D histogram but the first dimension logs intensities for the first image and the second dimension logs intensities for the second image. This is very similar to what is commonly referred to as a co-occurrence matrix. At location (i,j) in the joint histogram, it tells you how many intensity values we have encountered that have intensity i in the first image and intensity j in the second image.
What is important is that this logs how many times we have seen this pair of intensities at the same corresponding locations. For example, if we have a joint histogram count of (7,3) = 2, this means that when we were scanning both images, when we encountered the intensity of 7, at the same corresponding location in the second image, we encountered the intensity of 3 for a total of 2 times.
Constructing a joint histogram is very simple to do.
First, create a 256 x 256 matrix (assuming your image is unsigned 8-bit integer) and initialize them to all zeroes. Also, you need to make sure that both of your images are the same size (width and height).
Once you do that, take a look at the first pixel of each image, which we will denote as the top left corner. Specifically, take a look at the intensities for the first and second image at this location. The intensity of the first image will serve as the row while the intensity of the second image will serve as the column.
Find this location in the matrix and increment this spot in the matrix by 1.
Repeat this for the rest of the locations in your image.
After you're done, divide all entries by the total number of elements in either image (remember they should be the same size). This will give us the joint probability distribution between both images.
One would be inclined to do this with for loops, but as it is commonly known, for loops are notoriously slow and should be avoided if at all possible. However, you can easily do this in MATLAB in the following way without loops. Let's assume that im1 and im2 are the first and second images you want to compare to. What we can do is convert im1 and im2 into vectors. We can then use accumarray to help us compute the joint histogram. accumarray is one of the most powerful functions in MATLAB. You can think of it as a miniature MapReduce paradigm. Simply put, each data input has a key and an associated value. The goal of accumarray is to bin all of the values that belong to the same key and do some operation on all of these values. In our case, the "key" would be the intensity values, and the values themselves are the value of 1 for every intensity value. We would then want to add up all of the values of 1 that map to the same bin, which is exactly how we'd compute a histogram. The default behaviour for accumarray is to add all of these values. Specifically, the output of accumarray would be an array where each position computes the sum of all values that mapped to that key. For example, the first position would be the summation of all values that mapped to the key of 1, the second position would be the summation of all values that mapped to the key of 2 and so on.
However, for the joint histogram, you want to figure out which values map to the same intensity pair of (i,j), and so the keys here would be a pair of 2D coordinates. As such, any intensities that have an intensity of i in the first image and j in the second image in the same spatial location shared between the two images go to the same key. Therefore in the 2D case, the output of accumarray would be a 2D matrix where each element (i,j) contains the summation of all values that mapped to key (i,j), similar to the 1D case that was mentioned previously which is exactly what we are after.
In other words:
indrow = double(im1(:)) + 1;
indcol = double(im2(:)) + 1; %// Should be the same size as indrow
jointHistogram = accumarray([indrow indcol], 1);
jointProb = jointHistogram / numel(indrow);
With accumarray, the first input are the keys and the second input are the values. A note with accumarray is that if each key has the same value, you can simply assign a constant to the second input, which is what I've done and it's 1. In general, this is an array with the same number of rows as the first input. Also, take special note of the first two lines. There will inevitably be an intensity of 0 in your image, but because MATLAB starts indexing at 1, we need to offset both arrays by 1.
Now that we have the joint histogram, it's really simple to calculate the joint entropy. It is similar to the entropy in 1D, except now we are just summing over the entire joint probability matrix. Bear in mind that it will be very likely that your joint histogram will have many 0 entries. We need to make sure that we skip those or the log2 operation will be undefined. Let's get rid of any zero entries now:
indNoZero = jointHistogram ~= 0;
jointProb1DNoZero = jointProb(indNoZero);
Take notice that I searched the joint histogram instead of the joint probability matrix. This is because the joint histogram consists of whole numbers while the joint probability matrix will lie between 0 and 1. Because of the division, I want to avoid comparing any entries in this matrix with 0 due to numerical roundoff and instability. The above will also convert our joint probability matrix into a stacked 1D vector, which is fine.
As such, the joint entropy can be calculated as:
jointEntropy = -sum(jointProb1DNoZero.*log2(jointProb1DNoZero));
If my understanding of calculating entropy for an image in MATLAB is correct, it should calculate the histogram / probability distribution over 256 bins, so you can certainly use that function here with the joint entropy that was just calculated.
What if we have floating-point data instead?
So far, we have assumed that the images that you have dealt with have intensities that are integer-valued. What if we have floating point data? accumarray assumes that you are trying to index into the output array using integers, but we can still certainly accomplish what we want with this small bump in the road. What you would do is simply assign each floating point value in both images to have a unique ID. You would thus use accumarray with these IDs instead. To facilitate this ID assigning, use unique - specifically the third output from the function. You would take each of the images, put them into unique and make these the indices to be input into accumarray. In other words, do this instead:
[~,~,indrow] = unique(im1(:)); %// Change here
[~,~,indcol] = unique(im2(:)); %// Change here
%// Same code
jointHistogram = accumarray([indrow indcol], 1);
jointProb = jointHistogram / numel(indrow);
indNoZero = jointHistogram ~= 0;
jointProb1DNoZero = jointProb(indNoZero);
jointEntropy = -sum(jointProb1DNoZero.*log2(jointProb1DNoZero));
Note that with indrow and indcol, we are directly assigning the third output of unique to these variables and then using the same joint entropy code that we computed earlier. We also don't have to offset the variables by 1 as we did previously because unique will assign IDs starting at 1.
Aside
You can actually calculate the histograms or probability distributions for each image individually using the joint probability matrix. If you wanted to calculate the histograms / probability distributions for the first image, you would simply accumulate all of the columns for each row. To do it for the second image, you would simply accumulate all of the rows for each column. As such, you can do:
histogramImage1 = sum(jointHistogram, 1);
histogramImage2 = sum(jointHistogram, 2);
After, you can calculate the entropy of both of these by yourself. To double check, make sure you turn both of these into PDFs, then compute the entropy using the standard equation (like above).
How do I finally compute Mutual Information?
To finally compute Mutual Information, you're going to need the entropy of the two images. You can use MATLAB's built-in entropy function, but this assumes that there are 256 unique levels. You probably want to apply this for the case of there being N distinct levels instead of 256, and so you can use what we did above with the joint histogram, then computing the histograms for each image in the aside code above, and then computing the entropy for each image. You would simply repeat the entropy calculation that was used jointly, but apply it to each image individually:
%// Find non-zero elements for first image's histogram
indNoZero = histogramImage1 ~= 0;
%// Extract them out and get the probabilities
prob1NoZero = histogramImage1(indNoZero);
prob1NoZero = prob1NoZero / sum(prob1NoZero);
%// Compute the entropy
entropy1 = -sum(prob1NoZero.*log2(prob1NoZero));
%// Repeat for the second image
indNoZero = histogramImage2 ~= 0;
prob2NoZero = histogramImage2(indNoZero);
prob2NoZero = prob2NoZero / sum(prob2NoZero);
entropy2 = -sum(prob2NoZero.*log2(prob2NoZero));
%// Now compute mutual information
mutualInformation = entropy1 + entropy2 - jointEntropy;
Hope this helps!

Related

Partial fft of multidimensional array

I have a 3D array of dimension (NX,NY,NZ) which represents a variable in physical space, let's say velocities, taken from a simulation in a 3D domain.
1) I want to Fourier-transform only the dimensions X and Z, how should I use the built-in function fft in this case? At some point I want to also get back to the physical space, but only on X, so the same question applies.
2) I read that FFTW uses only 2*N/3 points, should I specify NX and NZ as the number of retained modes or fewer?
3) When using the FFTW package, is there any issue with the coefficient in front of the integral defining the Fourier transformation? Does this package assume that my domain is 2pix2pix2pi?
1°) The function for 2D FFT is fft2, and it will by default apply to the two first dimensions of the array. That is, fft2(velocities) will give you a 3D array with NZ Fourier transforms along dimensions X and Y
In order to do the FFT along other dimensions, you have to manually decompose the 2D FFT as two 1D FFTs. fft will work by default along dimension 1 and produce as many samples as there were in the input. fft(X[],n) does the same, but along dimension n.
Thus, you may compute a 2D FFT of your 3D array, along dimensions X and Z with the command:
my_FFT = fft(fft(velocities),[],3);
2°) There will be as many samples out as samples in.
3°) I believe the normalization by the size of the array is fully applied on the reverse transform, and not at all on the direct transform.
fft([1 0 0 0 0 0])
ans =
1 1 1 1 1 1
To maintain normalization, a coefficient sqrt(NX*NZ) should be applied (multiply when doing FFT, divide when doing an IFFT).

Quadratic optimisation to compute a transformation matrix

I have two 3D meshes and on these meshes exists points that are mapped to each other and points with no mapping. For these points I want to calculate a mapping.
I want to do this by calculating a transformation Matrix for each point of one mesh that is already mapped to a point in the other mesh. Afterwards I want to calculate with an quadratic optimization the best possible mapping for all other points.
f(p) = T_i * p
(I'm using homogeneous coordinates - so every transfomation matrix should look like this:
1 0 0 a_i
0 1 0 b_i
0 0 1 c_i
0 0 0 1
Because of some not existing mathematics knowledge I stuck at this point, could somebody tell what exactly I need to feed matlab with that it gives me back an optimized transformation matrix, preferably with a quadratic optimization.
(and maybe an elegant way to do the transformation matrix calculation - actually I do it with a loop that runs over all points and calcs line by line the a_i, b_i, c_i and transforms them to a new matrix..., is there a function in matlab?)
Thanks in advance.

Replace matrix values with mean values extracted from smaller matrix in MATLAB

Let's say I have a 10 x 10 matrix. What I do then is iterate through the whole matrix with an 3 x 3 matrix (except from the edges to make it easier), and from this 3 x 3 matrix I get the mean/average value of this 3 x 3 space. What I then want to do is to replace the original matrix values, with those new mean/average values.
Can someone please explain to me how I can do that? Some code examples would be appreciated.
What you're trying to do is called convolution. In MATLAB, you can do the following (read up more on convolution and how it's done in MATLAB).
conv2( myMatrix , ones(3)/9 , 'same' );
A little deciphering is in order. myMatrix is the matrix you're working on (the 10x10 one you mentioned). The ones(3)/9 command creates a so-called mask of filter kernel, which is
1/9 1/9 1/9
1/9 1/9 1/9
1/9 1/9 1/9
When you take this mask and move it around, multiplying value-by-value the mask's entries with 3x3 entries of the image and then add the results (dot product, essentially), the you get the average of the 9 values that ended up underneath this mask. So once you've placed this mask over every 3x3 segment of you matrix (image, I imagine) and replaced the value in the middle by the average, you get the result of that command. You're welcome to experiment with it further. The 'same' flag simply means that the matrix you're getting back is the same size as you original one. This is important because, as you've yourself realized, there are multiple ways of dealing with the edges.
To do this, you need to keep the original intact until you have got all the means. This means that if you implement this using a loop, you have to store the averages in different matrix. To get the borders too, the easiest way is to copy the original matrix to the new matrix, although only the borders are needed to be copied.
This average3x3 function copies the input Matrix to AveragedMatrix and then goes through all the elements that are not on any border, calculates the mean of 3x3 space and stores it in the corresponding element of AveragedMatrix.
function [AveragedMatrix] = average3x3(Matrix)
AveragedMatrix = Matrix;
if ((size(Matrix, 1) < 3) || (size(Matrix, 2) < 3))
fprintf('Matrix is too small, minimum matrix size is 3x3.\n');
return
end
for RowIndex = 2:(size(Matrix, 1)-1)
Rows = RowIndex-1:RowIndex+1;
for ColIndex = 2:(size(Matrix, 2)-1)
Columns = ColIndex-1:ColIndex+1;
AveragedMatrix(RowIndex,ColIndex) = mean(mean(Matrix(Rows,Columns)));
end
end
return
To use this function, you can try:
A = randi(10,10);
AveragedA = average3x3(A);

Matlab : 256 binary matrices to one 256 levels grayscale image

a process of mine produces 256 binary (logical) matrices, one for each level of a grayscale source image.
Here is the code :
so = imread('bio_sd.bmp');
co = rgb2gray(so);
for l = 1:256
bw = (co == l); % Binary image from level l of original image
be = ordfilt2(bw, 1, ones(3, 3)); % Convolution filter
bl(int16(l)) = {bwlabel(be, 8)}; % Component labelling
end
I obtain a cell array of 256 binary images. Such a binary image contains 1s if the source-image pixel at that location has the same level as the index of the binary image.
ie. the binary image bl{12} contains 1s where the source image has pixels with the level 12.
I'd like to create new image by combining the 256 binary matrices back to a grayscale image.
But i'm very new to Matlab and i wonder if someone can help me to code it :)
ps : i'm using matlab R2010a student edition.
this whole answer only applies to the original form of the question.
Lets assume you can get all your binary matrices together into a big n-by-m-by-256 matrix binaryimage(x,y,greyvalue). Then you can calculate your final image as
newimage=sum(bsxfun(#times,binaryimage,reshape(0:255,1,1,[])),3)
The magic here is done by bsxfun, which multiplies the 3D (n x m x 256) binaryimage with the 1 x 1 x 256 vector containing the grey values 0...255. This produces a 3D image where for fixed x and y, the vector (y,x,:) contains many zeros and (for the one grey value G where the binary image contained a 1) it contains the value G. So now you only need to sum over this third dimension to get a n x m image.
Update
To test that this works correctly, lets go the other way first:
fullimage=floor(rand(100,200)*256);
figure;imshow(fullimage,[0 255]);
is a random greyscale image. You can calculate the 256 binary matrices like this:
binaryimage=false([size(fullimage) 256]);
for i=1:size(fullimage,1)
for j=1:size(fullimage,2)
binaryimage(i,j,fullimage(i,j)+1)=true;
end
end
We can now apply the solution I gave above
newimage=sum(bsxfun(#times,binaryimage,reshape(0:255,1,1,[])),3);
and verify that I returns the original image:
all(newimage(:)==fullimage(:))
which gives 1 (true) :-).
Update 2
You now mention that your binary images are in a cell array, I assume binimg{1:256}, with each cell containing an n x m binary array. If you can it probably makes sense to change the code that produces this data to create the 3D binary array I use above - cells are mostly usefull if different cells contain data of different types, shapes or sizes.
If there are good reasons to stick with a cell array, you can convert it to a 3D array using
binaryimage = reshape(cell2mat(reshape(binimg,1,256)),n,m,256);
with n and m as used above. The inner reshape is not necessary if you already have size(binimg)==[1 256]. So to sum it up, you need to use your cell array binimg to calculate the 3D matrix binaryimage, which you can then use to calculate the newimage that you are interested in using the code at the very beginning of my answer.
Hope this helps...
What your code does...
I thought it may be best to first go through what the code you posted is actually doing, since there are a couple of inconsistencies. I'll go through each line in your loop:
bw = (co == l);
This simply creates a binary matrix bw with ones where your grayscale image co has a pixel intensity equal to the loop value l. I notice that you loop from 1 to 256, and this strikes me as odd. Typically, images loaded into MATLAB will be an unsigned 8-bit integer type, meaning that the grayscale values will span the range 0 to 255. In such a case, the last binary matrix bw that you compute when l = 256 will always contain all zeroes. Also, you don't do any processing for pixels with a grayscale level of 0. From your subsequent processing, I'm guessing you purposefully want to ignore grayscale values of 0, in which case you probably only need to loop from 1 to 255.
be = ordfilt2(bw, 1, ones(3, 3));
What you are essentially doing here with ORDFILT2 is performing a binary erosion operation. Any values of 1 in bw that have a 0 as one of their 8 neighbors will be set to 0, causing islands of ones to erode (i.e. shrink in size). Small islands of ones will disappear, leaving only the larger clusters of contiguous pixels with the same grayscale level.
bl(int16(l)) = {bwlabel(be, 8)};
Here's where you may be having some misunderstandings. Firstly, the matrices in bl are not logical matrices. In your example, the function BWLABEL will find clusters of 8-connected ones. The first cluster found will have its elements labeled as 1 in the output image, the second cluster found will have its elements labeled as 2, etc. The matrices will therefore contain positive integer values, with 0 representing the background.
Secondly, are you going to use these labeled clusters for anything? There may be further processing you do for which you need to identify separate clusters at a given grayscale intensity level, but with regard to creating a grayscale image from the elements in bl, the specific label value is unnecessary. You only need to identify zero versus non-zero values, so if you aren't using bl for anything else I would suggest that you just save the individual values of be in a cell array and use them to recreate a grayscale image.
Now, onto the answer...
A very simple solution is to concatenate your cell array of images into a 3-D matrix using the function CAT, then use the function MAX to find the indices where the non-zero values occur along the third dimension (which corresponds to the grayscale value from the original image). For a given pixel, if there is no non-zero value found along the third dimension (i.e. it is all zeroes) then we can assume the pixel value should be 0. However, the index for that pixel returned by MAX will default to 1, so you have to use the maximum value as a logical index to set the pixel to 0:
[maxValue,grayImage] = max(cat(3,bl{:}),[],3);
grayImage(~maxValue) = 0;
Note that for the purposes of displaying or saving the image you may want to change the type of the resulting image grayImage to an unsigned 8-bit integer type, like so:
grayImage = uint8(grayImage);
The simplest solution would be to iterate through each of your logical matrices in turn, multiply it by its corresponding weight, and accumulate into an output matrix which will represent your final image.

Uniform distribution of binary values in Matlab

I have a requirement for the generation of a given number N of vectors of given size each consistent of a uniform distribution of 0s and 1s.
This is what I am doing at the moment, but I noticed that the distribution is strongly peaked at half 1s and half 0s, which is no good for what I am doing:
a = randint(1, sizeOfVector, [0 1]);
The unifrnd function looks promising for what I need, but I can't manage to understand how to output a binary vector of that size.
Is it the case that I can use the unifrnd function (and if so how would be appreciated!) or can is there any other more convenient way to obtain such a set of vectors?
Any help appreciated!
Note 1: just to be sure - here's what the requirement actually says:
randomly choose N vectors of given size that are
uniformly distributed over [0;1]
Note 2: I am generating initial configurations for Cellular Automata, that's why I can have only binary values [0;1].
To generate 1 random vector with elements from {0, 1}:
unidrnd(2,sizeOfVector,1)-1
The other variants are similar.
If you want to get uniformly distributed 0's and 1's, you want to use randi. However, from the requirement, I'd think that the vectors can have real values, for which you'd use rand
%# create a such that each row contains a vector
a = randi(1,N,sizeOfVector); %# for uniformly distributed 0's and 1's
%# if you want, say, 60% 1's and 40% 0's, use rand and threshold
a = rand(N,sizeOfVector) > 0.4; %# maybe you need to call double(a) if you don't want logical output
%# if you want a random number of 1's and 0's, use
a = rand(N,sizeOfVector) > rand(1);