Usage of entropy function - matlab

I was trying to find the entropy of a certain probability distribution in MATLAB. For p, I tried doing
E = -sum(p .* log2(p))
and Echeck = entropy(p)
Shouldn't E and Echeck be same?
The matlab help on entropy does say Entropy is defined as -sum(p.*log2(p))
where p contains the histogram counts returned from imhist.But also that entropy converts any class other than logical to uint8 for the histogram count calculation since it is actually trying to calculate the entropy of a grayscale image and hence wants the pixel values to be discrete.
So I guess it's incorrect to use this function for my purpose?
Is there a good alternative?

I used open entropy to check the code, and there is a line:
if ~islogical(I)
I = im2uint8(I);
end
p = imhist(I(:));
which mean that the input is converted to uint8, and then the function computes the entropy of the histogram of the input, and not of the input itself.
That explains the difference.

Related

MatLab:Generate N pseudo-random numbers with a Poisson distribution having mean M and total T where N,M, and T are user defined

I’d like to be able to generate in MatLab a sequence of N pseudo-random numbers with a Poisson distribution having mean M. The sum of the N numbers should be T. N, M, and T are always positive or zero and would be user specified parameters to any function.
Obviously, if T is small relative to N it is likely that there will be problems achieving a total of T. In that case the function could just return the values T and then N-1 zeros or an error code. However, it is highly likely that in most cases T>>N.
I have been trying variations based on the method of generating random numbers with a given distribution provided at http://matlabtricks.com/post-44/generate-random-numbers-with-a-given-distribution and trying various normalizations at each step but have not been successful.
You could try to approximate what you want by using multinomial distribution.
If you use Wikipedia notation, then k=N, n=T and pi=M/T. Poisson distribution has distinctive property of mean equal to variance, but if your parameters are such that pi is small, then mean npi would be quite close to variance npi(1-pi). Sum would be automatically (by property of multinomial) equal of T.
Multinomial sampling in Matlab is done using mnrmd function.
UPDATE
Wrt comment, lets consider N sampled values vi, and write their sum
Sum(i=1...N) vi = T
Lets compute mean value of the left and right side of this equation.
Sum(i=1...N) E(vi) = E(T) = T
On the right side, mean value of constant is constant itself. On the left side we have
Sum(i=1...N) E(vi) = Sum(i=1...N) M = N*M = T
Therefore, M=T/N and pi=M/T=1/N.

Image Parameters (Standard Deviation, Mean and Entropy) of an RGB Image

I couldn't find an answer for RGB image.
How can someone get a value of SD,mean and Entropy of RGB image using MATLAB?
From http://airccse.org/journal/ijdms/papers/4612ijdms05.pdf TABLE3, it seems he got one answer so did he get the average of the RGB values?
Really in need of any help.
After reading the paper, because you are dealing with colour images, you have three channels of information to access. This means that you could alter one of the channels for a colour image and it could still affect the information it's trying to portray. The author wasn't very clear on how they were obtaining just a single value to represent the overall mean and standard deviation. Quite frankly, because this paper was published in a no-name journal, I'm not surprised how they managed to get away with it. If this was attempted to be published in more well known journals (IEEE, ACM, etc.), this would probably be rejected outright due to that very ambiguity.
On how I interpret this procedure, averaging all three channels doesn't make sense because you want to capture the differences over all channels. Doing this averaging will smear that information and those differences get lost. Practically speaking, if you averaged all three channels, should one channel change its intensity by 1, and when you averaged the channels together, the reported average would be so small that it probably would not register as a meaningful difference.
In my opinion, what you should perhaps do is treat the entire RGB image as a 1D signal, then perform the mean, standard deviation and entropy of that image. As such, given an RGB image stored in image_rgb, you can unroll the entire image into a 1D array like so:
image_1D = double(image_rgb(:));
The double casting is important because you want to maintain floating point precision when calculating the mean and standard deviation. The images will probably be of an unsigned integer type, and so this casting must be done to maintain floating point precision. If you don't do this, you may have calculations that get saturated or clamped beyond the limits of that data type and you won't get the right answer. As such, you can calculate the mean, standard deviation and entropy like so:
m = mean(image_1D);
s = std(image_1D);
e = entropy(image_1D);
entropy is a function in MATLAB that calculates the entropy of images so you should be fine here. As noted by #CitizenInsane in his answer, entropy unrolls a grayscale image into a 1D vector and applies the Shannon definition of entropy on this 1D vector. In a similar token, you can do the same thing with a RGB image, but we have already unrolled the signal into a 1D vector anyway, and so the input into entropy will certainly be well suited for the unrolled RGB image.
I have no idea how the author actually did it. But what you could do, is to treat the image as a 1D-array of size WxHx3 and then simply calculate the mean and standard deviation.
Don't know if table 3 is obtain in the same way but at least looking at entropy routine in image toolbox of matlab, RGB values are vectorized to single vector:
I = imread('rgb'); % Read RGB values
I = I(:); % Vectorization of RGB values
p = imhist(I); % Histogram
p(p == 0) = []; % remove zero entries in p
p = p ./ numel(I); % normalize p so that sum(p) is one.
E = -sum(p.*log2(p));

Using matlab to fit data with negative binomial distribution under given p?

How to use matlab to fit data with negative binomial distribution under given p?
Looks like a job for mle in the statistics toolbox. You'll need to express the negative binomial distribution (or the log of it, which will probably be easier) as a function of p and whatever else, and invent some starting parameters to hand in.
If you mean fit as a function of the other parameter R with P fixed, the following shows how to use mle to fix the value of one parameter and estimate the other:
x = nbinrnd(20,.5,1000,1);
params = nbinfit(x) % unconstrained fit
r = mle(x,'pdf',#(x,r)nbinpdf(x,r,.5),'start',23) % constrain P=0.5
% Plot log likelihood as a function of R
rr = linspace(15,25);
yy = zeros(size(rr));
for j=1:length(rr)
yy(j) = sum(log(nbinpdf(x,rr(j),.5)));
end
plot(rr,yy,'-',...
params(1),sum(log(nbinpdf(x,params(1),params(2)))),'o')
legend(sprintf('r=%f,p=.5',r), sprintf('r=%f,p=%f',params),'location','sw')

matlab code for perceptual hashing

I need a matlab code for a perceptual hashing algorithm descried here:
http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
Basically I want this to remove deatails in an image and only leave the major structure components information.
To do so, I think I need the following steps:
1. Reduce the DCT. Suppose the DCT is 32x32 (), just keep the top-left 8x8. Those represent the lowest frequencies in the picture.
Compute the average value. Like the Average Hash, compute the mean DCT value (using only the 8x8 DCT low-frequency values and excluding the first term since the DC coefficient can be significantly different from the other values and will throw off the average).
Further reduce the DCT. Set the 64 hash bits to 0 or 1 depending on whether each of the 64 DCT values is above or below the average value. The result doesn't tell us the actual low frequencies; it just tells us the very-rough relative scale of the frequencies to the mean. The result will not vary as long as the overall structure of the image remains the same; this can survive gamma and color histogram adjustments without a problem.
reconstruct image after the processing.
Anyone can help on any one of above steps?
I have tried some code that gives some results (in the below link), it is not yet perfect:
https://stackoverflow.com/questions/26748051/extract-low-frequency-from-dct-coeffecients-of-an-image-in-matlab
Try this:
% read image
I = imread('cameraman.tif');
% cosine transform and reduction
d = dct2(I);
d = d(1:8,1:8);
% compute average
a = mean(mean(d));
% set bits, here unclear whether > or >= shall be used
b = d > a;
% maybe convert to string:
string = num2str(b(:)');

How to generate random numbers from cut-off log-normal distribution in Matlab?

The radii r is drawn from a cut-off log-normal distribution, which has a following probability density function:
pdf=((sqrt(2).*exp(-0.5*((log(r/rch)).^2)))./((sqrt(pi.*(sigma_nd.^2))...
.*r).*(erf((log(rmax/rch))./sqrt(2.*(sigma_nd.^2)))-erf((log(rmin/rch))./sqrt(2.*(sigma_nd.^2))))));
rch, sigma_nd, rmax, and rmin are all constants.
I found the explanation from the net, but it seems difficult to find its integral and then take inverse in Matlab.
I checked, but my first instinct is that it looks like log(r/rch) is a truncated normal distribution with limits of log(rmin/rch) and log(rmax/rch). So you can generate the appropriate truncated normal random variable, say y, then r = rch * exp(y).
You can generate truncated normal random variables by generating the untruncated values and replacing those that are outside the limits. Alternatively, you can do it using the CDF, as described by #PengOne, which you can find on the wikipedia page.
I'm (still) not sure that your p.d.f. is completely correct, but what's most important here is the distribution.
If your PDF is continuous, then you can integrate to get a CDF, then find the inverse of the CDF and evaluate that at the random value.
If your PDF is not continuous, then you can get a discrete CDF using cumsum, and use that as your initial Y value in interp(), with the initial X value the same as the values the PDF was sampled at, and asking to interpolate at your array of rand() numbers.
it's not clear what's your variable, but i'm assuming it's r.
the simplest way to do this is, as Chris noted, first get the cdf (note that if r starts at 0, pdf(1) is Nan... change it to 0):
cdf = cumtrapz(pdf);
cdf = cdf / cdf(end);
then spawn a uniform distribution (size_dist indicating the number of elements):
y = rand (size_dist,1);
followed by a method to place distribution along the cdf. Any technique will work, but here is the simplest (albeit inelegant)
x = zeros(size_dist,1);
for i = 1:size_dist
x(i) = find( y(i)<= cdf,1);
end
and finally, returning to the original pdf. Use matlab numerical indexing to reverse course. Note: use r and not pdf:
pdfHist = r(x);
hist (pdfHist,50)
Probably an overkill for your distribution - but you can always write a Metropolis sampler.
On the other hand - implementation is straight forward so you'd have your sampler very quick.