matlab code for perceptual hashing - matlab

I need a matlab code for a perceptual hashing algorithm descried here:
http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
Basically I want this to remove deatails in an image and only leave the major structure components information.
To do so, I think I need the following steps:
1. Reduce the DCT. Suppose the DCT is 32x32 (), just keep the top-left 8x8. Those represent the lowest frequencies in the picture.
Compute the average value. Like the Average Hash, compute the mean DCT value (using only the 8x8 DCT low-frequency values and excluding the first term since the DC coefficient can be significantly different from the other values and will throw off the average).
Further reduce the DCT. Set the 64 hash bits to 0 or 1 depending on whether each of the 64 DCT values is above or below the average value. The result doesn't tell us the actual low frequencies; it just tells us the very-rough relative scale of the frequencies to the mean. The result will not vary as long as the overall structure of the image remains the same; this can survive gamma and color histogram adjustments without a problem.
reconstruct image after the processing.
Anyone can help on any one of above steps?
I have tried some code that gives some results (in the below link), it is not yet perfect:
https://stackoverflow.com/questions/26748051/extract-low-frequency-from-dct-coeffecients-of-an-image-in-matlab

Try this:
% read image
I = imread('cameraman.tif');
% cosine transform and reduction
d = dct2(I);
d = d(1:8,1:8);
% compute average
a = mean(mean(d));
% set bits, here unclear whether > or >= shall be used
b = d > a;
% maybe convert to string:
string = num2str(b(:)');

Related

How to remove bias when downsampling a vector in Matlab

I have a set of vectors containing some arbitrary shape like a triangle pulse with a single maxima.
I need to downsample these vectors by an integer factor.
The position of the maxima relative to the length of the vector should stay the same.
Below code shows, that when I do this, there is a bias=-0.0085 introduced by the downsampling step which should be zero on average.
The bias doesn't seem to change much depending on the number of vectors (tried between 200 and 800 vectors)
.
I also tried different resampling functions like downsample and decimate leading to the same results.
datapoints = zeros(1000,800);
for ii = 1:size(datapoints,2)
datapoints(ii:ii+18,ii) = [1:10,9:-1:1];
end
%downsample each column of the data
datapoints_downsampled = datapoints(1:10:end,:);
[~,maxinds_downsampled] = max(datapoints_downsampled);
[~,maxinds] = max(datapoints);
%bias needs to be zero
bias = mean(maxinds/size(datapoints,1)-maxinds_downsampled/size(datapoints_downsampled,1))
This graph shows, that there is a systematic bias that does not depend on the number of vectors
How to remove this bias? Is there a way to determine its magnitude given only one vector?
Where does it come from?
There are two main issues with the code:
Dividing the index by the length of the vector leads to a small bias: if the max is at the first element, then 1/1000 is not the same as 1/100, even though the subsampling preserved the element that contained the maximum. This needs to be corrected for by subtracting 1 before the division, and adding 1/1000 after the division.
Subsampling by a factor of 10 leads to a bias as well: since we're determining the integer location only, in 1/10 cases we preserve the location, in 4/10 cases we move the location in one direction, and in 5/10 cases we move the location in the other direction. The solution is to use an odd subsampling factor, or to determine the location of the maximum with sub-sample precision (this requires proper low-pass filtering before subsampling).
The code below is a modification of the code in the OP, it does a scatter plot of the error vs the location, as well as OP's bias plot. The first plot helps identify issue #2 above. I have made the subsampling factor and the offset for subsampling variables, I recommend that you play with these values to understand what is happening. I have also made the location of the maximum random to avoid a sampling bias. Note I also use N/factor instead of size(datapoints_downsampled,1). The size of the downsampled vector is the wrong value to use if N/factor is not integer.
N = 1000;
datapoints = zeros(N,800);
for ii = 1:size(datapoints,2)
datapoints(randi(N-20)+(1:19),ii) = [1:10,9:-1:1];
end
factor = 11;
offset = round(factor/2);
datapoints_downsampled = datapoints(offset:factor:end,:);
[~,maxinds_downsampled] = max(datapoints_downsampled,[],1);
[~,maxinds] = max(datapoints,[],1);
maxpos_downsampled = (maxinds_downsampled-1)/(N/factor) + offset/N;
maxpos = (maxinds)/N;
subplot(121), scatter(maxpos,maxpos_downsampled-maxpos)
bias = cumsum(maxpos_downsampled-maxpos)./(1:size(datapoints,2));
subplot(122), plot(bias)

Changing the range of FFT

Let's say I have a vector of length 1000. If take the FFT of this data; MATLAB chooses the k values as 0:1:length(data)-1. How can I change this range to 0:1:length(data)*(an integer)-1 or any desired range?
See the documentation to fft. The second parameter sets the size of the transform:
x = randn(1,1000);
y = fft(x,512);
However, this is equivalent to
y = fft(x(1:512));
That is, the input data is cropped to the right length, rather than using all the input data and computing only part of the output values.
There is no way to compute only part of the output values, as the FFT algorithm is most efficient when computing the full transform.
Alternatives are to simply crop the output (is the computation taking too long?), or to compute the DFT sample by sample (will be efficient only for a few output samples, anything more and your computation will take longer than the full FFT.

Kullback-Leibler Divergence between 2 Histograms from an image (MATLAB)

I pulled histograms from images on matlab, than I want to compare the histograms using KL-divergence.
I found this script but I do not understand how I could apply it to my case.
So here I pull my histogram (pretty simple!!):
[N,X]=hist(I,n);
[N1,X1]=hist(I1,n);
KLDiv(N,N1)
% ans=inf
N is the histogram of my image I
Like you can see my result is inf...
Please can you tell me in my case how to use the script?
Thanks
You probably want to calculate the histogram of an image using imhist, instead of the columnwise calculation of the histogram:
I1 = rand(10);
I2 = rand(10);
[N1, X1] = imhist(I1, 10); % limit the number of bins to avoid zero values
[N2, X2] = imhist(I2, 10);
KLDiv(N1.', N2.') % convert to row vectors to correspond with the requested format
KLDiv(N1.', N1.') % the divergence of an histogram with itself is indeed zero
Note that I limited the number of bins to be sure that each bin has at least one point, because the Kullback-Leibler divergence is not defined if Q(i) is zero and P(i) not:
The Kullback–Leibler divergence is defined only if Q(i)=0 implies
P(i)=0, for all i (absolute continuity).
Notes
Range of Kullback–Leibler divergence?
Any positive number, zero if (and only if) they are equal: KLD >= 0.
To which base should I take the logarithm? Natural logarithm log or base 2 logarithm log2?
Note that it is just a matter of scaling your results. So in fact, it doesn't matter, but be sure to use the same logarithm if you want to compare your results. Wikipedia suggests the following:
logarithms in these formulae are taken to base 2 if information is
measured in units of bits, or to base e if information is measured in
nats.

Image Parameters (Standard Deviation, Mean and Entropy) of an RGB Image

I couldn't find an answer for RGB image.
How can someone get a value of SD,mean and Entropy of RGB image using MATLAB?
From http://airccse.org/journal/ijdms/papers/4612ijdms05.pdf TABLE3, it seems he got one answer so did he get the average of the RGB values?
Really in need of any help.
After reading the paper, because you are dealing with colour images, you have three channels of information to access. This means that you could alter one of the channels for a colour image and it could still affect the information it's trying to portray. The author wasn't very clear on how they were obtaining just a single value to represent the overall mean and standard deviation. Quite frankly, because this paper was published in a no-name journal, I'm not surprised how they managed to get away with it. If this was attempted to be published in more well known journals (IEEE, ACM, etc.), this would probably be rejected outright due to that very ambiguity.
On how I interpret this procedure, averaging all three channels doesn't make sense because you want to capture the differences over all channels. Doing this averaging will smear that information and those differences get lost. Practically speaking, if you averaged all three channels, should one channel change its intensity by 1, and when you averaged the channels together, the reported average would be so small that it probably would not register as a meaningful difference.
In my opinion, what you should perhaps do is treat the entire RGB image as a 1D signal, then perform the mean, standard deviation and entropy of that image. As such, given an RGB image stored in image_rgb, you can unroll the entire image into a 1D array like so:
image_1D = double(image_rgb(:));
The double casting is important because you want to maintain floating point precision when calculating the mean and standard deviation. The images will probably be of an unsigned integer type, and so this casting must be done to maintain floating point precision. If you don't do this, you may have calculations that get saturated or clamped beyond the limits of that data type and you won't get the right answer. As such, you can calculate the mean, standard deviation and entropy like so:
m = mean(image_1D);
s = std(image_1D);
e = entropy(image_1D);
entropy is a function in MATLAB that calculates the entropy of images so you should be fine here. As noted by #CitizenInsane in his answer, entropy unrolls a grayscale image into a 1D vector and applies the Shannon definition of entropy on this 1D vector. In a similar token, you can do the same thing with a RGB image, but we have already unrolled the signal into a 1D vector anyway, and so the input into entropy will certainly be well suited for the unrolled RGB image.
I have no idea how the author actually did it. But what you could do, is to treat the image as a 1D-array of size WxHx3 and then simply calculate the mean and standard deviation.
Don't know if table 3 is obtain in the same way but at least looking at entropy routine in image toolbox of matlab, RGB values are vectorized to single vector:
I = imread('rgb'); % Read RGB values
I = I(:); % Vectorization of RGB values
p = imhist(I); % Histogram
p(p == 0) = []; % remove zero entries in p
p = p ./ numel(I); % normalize p so that sum(p) is one.
E = -sum(p.*log2(p));

image processing - enlarging an image using FFT (Matlab code)

I'm trying to write a simple matlab code which enlarges an image using fft. I tried the known algorithm for image expansion, which computes the Fourier transform of the image, pads it with zeros and computes the inverse Fourier of the padded image.
However, the inverse Fourier transform returns an image which contains complex numbers.
Therefore, when I'm trying to show the result using imshow, I'm getting the following error:
Warning: Displaying real part of complex input.
Do you have an idea what am I doing wrong?
my code:
im = imread('fruit.jpg');
imFFT = fft2(im);
bigger = padarray(imFFT,[10,10]);
imEnlarged = ifft2(bigger);
Thanks!
That's because the FFT returns values corresponding to the discrete (spatial) frequencies from 0 through Fs, where Fs is the (spatial) sampling rate. You need to insert zeros at high frequencies, which are located at the center of the returned FFT, not in its end.
You can use fftshift to shift the high frequencies to the end, pad with zeros, and then shift back with ifftshift (thanks to #Shai for the correction):
bigger = ifftshift(padarray(fftshift(imFFT),[10,10]));
Also, note that padding with zeros decreases the values in the enlarged image. You can correct that using a suitable amplification factor amp, which in this case would be equal to (1+2*10/length(im))^2:
bigger = ifftshift(padarray(fftshift(amp*imFFT),[10,10]));
You can pad at the higher frequencies directly (without fftshift suggested by Luis Mendo)
>> BIG = padarray( amp*imFFT, [20 20], 0, 'post' );
>> big = ifft2( BIG );
If you want a strictly real result, then before you do the IFFT you need to make sure the zero-padded array is exactly conjugate symmetric. Adding the zeros off-center could prevent this required symmetry.
Due to finite numerical precision, you may still end up with a complex IFFT result, but the imaginary components will all be tiny values that are essentially equivalent to zero.
Your FFT library may contain a half-to-real (quarter-size input for 2D) version that enforces symmetry and throws away the almost-zero numerical noise for you.