Image Histogram Comparison - matlab

I was trying to do histogram image comparison between two RGB images which includes heads of the same persons and non-heads to see the correlation between them. The reason I am doing this is because after performing scanning using HOG to check whether the scanning window is a head or not, I am now trying to track the same head throughout consequent frames and also I want to remove some clear false positives.
I currently tried both RGB and HSV histogram comparison and used Euclidean Distance to check the difference between the histograms. The following is the code I wrote:
%RGB histogram comparison
%clear all;
img1 = imread('testImages/samehead_1.png');
img2 = imread('testImages/samehead_2.png');
img1 = rgb2hsv(img1);
img2 = rgb2hsv(img2);
%% calculate number of bins = root(pixels);
[rows, cols] = size(img1);
no_of_pixels = rows * cols;
%no_of_bins = floor(sqrt(no_of_pixels));
no_of_bins = 256;
%% obtain Histogram for each colour
% -----1st Image---------
rHist1 = imhist(img1(:,:,1), no_of_bins);
gHist1 = imhist(img1(:,:,2), no_of_bins);
bHist1 = imhist(img1(:,:,3), no_of_bins);
hFig = figure;
hold on;
h(1) = stem(1:256, rHist1);
h(2) = stem(1:256 + 1/3, gHist1);
h(3) = stem(1:256 + 2/3, bHist1);
set(h, 'marker', 'none')
set(h(1), 'color', [1 0 0])
set(h(2), 'color', [0 1 0])
set(h(3), 'color', [0 0 1])
hold off;
% -----2nd Image---------
rHist2 = imhist(img2(:,:,1), no_of_bins);
gHist2 = imhist(img2(:,:,2), no_of_bins);
bHist2 = imhist(img2(:,:,3), no_of_bins);
hFig = figure;
hold on;
h(1) = stem(1:256, rHist2);
h(2) = stem(1:256 + 1/3, gHist2);
h(3) = stem(1:256 + 2/3, bHist2);
set(h, 'marker', 'none')
set(h(1), 'color', [1 0 0])
set(h(2), 'color', [0 1 0])
set(h(3), 'color', [0 0 1])
%% concatenate values of a histogram in 3D matrix
% -----1st Image---------
M1(:,1) = rHist1;
M1(:,2) = gHist1;
M1(:,3) = bHist1;
% -----2nd Image---------
M2(:,1) = rHist2;
M2(:,2) = gHist2;
M2(:,3) = bHist2;
%% normalise Histogram
% -----1st Image---------
M1 = M1./no_of_pixels;
% -----2nd Image---------
M2 = M2./no_of_pixels;
%% Calculate Euclidean distance between the two histograms
E_distance = sqrt(sum((M2-M1).^2));
The E_distance consists of an array containing 3 distances which refer to the red histogram difference, green and blue.
The Problem is:
When I compare the histogram of a non-head(eg. a bag) with that of a head..there is a clear difference in the error. So this is acceptable and can help me to remove the false positive.
However! When I am trying to check whether the two images are heads of the same person, this technique did not help at all as the head of another person gave a less Euclidean distance than that of the same person.
Can someone explain to me if I am doing this correctly, or maybe any guidance of what I should do?
PS: I got the idea of the LAB histogram comparison from this paper (Affinity Measures section): People Looking at each other

Color histogram similarity may be used as a good clue for tracking by detection, but don't count on it to disambiguate all possible matches between people-people and people-non-people.
According to your code, there is one thing you can do to improve the comparison: currently, you are working with per-channel histograms. This is not discriminative enough because you do not know when R-G-B components co-occur (e.g. you know how many times the red channel is in range 64-96 and how many times the blue is in range 32-64, but not when these occur simultaneously). To rectify this, you must work with 3D histograms, counting the co-occurrence of colors). For a discretization of 8 bins per channel, your histograms will have 8^3=512 bins.
Other suggestions for improvement:
Weighted assignment to neighboring bins according to interpolation weights. This eliminates the discontinuities introduced by bin quantization
Hierarchical splitting of detection window into cells (1 cell, 4 cells, 16 cells, etc), each with its own histogram, where the histograms of different levels and cells are concatenated. This allows catching local color details, like the color of a shirt, or even more local, a shirt pocket/sleeve.
Working with the Earth Mover's Distance (EMD) instead of the Euclidean metric for comparing histograms. This reduces color quantization effects (differences in histograms are weighted by color-space distance instead of equal weights), and allows for some error in the localization of cells within the detection window.
Use other cues for tracking. You'll be surprised how much the similarity between the HoG descriptors of your detections helps disambiguate matches!

Related

Matlab: Force watershed to segment into a specific number of segments

In order to avoid oversegmentation by the watershed algorithm in Matlab, I would like to force the algorithm to segment into a specific number of segments (in the example here, the algorithm segments automatically into 4, and I would like it to segment into 2). Is there a general way to define the allowed number of output segments?
The code that I am currently using:
% Load the image
grayscaleImg = imread('https://i.stack.imgur.com/KyatF.png');
white_in_current_bits = 65535;
% Display the original image
figure;
set(gcf, 'units','normalized','outerposition',[0 0 1 1]);
hold on;
imshow(grayscaleImg);
title('The origianl image');
% Binarize the image.
binaryImageElement = grayscaleImg < white_in_current_bits;
% Calculate the distance transform
D = -bwdist(~binaryImageElement);
% Find the regional minima of the distance matrix:
mask = imextendedmin(D,2);
%Display the mask on top of the binary image:
figure;
set(gcf, 'units','normalized','outerposition',[0 0 1 1]);
imshowpair(binaryImageElement,mask,'blend');
title('Blend of binary image and the regional minima mask');
%Impose the regional minima on the distance transform:
D2 = imimposemin(D,mask);
%Watershed the distance transform after imposing the regional minima:
Ld2 = watershed(D2);
%Display the binary image with the watershed segmentation lines:
bw3 = binaryImageElement;
bw3(Ld2 == 0) = 0;
figure;
set(gcf, 'units','normalized','outerposition',[0 0 1 1]);
imshow(bw3);
title('Binary image after watershedding');
There is no direct way to specify the number of regions that the watershed will produce. The watershed will always produce one region per local minimum. But you can modify the image to reduce the number of local minima. One approach is the H-minima transform. This function removes all local minima that are less deep than a threshold.
The idea would be to iterate (this might not be fast...) over thresholds until you get the desired number of regions.
% iterate over h, starting at 0
tmp = imhmin(D2,h);
Ld2 = watershed(tmp);
% count regions in Ld2, increase h and repeat
I just noticed that you impose minima in D2. You determine these minima using imextendedmin. This means you apply the H minima, find the resulting local minima, then impose those again. You might as well skip this step, and directly apply the H minima transform.

How to detect peaks on gray background with Matlab FastPeakFind?

I am testing the validity of the FileExchange project FindPeaksFast with different linewidths and backgrounds.
Test 1 is successful and the tool detects all peaks from 1px to 10 px.
However, Test 2 fails when testing to find peaks on the frame of an object plot i.e. an object (plot) on gray background.
The tool works well on white background.
Code
close all; clear all; clc;
f = figure;
hax = axes(f);
% Comment this out for Test 2
%zeroFigureDecorations(hax);
af = figure('Name', 'Do Not Touch');
x = rand(1,100);
y = rand(1,100);
linewidth=1;
plot(hax, x,y, 'LineWidth', linewidth);
I = getframe(hax);
I = I.cdata;
% https://se.mathworks.com/matlabcentral/fileexchange/37388-fast-2d-peak-finder
p=FastPeakFind(I);
% Input: 344x435x3 uint8
hold(hax, 'on');
plot(hax, p(1:2:end),p(2:2:end),'r+')
hold(hax, 'off');
function zeroFigureDecorations(ax)
axis(ax, 'tight');
set(ax, 'yTickLabel', []);
set(ax, 'xTickLabel', []);
set(ax, 'Ticklength', [0 0]); % http://stackoverflow.com/a/15529630/54964
colormap(ax, 1-gray(1024));
box(ax, 'off');
axis(ax, 'off');
end
Outputs in the following, and Fig. 1 shows that the function can detect something on lines when the background is white but not on correct locations.
Linewidth Output
10 166x1 double
1 844x1 double
Table: full axis decoration in Test 1
Linewidth Output
10 []
1 []
Table: no axis decorations, after zeroFigureDecorations(hax) in Test 2
Fig. 1 line as input (See Bla's answer) and its output,
Fig. 2 Output is wrong in Section 2,
Fig. 3 One more example that you cannot apply to function to simple curves,
Fig. 4 Section 3 Output is wrong, since not known how to apply the function on spectrograms
2 Test with bla's example data
f0 = figure;
hax0 = axes(f0);
d=uint16(conv2(reshape(single( 2^14*(rand(1,128*128)>0.9995) ),[128 128]) ,fspecial('gaussian', 10,2),'same')+2^4*rand(128));
imagesc(d, 'Parent', hax0);
I = getframe(hax0);
I = I.cdata;
p=FastPeakFind(I);
hold(hax0, 'on');
plot(hax0, p(1:2:end),p(2:2:end),'r+')
hold(hax0, 'off');
Output is wrong in Fig. 2
3 Testing with spectrograms
f3 = figure;
hax3 = axes(f3);
N = 1024*10;
n = 0:N-1;
w0 = 2*pi/5;
x = sin(w0*n)+10*sin(2*w0*n);
s = spectrogram(x);
spectrogram(x,'yaxis')
p=FastPeakFind(s);
hold on;
plot(p(1:2:end),p(2:2:end),'r+')
Matlab: 2016b
OS: Debian 8.5
You are not using the function correctly.
your code is this (verbatim):
f = figure;
hax = axes(f);
af = figure('Name', 'Do Not Touch');
x = rand(1,100);
y = rand(1,100);
linewidth=1;
plot(hax, x,y, 'LineWidth', linewidth);
I = getframe(hax);
I = I.cdata;
The matrix I is not a matrix that contain peaks like the function is intended to have. This is how it looks like:
imagesc(I);
Even if all you had were single pixels, that is not what the function is supposed to have, as it is said that the peaks point spread function needs to be larger than some # of pixels, and that they are assumed to be sparse. The function has a demonstration on a sample image that works fine.
Also , it's completely unclear what you even mean by peaks here.
EDIT:
Here's an example of how to use the function. First let's select random positions where we "create" the peaks:
I=rand(200)>0.9995;
This makes a binary matrix with only the points larger than 0.9995 selected (or having value 1). At each step you can imagesc(I) to see how I looks.
In real life, a camera will have some intensity in these points so we write:
I=I*100;
This is important as the peak by dentition needs to be a maximum value in its neighborhood. In real life, peaks are mostly not single pixels, they have some "width" or spread (this is also what the function says it deals with):
I=conv2(I,fspecial('gaussian',10,2),'same');
here, this spread is done by a "point-spread function" of a guassian of some width.
Lets add some 30% noise (note that after the last step the maximum value of the peaks is no longer 100, because it is spread to other pixels as well):
I=I+0.3*max(I(:))*rand(size(I));
Let's find peaks
p=FastPeakFind(I);
See how it did:
subplot(1,2,1);imagesc(I);
subplot(1,2,2);imagesc(I); hold on
plot(p(1:2:end),p(2:2:end),'r+')
In the function code, the example is doing what I wrote here in a single line. Note that there is an edg parameter, as this will not work on peaks on the edges of the image. This cab be solved by padding the image with zeros I think...

How to plot this in matlab?

I have following code
xr=randi([1 150],1,20)
z=numel(xr);
N=10; %Window length
gAll=zeros(1,z-N+1);
for n=0:z-N;
x=xr(1+n:N+n)
d=max(x);
m=numel(x);
y=zeros(d,1);
p=zeros(d,d);
for k=1:m-1
y(x(k))=y(x(k))+1;
p(x(k),x(k+1))=p(x(k),x(k+1))+1;
end
p=bsxfun(#rdivide,p,y);
p(isnan(p)) = 0;
j=prod(p(p~=0));
[~,~,idx] = unique(x);
q=prod(hist(idx,1:max(idx))/numel(x));
s=log(j);
l=log(q);
g=s+l
gAll(n+1)=g;
end
plot(gAll)
I want a plot such a that for threshold line of gAll =-22, graph above threshold line should be in red color and graph below threshold line should in blue, but graph should in continuous joint with these two different colors, how to do it.
You can mask areas of the graph by setting values to NaN. I would create a high resolution interpolant of the data gAll, then create two coppies, ones with gAll>-22 masked and one with gAll<-22 masked and plot them on the same axes.
You could make gAll_HR a high resolution version of your vector gAll on x_HR with interp1(x, gAll, x_HR), then replace values with NaN using logical indexing:
gAll_low = gAll_HR;
gAll_low(gAll_HR>=-22) = NaN;
gAll_high = gAll_HR;
gAll_high(gAll_HR<-22) = NaN;
plot(x_HR, gAll_low, 'b-', x_HR, gAll_high, 'r-')

Matlab plot in histogram

Assume y is a vector with random numbers following the distribution f(x)=sqrt(4-x^2)/(2*pi). At the moment I use the command hist(y,30). How can I plot the distribution function f(x)=sqrt(4-x^2)/(2*pi) into the same histogram?
Instead of normalizing numerically, you could also do it by finding a theoretical scaling factor as follows.
nbins = 30;
nsamples = max(size(y));
binsize = (max(y)-min(y)) / nsamples
hist(y,nbins)
hold on
x1=linspace(min(y),max(y),100);
scalefactor = nsamples * binsize
y1=scalefactor * sqrt(4-x^2)/(2*pi)
plot(x1,y1)
Update: How it works.
For any dataset that is large enough to give a good approximation to the pdf (call it f(x)), the integral of f(x) over this domain will be approximately unity. However we know that the area under any histogram is precisely equal to the total number of samples times the bin-width.
So a very simple scale factor to bring the pdf into line with the histogram is Ns*Wb, the total number of sample point times the width of the bins.
Let's take an example of another distribution function, the standard normal. To do exactly what you say you want, you do this:
nRand = 10000;
y = randn(1,nRand);
[myHist, bins] = hist(y,30);
pdf = normpdf(bins);
figure, bar(bins, myHist,1); hold on; plot(bins,pdf,'rx-'); hold off;
This is probably NOT what you actually want though. Why? You'll notice that your density function looks like a thin line at the bottom of your histogram plot. This is because a histogram is counts of numbers in bins, while a density function is normalized to integrate to one. If you have hundreds of items in a bin, there is no way that the density function will match that in scale, so you have a scaling or normalization problem. Either you have to normalize the histogram, or plot a scaled distribution function. I prefer to scale the distribution function so that my counts are sensical when I look at the histogram:
normalizedpdf = pdf/sum(pdf)*sum(myHist);
figure, bar(bins, myHist,1); hold on; plot(bins,normalizedpdf,'rx-'); hold off;
Your case is the same, except you'll use the function f(x) you specified instead of the normpdf command.
Let me add another example to the mix:
%# some normally distributed random data
data = randn(1e3,1);
%# histogram
numbins = 30;
hist(data, numbins);
h(1) = get(gca,'Children');
set(h(1), 'FaceColor',[.8 .8 1])
%# figure out how to scale the pdf (with area = 1), to the area of the histogram
[bincounts,binpos] = hist(data, numbins);
binwidth = binpos(2) - binpos(1);
histarea = binwidth*sum(bincounts);
%# fit a gaussian
[muhat,sigmahat] = normfit(data);
x = linspace(binpos(1),binpos(end),100);
y = normpdf(x, muhat, sigmahat);
h(2) = line(x, y*histarea, 'Color','b', 'LineWidth',2);
%# kernel estimator
[f,x,u] = ksdensity( data );
h(3) = line(x, f*histarea, 'Color','r', 'LineWidth',2);
legend(h, {'freq hist','fitted Gaussian','kernel estimator'})

How to fit a curve by a series of segmented lines in Matlab?

I have a simple loglog curve as above. Is there some function in Matlab which can fit this curve by segmented lines and show the starting and end points of these line segments ? I have checked the curve fitting toolbox in matlab. They seems to do curve fitting by either one line or some functions. I do not want to curve fitting by one line only.
If there is no direct function, any alternative to achieve the same goal is fine with me. My goal is to fit the curve by segmented lines and get locations of the end points of these segments .
First of all, your problem is not called curve fitting. Curve fitting is when you have data, and you find the best function that describes it, in some sense. You, on the other hand, want to create a piecewise linear approximation of your function.
I suggest the following strategy:
Split manually into sections. The section size should depend on the derivative, large derivative -> small section
Sample the function at the nodes between the sections
Find a linear interpolation that passes through the points mentioned above.
Here is an example of a code that does that. You can see that the red line (interpolation) is very close to the original function, despite the small amount of sections. This happens due to the adaptive section size.
function fitLogLog()
x = 2:1000;
y = log(log(x));
%# Find section sizes, by using an inverse of the approximation of the derivative
numOfSections = 20;
indexes = round(linspace(1,numel(y),numOfSections));
derivativeApprox = diff(y(indexes));
inverseDerivative = 1./derivativeApprox;
weightOfSection = inverseDerivative/sum(inverseDerivative);
totalRange = max(x(:))-min(x(:));
sectionSize = weightOfSection.* totalRange;
%# The relevant nodes
xNodes = x(1) + [ 0 cumsum(sectionSize)];
yNodes = log(log(xNodes));
figure;plot(x,y);
hold on;
plot (xNodes,yNodes,'r');
scatter (xNodes,yNodes,'r');
legend('log(log(x))','adaptive linear interpolation');
end
Andrey's adaptive solution provides a more accurate overall fit. If what you want is segments of a fixed length, however, then here is something that should work, using a method that also returns a complete set of all the fitted values. Could be vectorized if speed is needed.
Nsamp = 1000; %number of data samples on x-axis
x = [1:Nsamp]; %this is your x-axis
Nlines = 5; %number of lines to fit
fx = exp(-10*x/Nsamp); %generate something like your current data, f(x)
gx = NaN(size(fx)); %this will hold your fitted lines, g(x)
joins = round(linspace(1, Nsamp, Nlines+1)); %define equally spaced breaks along the x-axis
dx = diff(x(joins)); %x-change
df = diff(fx(joins)); %f(x)-change
m = df./dx; %gradient for each section
for i = 1:Nlines
x1 = joins(i); %start point
x2 = joins(i+1); %end point
gx(x1:x2) = fx(x1) + m(i)*(0:dx(i)); %compute line segment
end
subplot(2,1,1)
h(1,:) = plot(x, fx, 'b', x, gx, 'k', joins, gx(joins), 'ro');
title('Normal Plot')
subplot(2,1,2)
h(2,:) = loglog(x, fx, 'b', x, gx, 'k', joins, gx(joins), 'ro');
title('Log Log Plot')
for ip = 1:2
subplot(2,1,ip)
set(h(ip,:), 'LineWidth', 2)
legend('Data', 'Piecewise Linear', 'Location', 'NorthEastOutside')
legend boxoff
end
This is not an exact answer to this question, but since I arrived here based on a search, I'd like to answer the related question of how to create (not fit) a piecewise linear function that is intended to represent the mean (or median, or some other other function) of interval data in a scatter plot.
First, a related but more sophisticated alternative using regression, which apparently has some MATLAB code listed on the wikipedia page, is Multivariate adaptive regression splines.
The solution here is to just calculate the mean on overlapping intervals to get points
function [x, y] = intervalAggregate(Xdata, Ydata, aggFun, intStep, intOverlap)
% intOverlap in [0, 1); 0 for no overlap of intervals, etc.
% intStep this is the size of the interval being aggregated.
minX = min(Xdata);
maxX = max(Xdata);
minY = min(Ydata);
maxY = max(Ydata);
intInc = intOverlap*intStep; %How far we advance each iteraction.
if intOverlap <= 0
intInc = intStep;
end
nInt = ceil((maxX-minX)/intInc); %Number of aggregations
parfor i = 1:nInt
xStart = minX + (i-1)*intInc;
xEnd = xStart + intStep;
intervalIndices = find((Xdata >= xStart) & (Xdata <= xEnd));
x(i) = aggFun(Xdata(intervalIndices));
y(i) = aggFun(Ydata(intervalIndices));
end
For instance, to calculate the mean over some paired X and Y data I had handy with intervals of length 0.1 having roughly 1/3 overlap with each other (see scatter image):
[x,y] = intervalAggregate(Xdat, Ydat, #mean, 0.1, 0.333)
x =
Columns 1 through 8
0.0552 0.0868 0.1170 0.1475 0.1844 0.2173 0.2498 0.2834
Columns 9 through 15
0.3182 0.3561 0.3875 0.4178 0.4494 0.4671 0.4822
y =
Columns 1 through 8
0.9992 0.9983 0.9971 0.9955 0.9927 0.9905 0.9876 0.9846
Columns 9 through 15
0.9803 0.9750 0.9707 0.9653 0.9598 0.9560 0.9537
We see that as x increases, y tends to decrease slightly. From there, it is easy enough to draw line segments and/or perform some other kind of smoothing.
(Note that I did not attempt to vectorize this solution; a much faster version could be assumed if Xdata is sorted.)