I want to segment hand from a depth image using depth thresholding. I used this kinect and leap dataset from this link-
http://lttm.dei.unipd.it/downloads/gesture/
I tried these 2 codes, but the output I got is total black image in both the cases. The original .png image is
I selected depth value from 1_depth.bin file in the dataset.
Code 1
I = fopen('D:\dsktop\kinect_leap_dataset\acquisitions\P1\G1\1_depth.bin', 'r');
A = fread(I, 480*640, 'uint8=>uint8');
A = reshape(A, 480, 640);
min_row = min(A);
min_col = min(min_row);
for i = 1:480
for j = 1:640
if ((A(i,j) > (min_col + 10)) || (A(i,j) == (min_col + 10)))
A(i,j) = 1;
else
A(i,j) = 0;
end
end
end
imshow(A)
Code 2
image = imread('D:\dsktop\kinect_leap_dataset\acquisitions\P1\G1\1_depth.png');
I = fopen('D:\dsktop\kinect_leap_dataset\acquisitions\P1\G1\1_depth.bin', 'r');
A = fread(I, 480*640, 'uint8=>uint8');
A = reshape(A, 480, 640);
min_row = min(A);
min_col = min(min_row);
for i = 1:480
for j = 1:640
if ((A(i,j) > (min_col + 10)) || (A(i,j) == (min_col + 10)))
image(i,j) = 1;
else
image(i,j) = 0;
end
end
end
imshow(image)
The output I am getting is
Kindly tell what is wrong in this code and why I am not getting any out?
Your code is extremely not vectorize. Here's how to re-write your code in a more vectorize fashion. This is both more efficient and more "readable":
I = fopen('D:\dsktop\kinect_leap_dataset\acquisitions\P1\G1\1_depth.bin', 'r');
A = fread(I, 480*640, 'uint8=>uint8');
A = reshape(A, 480, 640);
min_ = min(A(:)); % minimal value across rows and columns
mask = A>=(min_+10); % no need for loop, vectorize code.
imshow(mask, []);
Related
I'm trying to write some MATLAB code such that given a monochromatic video, It needs to produce a image such that each pixel of the image equals the minimal value that said pixel takes in the video. As an example the pixel (200,300) will equal the min value that pixel (200,300) through the course of the video. I have written some code to do this however it's terribly inefficient. Any comments to improve my code would be appriciated
hologramVideo = VideoReader('test.mp4')
mkdir('images')
frames = int16(hologramVideo.Duration * hologramVideo.FrameRate)
imageValues = cell(frames, 1);
ii = 1;
while hasFrame(hologramVideo)
imageValues{ii} = im2uint8(rgb2gray(readFrame(hologramVideo)));
ii = ii + 1;
end
newImage = zeros(512)
currentMin = 255
currentVal = 0
x = 1;
y = 1;
for x = 1:512
for y = 1:512
currentMin = 0;
for i = 1:frames
currentImg = imageValues(i,1,1);
currentVal = currentImg{1,1}(x,y)
if currentVal < currentMin;
currentMin = currentVal;
end
end
newImage(x,y) = currentMin;
end
end
I don't have a file to test, but the main bottleneck is in how you store the images. Rather than storing them in a cell array, you are better off storing them in a 3D-array:
imageValues = zeros([Nframes, 512, 512]);
ii=1;
while hasFrame(hologramVideo)
imageValues(ii,:,:) = im2uint8(rgb2gray(readFrame(hologramVideo)));
ii = ii + 1;
end
That would make the remainder of the code very easy and vectorized (i.e. fast):
newImage = squeeze(min(imageValues,[],1));
I'm writing the code in Matlab to find interest point using DoG in the image.
Here is the main.m:
imTest1 = rgb2gray(imread('1.jpg'));
imTest1 = double(imTest1);
sigma = 0.6;
k = 5;
thresh = 3;
[x1,y1,r1] = DoG(k,sigma,thresh,imTest1);
%get the interest points and show it on the image with its scale
figure(1);
imshow(imTest1,[]), hold on, scatter(y1,x1,r1,'r');
And the function DoG is:
function [x,y,r] = DoG(k,sigma,thresh,imTest)
x = []; y = []; r = [];
%suppose 5 levels of gaussian blur
for i = 1:k
g{i} = fspecial('gaussian',size(imTest),i*sigma);
end
%so 4 levels of DoG
for i = 1:k-1
d{i} = imfilter(imTest,g{i+1}-g{i});
end
%compare the current pixel in the image to the surrounding pixels (26 points),if it is the maxima/minima, this pixel will be a interest point
for i = 2:k-2
for m = 2:size(imTest,1)-1
for n = 2:size(imTest,2)-1
id = 1;
compare = zeros(1,27);
for ii = i-1:i+1
for mm = m-1:m+1
for nn = n-1:n+1
compare(id) = d{ii}(mm,nn);
id = id+1;
end
end
end
compare_max = max(compare);
compare_min = min(compare);
if (compare_max == d{i}(m,n) || compare_min == d{i}(m,n))
if (compare_min < -thresh || compare_max > thresh)
x = [x;m];
y = [y;n];
r = [r;abs(d{i}(m,n))];
end
end
end
end
end
end
So there's a gaussian function and the sigma i set is 0.6. After running the code, I find the position is not correct and the scales looks almost the same for all interest points. I think my code should work but actually the result is not. Anybody know what's the problem?
I am trying to implement a simple pixel level center-surround image enhancement. Center-surround technique makes use of statistics between the center pixel of the window and the surrounding neighborhood as a means to decide what enhancement needs to be done. In the code given below I have compared the center pixel with average of the surrounding information and based on that I switch between two cases to enhance the contrast. The code that I have written is as follows:
im = normalize8(im,1); %to set the range of pixel from 0-255
s1 = floor(K1/2); %K1 is the size of the window for surround
M = 1000; %is a constant value
out1 = padarray(im,[s1,s1],'symmetric');
out1 = CE(out1,s1,M);
out = (out1(s1+1:end-s1,s1+1:end-s1));
out = normalize8(out,0); %to set the range of pixel from 0-1
function [out] = CE(out,s,M)
B = 255;
out1 = out;
for i = s+1 : size(out,1) - s
for j = s+1 : size(out,2) - s
temp = out(i-s:i+s,j-s:j+s);
Yij = out1(i,j);
Sij = (1/(2*s+1)^2)*sum(sum(temp));
if (Yij>=Sij)
Aij = A(Yij-Sij,M);
out1(i,j) = ((B + Aij)*Yij)/(Aij+Yij);
else
Aij = A(Sij-Yij,M);
out1(i,j) = (Aij*Yij)/(Aij+B-Yij);
end
end
end
out = out1;
function [Ax] = A(x,M)
if x == 0
Ax = M;
else
Ax = M/x;
end
The code does the following things:
1) Normalize the image to 0-255 range and pad it with additional elements to perform windowing operation.
2) Calls the function CE.
3) In the function CE obtain the windowed image(temp).
4) Find the average of the window (Sij).
5) Compare the center of the window (Yij) with the average value (Sij).
6) Based on the result of comparison perform one of the two enhancement operation.
7) Finally set the range back to 0-1.
I have to run this for multiple window size (K1,K2,K3, etc.) and the images are of size 1728*2034. When the window size is selected as 100, the time consumed is very high.
Can I use vectorization at some stage to reduce the time for loops?
The profiler result (for window size 21) is as follows:
The profiler result (for window size 100) is as follows:
I have changed the code of my function and have written it without the sub-function. The code is as follows:
function [out] = CE(out,s,M)
B = 255;
Aij = zeros(1,2);
out1 = out;
n_factor = (1/(2*s+1)^2);
for i = s+1 : size(out,1) - s
for j = s+1 : size(out,2) - s
temp = out(i-s:i+s,j-s:j+s);
Yij = out1(i,j);
Sij = n_factor*sum(sum(temp));
if Yij-Sij == 0
Aij(1) = M;
Aij(2) = M;
else
Aij(1) = M/(Yij-Sij);
Aij(2) = M/(Sij-Yij);
end
if (Yij>=Sij)
out1(i,j) = ((B + Aij(1))*Yij)/(Aij(1)+Yij);
else
out1(i,j) = (Aij(2)*Yij)/(Aij(2)+B-Yij);
end
end
end
out = out1;
There is a slight improvement in the speed from 93 sec to 88 sec. Suggestions for any other improvements to my code are welcomed.
I have tried to incorporate the suggestions given to replace sliding window with convolution and then vectorize the rest of it. The code below is my implementation and I'm not getting the result expected.
function [out_im] = CE_conv(im,s,M)
B = 255;
temp = ones(2*s,2*s);
temp = temp ./ numel(temp);
out1 = conv2(im,temp,'same');
out_im = im;
Aij = im-out1; %same as Yij-Sij
Aij1 = out1-im; %same as Sij-Yij
Mij = Aij;
Mij(Aij>0) = M./Aij(Aij>0); % if Yij>Sij Mij = M/Yij-Sij;
Mij(Aij<0) = M./Aij1(Aij<0); % if Yij<Sij Mij = M/Sij-Yij;
Mij(Aij==0) = M; % if Yij-Sij == 0 Mij = M;
out_im(Aij>=0) = ((B + Mij(Aij>=0)).*im(Aij>=0))./(Mij(Aij>=0)+im(Aij>=0));
out_im(Aij<0) = (Mij(Aij<0).*im(Aij<0))./ (Mij(Aij<0)+B-im(Aij<0));
I am not able to figure out where I'm going wrong.
A detailed explanation of what I'm trying to implement is given in the following paper:
Vonikakis, Vassilios, and Ioannis Andreadis. "Multi-scale image contrast enhancement." In Control, Automation, Robotics and Vision, 2008. ICARCV 2008. 10th International Conference on, pp. 856-861. IEEE, 2008.
I've tried to see if I could get those times down by processing with colfiltand nlfilter, since both are usually much faster than for-loops for sliding window image processing.
Both worked fine for relatively small windows. For an image of 2048x2048 pixels and a window of 10x10, the solution with colfilt takes about 5 seconds (on my personal computer). With a window of 21x21 the time jumped to 27 seconds, but that is still a relative improvement on the times displayed on the question. Unfortunately I don't have enough memory to colfilt using windows of 100x100, but the solution with nlfilter works, though taking about 120 seconds.
Here the code
Solution with colfilt:
function outval = enhancematrix(inputmatrix,M,B)
%Inputmatrix is a 2D matrix or column vector, outval is a 1D row vector.
% If inputmatrix is made of integers...
inputmatrix = double(inputmatrix);
%1. Compute S and Y
normFactor = 1 / (size(inputmatrix,1) + 1).^2; %Size of column.
S = normFactor*sum(inputmatrix,1); % Sum over the columns.
Y = inputmatrix(ceil(size(inputmatrix,1)/2),:); % Center row.
% So far we have all S and Y, one value per column.
%2. Compute A(abs(Y-S))
A = Afunc(abs(S-Y),M);
% And all A: one value per column.
%3. The tricky part. If Y(i)-S(i) > 0 do something.
doPositive = (Y > S);
doNegative = ~doPositive;
outval = zeros(1,size(inputmatrix,2));
outval(doPositive) = (B + A(doPositive) .* Y(doPositive)) ./ (A(doPositive) + Y(doPositive));
outval(doNegative) = (A(doNegative) .* Y(doNegative)) ./ (A(doNegative) + B - Y(doNegative));
end
function out = Afunc(x,M)
% Input x is a row vector. Output is another row vector.
out = x;
out(x == 0) = M;
out(x ~= 0) = M./x(x ~= 0);
end
And to call it, simply do:
M = 1000; B = 255; enhancenow = #(x) enhancematrix(x,M,B);
w = 21 % windowsize
result = colfilt(inputImage,[w w],'sliding',enhancenow);
Solution with nlfilter:
function outval = enhanceimagecontrast(neighbourhood,M,B)
%1. Compute S and Y
normFactor = 1 / (length(neighbourhood) + 1).^2;
S = normFactor*sum(neighbourhood(:));
Y = neighbourhood(ceil(size(neighbourhood,1)/2),ceil(size(neighbourhood,2)/2));
%2. Compute A(abs(Y-S))
test = (Y>=S);
A = Afunc(abs(Y-S),M);
%3. Return outval
if test
outval = ((B + A) * Y) / (A + Y);
else
outval = (A * Y) / (A + B - Y);
end
function aval = Afunc(x,M)
if (x == 0)
aval = M;
else
aval = M/x;
end
And to call it, simply do:
M = 1000; B = 255; enhancenow = #(x) enhanceimagecontrast(x,M,B);
w = 21 % windowsize
result = nlfilter(inputImage,[w w], enhancenow);
I didn't spend much time checking that everything is 100% correct, but I did see some nice contrast enhancement (hair looks particularly nice).
This answer is the implementation that was suggested by Peter. I debugged the implementation and presenting the final working version of the fast implementation.
function [out_im] = CE_conv(im,s,M)
B = 255;
im = ( im - min(im(:)) ) ./ ( max(im(:)) - min(im(:)) )*255;
h = ones(s,s)./(s*s);
out1 = imfilter(im,h,'conv');
out_im = im;
Aij = im-out1; %same as Yij-Sij
Aij1 = out1-im; %same as Sij-Yij
Mij = Aij;
Mij(Aij>0) = M./Aij(Aij>0); % if Yij>Sij Mij = M/(Yij-Sij);
Mij(Aij<0) = M./Aij1(Aij<0); % if Yij<Sij Mij = M/(Sij-Yij);
Mij(Aij==0) = M; % if Yij-Sij == 0 Mij = M;
out_im(Aij>=0) = ((B + Mij(Aij>=0)).*im(Aij>=0))./(Mij(Aij>=0)+im(Aij>=0));
out_im(Aij<0) = (Mij(Aij<0).*im(Aij<0))./ (Mij(Aij<0)+B-im(Aij<0));
out_im = ( out_im - min(out_im(:)) ) ./ ( max(out_im(:)) - min(out_im(:)) );
To call this use the following code
I = imread('pout.tif');
w_size = 51;
M = 4000;
output = CE_conv(I(:,:,1),w_size,M);
The output for the 'pout.tif' image is given below
The execution time for Bigger image and with 100*100 block size is around 5 secs with this implementation.
I have to create an algorithm with Matlab that, with a image of a hand, can know the form of the hand by the number of raised fingers and the presence or absence of the thumb. So far, the algorithm is almost complete but I don't know what more I can do that could find the peaks that represents the fingers. We tried a lot of things but nothing works. The idea is to find when there is a sudden increasement but as the pixels are never completely aligned, nothing that we tried worked. Someone has any idea? Here is the code so far.
The image that he is reading is this one:
To know if the finger is relevant or not, we already have an idea that might work... but we need to find the fingers first.
clear all
close all
image=imread('mao2.jpg');
YCBCR = rgb2ycbcr(image);
image=YCBCR;
cb = image(:,:,2);
cr = image(:,:,3);
imagek(:,1) = cb(:);
imagek(:,2) = cr(:);
imagek = double(imagek);
[IDX, C] = kmeans(imagek, 2, 'EmptyAction', 'singleton');
s=size(image);
IDX= uint8(IDX);
C2=round(C);
imageNew = zeros(s(1),s(2));
temp = reshape(IDX, [s(1) s(2)]);
for i = 1 : 1 : s(1)
for j = 1 : 1 : s(2)
imageNew(i,j,:) = C2(temp(i,j));
end
end
imageNew=uint8(imageNew);
[m,n]=size(imageNew);
for i=1:1:m
for j = 1:1:n
if(imageNew(i,j)>=127)
pretobranco(i,j)=0;
else
pretobranco(i,j)=1;
end
end
end
I2=imfill(pretobranco);
imshow(I2);
imwrite(I2, 'mao1trab.jpg');
[m,n]=size(I2);
B=edge(I2);
figure
imshow(B);
hold on;
stats=regionprops(I2,'BoundingBox');
rect=rectangle('position', [stats(1).BoundingBox(1), stats(1).BoundingBox(2), stats(1).BoundingBox(3), stats(1).BoundingBox(4)], 'EdgeColor', 'r');
stats(1).BoundingBox(1)
stats(1).BoundingBox(2)
stats(1).BoundingBox(3)
stats(1).BoundingBox(4)
figure
Bound = B( stats(1).BoundingBox(2): stats(1).BoundingBox(2)+stats(1).BoundingBox(4)-1, stats(1).BoundingBox(1):stats(1).BoundingBox(1)+stats(1).BoundingBox(3)-1);
imshow(Bound)
y1 = round(stats(1).BoundingBox(2))
y2 = round(stats(1).BoundingBox(2)+stats(1).BoundingBox(4)-1)
x1 = round(stats(1).BoundingBox(1))
x2 = round(stats(1).BoundingBox(1)+stats(1).BoundingBox(3)-1)
% Bounding box contida em imagem[M, N].
[M,N] = size(Bound)
vertical=0;
horizontal=0;
if M > N
vertical = 1 %imagem vertical
else
horizontal = 1 %imagem horizontal
end
%Find thumb
MaoLeft = 0;
MaoRight = 0;
nPixelsBrancos = 0;
if vertical==1
for i = x1:1:x2
for j= y1:1:y2
if I2(j,i) == 1
nPixelsBrancos = nPixelsBrancos + 1; %Numero de pixels da mão
end
end
end
for i=x1:1:x1+30
for j=y1:1:y2
if I2(j,i) == 1
MaoLeft = MaoLeft + 1; %Number of pixels of the hand between the 30 first colums
end
end
end
for i=x2-30:1:x2
for j=y1:1:y2
if I2(j,1) == 1
MaoRight = MaoRight + 1; %Number of pixels of the hand between the 30 last colums
end
end
end
TaxaBrancoLeft = MaoLeft/nPixelsBrancos
TaxaBrancoRight = MaoRight/nPixelsBrancos
if TaxaBrancoLeft <= (7/100)
if TaxaBrancoRight <= (7/100)
Thumb = 0 %Thumb in both borders is defined as no Thumb.
else
ThumbEsquerdo = 1 %Thumb on left
end
end
if TaxaBrancoRight <= (7/100) && TaxaBrancoLeft >= (7/100)
ThumbDireito = 1 %Thumb on right
end
end
if horizontal==1
for i = x1:1:x2
for j= y1:y2
if I2(i,j) == 1
nPixelsBrancos = nPixelsBrancos + 1; %Numero de pixels da mão
end
end
end
for i=x1:1:x2
for j=y1:1:y1+30
if I2(i,j) == 1
MaoLeft = MaoLeft + 1; %Numero de pixels da mão entre as 30 primeiras colunas
end
end
end
for i=x1:1:x2
for j=y2-30:1:y2
if I2(j,1) == 1
MaoRight = MaoRight + 1; %Numero de pixels da mão entre as 30 ultimas colunas
end
end
end
TaxaBrancoLeft = MaoLeft/nPixelsBrancos
TaxaBrancoRight = MaoRight/nPixelsBrancos
if TaxaBrancoLeft <= (7/100)
if TaxaBrancoRight <= (7/100)
Thumb = 0 %Polegar nas duas bordas. Definimos como sem polegar.
else
ThumbEsquerdo = 1 %Polegar na borda esquerda
end
end
if TaxaBrancoRight <= (7/100) && TaxaBrancoLeft >= (7/100)
ThumbDireito = 1 %Polegar na borda direita
end
end
figure
imshow(I2);
%detecção da centroid
Ibw = im2bw(I2);
Ilabel = bwlabel(Ibw);
stat = regionprops(Ilabel,'centroid');
figure
imshow(I2); hold on;
for x = 1: numel(stat)
plot(stat(x).Centroid(1),stat(x).Centroid(2),'ro');
end
centroid = [stat(x).Centroid(1) stat(x).Centroid(2)] %coordenadas x e y da centroid
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Seemed like an interesting problem, so I gave it a shot. Basically you start with a Sobel filter to find the edges in your image (after slight denoising). Then clean up the resulting lines, use them to separate regions within your binary mask of the hand, use a watershed transform to find the wrist, some distance transforms to find other landmarks, then remove the palm. What you're left with is separate regions for each finger and thumb. You can count those regions easily enough or find which way they are pointing, or whatever you'd like.
imgURL = 'https://encrypted-tbn2.gstatic.com/imgs?q=tbn:ANd9GcRQsqJtlrOnSbJNTnj35Z0uG9BXsecX2AXn1vV0YDKodq-zSuqnnQ';
imgIn=imread(imgURL);
gaussfilt = fspecial('gaussian', 3, .5); % Blur starting image
blurImg = imfilter(double(img(:,:,1)), gaussfilt);
edgeImg = edge(blurImg, 'sobel'); % Use Sobel edge filter to pick out contours of hand + fingers
% Clean up contours
edgeImg = bwmorph(edgeImg, 'close', 1);
edgeImg = bwmorph(edgeImg, 'thin', Inf);
% Clean up rogue spots in corners
edgeImg([2 end-1], 2) = 0;
edgeImg([2 end-1], end-1) = 0;
% Extend lines to edge of image (correct for 'close' operation above
edgeImg([1 end],:) = edgeImg([2 end-1],:);
edgeImg(:, [1 end]) = edgeImg(:, [2 end-1]);
% Remove all but the longest line
regs = regionprops(edgeImg, 'Area', 'PixelIdxList');
regs(vertcat(regs.Area) ~= max(vertcat(regs.Area))) = [];
lineImg = false(size(edgeImg, 1), size(edgeImg, 2));
lineImg(regs.PixelIdxList) = 1;
fillImg = edgeImg;
% Close in wrist
if any(fillImg(1,:))
fillImg(1,:) = 1;
end
if any(fillImg(end,:))
fillImg(end,:) = 1;
end
if any(fillImg(:,1))
fillImg(:,1) = 1;
end
if any(fillImg(:,end))
fillImg(:,end) = 1;
end
fillImg = imfill(fillImg, 'holes');
fillImg([1 end], :) = 0;
fillImg(:, [1 end]) = 0;
fillImg([1 end],:) = fillImg([2 end-1],:);
fillImg(:, [1 end]) = fillImg(:, [2 end-1]);
% Start segmenting out hand + fingers
handBin = fillImg;
% Set lines in above image to 0 to separate closely-spaced fingers
handBin(lineImg) = 0;
% Erode these lines to make fingers a bit more separate
handBin = bwmorph(handBin, 'erode', 1);
% Segment out just hand (remove wrist)
distImg = bwdist(~handBin);
[cDx, cDy] = find(distImg == max(distImg(:)));
midWrist = distImg;
midWrist = max(midWrist(:)) - midWrist;
midWrist(distImg == 0) = Inf;
wristWatershed = watershed(imerode(midWrist, strel('disk', 10)));
whichRegion = wristWatershed(cDx, cDy);
handBin(wristWatershed ~= whichRegion) = 0;
regs = regionprops(handBin, 'Area', 'PixelIdxList');
regs(vertcat(regs.Area) ~= max(vertcat(regs.Area))) = [];
handOnly = zeros(size(handBin, 1), size(handBin, 2));
handOnly(regs.PixelIdxList) = 1;
% Find radius of circle around palm centroid that excludes wrist and splits
% fingers into separate regions.
% This is estimated as D = 1/3 * [(Centroid->Fingertip) + 2*(Centroid->Wrist)]
% Find Centroid-> Wrist distance
dist2w = wristWatershed ~= whichRegion;
dist2w = bwdist(dist2w);
distToWrist = dist2w(cDx, cDy);
% Find Centroid-> Fingertip distance
dist2FE = zeros(size(handOnly, 1), size(handOnly, 2));
dist2FE(cDx, cDy) = 1;
dist2FE = bwdist(dist2FE).*handOnly;
distToFingerEnd = max(dist2FE(:));
circRad = mean([distToFingerEnd, distToWrist, distToWrist]); % Estimage circle diameter
% Draw circle
X = bsxfun(#plus,(1:size(handOnly, 1))',zeros(1,size(handOnly, 2)));
Y = bsxfun(#plus,(1:size(handOnly, 2)),zeros(size(handOnly, 1),1));
B = sqrt(sum(bsxfun(#minus,cat(3,X,Y),reshape([cDx, cDy],1,1,[])).^2,3))<=circRad;
% Cut out binary mask within circle
handOnly(B) = 0;
% Label separate regions, where each now corresponds to a separate digit
fingerCount = bwlabel(handOnly);
% Display overlay image
figure()
imshow(imgIn)
hold on
overlayImg = imshow(label2rgb(fingerCount, 'jet', 'k'));
set(overlayImg, 'AlphaData', 0.5);
hold off
Results:
http://imgur.com/ySn1fPy
I implemented a method for removing shadows based on invariant color features found in the paper Entropy Minimization for Shadow Removal. My implementation seems to be yielding similar computational results sometimes, but they are always off, and my grayscale image is blocky, maybe as a result of incorrectly taking the geometric mean.
Here is an example plot of the information potential from the horse image in the paper as well as my invariant image. Multiply the x-axis by 3 to get theta(which goes from 0 to 180):
And here is the grayscale Image my code outputs for the correct maximum theta (mine is off by 10):
You can see the blockiness that their image doesn't have:
Here is their information potential:
When dividing by the geometric mean, I have tried using NaN and tresholding the image so the smallest possible value is .01, but it doesn't seem to change my output.
Here is my code:
I = im2double(imread(strname));
[m,n,d] = size(I);
I = max(I, .01);
chrom = zeros(m, n, 3, 'double');
for i = 1:m
for j = 1:n
% if ((I(i,j,1)*I(i,j,2)*I(i,j,3))~= 0)
chrom(i,j, 1) = I(i,j,1)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
chrom(i,j, 2) = I(i,j,2)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
chrom(i,j, 3) = I(i,j,3)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
% else
% chrom(i,j, 1) = 1;
% chrom(i,j, 2) = 1;
% chrom(i,j, 3) = 1;
% end
end
end
p1 = mat2gray(log(chrom(:,:,1)));
p2 = mat2gray(log(chrom(:,:,2)));
p3 = mat2gray(log(chrom(:,:,3)));
X1 = mat2gray(p1*1/(sqrt(2)) - p2*1/(sqrt(2)));
X2 = mat2gray(p1*1/(sqrt(6)) + p2*1/(sqrt(6)) - p3*2/(sqrt(6)));
maxinf = 0;
maxtheta = 0;
data2 = zeros(1, 61);
for theta = 0:3:180
M = X1*cos(theta*pi/180) - X2*sin(theta*pi/180);
s = sqrt(std2(X1)^(2)*cos(theta*pi/180) + std2(X2)^(2)*sin(theta*pi/180));
s = abs(1.06*s*((m*n)^(-1/5)));
[m, n] = size(M);
length = m*n;
sources = zeros(1, length, 'double');
count = 1;
for x=1:m
for y = 1:n
sources(1, count) = M(x , y);
count = count + 1;
end
end
weights = ones(1, length);
sigma = 2*s;
[xc , Ak] = fgt_model(sources , weights , sigma , 10, sqrt(length) , 6 );
sum1 = sum(fgt_predict(sources , xc , Ak , sigma , 10 ));
sum1 = sum1/sqrt(2*pi*2*s*s);
data2(theta/3 + 1) = sum1;
if (sum1 > maxinf)
maxinf = sum1;
maxtheta = theta;
end
end
InvariantImage2 = cos(maxtheta*pi/180)*X1 + sin(maxtheta*pi/180)*X2;
Assume the Fast Gauss Transform is correct.
I don't know whether this makes any difference as it is more than a month now, but the blockiness and different information potential plot is simply caused by compression of the used image. You can't expect to be getting same results using this image as they had, because they have used raw, high resolution uncompressed version of it. I have to say I am fairly impressed with your results, especially with implementing the information potential. That thing went over my head a little.
John.