My idea is simple here. I am using mexopencv and trying to see whether there is any object present in my current that matches with any image stored in my database.I am using OpenCV DescriptorMatcher function to train my images.
Here is a snippet, I am wishing to build on top of this, which is one to one one image matching using mexopencv, and can also be extended for image stream.
function hello
detector = cv.FeatureDetector('ORB');
extractor = cv.DescriptorExtractor('ORB');
matcher = cv.DescriptorMatcher('BruteForce-Hamming');
train = [];
for i=1:3
train(i).img = [];
train(i).points = [];
train(i).features = [];
end;
train(1).img = imread('D:\test\1.jpg');
train(2).img = imread('D:\test\2.png');
train(3).img = imread('D:\test\3.jpg');
for i=1:3
frameImage = train(i).img;
framePoints = detector.detect(frameImage);
frameFeatures = extractor.compute(frameImage , framePoints);
train(i).points = framePoints;
train(i).features = frameFeatures;
end;
for i = 1:3
boxfeatures = train(i).features;
matcher.add(boxfeatures);
end;
matcher.train();
camera = cv.VideoCapture;
pause(3);%Sometimes necessary
window = figure('KeyPressFcn',#(obj,evt)setappdata(obj,'flag',true));
setappdata(window,'flag',false);
while(true)
sceneImage = camera.read;
sceneImage = rgb2gray(sceneImage);
scenePoints = detector.detect(sceneImage);
sceneFeatures = extractor.compute(sceneImage,scenePoints);
m = matcher.match(sceneFeatures);
%{
%Comments in
img_no = m.imgIdx;
img_no = img_no(1);
%I am planning to do this based on the fact that
%on a perfect match imgIdx a 1xN will be filled
%with the index of the training
%example 1,2 or 3
objPoints = train(img_no+1).points;
boxImage = train(img_no+1).img;
ptsScene = cat(1,scenePoints([m.queryIdx]+1).pt);
ptsScene = num2cell(ptsScene,2);
ptsObj = cat(1,objPoints([m.trainIdx]+1).pt);
ptsObj = num2cell(ptsObj,2);
%This is where the problem starts here, assuming the
%above is correct , Matlab yells this at me
%index exceeds matrix dimensions.
end [H,inliers] = cv.findHomography(ptsScene,ptsObj,'Method','Ransac');
m = m(inliers);
imgMatches = cv.drawMatches(sceneImage,scenePoints,boxImage,boxPoints,m,...
'NotDrawSinglePoints',true);
imshow(imgMatches);
%Comment out
%}
flag = getappdata(window,'flag');
if isempty(flag) || flag, break; end
pause(0.0001);
end
Now the issue here is that imgIdx is a 1xN matrix , and it contains the index of different training indices, which is obvious. And only on a perfect match is the matrix imgIdx is completely filled with the matched image index. So, how do I use this matrix to pick the right image index. Also
in these two lines, I get the error of index exceeding matrix dimension.
ptsObj = cat(1,objPoints([m.trainIdx]+1).pt);
ptsObj = num2cell(ptsObj,2);
This is obvious since while debugging I saw clearly that the size of m.trainIdx is greater than objPoints, i.e I am accessing points which I should not, hence index exceeds
There is scant documentation on use of imgIdx , so anybody who has knowledge on this subject, I need help.
These are the images I used.
Image1
Image2
Image3
1st update after #Amro's response:
With the ratio of min distance to distance at 3.6 , I get the following response.
With the ratio of min distance to distance at 1.6 , I get the following response.
I think it is easier to explain with code, so here it goes :)
%% init
detector = cv.FeatureDetector('ORB');
extractor = cv.DescriptorExtractor('ORB');
matcher = cv.DescriptorMatcher('BruteForce-Hamming');
urls = {
'http://i.imgur.com/8Pz4M9q.jpg?1'
'http://i.imgur.com/1aZj0MI.png?1'
'http://i.imgur.com/pYepuzd.jpg?1'
};
N = numel(urls);
train = struct('img',cell(N,1), 'pts',cell(N,1), 'feat',cell(N,1));
%% training
for i=1:N
% read image
train(i).img = imread(urls{i});
if ~ismatrix(train(i).img)
train(i).img = rgb2gray(train(i).img);
end
% extract keypoints and compute features
train(i).pts = detector.detect(train(i).img);
train(i).feat = extractor.compute(train(i).img, train(i).pts);
% add to training set to match against
matcher.add(train(i).feat);
end
% build index
matcher.train();
%% testing
% lets create a distorted query image from one of the training images
% (rotation+shear transformations)
t = -pi/3; % -60 degrees angle
tform = [cos(t) -sin(t) 0; 0.5*sin(t) cos(t) 0; 0 0 1];
img = imwarp(train(3).img, affine2d(tform)); % try all three images here!
% detect fetures in query image
pts = detector.detect(img);
feat = extractor.compute(img, pts);
% match against training images
m = matcher.match(feat);
% keep only good matches
%hist([m.distance])
m = m([m.distance] < 3.6*min([m.distance]));
% sort by distances, and keep at most the first/best 200 matches
[~,ord] = sort([m.distance]);
m = m(ord);
m = m(1:min(200,numel(m)));
% naive classification (majority vote)
tabulate([m.imgIdx]) % how many matches each training image received
idx = mode([m.imgIdx]);
% matches with keypoints belonging to chosen training image
mm = m([m.imgIdx] == idx);
% estimate homography (used to locate object in query image)
ptsQuery = num2cell(cat(1, pts([mm.queryIdx]+1).pt), 2);
ptsTrain = num2cell(cat(1, train(idx+1).pts([mm.trainIdx]+1).pt), 2);
[H,inliers] = cv.findHomography(ptsTrain, ptsQuery, 'Method','Ransac');
% show final matches
imgMatches = cv.drawMatches(img, pts, ...
train(idx+1).img, train(idx+1).pts, ...
mm(logical(inliers)), 'NotDrawSinglePoints',true);
% apply the homography to the corner points of the training image
[h,w] = size(train(idx+1).img);
corners = permute([0 0; w 0; w h; 0 h], [3 1 2]);
p = cv.perspectiveTransform(corners, H);
p = permute(p, [2 3 1]);
% show where the training object is located in the query image
opts = {'Color',[0 255 0], 'Thickness',4};
imgMatches = cv.line(imgMatches, p(1,:), p(2,:), opts{:});
imgMatches = cv.line(imgMatches, p(2,:), p(3,:), opts{:});
imgMatches = cv.line(imgMatches, p(3,:), p(4,:), opts{:});
imgMatches = cv.line(imgMatches, p(4,:), p(1,:), opts{:});
imshow(imgMatches)
The result:
Note that since you did not post any testing images (in your code you are taking input from the webcam), I created one by distorting one the training images, and using it as a query image. I am using functions from certain MATLAB toolboxes (imwarp and such), but those are non-essential to the demo and you could replace them with equivalent OpenCV ones...
I must say that this approach is not the most robust one.. Consider using other techniques such as the bag-of-word model, which OpenCV already implements.
Related
I'm trying to write an image compression script in MATLAB using multilayer 3D DWT(color image). along the way, I want to apply thresholding on coefficient matrices, both global and local thresholds.
I like to use the formula below to calculate my local threshold:
where sigma is variance and N is the number of elements.
Global thresholding works fine; but my problem is that the calculated local threshold is (most often!) greater than the maximum band coefficient, therefore no thresholding is applied.
Everything else works fine and I get a result too, but I suspect the local threshold is miscalculated. Also, the resulting image is larger than the original!
I'd appreciate any help on the correct way to calculate the local threshold, or if there's a pre-set MATLAB function.
here's an example output:
here's my code:
clear;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%% COMPRESSION %%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% read base image
% dwt 3/5-L on base images
% quantize coeffs (local/global)
% count zero value-ed coeffs
% calculate mse/psnr
% save and show result
% read images
base = imread('circ.jpg');
fam = 'haar'; % wavelet family
lvl = 3; % wavelet depth
% set to 1 to apply global thr
thr_type = 0;
% global threshold value
gthr = 180;
% convert base to grayscale
%base = rgb2gray(base);
% apply dwt on base image
dc = wavedec3(base, lvl, fam);
% extract coeffs
ll_base = dc.dec{1};
lh_base = dc.dec{2};
hl_base = dc.dec{3};
hh_base = dc.dec{4};
ll_var = var(ll_base, 0);
lh_var = var(lh_base, 0);
hl_var = var(hl_base, 0);
hh_var = var(hh_base, 0);
% count number of elements
ll_n = numel(ll_base);
lh_n = numel(lh_base);
hl_n = numel(hl_base);
hh_n = numel(hh_base);
% find local threshold
ll_t = ll_var * (sqrt(2 * log2(ll_n)));
lh_t = lh_var * (sqrt(2 * log2(lh_n)));
hl_t = hl_var * (sqrt(2 * log2(hl_n)));
hh_t = hh_var * (sqrt(2 * log2(hh_n)));
% global
if thr_type == 1
ll_t = gthr; lh_t = gthr; hl_t = gthr; hh_t = gthr;
end
% count zero values in bands
ll_size = size(ll_base);
lh_size = size(lh_base);
hl_size = size(hl_base);
hh_size = size(hh_base);
% count zero values in new band matrices
ll_zeros = sum(ll_base==0,'all');
lh_zeros = sum(lh_base==0,'all');
hl_zeros = sum(hl_base==0,'all');
hh_zeros = sum(hh_base==0,'all');
% initiate new matrices
ll_new = zeros(ll_size);
lh_new = zeros(lh_size);
hl_new = zeros(lh_size);
hh_new = zeros(lh_size);
% apply thresholding on bands
% if new value < thr => 0
% otherwise, keep the previous value
for id=1:ll_size(1)
for idx=1:ll_size(2)
if ll_base(id,idx) < ll_t
ll_new(id,idx) = 0;
else
ll_new(id,idx) = ll_base(id,idx);
end
end
end
for id=1:lh_size(1)
for idx=1:lh_size(2)
if lh_base(id,idx) < lh_t
lh_new(id,idx) = 0;
else
lh_new(id,idx) = lh_base(id,idx);
end
end
end
for id=1:hl_size(1)
for idx=1:hl_size(2)
if hl_base(id,idx) < hl_t
hl_new(id,idx) = 0;
else
hl_new(id,idx) = hl_base(id,idx);
end
end
end
for id=1:hh_size(1)
for idx=1:hh_size(2)
if hh_base(id,idx) < hh_t
hh_new(id,idx) = 0;
else
hh_new(id,idx) = hh_base(id,idx);
end
end
end
% count zeros of the new matrices
ll_new_size = size(ll_new);
lh_new_size = size(lh_new);
hl_new_size = size(hl_new);
hh_new_size = size(hh_new);
% count number of zeros among new values
ll_new_zeros = sum(ll_new==0,'all');
lh_new_zeros = sum(lh_new==0,'all');
hl_new_zeros = sum(hl_new==0,'all');
hh_new_zeros = sum(hh_new==0,'all');
% set new band matrices
dc.dec{1} = ll_new;
dc.dec{2} = lh_new;
dc.dec{3} = hl_new;
dc.dec{4} = hh_new;
% count how many coeff. were thresholded
ll_zeros_diff = ll_new_zeros - ll_zeros;
lh_zeros_diff = lh_zeros - lh_new_zeros;
hl_zeros_diff = hl_zeros - hl_new_zeros;
hh_zeros_diff = hh_zeros - hh_new_zeros;
% show coeff. matrices vs. thresholded version
figure
colormap(gray);
subplot(2,4,1); imagesc(ll_base); title('LL');
subplot(2,4,2); imagesc(lh_base); title('LH');
subplot(2,4,3); imagesc(hl_base); title('HL');
subplot(2,4,4); imagesc(hh_base); title('HH');
subplot(2,4,5); imagesc(ll_new); title({'LL thr';ll_zeros_diff});
subplot(2,4,6); imagesc(lh_new); title({'LH thr';lh_zeros_diff});
subplot(2,4,7); imagesc(hl_new); title({'HL thr';hl_zeros_diff});
subplot(2,4,8); imagesc(hh_new); title({'HH thr';hh_zeros_diff});
% idwt to reconstruct compressed image
cmp = waverec3(dc);
cmp = uint8(cmp);
% calculate mse/psnr
D = abs(cmp - base) .^2;
mse = sum(D(:))/numel(base);
psnr = 10*log10(255*255/mse);
% show images and mse/psnr
figure
subplot(1,2,1);
imshow(base); title("Original"); axis square;
subplot(1,2,2);
imshow(cmp); colormap(gray); axis square;
msg = strcat("MSE: ", num2str(mse), " | PSNR: ", num2str(psnr));
title({"Compressed";msg});
% save image locally
imwrite(cmp, 'compressed.png');
I solved the question.
the sigma in the local threshold formula is not variance, it's the standard deviation. I applied these steps:
used stdfilt() std2() to find standard deviation of my coeff. matrices (thanks to #Rotem for pointing this out)
used numel() to count the number of elements in coeff. matrices
this is a summary of the process. it's the same for other bands (LH, HL, HH))
[c, s] = wavedec2(image, wname, level); %apply dwt
ll = appcoeff2(c, s, wname); %find LL
ll_std = std2(ll); %find standard deviation
ll_n = numel(ll); %find number of coeffs in LL
ll_t = ll_std * (sqrt(2 * log2(ll_n))); %local the formula
ll_new = ll .* double(ll > ll_t); %thresholding
replace the LL values in c in a for loop
reconstruct by applying IDWT using waverec2
this is a sample output:
I am trying this code to generate a freeman chain code based on the code in https://www.crisluengo.net/archives/324 but it uses the DIPimage. Therefore, does someone has an idea how to by pass the dip_array function?
Code:
clc;
clear all;
Image = rgb2gray(imread('https://upload-icon.s3.us-east-2.amazonaws.com/uploads/icons/png/1606078271536061993-512.png'));
BW = imbinarize(Image);
BW = imfill(BW,'holes');
BW = bwareaopen(BW, 100);
BW = padarray(BW,60,60,'both')
BW = imcomplement(BW);
imshow(BW)
[B,L] = bwboundaries(BW,'noholes');
%%%%%%%https://www.crisluengo.net/archives/324%%%%%
directions = [ 1, 0
1,-1
0,-1
-1,-1
-1, 0
-1, 1
0, 1
1, 1];
indx = find(dip_array(img),1)-1;
sz = imsize(img);
start = [floor(indx/sz(2)),0];
start(2) = indx-(start(1)*sz(2));
cc = []; % The chain code
coord = start; % Coordinates of the current pixel
dir = 1; % The starting direction
while 1
newcoord = coord + directions(dir+1,:);
if all(newcoord>=0) && all(newcoord<sz) ...
&& img(newcoord(1),newcoord(2))
cc = [cc,dir];
coord = newcoord;
dir = mod(dir+2,8);
else
dir = mod(dir-1,8);
end
if all(coord==start) && dir==1 % back to starting situation
break;
end
end
I don't want to translate the whole code, I don't have time right now, but I can give a few pointers:
dip_array(img) extracts the MATLAB array with the pixel values that is inside the dip_image object img. If you use BW as input image here, you can simply remove the call to dip_array: indx = find(BW,1)-1.
imsize(img) returns the sizes of the image img. The MATLAB function size is equivalent (in this particular case).
Dimensions for the dip_image object are different from those for MATLAB arrays: they are indexed as img(x,y), whereas MATLAB arrays are indexed as BW(y,x).
Indices for dip_image objects start at 0, not at 1 as MATLAB arrays do.
These last two points change how you'd compute start. I think it'd be something like this:
indx = find(BW,1);
sz = size(BW);
start = [1,floor((indx-1)/sz(2))+1];
start(1) = indx-((start(2)-1)*sz(1));
But it's easier to use ind2sub (not sure why I did the explicit calculation in the blog post):
indx = find(BW,1);
sz = size(BW);
start = ind2sub(sz,indx);
You also probably want to swap the two columns of directions for the same reason, and change all(newcoord>=0) && all(newcoord<sz) into all(newcoord>0) && all(newcoord<=sz).
I = imread('Sub1.png');
figure, imshow(I);
I = imcomplement(I);
I = double(I)/255;
I = adapthisteq(I,'clipLimit',0.0003,'Distribution','exponential');
k = 12;
beta = 2;
maxIter = 100;
for i=1:length(beta)
[seg,prob,mu,sigma,it(i)] = ICM(I, k, beta(i), maxIter,5);
pr(i) = prob(end);
hold on;
end
figure, imshow(seg,[]);
and ICM function is defined as
function [segmented_image,prob,mu,sigma,iter] = ICM(image, k, beta, max_iterations, neigh)
[width, height, bands] = size(image);
image = imstack2vectors(image);
segmented_image = init(image,k,1);
clear c;
iter = 0;
seg_old = segmented_image;
while(iter < max_iterations)
[mu, sigma] = stats(image, segmented_image, k);
E1 = energy1(image,mu,sigma,k);
E2 = energy2(segmented_image, beta, width, height, k);
E = E1 + E2;
[p2,~] = min(E2,[],2);
[p1,~] = min(E1,[],2);
[p,segmented_image] = min(E,[],2);
prob(iter+1) = sum(p);
%find mismatch with previous step
[c,~] = find(seg_old~=segmented_image);
mismatch = (numel(c)/numel(segmented_image))*100;
if mismatch<0.1
iter
break;
end
iter = iter + 1;
seg_old = segmented_image;
end
segmented_image = reshape(segmented_image,[width height]);
end
Output of my algorithm is a logical matrix (seg) of size 305-by-305. When I use
imshow(seg,[]);
I am able to display the image. It shows different component with varying gray value. But bwlabel returns 1. I want to display the connected components. I think bwlabel thresholds the image to 1. unique(seg) returns values 1 to 10 since number of classes used in k-means is 10. I used
[label n] = bwlabel(seg);
RGB = label2rgb(label);
figure, imshow(RGB);
I need all the ellipse-like structures which are in between the two squares close to the middle of the image. I don't know the number of classes present in it.
Input image:
Ground truth:
My output:
If you want to explode the label image to different connected components you need to use a loop to extract labels for each class and sum label images to get the out label image.
u = unique(seg(:));
out = zeros(size(seg));
num_objs = 0;
for k = 1: numel(u)
mask = seg==u(k);
[L,N] = bwlabel(mask);
L(mask) = L(mask) + num_objs;
out = out + L;
num_objs = num_objs + N ;
end
mp = jet(num_objs);
figure,imshow(out,mp)
Something like this is produced:
I have tried to do everything out of scratch. I wish it is of some help.
I have a treatment chain that get at first contours with parameters tuned on a trial-and-error basis, I confess. The last "image" is given at the bottom ; with it, you can easily select the connected components and do for example a reconstruction by markers using "imreconstruct" operator.
clear all;close all;
I = imread('C:\Users\jean-marie.becker\Desktop\imagesJPG10\spinalchord.jpg');
figure,imshow(I);
J = I(:,:,1);% select the blue channel because jpg image
J=double(J<50);% I haven't inverted the image
figure, imshow(J);
se = strel('disk',5);
J=J-imopen(J,se);
figure, imshow(J);
J=imopen(J,ones(1,15));% privilegizes long horizontal strokes
figure, imshow(J);
K=imdilate(J,ones(20,1),'same');
% connects verticaly not-to-far horizontal "segments"
figure, imshow(K);
I am doing a real-time people detection using HOG-LBP descriptor and using a sliding window approach for the detector also LibSVM for the classifier. However, after classifier I never get multiple detected people, sometimes is only 1 or might be none. I guess I have a problem on my classification step. Here is my code on classification:
label = ones(length(featureVector),1);
P = cell2mat(featureVector);
% each row of P' correspond to a window
% classifying each window
[~, predictions] = svmclassify(P', label,model);
% set the threshold for getting multiple detection
% the threshold value is 0.7
get_detect = predictions.*[predictions>0.6];
% the the value after sorted
[r,c,v]= find(get_detect);
%% Creating the bounding box for detection
for ix=1:length(r)
rects{ix}= boxPoint{r(ix)};
end
if (isempty(rects))
rects2=[];
else
rects2 = cv.groupRectangles(rects,3,'EPS',0.35);
end
for i = 1:numel(rects2)
rectangle('Position',[rects2{i}(1),rects2{i}(2),64,128], 'LineWidth',2,'EdgeColor','y');
end
For the whole my code, I have posted here : [HOG with SVM] (sliding window technique for multiple people detection)
I really need a help for it. Thx.
If you have problems wiith the sliding window, you can use this code:
topLeftRow = 1;
topLeftCol = 1;
[bottomRightCol bottomRightRow d] = size(im);
fcount = 1;
% this for loop scan the entire image and extract features for each sliding window
for y = topLeftCol:bottomRightCol-wSize(2)
for x = topLeftRow:bottomRightRow-wSize(1)
p1 = [x,y];
p2 = [x+(wSize(1)-1), y+(wSize(2)-1)];
po = [p1; p2];
img = imcut(po,im);
featureVector{fcount} = HOG(double(img));
boxPoint{fcount} = [x,y];
fcount = fcount+1;
x = x+1;
end
end
lebel = ones(length(featureVector),1);
P = cell2mat(featureVector);
% each row of P' correspond to a window
[~, predictions] = svmclassify(P',lebel,model); % classifying each window
[a, indx]= max(predictions);
I implemented a method for removing shadows based on invariant color features found in the paper Entropy Minimization for Shadow Removal. My implementation seems to be yielding similar computational results sometimes, but they are always off, and my grayscale image is blocky, maybe as a result of incorrectly taking the geometric mean.
Here is an example plot of the information potential from the horse image in the paper as well as my invariant image. Multiply the x-axis by 3 to get theta(which goes from 0 to 180):
And here is the grayscale Image my code outputs for the correct maximum theta (mine is off by 10):
You can see the blockiness that their image doesn't have:
Here is their information potential:
When dividing by the geometric mean, I have tried using NaN and tresholding the image so the smallest possible value is .01, but it doesn't seem to change my output.
Here is my code:
I = im2double(imread(strname));
[m,n,d] = size(I);
I = max(I, .01);
chrom = zeros(m, n, 3, 'double');
for i = 1:m
for j = 1:n
% if ((I(i,j,1)*I(i,j,2)*I(i,j,3))~= 0)
chrom(i,j, 1) = I(i,j,1)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
chrom(i,j, 2) = I(i,j,2)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
chrom(i,j, 3) = I(i,j,3)/((I(i,j,1)*I(i,j,2)*I(i,j, 3))^(1/3));
% else
% chrom(i,j, 1) = 1;
% chrom(i,j, 2) = 1;
% chrom(i,j, 3) = 1;
% end
end
end
p1 = mat2gray(log(chrom(:,:,1)));
p2 = mat2gray(log(chrom(:,:,2)));
p3 = mat2gray(log(chrom(:,:,3)));
X1 = mat2gray(p1*1/(sqrt(2)) - p2*1/(sqrt(2)));
X2 = mat2gray(p1*1/(sqrt(6)) + p2*1/(sqrt(6)) - p3*2/(sqrt(6)));
maxinf = 0;
maxtheta = 0;
data2 = zeros(1, 61);
for theta = 0:3:180
M = X1*cos(theta*pi/180) - X2*sin(theta*pi/180);
s = sqrt(std2(X1)^(2)*cos(theta*pi/180) + std2(X2)^(2)*sin(theta*pi/180));
s = abs(1.06*s*((m*n)^(-1/5)));
[m, n] = size(M);
length = m*n;
sources = zeros(1, length, 'double');
count = 1;
for x=1:m
for y = 1:n
sources(1, count) = M(x , y);
count = count + 1;
end
end
weights = ones(1, length);
sigma = 2*s;
[xc , Ak] = fgt_model(sources , weights , sigma , 10, sqrt(length) , 6 );
sum1 = sum(fgt_predict(sources , xc , Ak , sigma , 10 ));
sum1 = sum1/sqrt(2*pi*2*s*s);
data2(theta/3 + 1) = sum1;
if (sum1 > maxinf)
maxinf = sum1;
maxtheta = theta;
end
end
InvariantImage2 = cos(maxtheta*pi/180)*X1 + sin(maxtheta*pi/180)*X2;
Assume the Fast Gauss Transform is correct.
I don't know whether this makes any difference as it is more than a month now, but the blockiness and different information potential plot is simply caused by compression of the used image. You can't expect to be getting same results using this image as they had, because they have used raw, high resolution uncompressed version of it. I have to say I am fairly impressed with your results, especially with implementing the information potential. That thing went over my head a little.
John.