What is the best way to determine if and what logo is present on an image - image-recognition

I'm currently trying to implement a tool to separate images based on what logo is present on the image. I've tried using opencv to this end, while I did succeed in roughly getting out the logo in most cases, it's highly inaccurate and the amount of "good" matches versus the amount of total matches makes it hard to use it effectively for my purposes. I've included the code I'm using right now below.
def getScore(logo, document, outputDirectory):
MIN_MATCH_COUNT = 8
img1 = cv2.imread(logo, cv2.IMREAD_GRAYSCALE) # queryImage
img2 = cv2.imread(document, cv2.IMREAD_GRAYSCALE) # trainImage
# Initiate SIFT detector
sift = cv2.SIFT_create(
0,
4,
0.04,
10,
1.6
)
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks = 50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des1,des2,k=2)
# store all the good matches as per Lowe's ratio test.
good = []
for m,n in matches:
if m.distance < 0.7*n.distance:
good.append(m)
if len(good)>MIN_MATCH_COUNT:
src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
matchesMask = mask.ravel().tolist()
h,w = img1.shape
pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
dst = cv2.perspectiveTransform(pts,M)
img2 = cv2.polylines(img2,[np.int32(dst)],True,(0,255,0),3, cv2.LINE_AA)
#cv2.imwrite(outputDirectory + str(random.randint(0, 3000)) + ".jpg", img2)
else:
matchesMask = None
return len(good), len(matches)
I'm looking for a robust way to determine if and what logo is present, preferably with a way to judge its confidence (The logo's that should be recognized are contained in a directory, I'm not expecting it to find it over the internet). Is there any way to do this with a tool like opencv, which I'm using now, or is there no other way to achieve this than rely on tools like customvision from Azure?

Related

How to count the number of occurrences of pixel intensities in an image without using for loop?

I am to writing a script for histogram equalisation and I need to work on each RGB plane separately. In the first step I count the number of occurrences of each intensity value in the range 0-255. As far as I know, using for loops makes MATLAB code super slow. This is what I came up with :
org_image = imread('image.jpg')
tot_pixel = size(org_image,1) * size(org_image,2)
R = org_image(:,:,1);
G = org_image(:,:,2);
B = org_image(:,:,3);
[R_val_ocurr,R_unique_val] = histcounts(R);
[G_val_ocurr,G_unique_val] = histcounts(G);
[B_val_ocurr,B_unique_val] = histcounts(B);
Now to have an array of size 256,with each index holding number of pixels corresponding to it what should my next step be? I'm trying to write down my logic :
for i = 0 to 255
if i is in R_unique_val
hist[i] = R_val_ocurr[i]
else
hist[i] = 0
How to correctly and efficiently write this in MATLAB?
after you have separete the channel you can use imhist to get the histogram of each channel:
[NumberOfPixelR, intensity] = imhist(R);
[NumberOfPixelG, intensity] = imhist(G);
[NumberOfPixelB, intensity] = imhist(B);

Normalization of an image dataset after processing

Edited With More Clear Explanation
I am trying to normalize images in a dataset after processing them, but min, max, ranges change (for example one image is between the range [0.38,5.26] and another one is [-0.44, 3.65]) after the processing and normalizing them between [0,1] with the common normalization approach but it causes inconsistency between images.
imagesPath = '/home/berkanhoke/Datasets/Freiburg/Org/Night/';
outFolderPath = '/home/berkanhoke/Datasets/Freiburg/Maddern/Night';
imageSet = dir(strcat(imagesPath,'*.jpeg'));
imageCount = length(imageSet);
for i = 1:imageCount
fileName = imageSet(i).name;
filePath = strcat(imagesPath,fileName);
img = double(imread(filePath));
I_old = maddern(img,0.3975);
I_new = (I_old - min(I_old(:)))/(max(I_old(:)) - min(I_old(:)));
writePath = strcat(outFolderPath,fileName);
imwrite(I_new,writePath,'jpeg');
end
The function I use for processing is the following:
function [ ii_image ] = maddern( image, alpha )
ii_image = 0.5 + log(image(:,:,2)+1)...
- alpha * log(image(:,:,3)+1)...
- (1-alpha) * log(image(:,:,1)+1);
which is based on the paper: http://www.robots.ox.ac.uk/~mobile/Papers/2014ICRA_maddern.pdf
I tried normalized with respect to min/max of the whole dataset, but it did not work and I got weird results. How can I normalize the images by keeping the images consistent after processing?
The problem is that when you do min/max function in MATLAB, it only does it in one dimension. So if you have a 256x256 image, when you do min(image), you get a 1x256 vector. And when you divide this you are doing (256x256)/(1x256) = 256x1
To fix this, you'll want to do min(min(image))

Get the 5 most similar images

I would like to compare which are the 5 most similar images to an input image.
To do this I thought to use the SIFT (VLFeat library) and compare the respective descriptors.
So I use the vl_ubcmatch (doc here) method to calculate the similarity measurement between the images.
This is the code:
path_dir = './img/';
imgs = dir(path_dir);
imgs = imgs(3 : end);
numImgs = size(imgs);
numImgs = numImgs(1);
path1 = './img/car01.jpg';
Ia = imread(path1);
Ia = single(rgb2gray(Ia));
[fa, da] = vl_sift(Ia);
results = struct;
m = 0;
j = 1; % indice dell'img (del for)
for img = imgs'
path = strcat(path_dir, img.name);
if(strcmp(path1, path) == 0)
Ib = imread(path);
Ib = single(rgb2gray(Ib));
[fb, db] = vl_sift(Ib);
[matches, scores] = vl_ubcmatch(da, db);
s = sum(scores);
[r, c] = size(scores);
m = s ./ c;
results(j).measure = m;
results(j).img = path;
j = j + 1;
end
end
As you can see from the code, I thought I would use the mean as a measure of similarity but the results I get are not satisfactory (for example, it tells me that the input image of a cup is more similar to a tree than another cup).
According to you, is it better to have more equal descriptors but with low similar or less similar descriptors but with greater similarity?
I have 50 images of 5 different categories (cups, trees, people, tables and cars) and, given an image as input, the program will return the 5 most similar images to it and preferably belonging to the same category.
What measurement can I use instead of the mean to get a more precise classification?
Thanks!
According to your code you measure the similarity between image (Ia) and all other images (Ib). Therefore you compare the SIFT descriptors of Ia with those of all Ib's - which gives you a list of feature matches for each image pair (matches) and the Euclidean distance of each feature pair (scores).
Now using the mean of all scores of an image pair as a measure of similarity is not a very robust approach because an image pair with only one feature match could (by chance) lead to a better "similarity" than an image pair with many features - which I guess is an unrealistic solution for your task.
Concerning your question it is always better to have meaningful/robust descriptors, even if there are only a few (of course the more the better!), than having a lot of meaningless descriptors.
Proposal: why don't you just count the number of inliers (= number of feature matches for each image pair, numel(matches))?
With this it should give more inliers between images of the same object than different objects, so taking those pairs which have the 5 most inliers should be the most similar ones.
If you just want to distinguish a cup from a tree it should work. If your classification task is getting more difficult and you need to distinguish different types of trees, SIFT is not the best algorithm to use. A learning approach will give better results... but depends on your task.

connected component analysis error

I'm trying to do connected component analysis.but I'm getting error. I need the vertebral body ;but I'm getting some other objects.
Image is:
Result is:
im= imread('im.bmp');
figure,imshow(im);
K1=imadjust(im);
figure, imshow(K1), title('After Adjustment Image')
threshold = graythresh(K1);
originalImage = im2bw(K1, threshold);
originalImage = bwareaopen(originalImage,100);
se = strel('disk', 2); %# structuring element
closeBW = imclose(originalImage,se);
figure,imshow(closeBW);
CC = bwconncomp(closeBW);
L = labelmatrix(CC);
L2 = bwlabel(K1);
figure, imshow(label2rgb(L));
Segmentation isn't my area, so I'm not sure what the best approach is. Here are a couple heuristic ideas I came up with:
Discard regions that are too big or too small.
It looks like you can expect a certain size from the vertebra.
regionIdxs = unique(L(:));
regionSizes = accumarray(L(:)+1,1);
If we look at regionSizes, we see the region sizes in pixels:
213360
919
887
810
601
695
14551
684
1515
414
749
128
173
26658
The regions you want (rows 2-6) are on the range 500-1000. We can probably safely discard regions that are <200 or >2000 in size.
goodRegionIdx = (regionSizes>200) & (regionSizes<2000);
regionIdxs = regionIdxs(goodRegionIdx);
regionSizes = regionSizes(goodRegionIdx);
Look at the image moments of the desired regions.
The eigenvalues of the covariance matrix of a distribution characterize its size in its widest direction and its size perpendicular to that direction. We are looking for fat disk-shapes, so we can expect a big eigenvalue and a medium-sized eigenvalue.
[X,Y] = meshgrid(1:size(L,2),1:size(L,1));
for i = 1:length(regionIdxs)
idx = regionIdxs(i);
region = L==idx;
totalmass = sum(region(:));
Ex(i) = sum( X(1,:).*sum(region,1) ) / totalmass;
Ey(i) = sum( Y(:,1).*sum(region,2)) / totalmass;
Exy(i) = sum(sum( X.*Y.*region )) / totalmass;
Exx(i) = sum(sum( X.*X.*region )) / totalmass;
Eyy(i) = sum(sum( Y.*Y.*region )) / totalmass;
Varx(i) = Exx(i) - Ex(i)^2;
Vary(i) = Eyy(i) - Ey(i)^2;
Varxy(i) = Exy(i) - Ex(i)*Ey(i);
Cov = [Varx(i) Varxy(i); Varxy(i) Vary(i)];
eig(i,:) = eigs(Cov);
end
If we look at the eigenvalues eig:
177.6943 30.8029
142.4484 35.9089
164.6374 26.2081
112.6501 22.7570
138.1674 24.1569
89.8082 58.8964
284.2280 96.9304
83.3226 15.9994
113.3122 33.7410
We are only interested in rows 1-5, which have eigenvalues on the range 100-200 for the largest and below 50 the the second. If we discard these, get the following regions:
goodRegionIdx = (eig(:,1)>100) & (eig(:,1)<200) & (eig(:,2)<50);
regionIdxs = regionIdxs(goodRegionIdx);
We can plot the regions by using logical OR |.
finalImage = false(size(L));
for i = 1:length(regionIdxs)
finalImage = finalImage | (L==regionIdxs(i) );
end
We seem to get one false positive. Looking at the ratio of the eigenvalues eig(:,1)./eig(:,2) is one idea but that seem to be a little problematic too.
You could try some sort of outlier detection like RANSAC to try and eliminate the region you don't want, since true vertebra tend to be spatially aligned along a line or curve.
I'm not sure what else to suggest. You may have to look into more advanced segmentation methods like machine learning if you can't find another way to discriminate the good from the bad. Having a stricter preprocessing method might be one thing to try.
Hope that helps.

Using PCA before classification

I am using PCA to reduce number of features before training Random Forest. I first used around 70 principal components out of 125 which were around 99% of the energy (according to eigen values). I got much worse results after training Random Forests with new transformed features. After that I used all the principal components and I got the same results as when I used 70. This made no sense to me since that is the same feature space only in difirent base (the space has only be rotated so that should not affect the boundary).
Does anyone have the idea what may be the problem here?
Here is my code
clc;
clear all;
close all;
load patches_training_256.txt
load patches_testing_256.txt
Xtr = patches_training_256(:,2:end);
Xtr = Xtr';
Ytr = patches_training_256(:,1);
Ytr = Ytr';
Xtest = patches_testing_256(:,2:end);
Xtest = Xtest';
Ytest = patches_testing_256(:,1);
Ytest = Ytest';
data_size = size(Xtr, 2);
feature_size = size(Xtr, 1);
mu = mean(Xtr,2);
sigma = std(Xtr,0,2);
mu_mat = repmat(mu,1,data_size);
sigma_mat = repmat(sigma,1,data_size);
cov = ((Xtr - mu_mat)./sigma_mat) * ((Xtr - mu_mat)./sigma_mat)' / data_size;
[v d] = eig(cov);
%[U S V] = svd(((Xtr - mu_mat)./sigma_mat)');
k = 124;
%Ureduce = U(:,1:k);
%XtrReduce = ((Xtr - mu_mat)./sigma_mat) * Ureduce;
XtrReduce = v'*((Xtr - mu_mat)./sigma_mat);
B = TreeBagger(300, XtrReduce', Ytr', 'Prior', 'Empirical', 'NPrint', 1);
data_size_test = size(Xtest, 2);
mu_test = repmat(mu,1,data_size_test);
sigma_test = repmat(sigma,1,data_size_test);
XtestReduce = v' * ((Xtest - mu_test) ./ sigma_test);
Ypredict = predict(B,XtestReduce');
error = sum(Ytest' ~= (double(cell2mat(Ypredict)) - 48))
Random forest heavily depends on the choice of the base. It is not a linear model, which is (up to normalization) rotation invariant, RF completely changes behaviour once you "rotate the space". The reason behind it lies in the fact that it uses decision trees as base classifiers which analyze each feature completely independently, so as the result it fails to find any linear combination of features. Once you rotate your space you change "meaning" of features. There is nothing wrong with that, simply tree based classifiers are rather bad choice to apply after such transformations. Use features selection methods instead (methods which select which features are valuable without creating any linear combinations). In fact, RFs themselves can be used for such task due to their internal "feature importance" computation,
There is already a matlab function princomp which would do pca for you. I would suggest not to fall in numerical error loops. They have done it for us..:)