I'm trying to perform object detection with RCNN on my own dataset following the tutorial on Matlab webpage. Based on the picture below:
I'm supposed to put image paths in the first column and the bounding box of each object in the following columns. But in each of my images, there is more than one object of each kind. For example there are 20 vehicles in one image. How should I deal with that? Should I create a separate row for each instance of vehicle in an image?
The example found on the website finds the pixel neighbourhood with the largest score and draws a bounding box around that region in the image. When you have multiple objects now, that complicates things. There are two approaches that you can use to facilitate finding multiple objects.
Find all bounding boxes with scores that surpass some global threshold.
Find the bounding box with the largest score and find those bounding boxes that surpass a percentage of this threshold. This percentage is arbitrary but from experience and what I have seen in practice, people tend to choose between 80% to 95% of the largest score found in the image. This will of course give you false positives if you submit an image as the query with objects not trained to be detected by the classifier but you will have to implement some more post-processing logic on your end.
An alternative approach would be to choose some value k and you would display the top k bounding boxes associated with the k highest scores. This of course requires that you know what the value of k is before hand and it will always assume that you have found an object in the image like the second approach.
In addition to the above logic, the approach that you state where you need to create a separate row for each instance of vehicle in the image is correct. This means that if you have multiple candidates of an object in a single image, you would need to introduce one row per instance while keeping the image filename the same. Therefore, if you had for example 20 vehicles in one image, you would need to create 20 rows in your table where the filename is all the same and you would have a single bounding box specification for each distinct object in that image.
Once you have done this, assuming that you have already trained the R-CNN detector and you want to use it, the original code to detect objects is the following referencing the website:
% Read test image
testImage = imread('stopSignTest.jpg');
% Detect stop signs
[bboxes, score, label] = detect(rcnn, testImage, 'MiniBatchSize', 128)
% Display the detection results
[score, idx] = max(score);
bbox = bboxes(idx, :);
annotation = sprintf('%s: (Confidence = %f)', label(idx), score);
outputImage = insertObjectAnnotation(testImage, 'rectangle', bbox, annotation);
figure
imshow(outputImage)
This only works for one object which has the highest score. If you wanted to do this for multiple objects, you would use the score that is output from the detect method and find those locations that either accommodate situation 1 or situation 2.
If you had situation 1, you would modify it to look like the following.
% Read test image
testImage = imread('stopSignTest.jpg');
% Detect stop signs
[bboxes, score, label] = detect(rcnn, testImage, 'MiniBatchSize', 128)
% New - Find those bounding boxes that surpassed a threshold
T = 0.7; % Define threshold here
idx = score >= T;
% Retrieve those scores that surpassed the threshold
s = score(idx);
% Do the same for the labels as well
lbl = label(idx);
bbox = bboxes(idx, :); % This logic doesn't change
% New - Loop through each box and print out its confidence on the image
outputImage = testImage; % Make a copy of the test image to write to
for ii = 1 : size(bbox, 1)
annotation = sprintf('%s: (Confidence = %f)', lbl(ii), s(ii)); % Change
outputImage = insertObjectAnnotation(outputImage, 'rectangle', bbox(ii,:), annotation); % New - Choose the right box
end
figure
imshow(outputImage)
Note that I've stored the original bounding boxes, labels and scores in their original variables while the subset of the ones that surpassed the threshold in separate variables in case you want to cross-reference between the two. If you wanted to accommodate for situation 2, the code remains the same as situation 1 with the exception of defining the threshold.
The code from:
% New - Find those bounding boxes that surpassed a threshold
T = 0.7; % Define threshold here
idx = scores >= T;
% [score, idx] = max(score);
... would now change to:
% New - Find those bounding boxes that surpassed a threshold
perc = 0.85; % 85% of the maximum threshold
T = perc * max(score); % Define threshold here
idx = score >= T;
The end result will be multiple bounding boxes of the detected objects in the image - one annotation per detected object.
I think you actually have to put all of the coordinates for that image as a single entry in your training data table. See this MATLAB tutorial for details. If you load the training data to your MATLAB locally and check the vehicleDataset variable, you will actually see this (sorry my score is not high enough to include images directly in my answers).
To summarize, in your training data table, make sure you have one unique entry for each image, and put however many bounding boxes into the corresponding category as a matrix, where each row is in the format of [x, y, width, height].
Related
I'm trying to perform object detection with RCNN on my own dataset following the tutorial on Matlab webpage. Based on the picture below:
I'm supposed to put image paths in the first column and the bounding box of each object in the following columns. But in each of my images, there is more than one object of each kind. For example there are 20 vehicles in one image. How should I deal with that? Should I create a separate row for each instance of vehicle in an image?
The example found on the website finds the pixel neighbourhood with the largest score and draws a bounding box around that region in the image. When you have multiple objects now, that complicates things. There are two approaches that you can use to facilitate finding multiple objects.
Find all bounding boxes with scores that surpass some global threshold.
Find the bounding box with the largest score and find those bounding boxes that surpass a percentage of this threshold. This percentage is arbitrary but from experience and what I have seen in practice, people tend to choose between 80% to 95% of the largest score found in the image. This will of course give you false positives if you submit an image as the query with objects not trained to be detected by the classifier but you will have to implement some more post-processing logic on your end.
An alternative approach would be to choose some value k and you would display the top k bounding boxes associated with the k highest scores. This of course requires that you know what the value of k is before hand and it will always assume that you have found an object in the image like the second approach.
In addition to the above logic, the approach that you state where you need to create a separate row for each instance of vehicle in the image is correct. This means that if you have multiple candidates of an object in a single image, you would need to introduce one row per instance while keeping the image filename the same. Therefore, if you had for example 20 vehicles in one image, you would need to create 20 rows in your table where the filename is all the same and you would have a single bounding box specification for each distinct object in that image.
Once you have done this, assuming that you have already trained the R-CNN detector and you want to use it, the original code to detect objects is the following referencing the website:
% Read test image
testImage = imread('stopSignTest.jpg');
% Detect stop signs
[bboxes, score, label] = detect(rcnn, testImage, 'MiniBatchSize', 128)
% Display the detection results
[score, idx] = max(score);
bbox = bboxes(idx, :);
annotation = sprintf('%s: (Confidence = %f)', label(idx), score);
outputImage = insertObjectAnnotation(testImage, 'rectangle', bbox, annotation);
figure
imshow(outputImage)
This only works for one object which has the highest score. If you wanted to do this for multiple objects, you would use the score that is output from the detect method and find those locations that either accommodate situation 1 or situation 2.
If you had situation 1, you would modify it to look like the following.
% Read test image
testImage = imread('stopSignTest.jpg');
% Detect stop signs
[bboxes, score, label] = detect(rcnn, testImage, 'MiniBatchSize', 128)
% New - Find those bounding boxes that surpassed a threshold
T = 0.7; % Define threshold here
idx = score >= T;
% Retrieve those scores that surpassed the threshold
s = score(idx);
% Do the same for the labels as well
lbl = label(idx);
bbox = bboxes(idx, :); % This logic doesn't change
% New - Loop through each box and print out its confidence on the image
outputImage = testImage; % Make a copy of the test image to write to
for ii = 1 : size(bbox, 1)
annotation = sprintf('%s: (Confidence = %f)', lbl(ii), s(ii)); % Change
outputImage = insertObjectAnnotation(outputImage, 'rectangle', bbox(ii,:), annotation); % New - Choose the right box
end
figure
imshow(outputImage)
Note that I've stored the original bounding boxes, labels and scores in their original variables while the subset of the ones that surpassed the threshold in separate variables in case you want to cross-reference between the two. If you wanted to accommodate for situation 2, the code remains the same as situation 1 with the exception of defining the threshold.
The code from:
% New - Find those bounding boxes that surpassed a threshold
T = 0.7; % Define threshold here
idx = scores >= T;
% [score, idx] = max(score);
... would now change to:
% New - Find those bounding boxes that surpassed a threshold
perc = 0.85; % 85% of the maximum threshold
T = perc * max(score); % Define threshold here
idx = score >= T;
The end result will be multiple bounding boxes of the detected objects in the image - one annotation per detected object.
I think you actually have to put all of the coordinates for that image as a single entry in your training data table. See this MATLAB tutorial for details. If you load the training data to your MATLAB locally and check the vehicleDataset variable, you will actually see this (sorry my score is not high enough to include images directly in my answers).
To summarize, in your training data table, make sure you have one unique entry for each image, and put however many bounding boxes into the corresponding category as a matrix, where each row is in the format of [x, y, width, height].
I use bwareaopen to remove small objects. Is there a function to remove the big objects? I'm trying to adapt bwareaopen however haven't been successful so far. Thanks
For ref: Here's a link to the help of bwareaopen.
I found an easy way to tackle this problem described here:
"To keep only objects between, say, 30 pixels and 50 pixels in area, you can use the BWAREAOPEN command, like this:"
LB = 30;
UB = 50;
Iout = xor(bwareaopen(I,LB), bwareaopen(I,UB));
Another way if you don't want to use bwareaopen is to use regionprops, specifically with the Area and PixelIdxList attributes, filter out the elements that don't conform to the area range you want, then use the remaining elements and create a new mask. Area captures the total area of each shape while PixelIdxList captures the column major linear indices of the locations inside the image that belong to each shape. You would use the Area attribute to perform your filtering while you would use the PixelIdxList attribute to create a new output image and set these locations to true that are within the desired area range:
% Specify lower and upper bounds
LB = 30;
UB = 50;
% Run regionprops
s = regionprops(I, 'Area', 'PixelIdxList');
% Get all of the areas for each shape
areas = [s.Area];
% Remove elements from output of regionprops
% that are not within the range
s = s(areas >= LB & areas <= UB);
% Get the column-major locations of the shapes
% that have passed the check
idx = {s.PixelIdxList};
idx = cat(1, idx{:});
% Create an output image with the passed shapes
Iout = false(size(I));
Iout(idx) = true;
I have tried to use the code provided in this answer to detect symbols using template matching with the FFT (via fft2)
However, the code only detects one symbol but it doesn't detect all similar symbols.
The code has been adapted from the linked post and is shown below.
template=im2bw(imread(http://www.clipartkid.com/images/543/floor-plan-symb-aT0MYg-clipart.png));
background=im2bw(imread(http://www.the-house-plans-guide.com/images/blueprints/draw-floor-plan-step-6.png));
bx = size(background, 2);
by = size(background, 1);
tx = size(template, 2); % used for bbox placement
ty = size(template, 1);
pos=[];
%// Change - Compute the cross power spectrum
Ga = fft2(background);
Gb = fft2(template, by, bx);
c = real(ifft2((Ga.*conj(Gb))./abs(Ga.*conj(Gb))));
%% find peak correlation
[max_c, imax] = max(abs(c(:)));
[ypeak, xpeak] = find(c == max(c(:))); % Added to make code work
if ~isempty(ypeak) || ~isempty(xpeak)
pos=position;
plot(xpeak,ypeak,'x','LineWidth',1,'Color','g');
rectangle('position',position,'edgecolor','b','linewidth',1, 'LineStyle', '- ');
end
How may I use the above code to detect multiple symbols as opposed to just one?
Amitay is correct in his assessment. BTW, the code that you took comes from the following post: Matlab Template Matching Using FFT.
The code is only designed to detect one match from the template you specify. If you wish to detect multiple templates, there are various methodologies you can try each with their own advantages and disadvantages:
Use a global threshold and from the cross power spectrum, any values that surpass this threshold deem that there is a match.
Find the largest similarity in the cross power spectrum, and anything that is some distance away from this maximum would be deemed that there is a match. Perhaps a percentage away, or one standard deviation away may work.
Try to make a histogram of the unique values in the cross power spectrum and find the point where there is a clear separation between values that are clearly uncorrelated with the template and values that are correlated. I won't implement this for you here because it requires that we look at your image then find the threshold by examining the histogram so I won't do that for you. Instead you can try the first two cases and see where that goes.
You will have to loop over multiple matches should they arise, so you'll need to loop over the code that draws the rectangles in the image.
Case #1
The first case is very simple. All you have to do is modify the find statement so that instead of searching for the location with the maximum, simply find locations that exceed the threshold.
Therefore:
%% find peak correlation
thresh = 0.1; % For example
[ypeak, xpeak] = find(c >= thresh);
Case #2
This is very similar to the first case but instead of finding values that exceed the threshold, determine what the largest similarity value is (already done), and threshold anything that is above (1 - x)*max_val where x is a value between 0 and 1 and denotes the percentage you'd like away from the maximum value to be considered as match. Therefore, if you wanted at most 5% away from the maximum, x = 0.05 and so the threshold now becomes 0.95*max_val. Similarly for the standard deviation, just find what it is using the std function and ensuring that you convert it into one single vector so that you can compute the value for the entire image, then the threshold becomes max_val - std_val where std_val is the standard deviation of the similarity values.
Therefore, do something like this for the percentage comparison:
%% find peak correlation
x = 0.05; % For example
[max_c, imax] = max(abs(c(:)));
[ypeak, xpeak] = find(c >= (1-x)*max_c);
... and do this for the standard deviation comparison:
std_dev = std(abs(c(:)));
[max_c, imax] = max(abs(c(:)));
[ypeak, xpeak] = find(c >= (max_c - std_dev));
Once you finally establish this, you'll see that there are multiple matches. It's now a point of drawing all of the detected templates on top of the image. Using the post that you "borrowed" the code from, the code to draw the detected templates can be modified to draw multiple templates.
You can do that below:
%% display best matches
tx = size(template, 2);
ty = size(template, 1);
hFig = figure;
hAx = axes;
imshow(background, 'Parent', hAx);
hold on;
for ii = 1 : numel(xpeak)
position = [xpeak(ii), ypeak(ii), tx, ty]; % Draw match on figure
imrect(hAx, position);
end
I have an image shown below:
I am applying some sort of threshold like in the code. I could separate the blue objects like below:
However, now I have a problem separating these blue objects. I applied watershed (I don't know whether I made it right or wrong) but it didn't work out, so I need help to separate these connected objects.
The code I tried to use is shown below:
RGB=imread('testImage.jpg');
RGB = im2double(RGB);
cform = makecform('srgb2lab', 'AdaptedWhitePoint', whitepoint('D65'));
I = applycform(RGB,cform);
channel1Min = 12.099;
channel1Max = 36.044;
channel2Min = -9.048;
channel2Max = 48.547;
channel3Min = -53.996;
channel3Max = 15.471;
BW = (I(:,:,1) >= channel1Min ) & (I(:,:,1) <= channel1Max) & ...
(I(:,:,2) >= channel2Min ) & (I(:,:,2) <= channel2Max) & ...
(I(:,:,3) >= channel3Min ) & (I(:,:,3) <= channel3Max);
maskedRGBImage = RGB;
maskedRGBImage(repmat(~BW,[1 1 3])) = 0;
figure
imshow(maskedRGBImage)
In general, this type of segmentation is a serious research problem. In your case, you could do pretty well using a combination of morphology operations. These see widespread use in microscopy image processing.
First, clean up BW a bit by removing small blobs and filling holes,
BWopen = imopen(BW, strel('disk', 6));
BWclose = imclose(BWopen, strel('disk', 6));
(you may want to tune the structuring elements a bit, "6" is just a radius that seemed to work on your test image.)
Then you can use aggressive erosion to generate some seeds
seeds = imerode(BWclose, strel('disk', 35));
which you can use for watershed, or just assign each point in BW to its closest seed
labels = bwlabel(seeds);
[D, i] = bwdist(seeds);
closestLabels = labels(i);
originalLabels = BWopen .* closestLabels;
imshow(originalLabels, []);
I would try the following steps:
Convert the image to gray and then to a binary mask.
Apply morphological opening (imopen) to clean small noisy objects.
Apply Connected Component Analysis (CCA) using bwlabel. Each connected component contains at least 1 object.
These blue objects really look like stretched/distorted circles, so I would try Hough transform to detect cicles inside each labeled component. There is a built-in function (imfindcircles) or code available online (Hough transform for circles), depending on your Matlab version and available toolboxes.
Then, you need to take some decisions regarding the number of objects, N, inside each component (N>=1). I don't know in advance what the best criteria should be, but you could also apply these simple rules:
[i] An object needs to be of a minimum size.
[ii] Overlaping circles correspond to the same object (or not, depending on the overlap amount).
The circle centroids can then serve as seeds to complete the final object segmentation. Of course, if there is only one circle in each component, you just keep it directly as an object.
I didn't check all steps for validity in Matlab, but I quickly checked 1, 2, and 4 and they seemed to be quite promising. I show the result of circle detection for the most difficult component, in the center of the image:
The code I used to create this image is:
close all;clear all;clc;
addpath 'circle_hough'; % add path to code of [Hough transform for circles] link above
im = imread('im.jpg');
img = rgb2gray(im);
mask = img>30; mask = 255*mask; % create a binary mask
figure;imshow(mask)
% filter the image so that only the central part of 6 blue objects remains (for demo purposes only)
o = zeros(size(mask)); o(170:370, 220:320) = 1;
mask = mask.*o;
figure;imshow(mask);
se = strel('disk',3);
mask = imopen(mask,se); % apply morphological opening
figure;imshow(mask);
% check for circles using Hough transform (see also circle_houghdemo.m in [Hough transform for circles] link above)
radii = 15:5:40; % allowed circle radii
h = circle_hough(mask, radii, 'same', 'normalise');
% choose the 10 biggest circles
peaks = circle_houghpeaks(h, radii, 'nhoodxy', 15, 'nhoodr', 21, 'npeaks', 10);
% show result
figure;imshow(im);
for peak = peaks
[x, y] = circlepoints(peak(3));
hold on;plot(x+peak(1), y+peak(2), 'g-');
end
Some spontaneous thoughts. I assume the ultimate goal is to count the blue corpuscles. If so, I would search for a way to determine the total area of those objects and the average area of a corpuscle. As a first step convert to binary (black and White):
level = graythresh(RGB);
BW = im2bw(RGB,level);
Since morphological operations (open/close) will not preserve the areas, I would work with bwlabel to find the connected components. Looking at the picture I see 12 isolated corpuscles, two connected groups, some truncated corpuscles at the edge and some small noisy fragments. So first remove small objects and those touching edges. Then determine total area (bwarea) and the median size of objects as a proxy for the average area of one corpuscle.
I have a set of shapes in an image I would like to label according to their area, I have used bwboundaries to find them, and regionprops to determine their area. I would like to label them such that they are labelled different based on whether their area is above or below the threshold i have determined.
I've thought about using inserObjectAnnotation, but I'm not sure how to add on a condition based on their area into the function?
Assuming TH to be the threshold area and BW to be the binary image and if you are okay with labeling them as o's and x's with matlab figure text at their centers (centroids to be exact), based on the thresholding, see if this satisfies your needs -
stats = regionprops(BW,'Area')
stats2 = regionprops(BW,'Centroid')
figure,imshow(BW)
for k = 1:numel(stats)
xy = stats2(k).Centroid
if (stats(k).Area>TH)
text(xy(1),xy(2),'L') %// Large Shape
else
text(xy(1),xy(2),'S') %// Small Shape
end
end
Sample output -
You could use CC = bwconncomp(BW,conn).
To get the number of pixels of every connected compontent you can use:
numPixels = cellfun(#numel,CC.PixelIdxList);
In CC.PixelIdxList you have a list of all found objects and the indices of the pixels belonging to the components. I guess to label your areas you could do something like:
for ind = 1:size(CC.PixelIdxList,2)
Image(CC.PixelIdxList{ind}) = ind;
end