In caffe, py-faster-rcnn, "scores" return a large matrix, why? - neural-network

I use the py-faster-rcnn demo to build further of my project with 20 classes.
However, I am trying to gain the softmax, last layer probability of my classes.
For example:
# Load the demo image
im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
im = cv2.imread(im_file)
# Detect all object classes and regress object bounds
timer = Timer()
timer.tic()
scores, boxes = im_detect(net, im)
timer.toc()
print ('Detection took {:.3f}s for '
'{:d} object proposals').format(timer.total_time, boxes.shape[0])
# Visualize detections for each class
CONF_THRESH = 0.8
NMS_THRESH = 0.3
for cls_ind, cls in enumerate(CLASSES[1:]):
cls_ind += 1 # because we skipped background
cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
cls_scores = scores[:, cls_ind]
dets = np.hstack((cls_boxes,
cls_scores[:, np.newaxis])).astype(np.float32)
keep = nms(dets, NMS_THRESH)
dets = dets[keep, :]
vis_detections(im, cls, dets, thresh=CONF_THRESH)
print scores
While I do the print scores, it gives me a very large matrix output,
instead of 1 x 20 . I am not sure why, and how can I get the last probability matrix?
Thanks

The raw scores the detector outputs include overlapping detections and very low score detections as well.
Note that only after applying non-maximal suppression (aka "nms") with NMS_THRESH=0.3 the function vis_detection only displays detections with confidence larger than CONF_THRESH=0.8.
So, if you want to look at the "true" objects, you need to check inside vis_detection and check only the detections it renders on the image.

Related

How to separate human body from background in an image

I have been trying to separate the human body in an image from the background, but all the methods I have seen don't seem to work very well for me.
I have collected the following images;
The image of the background
The image of the background with the person in it.
Now I want to cut out the person from the background.
I tried subtracting the image of the background from the image with the person using res = cv2.subtract(background, foreground) (I am new to image processing).
Background subtraction methods in opencv like cv2.BackgroundSubtractorMOG2() and cv2.BackgroundSubtractorMOG2() only works with videos or image sequence and contour detection methods I have seen are only for solid shapes.
And grabCut doesn't quite work well for me because I would like to automate the process.
Given the images I have (Image of the background and image of the background with the person in it), is there a method of cutting the person out from the background?
I wouldn't recommend a neural net for this problem. That's a lot of work for something like this where you have a known background. I'll walk through the steps I took to do the background segmentation on this image.
First I shifted into the LAB color space to get some light-resistant channels to work with. I did a simple subtractions of foreground and background and combined the a and b channels.
You can see that there is still significant color change in the background even with a less light-sensitive color channel. This is likely due to the auto white balance on the camera, you can see that some of the background colors change when you step into view.
The next step I took was thresholding off of this image. The optimal threshold values may not always be the same, you'll have to adjust to a range that works well for your set of photos.
I used openCV's findContours function to get the segmentation points of each blob and I filtered the available contours by size. I set a size threshold of 15000. For reference, the person in the image had a pixel area of 27551.
Then it's just a matter of cropping out the contour.
This technique works for any good thresholding strategy. If you can improve the consistency of your pictures by turning off auto settings and ensure good contrast of the person against the wall then you can use simpler thresholding strategies and get good results.
Just for fun:
Edit:
I forgot to add in the code I used:
import cv2
import numpy as np
# rescale values
def rescale(img, orig, new):
img = np.divide(img, orig);
img = np.multiply(img, new);
img = img.astype(np.uint8);
return img;
# get abs(diff) of all hue values
def diff(bg, fg):
# do both sides
lh = bg - fg;
rh = fg - bg;
# pick minimum # this works because of uint wrapping
low = np.minimum(lh, rh);
return low;
# load image
bg = cv2.imread("back.jpg");
fg = cv2.imread("person.jpg");
fg_original = fg.copy();
# blur
bg = cv2.blur(bg,(5,5));
fg = cv2.blur(fg,(5,5));
# convert to lab
bg_lab = cv2.cvtColor(bg, cv2.COLOR_BGR2LAB);
fg_lab = cv2.cvtColor(fg, cv2.COLOR_BGR2LAB);
bl, ba, bb = cv2.split(bg_lab);
fl, fa, fb = cv2.split(fg_lab);
# subtract
d_b = diff(bb, fb);
d_a = diff(ba, fa);
# rescale for contrast
d_b = rescale(d_b, np.max(d_b), 255);
d_a = rescale(d_a, np.max(d_a), 255);
# combine
combined = np.maximum(d_b, d_a);
# threshold
# check your threshold range, this will work for
# this image, but may not work for others
# in general: having a strong contrast with the wall makes this easier
thresh = cv2.inRange(combined, 70, 255);
# opening and closing
kernel = np.ones((3,3), np.uint8);
# closing
thresh = cv2.dilate(thresh, kernel, iterations = 2);
thresh = cv2.erode(thresh, kernel, iterations = 2);
# opening
thresh = cv2.erode(thresh, kernel, iterations = 2);
thresh = cv2.dilate(thresh, kernel, iterations = 3);
# contours
_, contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE);
# filter contours by size
big_cntrs = [];
marked = fg_original.copy();
for contour in contours:
area = cv2.contourArea(contour);
if area > 15000:
print(area);
big_cntrs.append(contour);
cv2.drawContours(marked, big_cntrs, -1, (0, 255, 0), 3);
# create a mask of the contoured image
mask = np.zeros_like(fb);
mask = cv2.drawContours(mask, big_cntrs, -1, 255, -1);
# erode mask slightly (boundary pixels on wall get color shifted)
mask = cv2.erode(mask, kernel, iterations = 1);
# crop out
out = np.zeros_like(fg_original) # Extract out the object and place into output image
out[mask == 255] = fg_original[mask == 255];
# show
cv2.imshow("combined", combined);
cv2.imshow("thresh", thresh);
cv2.imshow("marked", marked);
# cv2.imshow("masked", mask);
cv2.imshow("out", out);
cv2.waitKey(0);
Since it is very easy to find dataset consist a lot of human body, I suggest you to implement neural network segmentation tecniques to extract human body perfectly. Please check this link to see similar example.

Gravity in accelerometric measurements

I have taken from a data set the values ​​of x and z of activity (e.g. walking, running) detected by an accelerometer. Since the data collected also contains the gravity values, I removed it with the following filter in Matlab:
fc = 0.3;
fs = 50;
x = ...;
y = ...;
z = ...;
[but,att] = butter(6,fc/(fs/2));
gx = filter(but,att,x);
gy = filter(but,att,y);
gz = filter(but,att,z);
new_x = x-gx;
new_y = y-gy;
new_z = z-gz;
A = magnitude(new_x,new_y,new_z);
plot(A)
Then I calculated the magnitude value and plotted the magnitude value on a graph.
However, every graph, even after removing gravity, starts with a magnitude of 1g (9.8 m / s ^ 2), why? Should not it start at 0 since I removed gravity?
You need to wait for the filter value to ramp up. Include some additional data that you don't graph at the beginning of the file for this purpose.
How accurate do your calculations need to be? With walking and running the angle of the accelerometer can change, so the orientation of the gravity vector can change throughout the gait cycle. How much of a change in orientation you can expect to see depends on the sensor location and the particular motion you are trying to capture.

Image Processing - Dress Segmentation using opencv

I am working on dress feature identification using opencv.
As a first step, I need to segment t-shirt by removing face and hands from the image.
Any suggestion is appreciated.
I suggest the following approach:
Use Adrian Rosebrock's skin detection algorithm for detecting the skin (thank you for Rosa Gronchi for his comment).
Use region growing algorithm on the variance map. The initial seed can be calculated by using stage 1(see the attached code for more information).
code:
%stage 1: skin detection - Adrian Rosebrock solution
im = imread(<path to input image>);
hsb = rgb2hsv(im)*255;
skinMask = hsb(:,:,1) > 0 & hsb(:,:,1) < 20;
skinMask = skinMask & (hsb(:,:,2) > 48 & hsb(:,:,2) < 255);
skinMask = skinMask & (hsb(:,:,3) > 80 & hsb(:,:,3) < 255);
skinMask = imclose(skinMask,strel('disk',6));
%stage 2: calculate top, left and right centroid from the different connected
%components of the skin
stats = regionprops(skinMask,'centroid');
topCentroid = stats(1).Centroid;
rightCentroid = stats(1).Centroid;
leftCentroid = stats(1).Centroid;
for x = 1 : length(stats)
centroid = stats(x).Centroid;
if topCentroid(2)>centroid(2)
topCentroid = centroid;
elseif centroid(1)<leftCentroid(1)
leftCentroid = centroid;
elseif centroid(1)>rightCentroid(1)
rightCentroid = centroid;
end
end
%first seed - the average of the most left and right centroids.
centralSeed = int16((rightCentroid+leftCentroid)/2);
%second seed - a pixel which is right below the face centroid.
faceSeed = int16(topCentroid);
faceSeed(2) = faceSeed(2)+40;
%stage 3: std filter
varIm = stdfilt(rgb2gray(im));
%stage 4 - region growing on varIm using faceSeed and centralSeed
res1=regiongrowing(varIm,centralSeed(2),centralSeed(1),8);
res2=regiongrowing(varIm,faceSeed(2),faceSeed(1),8);
res = res1|res2;
%noise reduction
res = imclose(res,strel('disk',3));
res = imopen(res,strel('disk',2));
result after stage 1(skin detection):
final result:
Comments:
Stage 1 is calculated using the following algorithm.
The region growing function can be downloaded here.
The solution is not perfect. For example, it may fail if the texture of the shirt is similar to the texture of the background. But I think that it can be a good start.
Another improvement which can be done is to use a better region growing algorithm, which doesn't grows into the skinMask location. Also, instead of using the region growing algorithm twice independently, the result of the second call of region growing can can be based on the result from the first one.

Warp images using motion maps generated by opticalFlowLKDoG (Matlab 2015A)

This question is based on a modified Matlab code from the online documentation for the optical flow system objects in version 2015a as appears in opticalFlowLK class
clc; clearvars; close all;
inputVid = VideoReader('viptraffic.avi');
opticFlow = opticalFlowLKDoG('NumFrames',3);
inputVid.currentTime = 2;
k = 1;
while inputVid.currentTime<=2 + 1/inputVid.FrameRate
frameRGB{k} = readFrame(inputVid);
frameGray{k} = rgb2gray(frameRGB{k});
flow{k} = estimateFlow(opticFlow,frameGray{k});
k = k+1;
end
By looking at flow{2}.Vx and flow{2}.Vy I get the motion maps U and V that describe the motion from frameGray{1} to frameGray{2}.
Iwant to use flow{2}.Vx and flow{2}.Vy directly on the data in frameGray{1} in order to warp frameGray{1} to appear visually similar to frameGray{2}.
I tried this code:
[x, y] = meshgrid(1:size(frameGray{1},2), 1:size(frameGray{1},1));
frameGray1Warped = interp2(double(frameGray{1}) , x-flow{2}.Vx , y-flow{2}.Vy);
But it doesn't seem to do much at all except ruin the image quality (but the objects don't display any real motion towards their locations in frameGray{2}.
I added 3 images showing the 2 original frames followed by frame 1 warped using the motion field to appear similar to frame 2:
It can be seen easily that frame 1 warped to 2 is essentially frame 1 with degraded quality but the cars haven't moved at all. That is - the location of the cars is the same: look at the car closest to the camera with respect to the road separation line near it; it's virtually the same in frame 1 and frame 1 warped to 2, but is quite different in frame 2.

maximum intensity projection matlab with color

Hi all I have a stack of images of fluorescent labeled particles that are moving through time. The imagestack is gray scaled.
I computed a maximum intensity projection by taking the maximum of the image stack in the 3rd dimension.
Example:
ImageStack(x,y,N) where N = 31 image frames.
2DProjection = max(ImageStack,[],3)
Now, since the 2D projection image is black and white, I was hoping to assign a color gradient so that I can get a sense of the flow of particles through time. Is there a way that I can overlay this image with color, so that I will know where a particle started, and where it ended up?
Thanks!
You could use the second output of max to get which frame the particular maximum came from. max returns an index matrix which indicates the index of each maximal value, which in your case will be the particular frame in which it occurred. If you use this with the imagesc function, you will be able to plot how the particles move with time. For instance:
ImageStack(x,y,N) where N = 31 image frames.
[2DProjection,FrameInfo] = max(ImageStack,[],3);
imagesc(FrameInfo);
set(gca,'ydir','normal'); % Otherwise the y-axis would be flipped
You can sum up bright pixels of each image with one another after coloring each image. This way you will have mixed colors on overlapped areas which you will miss using max function. Although I like the previous answer more than mine.
hStep = 1/N;
currentH = 0;
resultImage = uint8(zeros(x,y,3));
for i = 1 : N
rgbColor = hsv2rgb(currentH,1,0.5);
resultImage(:,:,1) = resultImage(:,:,1) + im(:,:,i) * rgbColor(1);
resultImage(:,:,2) = resultImage(:,:,2) + im(:,:,i) * rgbColor(2);
resultImage(:,:,3) = resultImage(:,:,3) + im(:,:,i) * rgbColor(3);
currentH = currentH + hStep;
end