This question is based on a modified Matlab code from the online documentation for the optical flow system objects in version 2015a as appears in opticalFlowLK class
clc; clearvars; close all;
inputVid = VideoReader('viptraffic.avi');
opticFlow = opticalFlowLKDoG('NumFrames',3);
inputVid.currentTime = 2;
k = 1;
while inputVid.currentTime<=2 + 1/inputVid.FrameRate
frameRGB{k} = readFrame(inputVid);
frameGray{k} = rgb2gray(frameRGB{k});
flow{k} = estimateFlow(opticFlow,frameGray{k});
k = k+1;
end
By looking at flow{2}.Vx and flow{2}.Vy I get the motion maps U and V that describe the motion from frameGray{1} to frameGray{2}.
Iwant to use flow{2}.Vx and flow{2}.Vy directly on the data in frameGray{1} in order to warp frameGray{1} to appear visually similar to frameGray{2}.
I tried this code:
[x, y] = meshgrid(1:size(frameGray{1},2), 1:size(frameGray{1},1));
frameGray1Warped = interp2(double(frameGray{1}) , x-flow{2}.Vx , y-flow{2}.Vy);
But it doesn't seem to do much at all except ruin the image quality (but the objects don't display any real motion towards their locations in frameGray{2}.
I added 3 images showing the 2 original frames followed by frame 1 warped using the motion field to appear similar to frame 2:
It can be seen easily that frame 1 warped to 2 is essentially frame 1 with degraded quality but the cars haven't moved at all. That is - the location of the cars is the same: look at the car closest to the camera with respect to the road separation line near it; it's virtually the same in frame 1 and frame 1 warped to 2, but is quite different in frame 2.
Related
i am working on a research about the swimming of fishes using analysis of videos, then i need to be carefully with the images (obtained from video frames) with emphasis in the tail.
The images are in High-Resolution and the software that i customize works with binary images, because is easy to use maths operations on this.
For obten this binary images i use 2 methods:
1)Convert the image to gray, invert the colors,later to bw and finally to binary with a treshold that give me images like this, with almost nothing of noise. The images sometimes loss a bit of area and doesn't is very exactly with the tail(now i need more acurracy for determinate the amplitude of tail moves)
image 1
2)i use this code, for cut the border that increase the threshold, this give me a good image of the edge, but i dont know like joint these point and smooth the image, or fitting binary images, the app fitting of matlab 2012Rb doesn't give me a good graph and i don't have access to the toolboxs of matlab.
s4 = imread('arecorte.bmp');
A=[90 90 1110 550]
s5=imcrop(s4,A)
E = edge(s5,'canny',0.59);
image2
My question is that
how i can fit the binary image or joint the points and smooth without disturb the tail?
Or how i can use the edge of the image 2 to increase the acurracy of the image 1?
i will upload a image in the comments that give me the idea of the method 2), because i can't post more links, please remember that i am working with iterations and i can't work frame by frame.
Note: If i ask this is because i am in a dead point and i don't have the resources to pay to someone for do this, until this moment i was able to write the code but in this final problem i can't alone.
I think you should use connected component labling and discard the small labels and than extract the labels boundary to get the pixels of each part
the code:
clear all
% Read image
I = imread('fish.jpg');
% You don't need to do it you haef allready a bw image
Ibw = rgb2gray(I);
Ibw(Ibw < 100) = 0;
% Find size of image
[row,col] = size(Ibw);
% Find connceted components
CC = bwconncomp(Ibw,8);
% Find area of the compoennts
stats = regionprops(CC,'Area','PixelIdxList');
areas = [stats.Area];
% Sort the areas
[val,index] = sort(areas,'descend');
% Take the two largest comonents ids and create filterd image
IbwFilterd = zeros(row,col);
IbwFilterd(stats(index(1,1)).PixelIdxList) = 1;
IbwFilterd(stats(index(1,2)).PixelIdxList) = 1;
imshow(IbwFilterd);
% Find the pixels of the border of the main component and tail
boundries = bwboundaries(IbwFilterd);
yCorrdainteOfMainFishBody = boundries{1}(:,1);
xCorrdainteOfMainFishBody = boundries{1}(:,2);
linearCorrdMainFishBody = sub2ind([row,col],yCorrdainteOfMainFishBody,xCorrdainteOfMainFishBody);
yCorrdainteOfTailFishBody = boundries{2}(:,1);
xCorrdainteOfTailFishBody = boundries{2}(:,2);
linearCorrdTailFishBody = sub2ind([row,col],yCorrdainteOfTailFishBody,xCorrdainteOfTailFishBody);
% For visoulaztion put color for the boundries
IFinal = zeros(row,col,3);
IFinalChannel = zeros(row,col);
IFinal(:,:,1) = IFinalChannel;
IFinalChannel(linearCorrdMainFishBody) = 255;
IFinal(:,:,2) = IFinalChannel;
IFinalChannel = zeros(row,col);
IFinalChannel(linearCorrdTailFishBody) = 125;
IFinal(:,:,3) = IFinalChannel;
imshow(IFinal);
The final image:
I am working on dress feature identification using opencv.
As a first step, I need to segment t-shirt by removing face and hands from the image.
Any suggestion is appreciated.
I suggest the following approach:
Use Adrian Rosebrock's skin detection algorithm for detecting the skin (thank you for Rosa Gronchi for his comment).
Use region growing algorithm on the variance map. The initial seed can be calculated by using stage 1(see the attached code for more information).
code:
%stage 1: skin detection - Adrian Rosebrock solution
im = imread(<path to input image>);
hsb = rgb2hsv(im)*255;
skinMask = hsb(:,:,1) > 0 & hsb(:,:,1) < 20;
skinMask = skinMask & (hsb(:,:,2) > 48 & hsb(:,:,2) < 255);
skinMask = skinMask & (hsb(:,:,3) > 80 & hsb(:,:,3) < 255);
skinMask = imclose(skinMask,strel('disk',6));
%stage 2: calculate top, left and right centroid from the different connected
%components of the skin
stats = regionprops(skinMask,'centroid');
topCentroid = stats(1).Centroid;
rightCentroid = stats(1).Centroid;
leftCentroid = stats(1).Centroid;
for x = 1 : length(stats)
centroid = stats(x).Centroid;
if topCentroid(2)>centroid(2)
topCentroid = centroid;
elseif centroid(1)<leftCentroid(1)
leftCentroid = centroid;
elseif centroid(1)>rightCentroid(1)
rightCentroid = centroid;
end
end
%first seed - the average of the most left and right centroids.
centralSeed = int16((rightCentroid+leftCentroid)/2);
%second seed - a pixel which is right below the face centroid.
faceSeed = int16(topCentroid);
faceSeed(2) = faceSeed(2)+40;
%stage 3: std filter
varIm = stdfilt(rgb2gray(im));
%stage 4 - region growing on varIm using faceSeed and centralSeed
res1=regiongrowing(varIm,centralSeed(2),centralSeed(1),8);
res2=regiongrowing(varIm,faceSeed(2),faceSeed(1),8);
res = res1|res2;
%noise reduction
res = imclose(res,strel('disk',3));
res = imopen(res,strel('disk',2));
result after stage 1(skin detection):
final result:
Comments:
Stage 1 is calculated using the following algorithm.
The region growing function can be downloaded here.
The solution is not perfect. For example, it may fail if the texture of the shirt is similar to the texture of the background. But I think that it can be a good start.
Another improvement which can be done is to use a better region growing algorithm, which doesn't grows into the skinMask location. Also, instead of using the region growing algorithm twice independently, the result of the second call of region growing can can be based on the result from the first one.
I'm working with the Kinect v2 and I have to map the depth information onto the RGB images to process them: in particular, I need to know which pixels in the RGB images are in a certain range of distance (depth) along the Z axis; I'm acquiring all the data with a C# program and saving them as images (RGB) and txt files (depth).
I've followed the instruction from here and here (and I thank them for sharing), but I still have some problems I don't know how to solve.
I have calculated the rotation (R) and translation (T) matrix between the depth sensor and the RGB camera, as well as their intrinsic parameters.
I have created P3D_d (depth pixels in world coordinates related to depth sensor) and P3D_rgb (depth pixels in world coordinates related to rgb camera).
row_num = 424;
col_num = 512;
P3D_d = zeros(row_num,col_num,3);
P3D_rgb = zeros(row_num,col_num,3);
cont = 1;
for row=1:row_num
for col=1:col_num
P3D_d(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
P3D_d(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
P3D_d(row,col,3) = depth(row,col);
temp = [P3D_d(row,col,1);P3D_d(row,col,2);P3D_d(row,col,3)];
P3D_rgb(row,col,:) = R*temp+T;
end
end
I have created P2D_rgb_x and P2D_rgb_y.
P2D_rgb_x(:,:,1) = (P3D_rgb(:,:,1)./P3D_rgb(:,:,3))*fx_rgb+cx_rgb;
P2D_rgb_y(:,:,2) = (P3D_rgb(:,:,2)./P3D_rgb(:,:,3))*fy_rgb+cy_rgb;
but now I don't understand how to continue.
Assuming that the calibration parameters are correct, I've tried to click on a defined point in both the depth (coordinates: row_d, col_d) and rgb (coordinates: row_rgb, col_rgb) images, but P2D_rgb_x(row_d, col_d) is totally different from row_rgb, as well as P2D_rgb_y(row_d, col_d) is totally different from col_rgb.
So, what do exactly mean P2D_rgb_x and P2D_rgb_y? How can I use them to map depth value onto rgb images or just to get the depth of a certain RGB pixel?
I'll apreciate any suggest or help!
PS: I've also a related post on MathWorks at this link
I am currently trying to figure out how to be able to detect a face that is 5m away from the source and will have its facial features clear enough for the user to see. The code i am working on is as shown.
faceDetector = vision.CascadeObjectDetector();
%Get the input device using image acquisition toolbox,resolution = 640x480 to improve performance
obj =imaq.VideoDevice('winvideo', 1, 'YUY2_640x480','ROI', [1 1 640 480]);
set(obj,'ReturnedColorSpace', 'rgb');
figure('menubar','none','tag','webcam');
while (true)
frame=step(obj);
bbox=step(faceDetector,frame);
boxInserter = vision.ShapeInserter('BorderColor','Custom',...
'CustomBorderColor',[255 255 0]);
videoOut = step(boxInserter, frame,bbox);
imshow(videoOut,'border','tight');
f=findobj('tag','webcam');
if (isempty(f));
[hueChannel,~,~] = rgb2hsv(frame);
% Display the Hue Channel data and draw the bounding box around the face.
figure, imshow(hueChannel), title('Hue channel data');
rectangle('Position',bbox,'EdgeColor','r','LineWidth',1)
hold off
noseDetector = vision.CascadeObjectDetector('Nose');
faceImage = imcrop(frame,bbox);
imshow(faceImage)
noseBBox = step(noseDetector,faceImage);
noseBBox(1:1) = noseBBox(1:1) + bbox(1:1);
videoInfo = info(obj);
ROI=get(obj,'ROI');
VideoSize = [ROI(3) ROI(4)];
videoPlayer = vision.VideoPlayer('Position',[300 300 VideoSize+30]);
tracker = vision.HistogramBasedTracker;
initializeObject(tracker, hueChannel, bbox);
while (1)
% Extract the next video frame
frame = step(obj);
% RGB -> HSV
[hueChannel,~,~] = rgb2hsv(frame);
% Track using the Hue channel data
bbox = step(tracker, hueChannel);
% Insert a bounding box around the object being tracked
videoOut = step(boxInserter, frame, bbox);
%Insert text coordinates
% Display the annotated video frame using the video player object
step(videoPlayer, videoOut);
pause (.2)
end
% Release resources
release(obj);
release(videoPlayer);
close(gcf)
break
end
pause(0.05)
end
release(obj)
% remove video object from memory
delete(handles.vid);
I am trying to work on this code to figure out the distance it can cover when tracking a face. I couldnt figure out which one handles that. Thanks!
Not sure what your question is, but try this example. It uses the KLT algorithm, which, IMHO, is more robust for face tracking than CAMShift. It also uses the webcam interface in base MATLAB, which is very easy.
I am working on video stabilisation ( making shaky videos non-shaky) using matlab.
One of the steps is to find a smooth camera path given the unstable camera path.
The unstable camera path is one which gives the jittering or shake to the video.
I have camera path specified using camera position which is a 3d-data.
camera path - (cx,cy,cz);
As i plot in matlab, i can visually see the shakiness of the camera motion.
So now i require a least squares fitting to be done on the camera path specified by(cx,cy,cz);
I came across polyfit() which does fitting for 2-dimensional data.
But what i need is a 3-d smooth curve fit to the shaky curve.
Thanks in advance.
Couldn't you just fit three separate 1d curves for cx(t), cy(t), cz(t)?
BTW: I think what you need is a Kalman filter, not a polynomial fit to the camera path. But I'm not sure if matlab has builtin support for that.
Approach using least square fit:
t = (1:0.1:5)';
% model
px = [ 5 2 1 ];
x = polyval(px,t);
py = [ -2 1 1 ];
y = polyval(py,t);
pz = [ 1 20 1 ];
z = polyval(pz,t);
% plot model
figure
plot3(x,y,z)
hold all
% simulate measurement
xMeasured = x+2*(rand(length(x),1)-0.5);
yMeasured = y+2*(rand(length(y),1)-0.5);
zMeasured = z+2*(rand(length(z),1)-0.5);
% plot simulated measurements
plot3(xMeasured, yMeasured, zMeasured,'or')
hold off
grid on
% least squares fit
A = [t.^2, t, t./t];
pxEstimated = A\xMeasured;
pyEstimated = A\yMeasured;
pzEstimated = A\zMeasured;
Let me be grateful to stackoverflow.com first of all and then my thanks to zellus and nikie who had started me thinking about the problem more. So now I have reached the solution which follows zellus approach and as nikie pointed out I used parameter 't' .
cx, cy,cz are the coordinates in 3d space and in my case they are all 343x1 doubles
My final code is shown below which fits the 3d data set:
t = linspace(1,343,343)';
load cx.mat;
load cy.mat;
load cz.mat;
plot3(cx, cy, cz,'r'),title('source Camera Path');
hold all
A = [t.^2, t, t./t];
fx = A\cx;
fy = A\cy;
fz = A\cz;
Xev = polyval(fx,t);
Yev = polyval(fy,t);
Zev = polyval(fz,t);
plot3(Xev,Yev,Zev,'+b'),title('Fitting Line');
I look forward to more interesting discussions on StackOverflow with great helpful people.