How to track the center point of the moving feature in the given picture (preferably using MATLAB)? - matlab

Suggest a method/algorithm to track the center point of the feature,
the features is part of a video. As the video is played, the feature keeps moving around but never goes out of the rectangle of size shown in figure.
I wish to track the center point over the duration of the video.
*the red point is not part of the image. I have overlaid it to show the center point I wish to track.

A very simple way:
create an image with the pattern to recognize
do cross-correlation along X and Y with your frames
select the peaks of the X and Y correlation signals to identify position
There must be a lot of material around .. start here http://en.wikipedia.org/wiki/Video_tracking

Try using vision.PointTracker in the Computer Vision System Toolbox.

Related

Restoring the image of a face's plane

I am using an API to analyze faces in Matlab, where I get for each picture a 3X3 rotation matrix of the face's orientation, telling which direction the head is pointing.
I am trying to normalize the image according to that matrix, so that it will be distorted to get the image of the face's plane. This is something like 'undoing' the projection of the face to the camera plane. For example, if the head is directed a little to the left, it will stretch the left side to (more or less) preserve the face's original proportions.
Tried using 'affine2d' and 'projective2d' with 'imwarp', but it didn't achieve that goal
Achieving your goal with simple tools like affine transformations seems impossible to me since a face is hardly a flat surface. An extreme example: Imagine the camera recording a profile view of someone's head. How are you going to reconstruct the missing half of the face?
There have been successful attempts to change the orientation of faces in images and real-time video, but the methods used are quite complex:
[We] propose a gaze correction method that needs just a
single webcam. We apply recent shape deformation techniques
to generate a 3D face model that matches the user’s face. We
then render a gaze-corrected version of this face model and
seamlessly insert it into the original image.
(Giger et al., https://graphics.ethz.ch/Downloads/Publications/Papers/2014/Gig14a/Gig14a.pdf)

How to detect the any 4 sides polygen in the image and adjust it to rectangle?

One TV screen recognition project, i need to clip the TV Screen from one image.
The TV screen actually is rectangle. But It's obvious that the TV screen is out of shape in the image from phone camera. My question are:
How to detect the any 4 sides polygen(it's not rectangle) in the image.
After i know the polygen area on the image ,how to retrieve the area to Mat.
After solve quest2, How to convert the Mat of 4 sides polygen to rectangle Mat which is fixed W/H radio.
It's very helpful that give some code sample to reference.
Thanks your answers!
if you want to detect the edges of your TV screen you can use some border
detection (like Canny) and then use Hough transform to obtained the lines.
If you then extract the points corresponding to the intersection of the lines
you can create an homography matrix H (3x3). Finally, using this homgraphy you can
"deform" your original image to a reference frame (in our case the rectangle
with a given aspect ratio). The homography is a transformation from plane
to plane, so it's exactly what you will need here.
If your going to use OpenCV (which is always a good choice!),
here are the functions that you could use:
Canny() - find edges in the image
HoughLines() - detect lines
findHomography() - this function finds from a set of correspondances,
the homography matrix. In your case, you will need to pass the method
as 0.
warpPerspective() - the function that your going to use to "deform"
the image to a reference frame.
Obviously, you can find similar functions for MATLAB and others...
I hope this helps you.

3D trajectory reconstruction from video (taken by a single camera)

I am currently trying to reconstruct a 3D trajectory of a falling object like a ball or a rock out of a sequence of images taken from an iPhone video.
Where should I start looking? I know I have to calibrate the camera (I think I'll use the matlab calibration toolbox by Jean-Yves Bouguet) and then find the vanishing point from the same sequence, but then I'm really stuck.
read this: http://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/773-GG/lectA-773.htm
it explains 3d reconstruction using two cameras. Now for a simple summary, look at the figure from that site:
You only know pr/pl, the image points. By tracing a line from their respective focal points Or/Ol you get two lines (Pr/Pl) that both contain the point P. Because you know the 2 cameras origin and orientation, you can construct 3d equations for these lines. Their intersection is thus the 3d point, voila, it's that simple.
But when you discard one camera (let's say the left one), you only know for sure the line Pr. What's missing is depth. Luckily you know the radius of your ball, this extra information can give you the missing depth information. see next figure (don't mind my paint skills):
Now you know the depth using the intercept theorem
I see one last issue: the shape of ball changes when projected under an angle (ie not perpendicular on your capture plane). However you do know the angle, so compensation is possible, but I leave that up to you :p
edit: #ripkars' comment (comment box was too small)
1) ok
2) aha, the correspondence problem :D Typically solved by correlation analysis or matching features (mostly matching followed by tracking in a video). (other methods exist too)
I haven't used the image/vision toolbox myself, but there should definitely be some things to help you on the way.
3) = calibration of your cameras. Normally you should only do this once, when installing the cameras (and every other time you change their relative pose)
4) yes, just put the Longuet-Higgins equation to work, ie: solve
P = C1 + mu1*R1*K1^(-1)*p1
P = C2 + mu2*R2*K2^(-1)*p2
with
P = 3D point to find
C = camera center (vector)
R = rotation matrix expressing the orientation of the first camera in the world frame.
K = calibration matrix of the camera (containing internal parameters of the camera, not to be confused with the external parameters contained by R and C)
p1 and p2 = the image points
mu = parameter expressing the position of P on the projection line from camera center C to P (if i'm correct R*K^-1*p expresses a line equation/vector pointing from C to P)
these are 6 equations containing 5 unknowns: mu1, mu2 and P
edit: #ripkars' comment (comment box too small once again)
The only computer vison library that pops up in my mind is OpenCV (http://opencv.willowgarage.com/wiki ). But that's a C library, not matlab... I guess google is your friend ;)
About the calibration: yes, if those two images contain enough information to match some features. If you change the relative pose of the cameras, you'll have to recalibrate of course.
The choice of the world frame is arbitrary; it only becomes important when you want to analyze the retrieved 3d data afterwards: for example you could align one of the world planes with the plane of motion -> simplified motion equation if you want to fit one.
This world frame is just a reference frame, changeable with a 'change of reference frame transformation' (translation and/or rotation transformation)
Unless you have a stereo camera, you will never be able to know the position for sure, even with calibrated camera. Because you don't know whether the ball is small and close or large and far away.
There are other methods with single camera, based on series of images with different focus. But I doubt that you can control the camera of your cell phone in that way.
Edit(1):
as #GuntherStruyf points out correctly, you can know the position if one of your inputs is the size of the ball.

Is there a way to figure out 3D distance/view angle from a 2D environment using the iPhone/iPad camera?

Maybe I'm asking this too soon in my research, but I'd better know if this is possible sooner than later.
Imagine I have the following square printed on a paper on top of a table:
The table is brown, so it does not match with any of the colors in the square. Is there a way for me, from a common iPhone camera (non-stereo view), to figure out the distance and angle from which Im looking at the square in the table?
In the end what I'm looking for is being able to draw a 3D square on top of this one using the camera image, but I'm not sure if I am going to be able to figure out the distance and position of the object in space using only a 2D image. Any hints are well appreciated.
Short answer: http://weblog.bocoup.com/javascript-augmented-reality
Big answer:
First posterize, Then vectorize, With the vectors in your power you may need to do some math tricks to define, based on the vectors position, the perspective and then the camera position.
Maybe this help:
www.pixastic.com/lib/docs/actions/posterize/
github.com/selead/cl-vectorizer
vectormagic.com/home
autotrace.sourceforge.net
www.scipy.org/PyLab
raphaeljs.com/
technabob.com/blog/2007/12/29/video-games-get-vectorized/
superuser.com/questions/88415/is-there-an-open-source-alternative-to-vector-magic
Oughta be possible. Scan the image for the red/blue/yellow pattern, then do edge detection to figure out how warped the squares are (they'll be parallelograms in anything but straight-on view). Distance would depend on the camera's zoom setting and scan resolution. But basically you'd count how many pixels are visible in each of the squares, run that past the camera's specs and you should be able to determine a rough distance.

Calculating corresponding pixels

I have a computer vision set up with two cameras. One of this cameras is a time of flight camera. It gives me the depth of the scene at every pixel. The other camera is standard camera giving me a colour image of the scene.
We would like to use the depth information to remove some areas from the colour image. We plan on object, person and hand tracking in the colour image and want to remove far away background pixel with the help of the time of flight camera. It is not sure yet if the cameras can be aligned in a parallel set up.
We could use OpenCv or Matlab for the calculations.
I read a lot about rectification, Epipolargeometry etc but I still have problems to see the steps I have to take to calculate the correspondence for every pixel.
What approach would you use, which functions can be used. In which steps would you divide the problem? Is there a tutorial or sample code available somewhere?
Update We plan on doing an automatic calibration using known markers placed in the scene
If you want robust correspondences, you should consider SIFT. There are several implementations in MATLAB - I use the Vedaldi-Fulkerson VL Feat library.
If you really need fast performance (and I think you don't), you should think about using OpenCV's SURF detector.
If you have any other questions, do ask. This other answer of mine might be useful.
PS: By correspondences, I'm assuming you want to find the coordinates of a projection of the same 3D point on both your images - i.e. the coordinates (i,j) of a pixel u_A in Image A and u_B in Image B which is a projection of the same point in 3D.