I have a computer vision set up with two cameras. One of this cameras is a time of flight camera. It gives me the depth of the scene at every pixel. The other camera is standard camera giving me a colour image of the scene.
We would like to use the depth information to remove some areas from the colour image. We plan on object, person and hand tracking in the colour image and want to remove far away background pixel with the help of the time of flight camera. It is not sure yet if the cameras can be aligned in a parallel set up.
We could use OpenCv or Matlab for the calculations.
I read a lot about rectification, Epipolargeometry etc but I still have problems to see the steps I have to take to calculate the correspondence for every pixel.
What approach would you use, which functions can be used. In which steps would you divide the problem? Is there a tutorial or sample code available somewhere?
Update We plan on doing an automatic calibration using known markers placed in the scene
If you want robust correspondences, you should consider SIFT. There are several implementations in MATLAB - I use the Vedaldi-Fulkerson VL Feat library.
If you really need fast performance (and I think you don't), you should think about using OpenCV's SURF detector.
If you have any other questions, do ask. This other answer of mine might be useful.
PS: By correspondences, I'm assuming you want to find the coordinates of a projection of the same 3D point on both your images - i.e. the coordinates (i,j) of a pixel u_A in Image A and u_B in Image B which is a projection of the same point in 3D.
Related
I am using an API to analyze faces in Matlab, where I get for each picture a 3X3 rotation matrix of the face's orientation, telling which direction the head is pointing.
I am trying to normalize the image according to that matrix, so that it will be distorted to get the image of the face's plane. This is something like 'undoing' the projection of the face to the camera plane. For example, if the head is directed a little to the left, it will stretch the left side to (more or less) preserve the face's original proportions.
Tried using 'affine2d' and 'projective2d' with 'imwarp', but it didn't achieve that goal
Achieving your goal with simple tools like affine transformations seems impossible to me since a face is hardly a flat surface. An extreme example: Imagine the camera recording a profile view of someone's head. How are you going to reconstruct the missing half of the face?
There have been successful attempts to change the orientation of faces in images and real-time video, but the methods used are quite complex:
[We] propose a gaze correction method that needs just a
single webcam. We apply recent shape deformation techniques
to generate a 3D face model that matches the user’s face. We
then render a gaze-corrected version of this face model and
seamlessly insert it into the original image.
(Giger et al., https://graphics.ethz.ch/Downloads/Publications/Papers/2014/Gig14a/Gig14a.pdf)
Recently, I have to do a project of multi view 3D scanning within this 2 weeks and I searched through all the books, journals and websites for 3D reconstruction including Mathworks examples and so on. I written a coding to track matched points between two images and reconstruct them into 3D plot. However, despite of using detectSURFFeatures() and extractFeatures() functions, still some of the object points are not tracked. How can I reconstruct them also in my 3D model?
What you are looking for is called "dense reconstruction". The best way to do this is with calibrated cameras. Then you can rectify the images, compute disparity for every pixel (in theory), and then get 3D world coordinates for every pixel. Please check out this Stereo Calibration and Scene Reconstruction example.
The tracking approach you are using is fine but will only get sparse correspondences. The idea is that you would use the best of these to try to determine the difference in camera orientation between the two images. You can then use the camera orientation to get better matches and ultimately to produce a dense match which you can use to produce a depth image.
Tracking every point in an image from frame to frame is hard (its called scene flow) and you won't achieve it by identifying individual features (such as SURF, ORB, Freak, SIFT etc.) because these features are by definition 'special' in that they can be clearly identified between images.
If you have access to the Computer Vision Toolbox of Matlab you could use their matching functions.
You can start for example by checking out this article about disparity and the related matlab functions.
In addition you can read about different matching techniques such as block matching, semi-global block matching and global optimization procedures. Just to name a few keywords. But be aware that the topic of stereo matching is huge one.
I have several images of the pugmark with lots of irrevelant background region. I cannot do intensity based algorithms to seperate background from the foreground.
I have tried several methods. one of them is detecting object in Homogeneous Intensity image
but this is not working with rough texture images like
http://img803.imageshack.us/img803/4654/p1030076b.jpg
http://imageshack.us/a/img802/5982/cub1.jpg
http://imageshack.us/a/img42/6530/cub2.jpg
Their could be three possible methods :
1) if i can reduce the roughness factor of the image and obtain the more smoother texture i.e more flat surface.
2) if i could detect the pugmark like shape in these images by defining rough pugmark shape in the database and then removing the background to obtain image like http://i.imgur.com/W0MFYmQ.png
3) if i could detect the regions with depth and separating them from the background based on difference in their depths.
please tell if any of these methods would work and if yes then how to implement them.
I have a hunch that this problem could benefit from using polynomial texture maps.
See here: http://www.hpl.hp.com/research/ptm/
You might want to consider top-down information in the process. See, for example, this work.
Looks like you're close enough from the pugmark, so I think that you should be able to detect pugmarks using Viola Jones algorithm. Maybe a PCA-like algorithm such as Eigenface would work too, even if you're not trying to recognize a particular pugmark it still can be used to tell whether or not there is a pugmark in the image.
Have you tried edge detection on your image ? I guess it should be possible to finetune Canny edge detector thresholds in order to get rid of the noise (if it's not good enough, low pass filter your image first), then do shape recognition on what remains (you would then be in the field of geometric feature learning and structural matching) Viola Jones and possibly PCA-like algorithm would be my first try though.
Maybe I'm asking this too soon in my research, but I'd better know if this is possible sooner than later.
Imagine I have the following square printed on a paper on top of a table:
The table is brown, so it does not match with any of the colors in the square. Is there a way for me, from a common iPhone camera (non-stereo view), to figure out the distance and angle from which Im looking at the square in the table?
In the end what I'm looking for is being able to draw a 3D square on top of this one using the camera image, but I'm not sure if I am going to be able to figure out the distance and position of the object in space using only a 2D image. Any hints are well appreciated.
Short answer: http://weblog.bocoup.com/javascript-augmented-reality
Big answer:
First posterize, Then vectorize, With the vectors in your power you may need to do some math tricks to define, based on the vectors position, the perspective and then the camera position.
Maybe this help:
www.pixastic.com/lib/docs/actions/posterize/
github.com/selead/cl-vectorizer
vectormagic.com/home
autotrace.sourceforge.net
www.scipy.org/PyLab
raphaeljs.com/
technabob.com/blog/2007/12/29/video-games-get-vectorized/
superuser.com/questions/88415/is-there-an-open-source-alternative-to-vector-magic
Oughta be possible. Scan the image for the red/blue/yellow pattern, then do edge detection to figure out how warped the squares are (they'll be parallelograms in anything but straight-on view). Distance would depend on the camera's zoom setting and scan resolution. But basically you'd count how many pixels are visible in each of the squares, run that past the camera's specs and you should be able to determine a rough distance.
I have two images. In one of the images, my eye is in the center position and in the other image, it is in the left. How do I find out whether my eye is in the left or the right?
I am using MATLAB. Are there any functions for this?
A simple solution is to try to detect the iris using circular Hough Transform.
You can find a lot materials out there. To name a few, these two fileexchange submissions:
Hough Transform for circle
detection
Circle Detection via Standard Hough
Transform
This sounds like Eye tracking implemented in MATLAB which is a fairly popular research topic.
If you want a more detailed answer, please answer the following questions:
Do you know the coordinates of your eye in the first image?
What kind of motion is there between the two images? Rotation/translation/scaling/...?
Do you want this to be real-time?
What is the resolution of the images?
Are there going to be more eyes in the image apart from yours?
If you are willing to select the eye in one image you can use template matching to find it in others (for example you can mark it in the first frame of a video and then find it in all other frames).
Look at the normxcor2 function in matlab:
http://www.nd.edu/~hpcc/solaris8_usr_local/src/matlab6.1/help/toolbox/images/normxcorr2.html
This technique is robust to constant illumination change, but will fail if the appearance of the eye changes significantly between the image you took the template from and the image you are searching in.
If you are going to search for the eye in a lot of frames (for example, eye tracking from a webcam) then you should look at stronger techniques such as the Kalman Filter or the Particle Filter (aka Condensation Filter in computer vision)
By using Color Distance Maps, the skin and non skin area can be differentiated and thus the non skin area contains the iris. From the iris, the whole eye could be detected. Hope it works.
You should also have a look at Eye Ball Detection in MATLAB , they have detected eyes first and then detected the EyeBall.