eye position mapping with the screen pixel - matlab

I am currently doing a project called eye controlled cursor using MATLAB.
I have few stages before I extract out the center of the iris (which can be considered as a pupil location). face detetcion - > eye detection -- > iris detection -->And finally i have obtained the center of the iris as show in the figure.
Now, I am trying to map this position (X,Y) to my computer screen pixel (1366 x 768). In most of the journals I have found, they require a reference point such as lips, nose or eye corner. But I am only able to extract the center of iris by doing certain thresholding. How can i map this position (X,Y) to my computer screen pixel (1366 x 768)?

Well you either have to fix the head to a certain position (which isn't very practical) or you will have to adapt to the face position. Depending on your image, you will have to choose points that are always on that image and are easy to detect. If you just have one point (like the nose), you can only adjust for the x/y shift of your head. If you have more points (like the 4 corners of the eye, the nose, maybe the corners of the mouth), you can also extract the 3 rotational values of the head and therefore calculate the direction of sight much better. For a first approach, I guess only the two inner corners of the eye (they are "easy" to detect) will do.
I would also recommend using a calibration sequency. You present the user with a sequence of 4 red points in the corners of the screen and he has to look at them. You can then record the positions of the pupils and interpolate between them.

Related

Is it possible to find the depth of an internal point of an object using stereo images (or any other method)?

I have image of robot with yellow markers as shown
The yellow points shown are the markers. There are two cameras used to view placed at an offset of 90 degrees. The robot bends in between the cameras. The crude schematic of the setup can be referred.
https://i.stack.imgur.com/aVyDq.png
Using the two cameras I am able to get its 3d co-ordinates of the yellow markers. But, I need to find the 3d-co-oridnates of the central point of the robot as shown.
I need to find the 3d position of the red marker points which is inside the cylindrical robot. Firstly, is it even feasible? If yes, what is the method I can use to achieve this?
As a bonus, is there any literature where they find the 3d location of such internal points which I can refer to (I searched, but could not find anything similar to my ask).
I am welcome to a theoretical solution as well(as long as it assures to find the central point within a reasonable error), which I can later translate to code.
If you know the actual dimensions, or at least, shape (e.g. perfect circle) of the white bands, then yes, it is feasible and possible.
You need to do the following steps, which are quite non trivial to do, and I won't do them here:
Optional but extremely suggested: calibrate your camera, and
undistort it.
find the equation of the projection of a 3D circle into a 2D camera, for any given rotation. You can simplify this by assuming the white line will be completely horizontal. You want some function that takes the parameters that make a circle and a rotation.
Find all white bands in the image, segment them, and make them horizontal (rotate them)
Fit points in the corrected white circle to the equation in (1). That should give you the parameters of the circle in 3d (radious, angle), if you wrote the equation right.
Now that you have an analytic equation of the actual circle (equation from 1 with parameters from 3), you can map any point from this circle (e.g. its center) to the image location. Remember to uncorrect for the rotations in step 2.
This requires understanding of curve fitting, some geometric analytical maths, and decent code skills. Not trivial, but this will provide a solution that is highly accurate.
For an inaccurate solution:
Find end points of white circles
Make line connecting endpoints
Chose center as mid point of this line.
This will be inaccurate because: choosing end points will have more error than fitting an equation with all points, ignores cone shape of view of the camera, ignores geometry.
But it may be good enough for what you want.
I have been able to extract the midpoint by fitting an ellipse to the arc visible to the camera. The centroid of the ellipse is the required midpoint.
There will be wrong ellipses as well, which can be ignored. The steps to extract the ellipse were:
Extract the markers
Binarise and skeletonise
Fit ellipse to the arc (found a matlab function for this)
Get the centroid of the ellipse
hsv_img=rgb2hsv(im);
bin=new_hsv_img(:,:,3)>marker_th; %was chosen 0.35
%skeletonise
skel=bwskel(bin);
%use regionprops to get the pixelID list
stats=regionprops(skel,'all');
for i=1:numel(stats)
el = fit_ellipse(stats(i).PixelList(:,1),stats(i).PixelList(:,2));
ellipse_draw(el.a, el.b, -el.phi, el.X0_in, el.Y0_in, 'g');
The link for fit_ellipse function
Link for ellipse_draw function

Measuring objects in a photo taken by calibrated cameras, knowing the size of a reference object in the photo

I am writing a program that captures real time images from a scene by two calibrated cameras (so the internal parameters of the cameras are known to us). Using two view geometry, I can find the essential matrix and use OpenCV or MATLAB to find the relative position and orientation of one camera with respect to another. Having the essential matrix, it is shown in Hartley and Zisserman's Multiple View Geometry that one can reconstruct the scene using triangulation up to scale. Now I want to use a reference length to determine the scale of reconstruction and resolve ambiguity.
I know the height of the front wall and I want to use it for determining the scale of reconstruction to measure other objects and their dimensions or their distance from the center of my first camera. How can it be done in practice?
Thanks in advance.
Edit: To add more information, I have already done linear trianglation (minimizing the algebraic error) but I am not sure if it is any useful because there is still a scale ambiguity that I don't know how to get rid of it. My ultimate goal is to recognize an object (like a Pepsi can) and separate it in a rectangular area (which is going to be written as a separate module by someone else) and then find the distance of each pixel in this rectangular area, i.e. the region of interest, to the camera. Then the distance from the camera to the object will be the minimum of the distances from the camera to the 3D coordinates of the pixels in the region of interest.
Might be a bit late, but at least for someone struggling with the same staff.
As far as I remember it is actually linear problem. You got essential matrix, which gives you rotation matrix and normalized translation vector specifying relative position of cameras. If you followed Hartley and Zissermanm you probably chose one of the cameras as origin of world coordinate system. Meaning all your triangulated points are in normalized distance from this origin. What is important is, that the direction of every triangulated point is correct.
If you have some reference in the scene (lets say height of the wall), then you just have to find this reference (2 points are enough - so opposite ends of the wall) and calculate "normalization coefficient" (sorry for terminology) as
coeff = realWorldDistanceOf2Points / distanceOfTriangulatedPoints
Once you have this coeff, just mulptiply all your triangulated points with it and you got real world points.
Example:
you know that opposite corners of the wall are 5m from each other. you find these corners in both images, triangulate them (lets call triangulated points c1 and c2), calculate their distance in the "normalized" world as ||c1 - c2|| and get the
coeff = 5 / ||c1 - c2||
and you get real 3d world points as triangulatedPoint*coeff.
Maybe easier option is to have both cameras in fixed relative position and calibrate them together by stereoCalibrate openCV/Matlab function (there is actually pretty nice GUI in Matlab for that) - it returns not just intrinsic params, but also extrinsic. But I don't know if this is your case.

Finding the centers of overlapping circles in a low resolution grayscale image

I am currently taking my first steps in the field of computer vision and image processing.
One of the tasks I'm working on is finding the center coordinates of (overlapping and occluded) circles.
Here is a sample image:
Here is another sample image showing two overlapping circles:
Further information about the problem:
Always a monochrome, grayscale image
Rather low resolution images
Radii of the circles are unknown
Number of circles in a given image is unknown
Center of circle is to be determined, preferably with sub-pixel accuracy
Radii do not have to be determined
Relative low overhead of the algorithm is of importance; the processing is supposed to be carried out with real-time camera images
For the first sample image, it is relatively easy to calculate the center of the circle by finding the center of mass. Unfortunately, this is not going to work for the second image.
Things I tried are mainly based on the Circle Hough Transform and the Distance Transform.
The Circle Hough Transform seemed relatively computationally expensive due to the fact that I have no information about the radii and the range of possible radii is large. Furthermore, it seems hard to identify the (appropriate) pixels along the edge because of the low resolution of the image.
As for the Distance Transform, I have trouble identifying the centers of the circles and the fact that the image needs to be binarized implies a certain loss of information.
Now I am looking for viable alternatives to the aforementioned algorithms.
Some more sample images (images like the two samples above are extracted from images like the following):
Just thinking aloud to try and get the ball rolling for you... I would be thinking of a Blob, or Connected Component analysis to separate out your blobs.
Then I would start looking at each blob individually. First thing is to see how square the bounding box is for each blob. If it is pretty square AND the centroid of the blob is central within the square, then you have a single circle. If it is not square, or the centroid is not central, you have more than one circle.
Now I am going to start looking at where the white areas touch the edges of the bounding box for some clues as to where the centres are...

Heat map generator of a floor plan image

I want to generate a heat map image of a floor. I have the following things:
A black & white .png image of the floor
A three column array stored in Matlab.
-- The first two columns indicate the X & Y coordinates of the floorpan image
-- The third coordinate denotes the "temperature" of that particular coordinate
I want to generate a heat map of the floor that will show the "temperature" strength in those coordinates. However, I want to display the heat map on top of the floor plan so that the viewers can see which rooms lead to which "temperatures".
Is there any software that does this job? Can I use Matlab or Python to do this?
Thanks,
Nazmul
One way to do this would be:
1) Load in the floor plan image with Matlab or NumPy/matplotlib.
2) Use some built-in edge detection to locate the edge pixels in the floor plan.
3) Form a big list of (x,y) locations where an edge is found in the floor plan.
4) Plot your heat map
5) Scatterplot the points of the floor plan as an overlay.
It sounds like you know how to do each of these steps individually, so all you'll need to do is look up some stuff on how to overlay plots onto the same axis, which is pretty easy in both Matlab and matplotlib.
If you're unfamiliar, the right commands look at are things like meshgrid and surf, possibly contour and their Python equivalents. I think Matlab has a built-in for Canny edge detection. I believe this was more difficult in Python, but if you use the PIL library, the Mahotas library, the scikits.image library, and a few others tailored for image manipulation, it's not too bad. SciPy may actually have an edge filter by now though, so check there first.
The only sticking point will be if your (x,y) data for the temperature are not going to line up with the (x,y) pixel locations in the image. In that case, you'll have to play around with some x-scale factor and y-scale factor to transform your heat map's coordinates into pixel coordinates first, and then plot the heat map, and then the overlay should work.
This is a fairly low-tech way to do it; I assume you just need a quick and dirty plot to illustrate how something's working. This method does have the advantage that you can change the style of the floorplan points easily, making them larger, thicker, thinner, different colors, or transparent, depending on how you want it to interact with the heat map. However, to do this for real, use GIMP, Inkscape, or Photoshop and overlay the heatmap onto the image after the fact.
I would take a look at using Python with a module called Polygon
Polygon will allow you to draw up the room using geometric shapes and I believe you can just do the borders of a room as an overlay on your black and white image. While I haven't used to a whole lot at this point, I do know that you only need a single (x,y) coordinate pair to be able to "hit test" against the given shape and then use that "hit test" to know the shape who's color you'd want to change.
Ultimately I think polygon would make your like a heck of a lot easier when it comes to creating the room shapes, especially when they aren't nice rectangles =)
A final little note though. Make sure to read through all of the documentation that Jorg has with his project. I haven't used it in the Python 3.x environment yet, but it was a little painstaking to get it up an running in 2.7.
Just my two cents, enjoy!

How to deduce angle an image was rotated through?

I have an image that was rotated to an unknown angle, and I don't have the original image. How I determine the angle of rotation with matlab commands?
I need to rotate the image back with this angle to reach the original image.
As #High Performance Mark mentions in his comment, it is difficult to give an answer when it is unclear how you can recognize that the image is rotated, or what would make you decide that the rotation has properly been corrected.
In other words, you will first have to find a way to determine the rotation angle by analyzing the image with respect to specific features that inform you about a potential rotation. For example, if your image contains a face, you'd do face detection (for which there is plenty of code on the File Exchange and then rotate so that the eyes are up and the mouth down. If your image contains lines that should be vertical and/or horizontal in an un-rotated image, you can apply a Hough-transform to your image and find the most likely angle of rotation using houghpeaks.
Finally, to rotate your image, you can use imrotate.
Without examples or a more detailed description, it's hard to give good advice. But generally, this can be done for some types of images.
For example, suppose the image shows buildings, poles, furniture or something that should have vertical edges. Run an edge detector, then take a Fourier transform. There should be peaks, or some visible pattern in the power spectrum, along the Y axis for an unrotated image. The power spectrum rotates the same way as the image. If you can devise an algorithm to find the spectral features that indicate vertical edges, you can measure its angle w.r.t. the origin (zero frequency). That is the angle of image rotation.
But you will have to distinguish that particular feature from all other image features that show in the power spectrum. Have fun with that - this is the kind of detail that will take most of your creativity and time.