Optical lense distance from an object - raspberry-pi

I am using a Raspberry PI camera and the problem in hand is how to find the best position for it in order to fully see an object.
The object looks like this:
Question is how to find the perfect position given that the camera is placed in the centre of the above image. Perfectly the camera will be able to catch the object only, as the idea is to get the camera as close as possible.

Take a picture with you camera, save it as a JPG, then open it in a viewer that allows you to inspect the EXIF header. If you are lucky you should see the focal length (in mm) and the sensor size. If the latter is missing, you can probably work it out from the sensor's spec sheet (see here to start). From the two quantities you can work out the angle of the field of view (HorizFOV = atan(0.5 * sensor_width / focal_length), VertFOV = atan(0.5 * sensor_height / focal_length). From these angles you can derive an approximate distance from your subject that will keep it fully in view.
Note that these are only approximations. Nonlinear lens distortion will produce a slightly larger effective FOV, especially near the corners.

Related

Restoring the image of a face's plane

I am using an API to analyze faces in Matlab, where I get for each picture a 3X3 rotation matrix of the face's orientation, telling which direction the head is pointing.
I am trying to normalize the image according to that matrix, so that it will be distorted to get the image of the face's plane. This is something like 'undoing' the projection of the face to the camera plane. For example, if the head is directed a little to the left, it will stretch the left side to (more or less) preserve the face's original proportions.
Tried using 'affine2d' and 'projective2d' with 'imwarp', but it didn't achieve that goal
Achieving your goal with simple tools like affine transformations seems impossible to me since a face is hardly a flat surface. An extreme example: Imagine the camera recording a profile view of someone's head. How are you going to reconstruct the missing half of the face?
There have been successful attempts to change the orientation of faces in images and real-time video, but the methods used are quite complex:
[We] propose a gaze correction method that needs just a
single webcam. We apply recent shape deformation techniques
to generate a 3D face model that matches the user’s face. We
then render a gaze-corrected version of this face model and
seamlessly insert it into the original image.
(Giger et al., https://graphics.ethz.ch/Downloads/Publications/Papers/2014/Gig14a/Gig14a.pdf)

Doubts in camera calibration

I am working on a project based on machine vision . Wide angle lens with high resolution pinhole camera is being used.
Working distance : Distance between Camera and object .
The resolution will be nearly 10MP. The image size may be 3656 pixel width and 2740 pixel height.
The project requirements are as mentioned below
My working distance must be nearly 5metres.
The camera needs to be tilted at an angle of 13 degree.
To avoid lens distortion in camera I do camera calibration using OpenCV.
Below mentioned are my doubts pertaining to this camera calibration
Since the working distance is 5 meters, should the camera calibration too be done with the same distance?
Since the camera is tilted by an angle 13deg in the application ,is it necessary to do the calibration too with the camera tilted at respective angle?
My answer is "maybe" to the first question, and "no" to the second.
While it is true that it is not strictly necessary to calibrate with the target at the same or nearby distance as the subject, in practice it is possible only if you have enough depth of field (in particular, if you are focused at infinity), and use a fixed iris.
The reason is the Second Rule of Camera Calibration: "Thou shalt not touch the lens during or after calibration". In particular, you may not refocus nor change the f-stop, because both focusing and iris affect the nonlinear lens distortion and (albeit less so, depending on the lens) the field of view. Of course, you are completely free to change the exposure time, as it does not affect the lens geometry at all.
See also, for general comment, this other answer of mine.
The answer is no to both questions. Camera calibration essentially finds the relationship between the focal length and the pixel plane when assuming the pinhole camera model; and optionally (as you will require due to your wide angle lens), radial distortion. These relationships are independent of the position of the camera in the world.
By the way, I see you tagged this as matlab: I can recommend the Camera Calibration Toolbox for MATLAB as a nice easy way of calibrating cameras. It guides you through process nicely.
That angle of the camera is not a problem, but you do want to calibrate your camera with the calibration target at roughly the working distance from it. In theory, the distance should not matter. In reality, though, you will have greater errors if you calibrate at 1 meter, and then try to measure things 5 meters away.
Also, please check out the CameraCalibrator App that is a part of the Computer Vision System Toolbox for MATLAB.

3D trajectory reconstruction from video (taken by a single camera)

I am currently trying to reconstruct a 3D trajectory of a falling object like a ball or a rock out of a sequence of images taken from an iPhone video.
Where should I start looking? I know I have to calibrate the camera (I think I'll use the matlab calibration toolbox by Jean-Yves Bouguet) and then find the vanishing point from the same sequence, but then I'm really stuck.
read this: http://www.cs.auckland.ac.nz/courses/compsci773s1c/lectures/773-GG/lectA-773.htm
it explains 3d reconstruction using two cameras. Now for a simple summary, look at the figure from that site:
You only know pr/pl, the image points. By tracing a line from their respective focal points Or/Ol you get two lines (Pr/Pl) that both contain the point P. Because you know the 2 cameras origin and orientation, you can construct 3d equations for these lines. Their intersection is thus the 3d point, voila, it's that simple.
But when you discard one camera (let's say the left one), you only know for sure the line Pr. What's missing is depth. Luckily you know the radius of your ball, this extra information can give you the missing depth information. see next figure (don't mind my paint skills):
Now you know the depth using the intercept theorem
I see one last issue: the shape of ball changes when projected under an angle (ie not perpendicular on your capture plane). However you do know the angle, so compensation is possible, but I leave that up to you :p
edit: #ripkars' comment (comment box was too small)
1) ok
2) aha, the correspondence problem :D Typically solved by correlation analysis or matching features (mostly matching followed by tracking in a video). (other methods exist too)
I haven't used the image/vision toolbox myself, but there should definitely be some things to help you on the way.
3) = calibration of your cameras. Normally you should only do this once, when installing the cameras (and every other time you change their relative pose)
4) yes, just put the Longuet-Higgins equation to work, ie: solve
P = C1 + mu1*R1*K1^(-1)*p1
P = C2 + mu2*R2*K2^(-1)*p2
with
P = 3D point to find
C = camera center (vector)
R = rotation matrix expressing the orientation of the first camera in the world frame.
K = calibration matrix of the camera (containing internal parameters of the camera, not to be confused with the external parameters contained by R and C)
p1 and p2 = the image points
mu = parameter expressing the position of P on the projection line from camera center C to P (if i'm correct R*K^-1*p expresses a line equation/vector pointing from C to P)
these are 6 equations containing 5 unknowns: mu1, mu2 and P
edit: #ripkars' comment (comment box too small once again)
The only computer vison library that pops up in my mind is OpenCV (http://opencv.willowgarage.com/wiki ). But that's a C library, not matlab... I guess google is your friend ;)
About the calibration: yes, if those two images contain enough information to match some features. If you change the relative pose of the cameras, you'll have to recalibrate of course.
The choice of the world frame is arbitrary; it only becomes important when you want to analyze the retrieved 3d data afterwards: for example you could align one of the world planes with the plane of motion -> simplified motion equation if you want to fit one.
This world frame is just a reference frame, changeable with a 'change of reference frame transformation' (translation and/or rotation transformation)
Unless you have a stereo camera, you will never be able to know the position for sure, even with calibrated camera. Because you don't know whether the ball is small and close or large and far away.
There are other methods with single camera, based on series of images with different focus. But I doubt that you can control the camera of your cell phone in that way.
Edit(1):
as #GuntherStruyf points out correctly, you can know the position if one of your inputs is the size of the ball.

Not able to calibrate camera view to 3D Model

I am developing an app which uses LK for tracking and POSIT for estimation. I am successful in getting rotation matrix, projection matrix and able to track perfectly but the problem for me is I am not able to translate 3D object properly. The object is not fitting in to the right place where it has to fit.
Will some one help me regarding this?
Check this links, they may provide you some ideas.
http://computer-vision-talks.com/2011/11/pose-estimation-problem/
http://www.morethantechnical.com/2010/11/10/20-lines-ar-in-opencv-wcode/
Now, you must also check whether the intrinsic camera parameters are correct. Even a small error in estimating the field of view can cause troubles when trying to reconstruct 3D space. And from your details, it seems that the problem are bad fov angles (field of view).
You can try to measure them, or feed the half or double value to your algorithm.
There are two conventions for fov: half-angle (from image center to top or left, or from bottom to top, respectively from left to right) Maybe you just mixed them up, using full-angle instead of half, or vice-versa
Maybe you can show us how you build a transformation matrix from R and T components?
Remember, that cv::solvePnP function returns inverse transformation (e.g camera in world) - it finds object pose in 3D space where camera is in (0;0;0). For almost all cases you need inverse it to get correct result: {Rt; -T}

iPhone 3D compass

I am trying to build an app for the iPhone 4 which enables the user to "point" at a hardcoded destination and a dot appears where the destination is located.
First, i use the compass to make a horizontal compass(this will cover the left/right rotation):
// Heading
nowHeading = heading.trueHeading;
// Shift image (horizontal compass)
float shift = bearing - nowHeading;
destinationImage.center = CGPointMake(shift+160, destinationImage.center.y);
I shift the dot 160 pixels because the screen is 320 pixels width. My question is now, how can I expand this code to handle up and down? Meaning that if i point the phone down in the table, the dot wont show.. I have to point (like taking a picture) at the destination in order for it to be drawn on the screen. I've already implemented the accelerator. But i don't know how to unite these components to solve my problem.
Bearing should depend on the field of vision of the camera. For iPhone 4 the horizontal angular view is 47.5 so 320 points/47.5 = xxx points per degree, use that to shift horizontally. You also have to add an adaptive filter to the accelerometers, you can get one from the AccelerometerGraph project from Apple.
You have the rotation in one axis (bearing) you should get the rotation on the other two from the accelerometers. The atan2 of two axis give you the rotation on the third. Go to UIAcceleration and imagine an axis physically piercing the device if that helps and do double xAngle = atan2(acceleration.y, acceleration.z); So once you have the rotation upside down you can repeat what you did for the horizontal with the vertical field of view, eg: 60 for the iPhone.
That is going to be one rough implementation :) but achieving smooth movement is difficult. One thing you can do is use the gyros to get a faster response and correct their signal periodically with the accelerometers. See this talk for the troubles ahead: Sensor Fusion on Android Devices. Here is a website dedicated to the Kalman Filter. If you dare with Quaternions I recommend "Visualizing Quaternions" from Andrew J. Hanson.
It sounds like you are trying to do a style of Augmented Reality. If that. Is the case there are several libraries and sample code suggested here:
Augmented Reality