I am working on a MATLAB project which enables the user to do face detection and blur them out.
Built-in functions used:
vision.CascadeObjectDetector
The problem with this function: It only detects frontal faces.
The methods I tried: Use imrotate function in a while loop to rotate the image while the degree is less then 360. So I thought that it would work. I increment the rotation by 23 everytime.
Cons: It doesn't work, it changes the spatial resolution of the image.
I have done some experiments in the past, and I have learned that the vision.CascadeObjectDetector using the default frontal face model can tolerate about 15 degrees of in-plane rotation. So I would advise rotating the image by 15 or even 10 degrees at a time, rather than 23.
The problem with training your own detector in this case is the fact that the underlying features (Haar, LBP, and HOG) are not invariant to in-plane rotation. You would have to train multiple detectors, one for each orientation, every 15 degrees or so.
Also, are you detecting faces in still images or in video? If you are looking at a video, then you may want to try tracking the faces. This way, even if you miss a face because somebody's head is tilted, you'll have a chance to detect it later. And once you detect a face, you can track it even if it tilts. Take a look at this example.
Related
I am using an API to analyze faces in Matlab, where I get for each picture a 3X3 rotation matrix of the face's orientation, telling which direction the head is pointing.
I am trying to normalize the image according to that matrix, so that it will be distorted to get the image of the face's plane. This is something like 'undoing' the projection of the face to the camera plane. For example, if the head is directed a little to the left, it will stretch the left side to (more or less) preserve the face's original proportions.
Tried using 'affine2d' and 'projective2d' with 'imwarp', but it didn't achieve that goal
Achieving your goal with simple tools like affine transformations seems impossible to me since a face is hardly a flat surface. An extreme example: Imagine the camera recording a profile view of someone's head. How are you going to reconstruct the missing half of the face?
There have been successful attempts to change the orientation of faces in images and real-time video, but the methods used are quite complex:
[We] propose a gaze correction method that needs just a
single webcam. We apply recent shape deformation techniques
to generate a 3D face model that matches the user’s face. We
then render a gaze-corrected version of this face model and
seamlessly insert it into the original image.
(Giger et al., https://graphics.ethz.ch/Downloads/Publications/Papers/2014/Gig14a/Gig14a.pdf)
I am working on a project based on machine vision . Wide angle lens with high resolution pinhole camera is being used.
Working distance : Distance between Camera and object .
The resolution will be nearly 10MP. The image size may be 3656 pixel width and 2740 pixel height.
The project requirements are as mentioned below
My working distance must be nearly 5metres.
The camera needs to be tilted at an angle of 13 degree.
To avoid lens distortion in camera I do camera calibration using OpenCV.
Below mentioned are my doubts pertaining to this camera calibration
Since the working distance is 5 meters, should the camera calibration too be done with the same distance?
Since the camera is tilted by an angle 13deg in the application ,is it necessary to do the calibration too with the camera tilted at respective angle?
My answer is "maybe" to the first question, and "no" to the second.
While it is true that it is not strictly necessary to calibrate with the target at the same or nearby distance as the subject, in practice it is possible only if you have enough depth of field (in particular, if you are focused at infinity), and use a fixed iris.
The reason is the Second Rule of Camera Calibration: "Thou shalt not touch the lens during or after calibration". In particular, you may not refocus nor change the f-stop, because both focusing and iris affect the nonlinear lens distortion and (albeit less so, depending on the lens) the field of view. Of course, you are completely free to change the exposure time, as it does not affect the lens geometry at all.
See also, for general comment, this other answer of mine.
The answer is no to both questions. Camera calibration essentially finds the relationship between the focal length and the pixel plane when assuming the pinhole camera model; and optionally (as you will require due to your wide angle lens), radial distortion. These relationships are independent of the position of the camera in the world.
By the way, I see you tagged this as matlab: I can recommend the Camera Calibration Toolbox for MATLAB as a nice easy way of calibrating cameras. It guides you through process nicely.
That angle of the camera is not a problem, but you do want to calibrate your camera with the calibration target at roughly the working distance from it. In theory, the distance should not matter. In reality, though, you will have greater errors if you calibrate at 1 meter, and then try to measure things 5 meters away.
Also, please check out the CameraCalibrator App that is a part of the Computer Vision System Toolbox for MATLAB.
I'd like to fill the background of my app with animated clouds. I did some research and stumbled upon the perlin noise algorithm which seems to be fitting. However even in the first test it was extremely expensive to generate a 512x512 (2D) cloud map. I tried simplex noise but it didn't fix it.
According to http://freespace.virgin.net/hugo.elias/models/m_clouds.htm generating clouds is done by adding some perlin/simplex noise maps together. Impossible to do it on a iPhone in my app: I need fluid graphics (my optimistic expectation is 60 FPS on an A4).
So my question: Is there a lighter algorithm to generate animated clouds that does not make my frame rate drop too much?
Thanks in advance!
Paul
Unless all you're doing is generating clouds you'll definitely want them precomputed. Perlin noise can make for nice 2d animations by traversing a set of 3d data, but you could just scroll a 2d image of some noise or a fractal like is generated by the diamond-square algorithm. Either way, you should probably precompute it.
If you want some more variation, I would experiment with putting a noise filter over the precomputed clouds.
Pre-generate the clouds and create 2d sprites using core animation or otherwise. You can then animate these around cheaply. You may not get 60 fps, but you should get close depending on how complex movement you want or what other animations are going on at the time. Either way, it's going to be faster than generating clouds yourself.
Games like FroggyJump for iPhone figure out the rotation of the iphone. I'm getting confused with the acceleration values. How do I calculate the level of rotation? I suppose I need to consider when the iphone isn't perfectly upright.
Thank you.
I'm also wanting to use the new Core Motion framework with the "Device Motion" for iPhone 4 for extra precision. I guess I'll have to use that low pass filter for the other devices.
It's the yaw.
Having given Froggy Jump a quick go, I think it's likely directly using the accelerometer's x value as the left/right acceleration on the frog. If it is stationary, you can think of an accelerometer as giving you the vector that points upward into space, relative to the local axes. For something like a ball rolling or anything else accelerating due to tilt, you want to use the values directly.
For anything that involves actually knowing angles, you're probably best picking the axis around which you want to detect rotation then using the C function atan2f on the accelerometer values for the other two axes. With just an accelerometer, there are some scenarios in which you can't detect rotation — for example, if the device is flat on a table then an accelerometer can't detect yaw. The general rule is that rotations around the gravity vector can't be detected with an accelerometer alone.
I have a computer vision set up with two cameras. One of this cameras is a time of flight camera. It gives me the depth of the scene at every pixel. The other camera is standard camera giving me a colour image of the scene.
We would like to use the depth information to remove some areas from the colour image. We plan on object, person and hand tracking in the colour image and want to remove far away background pixel with the help of the time of flight camera. It is not sure yet if the cameras can be aligned in a parallel set up.
We could use OpenCv or Matlab for the calculations.
I read a lot about rectification, Epipolargeometry etc but I still have problems to see the steps I have to take to calculate the correspondence for every pixel.
What approach would you use, which functions can be used. In which steps would you divide the problem? Is there a tutorial or sample code available somewhere?
Update We plan on doing an automatic calibration using known markers placed in the scene
If you want robust correspondences, you should consider SIFT. There are several implementations in MATLAB - I use the Vedaldi-Fulkerson VL Feat library.
If you really need fast performance (and I think you don't), you should think about using OpenCV's SURF detector.
If you have any other questions, do ask. This other answer of mine might be useful.
PS: By correspondences, I'm assuming you want to find the coordinates of a projection of the same 3D point on both your images - i.e. the coordinates (i,j) of a pixel u_A in Image A and u_B in Image B which is a projection of the same point in 3D.