So I am using matlab and I've managed to modify one of their examples so that I can now plot the flow lines as people walk below (Camera is above a door).
I use Lucas-Kanade optical flow and the computer vision toolbox.
The lines are defined like so, I also defined the tracked points. These tracked points include cases where the original points haven't changed and so the real(tmp(:)) in this case will be zero and those points will be the same as the orgininally identified feature points.
vel_Lines = [Y(:) X(:) Y(:)+real(tmp(:)) X(:)+imag(tmp(:))];
allTrackedPoints = [Y(:)+real(tmp(:)) X(:)+imag(tmp(:))];
My question is how can I JUST get the points which have successfully been tracked a certain distance? I want to somehow only retain values which the change is large enough.
I'm not great with Matlab's syntax so was hoping this would be easy for someone.
I want to get the points that were successfuly tracked pertaining to the motion, then cluster these points to determine how many people, and then tracked these sets of points using a multiple object tracker.
If your camera is not moving, then background subtraction may work better for you than optical flow. See this example.
You can also use the vision.PeopleDetector object to detect people. See this example.
If you insist on using optical flow, try the Fareneback optical flow algorithm, available as of R2015b release.
Related
I'v been trying to do quadrangle detection and localization for weeks, my goal is to have a robust way of getting the 4 points of an quadrangle(rectangle), so I can apply projective transform to an Image then attach it to the source image. I have try the classic opencv contour method, and also using hough transform to find lines then calculate intersections, those two methods is unusable when apply it to real life images.
So I turn to CNN for help, but currently i haven't find any one try to use CNN to solve this simple problem.
My first attempt is to use state-of-art object detection and localization methods to get quadrangle's bounding box so i can narrow the search of 4 points, then use image processing & computer vision methods to further the search for 4 points. but after trying YOLOv2 and Faster-RCNN, the prediction accuracy is not ideal.
So I'm wondering if there is any idea i can do this end to end, training and feedforward all using a single neural network. it also must be able to deal with occlusion reasonably well.
Currently my idea is to remove the fc-layers and make a huge activation map that has the same width and height as the first input layer(eg. 448x448) then optimize the 4 most highly activated areas, using argmax to get the position. but this method only works for one quadrangle it doesn't work well with corner occlusions as well.
I'll be appreciated if anyone can provide any suggestions. Thanks a lot!
You are absolutely right about the first methods you mentioned. Hough transform like methods are old and not useful for images in the wild. And of course, computer vision field turned its face to object detection and recognition with rise of deep learning.
However, there is a very nice discussion came up recently.
Have we forgotten about Geometry in Computer Vision?
My suggestion would be contour detection and then apply Hough transform(use state of the art) methods to detect rectangles you want, about the occlusion, you can set parameters for Hough transform to be more forgiving for missing edge pixels with parameters.
You can for example check most recent contour detection methods as in recent CVPR paper.
I am working on a drone based video surveillance project. I am required to implement object tracking in the same. I have tried conventional approaches but these seem to fail due to non static environment.
This is an example of what i would want to achieve. But this uses background subtraction which is impossible to achieve with a non static camera.
I have also tried feature based tracking using SURF features, but it fails for smaller objects and is prone to false positives.
What would be the best way to achieve the objective in this scenario ?.
Edit : An object can be anything within a defined region of interest. The object will usually be a person or a vehicle. The idea is that the user will make a bounding box which will define the region of interest. The drone now has to start tracking whatever is within this region of interest.
Tracking local features (like SURF) won't work in your case. Training a classifier (like Boosting with HAAR features) won't work either. Let me explain why.
Your object to track will be contained in a bounding box. Inside this bounding box there could be any object, not a person, a car, or something else that you used to train you classifier.
Also, near the object, in the bounding box there will be also background noise that will change as soon as your target object moves, even if the appearance of the object doesn't change.
Moreover the appearance of you object changes (e.g. a person turns, or drop the jacket, a vehicle get a reflection of the sun, etc...), or the object gets (partially or totally) occluded for a while. So tracking local features is very likely to lose the tracked object very soon.
So the first problem is that you must deal with potentially a lot of different objects, possibly unknown a priori, to track and you cannot train a classifier for each one of these.
The second problem is that you must follow an object whose appearance may change, so you need to update your model.
The third problem is that you need some logic that tells you that you lost the tracked object, and you need to detect it again in the scene.
So what to do? Well, you need a good long term tracker.
One of the best (to my knowledge) is Tracking-Learning-Detection (TLD) by Kalal et. al.. You can see on the dedicated page a lot of example videos, and you can see that it works pretty good with moving cameras, objects that change appearance, etc...
Luckily for us, OpenCV 3.0.0 has an implementation for TLD, and you can find a sample code here (there is also a Matlab + C implementation in the aforementioned site).
The main drawback is that this method could be slow. You can test if it's an issue for you. If so, you can downsample the video stream, upgrade your hardware, or switch to a faster tracking method, but this depends on you requirements and needs.
Good luck!
The simplest thing to try is frame differencing instead of background subtraction. Subtract the previous frame from the current frame, threshold the difference image to make it binary, and then use some morphology to clean up the noise. With this approach you typically only get the edges of the objects, but often that is enough for tracking.
You can also try to augment this approach using vision.PointTracker, which implements the KLT (Kanad-Lucas-Tomasi) point tracking algorithm.
Alternatively, you can try using dense optical flow. See opticalFlowLK, opticalFlowHS, and opticalFlowLKDoG.
Recently, I have to do a project of multi view 3D scanning within this 2 weeks and I searched through all the books, journals and websites for 3D reconstruction including Mathworks examples and so on. I written a coding to track matched points between two images and reconstruct them into 3D plot. However, despite of using detectSURFFeatures() and extractFeatures() functions, still some of the object points are not tracked. How can I reconstruct them also in my 3D model?
What you are looking for is called "dense reconstruction". The best way to do this is with calibrated cameras. Then you can rectify the images, compute disparity for every pixel (in theory), and then get 3D world coordinates for every pixel. Please check out this Stereo Calibration and Scene Reconstruction example.
The tracking approach you are using is fine but will only get sparse correspondences. The idea is that you would use the best of these to try to determine the difference in camera orientation between the two images. You can then use the camera orientation to get better matches and ultimately to produce a dense match which you can use to produce a depth image.
Tracking every point in an image from frame to frame is hard (its called scene flow) and you won't achieve it by identifying individual features (such as SURF, ORB, Freak, SIFT etc.) because these features are by definition 'special' in that they can be clearly identified between images.
If you have access to the Computer Vision Toolbox of Matlab you could use their matching functions.
You can start for example by checking out this article about disparity and the related matlab functions.
In addition you can read about different matching techniques such as block matching, semi-global block matching and global optimization procedures. Just to name a few keywords. But be aware that the topic of stereo matching is huge one.
I am attempting to do some face recognition and hallucination experiments and in order to get the best results, I first need to ensure all the facial images are aligned. I am using several thousand images for experimenting.
I have been scouring the Internet for past few days and have found many different programs which claim to do so, however due to Matlabs poor backwards compatibility, many of the programs no longer work. I have tried several different programs which don't run as they are calling onto Matlab functions which have since been removed.
The closest I found was using the SIFT algorithm, code found here
http://people.csail.mit.edu/celiu/ECCV2008/
Which does help align the images, but unfortunately it also downsamples the image, so the result ends up quite blurry looking which would have a negative effect on any experiments I ran.
Does anyone have any Matlab code samples or be able to point me in the right direction to code that actually aligns faces in a database.
Any help would be much appreciated.
You can find this recent work on Face Detection, Pose Estimation and Landmark Localization in the Wild. It has a working Matlab implementation and it is quite a good method.
Once you identify keypoints on all your faces you can morph them into a single reference and work from there.
The easiest way it with PCA and the eigen vector. To found X and Y most representative data. So you'll get the direction of the face.
You can found explication in this document : PCA Aligment
Do you need to detect the faces first, or are they already cropped? If you need to detect the faces, you can use vision.CascadeObjectDetector object in the Computer Vision System Toolbox.
To align the faces you can try the imregister function in the Image Processing Toolbox. Alternatively, you can use a feature-based approach. The Computer Vision System Toolbox includes a number of interest point detectors, feature descriptors, and a matchFeatures function to match the descriptors between a pair of images. You can then use the estimateGeometricTransform function to estimate an affine or even a projective transformation between two images. See this example for details.
I am trying to specify a line across a blood vessel in an image, and then have matlab specify the edges of the vessel (which are contained within the line). The next part will be comparing changes in the distance between these edges over time (so across 1000x more images).
I have tried the following code to get started:
I = imread('Obj1.tif');
imshow(I,[]);
improfile
And I was looking at available methods to detect the edges from intensity along that line that gets plotted (tangents, maxima/minima etc) but I am not convinced this is the best method. I looked into other tools on matlab such as the Canny Method, Sobel etc, but the examples for all of these only show how to detect edges throughout the entire image. My coding skills are not sufficient to have the algorithms specified along a single line of the users choosing. Methods that I have looked at on pubmed also seem more complicated then perhaps I need.
Does anybody have any ideas or suggestions from the point that I am currently at?
Thank you