Recognition multiple occurence of an object by SURF descriptor - image-recognition

My goal is to recognize and mark all occurence a test object in a image.
I've been working on recognition object using SURF descriptor. Currently, I'm able to extract descriptors from object image and scene image and then match them through a Euclidean-distance based nearest neighbor approach. Then, a homography matrix( with RASNAC) is computed to localise and mark object in image scene. But I can't figure out how to mark an object if it occurs more than once in a image scene.?

Related

ARKit – How to detect the colour of specific feature point in sceneView?

I would like to get the colour of the detected world object at a specific feature point in the sceneView. For example, I have a feature point detected at (x:10, y:10, z:10).
How do I get the colour of the object/ surface at this position?
At the moment it's not possible to get a colour of a real-world object under feature point using ARKit methods (the same way like you saw in many compositing apps). There's no ARKit method allowing you multiply an Alpha of a feature point by RGB value of corresponding pixel in a video stream.
.showFeaturePoints is an extended debug option ARSCNDebugOptions for an ARSCNView. This option just allow you to show detected 3D feature points in the world.
#available(iOS 11.0, *)
public static let showFeaturePoints: SCNDebugOptions
But I'm sure that you can try to apply a CIFilter to ARKit camera feed containing feature points.
Feature points in your scene are yellow, so you can use Chroma Key Effect to extract an Alpha channel. Then you need to multiply this Alpha by RGB from camera. So you'll get color-coded feature points.
You can alternatively use a CIDifferenceBlendMode op from Core Image Compositing Operations. You need two sources – one with feature points and another without them. Then you have to modify this result of Difference op and assign it to Alpha channel before multiplication.

how to measure distance and centroid of moving object with Matlab stereo computer vision?

Which Matlab functions or examples should be used to (1) track distance from moving object to stereo (binocular) cameras, and (2) track centroid (X,Y,Z) of moving objects, ideally in the range of 0.6m to 6m. from cameras?
I've used the Matlab example that uses the PeopleDetector function, but this becomes inaccurate when a person is within 2m. because it begins clipping heads and legs.
The first thing that you need deal with, is in how detect the object of interest (I suppose you have resolved this issue). There are a lot of approaches of how to detect moving objects. If your cameras will stand in a fix position you can work only with one camera and use some background subtraction to get the objects that appear in the scene (Some info here). If your cameras are are moving, I think the best approach is to work with optical flow of the two cameras (instead to use a previous frame to get the flow map, the stereo pair images are used to get the optical flow map in each fame).
In MatLab, there is an option called disparity computation, this could help you to try to detect the objects in scene, after this you need to add a stage to extract the objects of your interest, you can use some thresholds. Once you have the desired objects, you need to put them in a binary mask. In this mask you can use some image momentum (Check this and this) extractor to calculate the centroids. If the images in the binary mask look noissy you can use some morphological operations to improve the reults (watch this).

How to match an object within an image to other images using SURF features (in MATLAB)?

my problem is how to match one image to a set of images and to display the matched images. I am using SURF feature for feature extraction.
If you have the Computer Vision System Toolbox, take a look at the following examples:
Object Detection In A Cluttered Scene Using Point Feature Matching
Image Search using Point Features

Matlab: Transparent Object Detection

I'm trying to detect a transparent object (glass bottle) in an image.
The image is taken from the Kinect so there's rgb and depth images available.
I read from a literature that the boundary of an transparent object have 'unknown depth values' and I can use that as a boundary condition for detecting the object.
The problem is I cannot find that information from my depth file ie. the depth of the image only returns either zero or other values but never 'unknown'
I assume kinect represent 'unknown depth values' as zeros but this raises another problem:
there's a lot of zeros in the image ( ie. boundary etc) how do I know which zero is for the object?
Thanks alot!!
You could try to detect the body of the transparent object rather than the border. The body should return values of whatever is behind it, but those values will be noisier. Take a time-running sample and calculate a running standard deviation. Look for the region of the image that has larger errors than elsewhere. This is simpler if you have access to the raw data (libfreenect). If the data is converted to distance, then the error is a function of distance, so you need to detect regions that are noisier than other regions at that distance, not just regions that are noisier than elsewhere.
I'd recommend you take a look at the following publication:
They were able to detect objects (such as water bottles and glasses). all undertaken in matlab.
Object localisation via action recognition.
J. Darby, B. Li, R. Cunningham and N. Costen.
ICPR, 2012.

How to select objects in OpenGL on iPhone without using glUnProject or GL_SELECT?

I have 3 OpenGL objects which are shown at the same time. If the user touches any one of them, then that particular OpenGL object alone should display in screen.
Just use gluUnProject to convert your touch point to a point on your near clipping plane and a point on your far clipping plane. Use the ray between those two points in a ray-triangle intersection algorithm. Figure out which triangle was closest, and whatever object that triangle is part of is your object. Another approach is to give each object a unique ID color. Then, whenever the user touches the screen, render using your unique ID colors with no lighting, but don't present the render buffer. Now you can just check the color of the pixel where the user touched and compare it against your list of object color ID's. Quick and easy, and it supports up to 16,581,375 unique objects.
You would have to parse every object of your scene and check the possible collision of each one of them with the ray you computed thanks to gluUnProject.
Depending on whether you want to select a face or an object, you could test the collision of the ray with bounding volumes (e.g. bounding box) of your objects for efficiency purposes.