Difference between binary labeling and multi-labeling in energy minimization via graph-cut - stereo-3d

I am working on graph cut to create disparity map from a stereo system but I still have problem to understand binary and multi-labeling I guess that they are both reduced to a binary problem but I can't see the difference
Thanks

Related

Can the baseline between two cameras be determined from an uncalibrated rectified image pair?

Currently, I am working at a short project about stereo-vision.
I'm trying to create depth maps of a scenery. For this, I use my phone from to view points and use the following code/workflow provided by Matlab : https://nl.mathworks.com/help/vision/ug/uncalibrated-stereo-image-rectification.html
Following this code I am able to create nice disparity maps, but I want to now the depths (as in meters). For this, I need the baseline, focal length and disparity, as shown here: https://www.researchgate.net/figure/Relationship-between-the-baseline-b-disparity-d-focal-length-f-and-depth-z_fig1_2313285
The focal length and base-line are known, but not the baseline. I determined the estimate of the Fundamental Matrix. Is there a way to get from the Fundamental Matrix to the baseline, or by making some assumptions to get to the Essential Matrix, and from there to the baseline.
I would be thankful for any hint in the right direction!
"The focal length and base-line are known, but not the baseline."
I guess you mean the disparity map is known.
Without a known or estimated calibration matrix, you cannot determine the essential matrix.
(Compare Multi View Geometry of Hartley and Zisserman for details)
With respect to your available data, you cannot compute a metric reconstruction. From the fundamental matrix, you can only extract camera matrices in a canonical form that allow for a projective reconstruction and will not satisfy the true baseline of the setup. A projective reconstruction is a reconstruction that differs from the metric result by an unknown transformation.
Non-trivial techniques could allow to upgrade these reconstructions to an Euclidean reconstruction result. However, the success of these self-calibration techniques strongly depends of the quality of the data. Thus, using images of a calibrated camera is actually the best way to go.

feature extraction for machine learning

Looking for some advice. I am playing around with an accelerometer, combined with the machine learning app in matlab. Clearly there are many ways to extract features from the received data, both in time and frequency domains. However, I have recently come across time-frequency analysis, specifically using wavelets.
Has anyone got any advice on using wavelet analysis for classifying accelerometer (or similar) data and the benefits of using it ? Or if indeed this would be a valid way of extracting features ? I'm not too sure what sort of data I should be extracting using this method ?
Thanks in advance.
Few points to note,
1)You can transform a number of samples (should be a dyadic number and depends on your sampling frequency) into wavelet domain and classify that data. (eg. if you transform 64 accelerometer samples then you also have 64 points in wavelet domain).
2) Apart from time-frequency information from wavelet transformation, wavelet transformation has sparsity property
(https://en.wikipedia.org/wiki/Sparse_approximation) that would be useful for your classification model.
3) Also, you can try different wavelet basis functions (mother wavelets),
and try to figure out which basis is most suitable for your data. Maybe you can start with Haar basis function as it is more suitable to capture the singular behaviour of your data.

Using MIT HRTF Kemar database on MATLAB

I am a developer and not very familiar with MATLAB unless its about basics. Lately, I read some articles about Kemar HRTF database and i would like to test it under MATLAB to get a clear idea what it does, then try to implement an android audio 3D application using hrtf.
I looked everywhere for a good documentation but i couldnt find any (example)..
I know i should convolve my input stereo signal with the hrtf, but can anyone explain to me what is the meaning of all the files in the database, and which one to use? I ll be grateful.
HRTFs are direction dependant. The database is in polar coordinates, the folders are elevation angle and the files contain the impulse response for a respective azimuth under that elevation angle (for left and right channels respectively).
You need to use the impulse responses that correspond to the direction that the audio is supposed to come from and fold your audio data with that (or use the FFT on both, multiplicate them, then use the IFFT).
Note that that database is very old. It shouldn't be too hard to get data with better angular resolution (10° resolution in elevation is quite bad).
See http://sofacoustics.org/
http://sofacoustics.org/data/database/ari%20%28artificial%29/ in particular. The data from ARI usually has a resolution of 2.5°.

Classification of X-ray Image using machine learning

By what way i can classify X-ray image's features with the help of any machine learning algorithm so that when next time i test a input by sending an individual's X-ray image feature , it should send me whether or not this X-ray is present or not in the database... i have found out the features using matlab of around 20 images.
If the X-rays you're matching are identical, you don't really need to use machine learning. Just do a pixel-wise match and check if the images are say 99% identical (to make up for illumination differences in scanning). In MATLAB, you can do this by simply taking the absolute pixel-wise difference of the two images, and then counting the number of pixels that are different by more than a pre-defined threshold.
If the X-rays are not identical, and you know what features occur repeatedly when the same portion of the body of the same person is X-rayed multiple times, then machine learning would be useful.
It kinds of like face recognition where you input a human face image and then machine learning output whether this face is in your dataset. For your problem, the simplest way i can think of is just define a "distance metric" to measure the similarity of two image features and set a threshold to judge whether they are the same.

audio pattern matching in matlab

Can someone please give me an idea about this problem in matlab ,
I have 4 .wav files that contain the chirping of the birds . Each .wav file represents a different bird. Given an input .wav file , I need to decide which bird it is . I know I have to make frequency spectrum comparison to get to the solution . but don't quite know how i should use spectrogram to help me get there .
P.S. I know how what spectrogram does and have plotted quite a few .wav files with it though
There are several methods for patter recognition problem like the one that you are talking.
You can use a frequency analysis like FFT with the matlab function
S = SPECTROGRAM(X,WINDOW,NOVERLAP)
In SPECTROGRAM you need to define the time window of signal to be analysed in the variable WINDOW. You can use a rectangular window (example WINDOW = [1 1 1 1 1 1 1 ... 1]) with the number of values equal to the desired length. There are a lot of windows to use: hanning, hamming, blackman. You should used the one which is better to your problem. The NOVERLAP is the number of points that your windows moves in one step.
Besides this approach, wavelet transform is also a good technique to solve your problem. Matlab also have a good toolbox to apply discrete and continuous wavelets.
You might try to solve the problem with Deep Belief Networks
Here are some articles that might be helpful:
Audio Feature Extraction with Restricted Boltzmann Machines
Unsupervised feature learning for audio classification
To summarize the idea, instead of manually tune the features, employ RBMs or Autoencoder to extract features (bases) which represent the observed audio samples and then run learning algorithm.
You will need more than 4 audio samples in order to train DBN but it is worth trying as the approach has shown promising results in the past.
This tuturial might be also helpful.
this may prove to be a complicated problem. As a starting point I advise you to divide each record into some fixed length of frames like 20ms with 10ms overlap ,then extract fft of these frames and get some max energy freq. values for each frame. As a last step compare frame frequencies with each other and determine the result by selecting the maximum correlation