Matlab Image Histogram Analysis: how do I test for an underlying bimodal distribution? - matlab

I am working with image processing in MATLAB. I have two different images whose histogram plots are as shown below.
Image 1:
and
Image 2:
I have multiple images like those and the only distinguishing(separating) features is that some have single peak and others have two peaks.
In other words some can be thresholded (to generate good results) while others cannot. Is there any way I can separate the two images? Are there any functions that do so in MATLAB or any reference code that will help?
The function used is imhist()

If you mean "distinguish" by "separate", then yes: The property you describe is called bimodality, i.e. you have 2 peaks that can be seperated by one threshold. So your question is actually "how do I test for an underlying bimodal distribution?"
One option to do this programmatically is Binning. This is not the most robust method but the easiest. It might work, it might not.
Kernel Smoothing is probably the more robust solution. You basically shift and scale a certain function (e.g. Gaussian) to fit the data. This can be done with histfit in matlab.
There's more solutions for this problem which you can research for yourself since you now know the terms needed. Be aware though that your problem is not a trivial one if you want to do it properly.

Related

Is there an effective way to fit the following two datasets with lsqcurvefit?

I have two complex datasets for which I intend to find a suitable function to fit them. The first dataset is presented as follows:
As you can see, although complicated, it seems that this dataset is a combination of rectangle functions. These data describe the relation of 'Amplitude' of complex numbers with time. The second picture looks like this:
And this relation actually describes the 'Phase' of the above complex numbers with time, it seems that they are also combinations of rectangle functions. At first, I want to use combinations of Fourier cosine and sine series to fit the amplitude and phase using
lsqcurvefit
in MATLAB, but it seems that the provided parameters fail to converge to the correct values. (I have tried a number of options, like adjusting FiniteDifferenceStepSize, FiniteDifferenceType, StepTolerance and so on). Despite many failures, I saw someone said we could use Normal cumulative distribution function (CDF) to fit a step function, and I thought that it might be possible if we use the combinations of parameterized CDF and
y = erfc(x)
to achieve successful fitting. So, could anyone provide any solutions or ways to fit the above two relations? Giving some valuable ideas will also be very helpful to me.
PS: For now I don't care any hidden physics inside these data, and all I want to do is to find a mathematical way to fit the above two relations in MATLAB.
Thanks!

What is the official Matlab way to plot the values of histcounts into a histogram with any normalization option?

Assume that I have an array of counts (ideally returned by histcounts). Is there an official Matlab way to plot such a histogram with all the standard normalization options available?
It seems that the best suggestion I have is to get the counts from histcounts and then plot them with bar. Something like:
edges = linspace(0,bound,nbins);
hist_c = histcounts(X,nbins);
bar(edges(1:nbins-1),hist_c);
unfortunately as far as I know it seems that using bar is really not recommended according to this link. Probably because as its obvious from the code, it seems that it moves a lot of implementation details into user code (like produces edges array manually when only needing nbins or having to know if to use 1:nbins-1 vs 2:nbins).
Furthermore, which I believe is the worst, is that it leave the user to have to implement the normalization options on its own. One may point out that histcounts can do the normalization options for you, however, it can only do them given the data matrix X. If one had an extremely large matrix X, then one would be in trouble because producing the histogram counts of X could be done on the fly (as done in this question) but the other normalization options could not be easily be done on the fly. One practice the user could try to implement each normalization option as described by the equations in the documentation but it seems extremely inefficient to have users implement this by hand. Is there a way to get access to the code that actually performs this normalization?
In reality what my question is going for is, is there an official matlab way to produce histogram having only the histogram counts? In particular hiding all the implementation details of producing the counts, normalization, binning, edges, etc?
The ideal code in my mind should look like this to the user:
histogram_counts = get_hist_count(X)
plot_histogram(histogram_counts,'Normalization',normalization)
and produces the desired histogram plot.
Related question:
https://www.mathworks.com/matlabcentral/answers/332178-how-does-one-plot-a-histogram-from-the-histogram-counts
https://www.mathworks.com/matlabcentral/answers/275278-what-is-the-recommended-practice-for-plotting-the-outputs-of-histcounts
https://www.mathworks.com/matlabcentral/answers/91944-how-can-i-combine-the-options-histc-and-stack-in-a-bar-plot-in-matlab-7-4-r2007a#answer_101295

Multiscale search for HOG+SVM in Matlab

first of all this is my first question here, so I hope I can explain it in a clear way.
My goal is to detect different classes of traffic signs in images. For that purpose I have trained binary SVMs following these steps:
First I got a database of cropped traffic signs like the one in the link. I considered different classes (prohibition, danger, etc), and negative images. All of them were scaled to 40x40 pixels.
http://i.imgur.com/Hm9YyZT.jpg
I trained linear-SVM models for each class (1-vs-all), using HOG as feature. Each image is described with a 1728-dimensional feature. (I append the three feature vectors for all three image planes). I did crossvalidation to set parameter C, and tested on previously unseen 40x40 images, and I got very accurate results (F1 score over 0.9 for all classes). I used libsvm for training and testing.
Now I'd want to detect signs in full road images, sliding a window in different image scales. The problem I'm facing is that I couldn't find any function that can do it for me (as DetectMultiScale in OpenCV), and my solution is very slow and rudimentary (I'm just doing a triple for loop, and for each scale I crop consecutive and overlapping 40x40 images, obtain HOG features and apply svmpredict for each one).
Can someone give me a clue to find a faster way to do it? I thought too about getting the HOG feature vector of the whole input image, and then reorder that vector to a matrix where each row will have the features corresponding to each 40x40 window, but I couldn't find a straightforward way of doing it.
Thanks,
I would suggest using SURF feature detection, however I don't know if this would also be too slow your needs.
See : http://morf.lv/modules.php?name=tutorials&lasit=2 for more information on how to implement and weather it is a viable solution for you.

Dectecting stamp (seals) imprints on digital image with SIFT

I am working on an application that should determine if input image contain a stamp imprint and return its location. For RGB images I am using color segmentation and doing verification (with various shape factors), for grayscale image I thought that SIFT + verification would do the job, but using SIFT would only find those stamps(on input image) that I got in my database.
In ideal case it works really well, as shown on image bellow.
Fig. 1.
http://i.stack.imgur.com/JHkUl.png
The problem occurs when input image contains a stamp that does not exist in database. First thing I did was checking if there would be any matching key points if I compare a similar stamp to the one on input image. In most cases there is no single matching key point and if there is some they rather refer to other parts of input image than a stamp, as shown in Fig. 2.:
Fig. 2.
http://i.stack.imgur.com/coA4l.png
I also tried to find a match between input and circle images as the stamps are circular, but circle image has very few key points, if any.
So I wonder if there is any different approach that will make SIFT a bit more useful in this exact case? I though about creating a matrix with all descriptors and key-points from my database and then looking for nearest euclidean distance between input image and matrix, but it probably wont work as there is a lot of matching key-points(unwanted) across the database (see Fig. 2.).
I'm working with Matlab and tried both VLFeat and D. Lowe SIFT implementations.
Edit:
So I found a way to force SIFT to compute descriptors for user defined points on an image. My test image contained a circle, then the descriptors were computed and matched against input images, including the one under Fig 1 and 2. This process was repeated for scales from 0 to 10. Unfortunately it didn't help too.
This is only a first hint and not a full answer to the SIFT questions.
My impression is that detecting a circle by matching it against an image of a circle via SIFT is not the best approach, especially if the circle you want to detect has some unknown texture inside.
The textbook algorithm for circle detection would be Hough transform, which is mostly used for line detection but does work for any kind of shape which can be described by a low number of parameters (colleagues tell me things get nasty above 3, but a circle just has X,Y and r). There are several implementations in file exchange, the link is just to one example. Hough circle detection requires you to put an upper bound on the radii you want to detect, but this seems ok for your application.
From the examples you provided it looks like you should get quite far if you can detect circles reliably.
Actually I do not think SIFT will be solving this problem. I've been playing around with SIFT for quite some time and my conclusion is that it's really great for identifying identical patterns but not for similar patterns.
Just have a look at the construction of the SIFT feature vector: The descriptor is composed of several histograms of gradients(!). If you have patterns in the database that have very similar blob like structures in the stamps, then you might have a chance. But if this does not hold, then I guess you will not be very lucky.
From my point of view you have kind of solved the problem of finding indentical objects (stamps) and now extend to finding similar objects. This sounds like the same but in my past research I found these problems just related but not too identical.
Do you have any runtime constraints in your application? There might be other approaches but in this case, more input about possible constraints might be useful.
Update regarding constraints:
So your next task might be to detect the unknown stamps, right?
This sounds like a classification task.
In your case I would first try to find a descriptor/representation (or SVM) that classifies images into stamp/no-stamp. In order to evaluate this, set up a data base with ground truth and a reasonable amount of "unknown" stamps and other images like random snapshots from the letters, NOT containing stamps. This will be your test set.
Then try some descriptors/representations to caluclate the distance/similarity between your images to classify your test set into the classes STAMP / NO-STAMP. When you have found a descriptor/distance measure (or SVM) that performs well in classifying, then you could perform a sliding window approach on a letter to find a stamp. The sliding window approach is certainly not a very fast method, but a very easy one.
At least when you have reached this point, you can tune the detection - for example based on interesting point detectors.. but one step after the other...

How to Compare the quality of two images?

I have applied Two different Image Enhancement Algorithm on a particular Image and got two resultant image , Now i want to compare the quality of those two image in order to find the effectiveness of those two Algorithms and find the more appropriate one based on the comparison of Feature vectors of those two images.So what Suitable Feature Vectors should i compare in this Case?
Iam asking in context of comparing the texture features of the images and which feature vector will be more suitable.
I need Mathematical support for verifying the effectiveness of any one algorithm based on the evaluation of Images for example using Constrast and Variance.So are there any more approaches do that?
A better approach would be to do some Noise/Signal ratio by comparing image spectra ?
Slayton is right, you need a metric and a way to measure against it, which can be an academic project in itself. However, i could think of one approach straightaway, not sure if it makes sense to your specific task at hand:
Metric:
The sum of abs( colour difference ) across all pixels. The lower, the more similar the images are.
Method:
For each pixel, get the absolute colour difference (or distance, to be precise) in LAB space between original and processed image and sum that up. Don't ruin your day trying to understand the full wikipedia article and coding that, this has been done before. Try re-using the methods getDistanceLabFrom(Color color) or getDistanceRgbFrom(Color color) from this PHP implementation. It worked like a charm for me when i needed a way to match a color of pixels in a jpg picture - which basically is the same principle.
The theory behind it (as far as my limited understanding goes): It's doing a mathematical abstraction of rgb or (better) lab colour space as a three dimensional room, and then calculate the distance, that's why it works well - and hardly worked for me when looking at a color code from a one-dimensionional perspective.
The usual way is to start with a reference image (a good one), then add some noise on it (in a controlled way).
Then, your algorithm should remove as much as possible from the added noise. The results are easy to compare with a signal-to-noise ration (see wikipedia).
Now, the approach is easy to apply on simple noise models, but if you aim to improve more complex appearance issues, you must devise a way to apply the noise, which is not easy.
Another, quite common way to do it is the one recommended by slayton: take all your colleagues to appreciate the output of your algorithm, then average their impressions.
If you have only the 2 images and no reference (higest quality) image, then you can see my rude solution/bash script there: https://photo.stackexchange.com/questions/75995/how-do-i-compare-two-similar-images-sharpness/117823#117823
It gets the 2 filenames and outputs the higher quality filename. It assumes the content of the images is identical (same source image).
It can be fooled though.