Otsu method (graythresh function in matlab) produces a scaled result on which scale? 0:255, 0:max(px intensity), min:max? - matlab

Just clarifying a point about the Otsu thresholding method that lacks definition in the documentation & wikipedia articles. If you apply the Otsu method (in matlab the function graythresh) it returns a threshold value between 0 and 1.
Given 2 hypothetical grayscale images:
dark (with pixel intensities in the range of 0 to 100) and
light (with pixel intensities in the range of 155 to 255)
If I got an Otsu threshold of 0.75 for both dark and light images respectively, what grayscale pixel intensity would it map to in each case?
dark -> 75 and light -> 231 E.g. relative to the range of values in each image
dark -> 75 and light -> 191 E.g. relative to the range 0 to max pixel value
dark -> 191 and light -> 191 E.g. relative to the full range of grayscale pixel values (0-255)?

The accepted answer by #Ratbert makes the incorrect claims that
The correct answer is the first one
and
graythresh uses the min and max values in the image as boundaries,
which is the most logical behavior.
and rayryeng appears to agree with it. David Parks appears to have empirically verified it.
The correct answer is given by Anand which strangely seems to have a negative vote. He explains very convincingly that
full range of grayscale pixel values' depends on the data type of the input image
As he explains,
this is the third option
except for the fact that the dark image could not possibly get a threshold of 0.75.
First let us clarify the difference between the claims for the simplest case, in clear MATLAB, so there is no confusion. For an image with values ranging from min to max, the question poses three possibilities which, when translated to an equation are:
threshold = min + (max-min) * graythresh
threshold = max * graythresh
threshold = 255 * graythresh
Suppose the image consists of just two points one with an intensity of 0, and the other with 100. This means dark = uint8([0 100]);. A second image light = dark+155;. When we compute 255*graythresh(dark) we get exactly 49.5. When we compute 255*graythresh(light) we get exactly 204.5. These answers make it patently obvious that the third option is the only possibility.
There is one further fine point. If you try 255*graythresh(uint8(1:2)) the answer is 1, and not 1.5. So it appears that if you are using greythresh to threshold an image, you should use image <= 255*graythesh(image) with a less-than-or-equal-to, rather than a plain less-than.

Your third answer seems most right to me, with the clarification that 'full range of grayscale pixel values' depends on the data type of the input image. For example, for a uint8 image, an Otsu threshold of 0.75 corresponds to around 191. For a uint16 image, this would correspond to 49151.

Well, for posterity sake I did a comparison of the approaches mentioned before. I took a typical image with a full range of grayscale pixel intensities, then took a dark and light version of the same image and got the graythresh value for each. I Applied the Otsu threshold using each of the above mappings.
The light image pretty clearly demonstrates that the algorithm is producing a threshold over the range of the images' pixel intensities. Were we to assume a full range of pixel intensities, the Otsu algorithm would be producing a threshold below any actually present in the image, which doesn't make much sense, at least presuming the existing black background is transparent/irrelevant.
I suppose someone could make an argument for assuming the full range of pixel intensities if you assume the existing black part of the image were relevant darkness. I would certainly welcome comments there.
Full size images seen below
Amending my words above: When I blacken all but the top half of the light image and take the Otsu threshold again I get the same threshold value of 0.3020. Were the dark parts of the image relevant to the Otsu threshold being produced the extra darkness would have affected the value, therefore, Ratbert's answer is empirically demonstrated as correct.

The correct answer is the first one :
dark = 75 and light = 230, relative to the range of values in each image
graythresh uses the min and max values in the image as boundaries, which is the most logical behavior.

Related

Histogram after thresholding

I have an image and I apply thresholding to it to apply binary mask.I draw histogram before and after the thresholding process.The histograms look like below.
The second figure which is after thresholding,doesn't show any peaks.Is that mean,my thresholding is wrong.Can anyone please explain these histograms.
Update
Image after thresholding
To summarize Sardar's comment, the horizontal range of your plot is tight. Simply loosen the range a bit so you can see the result better. Doing xlim([-0.5 1.5]); will certainly do that and we can see that in the last figure of your update. How you interpret the histogram... well, for black and white images, examining the histogram is never meaningful because there are only two possible intensities to examine - 0 and 1. Histograms usually give a glimpse as to the contrast of the image. If the histogram is spread out, this usually gives an indication that the image has high contrast. However, if the histogram is within a small range this usually means the image is poor contrast.
Remember that the histogram simply count the occurrence of instances in a data set. In this case, we are counting how many times we see 0 and 1 in the image. Referring to your last plot, this means that approximately 9000 pixels that are intensity 0 and approximately 4000 pixels that are intensity 1. This gives absolutely no indication as to the contrast or the spread of the intensities in your image. because there are only two possible intensities that are seen in the image. As such, to answer your question in such a long-winded way, the answer is that you can't really interpret anything.
The only thing I can possible suggest is that it tells you the ratio of object pixels to background pixels and could indicate a measure of quality. Usually when we determine what is an object and what are background pixels, we would expect that there would be more background pixels than object pixels to be able to discern this from the background. Therefore, the more black pixels you have the better it may be. That being said, I can't really say more unless you actually show us what your image looks like after you threshold it.

Depth image from Kinect: Indexed image?

I am somewhat confused on how a depth image from a Kinect v1 is to be handled within MATLAB. I am using these (http://www.mathworks.com/matlabcentral/fileexchange/30242-kinect-matlab) mex-files, to read my depth images from saved *.oni files. As a result, I get images of resolution 640x480 or 320x240. The values in the images range roughly from 0-4500.
What type of image am I dealing here with now? Reading this http://www.mathworks.de/de/help/matlab/creating_plots/working-with-8-bit-and-16-bit-images.html I would assume it would have to be an indexed image, because it is no RGB image and the values are not linearly scaled. On the other hand, I believe the values in the image are actual distances to Kinect's focal plane in mm and therefore have a meaning other than an index.
When I want to look at the image using
imshow(depthMap);
I only see black. I have to use something like
imshow(depthMap, [0 9000])
to actually see something. Why exactly is that? What does imshow(depthMap) do with the values?
Do you think its correct to use depthMap as it is in my algorithms, but look at it using
imshow(depthMap, [0 9000])
?
depthMap is not an indexed image, but every pixel codes the distance from the focal plane in mm, as you correctly believe.
To show such an image using imshow, I suggest to use auto-scaling by default, i.e. imshow(depthMap,[]), or use a fixed scale (as you're currently doing) if there is a useful meaning to the minimum and maximum. Turn on the colorbar to visualize the correspondence between color and depth, either via the GUI, or by calling colorbar().
imshow with no scale argument will scale the color limits to [0 1], unless the image is of class uint8, where the color limits are set to [0 255]. In other words, with no scale argument, values of 0 and lower are mapped to black, values of 1 and higher are mapped to white. See also caxis.

Garment Cropping from mannequin

I have two images – mannequin with and without garment.
Please refer sample images below. Ignore the jewels, footwear on the mannequin, imagine the second mannequin has only dress.
I want to extract only the garment from the two images for further processing.
The complexity is that there is slight displacement in the position of camera when taking the two pictures. Due to this simple subtraction to generate the garment mask will not work.
Can anyone tell me how to handle it?
I think I need to do registration between the two images so that I can extract only the garment from the image?
Any references to blogs, articles and codes is highly appreciated.
--
Thanks
Idea
This is an idea of how you could do it, I haven't tested it but my gut tells me it might work. I'm assuming that there will be slight differences in the pose of the manequin as well as the camera attitude.
Let the original image be A, and the clothed image be B.
Take the difference D = |A - B|, apply a median filter that is proportional to the largest deviation you expect from pose and camera attitude error: Dmedian = Median(D, kernelsize).
Quantize Dmedian into a binary mask Dmask = Q(Dmedian, threshold) using appropriate threshold values to obtain an approximate mask for the garment (this will be smaller than the garment itself due to the median filter). Reject any shapes in Dmedian that have too small area by setting their pixels to 0.
Expand the shape(s) in Dmask proportionally to the size of the median kernel into Emask=expand(Dmask, k*kernelsize). Then construct the difference in the masks Fmask=|Dmask - Emask| which now contains areas of pixels where the garment edge is expected to be. For every pixel in Fmask which is in this area, find the correlation Cxy between A and B using a small neighbourhood, store the correlations into an image C=1.0 - Corr(A,B, Fmask, n).
Your final garment mask will be M=C+Dmask.
Explanation
Since your image has nice and continuous swatches of colour, the difference between the two similar images will be thin lines and small gradients where the pose and camera attitude is different. When taking a median filter of the difference image over a sufficiently large kernel, these lines will be removed because they are in a minority of the pixels.
The garment on the other hand will (hopefully) have a significant difference from the colors in the unclothed version. And will generate a bigger difference. Thresholding the difference after the median filter should give you a rough mask of the garment that is undersized dues to some of the pixels on the edge being rejected due to their median values being too low. You could stop here if the approximation is good enough for you.
By expanding the mask we obtained above we get a probable region for the "true" edge. The above process has served to narrow our search region for the true edge considerably and we can apply a more costly correlation search between the images along this edge to find where the garment is. High correlation means no carment and low correlation means garment.
We use the inverted correlation as an alpha value together with the initially smaller mask to obtain a alpha valued mask of the garment that can be used for extracting it.
Clarification
Expand: What I mean by "expanding the mask" is to find the contour of the mask region and outsetting/growing/enlarging it to make it larger.
Corr(A,B,Fmask,n): Is just an arbitrarily chosen correlation function that gives correlation between pixels in A and B that are selected by the mask Fmask using a region of size n. The function returns 1.0 for perfect match and 0.0 for anti-match for each pixel tested. A good function is this pseudocode:
foreach px_pos in Fmask where Fmask[px_pos] == 1
Ap = subregion(A, px_pos, size) - mean(mean(A));
Bp = subregion(B, px_pos, size) - mean(mean(B))
Cxy = sum(sum(Ap .* Bp))*sum(sum(Ap .* Bp)) / (sum(sum(Ap.*Ap))*sum(sum(Bp.*Bp)))
C[px_pos] = 1.0 - Cxy;
end
where subregion selects a region of size size around the pixel with position px_pos.
You can see that if Ap == Bp then Cxy=1

How to remove camera noises in CMOS camera

Here with i have attached two consecutive frames captured by a cmos camera with IR Filter.The object checker board was stationary at the time of capturing images.But the difference between two images are nearly 31000 pixels.This could be affect my result.can u tell me What kind of noise is this?How can i remove it.please suggest me any algorithms or any function possible to remove those noises.
Thank you.Sorry for my poor English.
Image1 : [1]: http://i45.tinypic.com/2wptqxl.jpg
Image2: [2]: http://i45.tinypic.com/v8knjn.jpg
That noise appears to result from camera sensor (Bayer to RGB conversion). There's the checkerboard pattern still left.
Also lossy jpg contributes a lot to the process. You should first have an access to raw images.
From those particular images I'd first try to use edge detection filters (Sobel Horizontal and Vertical) to make a mask that selects between some median/local histogram equalization for the flat areas and to apply some checker board reducing filter to the edges. The point is that probably no single filter is able to do good for both jpeg ringing artifacts and to the jagged edges. Then the real question is: what other kind of images should be processed?
From the comments: if corner points are to be made exact, then the solution more likely is to search for features (corner points with subpixel resolution) and make a mapping from one set of points to the other images set of corners, and search for the best affine transformation matrix that converts these sets to each other. With this matrix one can then perform resampling of the other image.
One can fortunately estimate motion vectors with subpixel resolution without brute force searching all possible subpixel locations: when calculating a matched filter, one gets local maximums for potential candidates of exact matches. But this is not all there is. One can try to calculate a more precise approximation of the peak location by studying the matched filter outputs in the nearby pixels. For exact match the output should be symmetric. Otherwise the 'energies' of the matched filter are biased towards the second best location. (A 2nd degree polynomial fit + finding maximum can work.)
Looking closely at these images, I must agree with #Aki Suihkonen.
In my view, the main noise comes from the jpeg compression, that causes sharp edges to "ring". I'd try a "de-speckle" type of filter on the images, and see if this makes a difference. Some info that can help you implement this can be found in this link.
In a more quick and dirty fashion, you apply one of the many standard tools, for example, given the images are a and b:
(i) just smooth the image with a Gaussian filter, this can reduce noise differences between the images by an order of magnitude. For example:
h=fspecial('gaussian',15,2);
a=conv2(a,h,'same');
b=conv2(b,h,'same');
(ii) Reduce Noise By Adaptive Filtering
a = wiener2(a,[5 5]);
b = wiener2(b,[5 5]);
(iii) Adjust ntensity Values Using Histogram Equalization
a = histeq(a);
b = histeq(b);
(iv) Adjust Intensity Values to a Specified Range
a = imadjust(a,[0 0.2],[0.5 1]);
b = imadjust(b,[0 0.2],[0.5 1]);
If your images are supposed to be black and white but you have captured them in gray scale there could be difference due to noise.
You can convert the images to black and white by defining a threshold, any pixel with a value less than that threshold should be assigned 0 and anything larger than that threshold should be assigned 1, or whatever your gray scale range is (maybe 255).
Assume your image is I, to make it black and white assuming your gray scale image level is from 0 to 255, assume you choose a threshold of 100:
ind = find(I < 100);
I(ind) = 0;
ind = find(I >= 100);
I(ind) = 255;
Now you have a black and white image, do the same thing for the other image and you should get very small difference if the camera and the subject have note moved.

Variance and Mean of Image

I am calculating mean and variance of my original and stego image to compare them
I am using grayscale BMP image for comaprison
image=imread("image name")
M = mean(image(:))
V = var((image(:)))
Is this is correct way fo calculating mean/var in MATLAB? My Variance is getting more than mean..
Any help appreciated..
These are indeed the correct way to calculate the mean and variance over all the pixels of your image.
It is not impossible that your variance is larger than the mean as both are defined in the following way:
mean = sum(x)/length(x)
variance = sum((x - mean(x)).^2)/(length(x) - 1);
For example, if you generate noise from a standard normal distribution with randn(N,1), you will get N samples, and if you calculate the mean and variance, you will get approximately 0 and 1. So there as well, your variance may well be larger than the mean.
Both have a totally different meaning: the mean gives you an idea where your pixels are (i.e. are they white, black, 50% gray, ...). The mean will give you an idea of what pixel color to choose to summarize the color of the complete image. The variance gives you an idea how the pixel values are spread: e.g. if your mean pixel value is 50% gray, are most of the other pixels also 50% gray (small variance) or do you have 50 black pixels and 50 white pixels (large variance)? So you could also view it as a way to get an idea how well the mean summarizes the image (i.e. with zero variance, most of the information is captured by the mean).
edit: For the RMS value (Root Mean Square) of a signal, just do what the definition says. In most cases you want to remove the DC component (i.e. the mean) before calculating the RMS value.
edit 2: What I forgot to mention was: it also makes little sense to compare the numerical value of the variance with the mean from a physical point of view. The mean has the same dimension as your data (in case of pixels, think of intensity), while the variance has the dimension of your data squared (so intensity^2). The standard deviation (std in MATLAB), which is the square root of the variance on the other hand has the same dimension as the data, so there you could make some comparisons (it is another question whether you should do such comparison).
If you are workign with RGB image (H x W x 3), you have to calculate mean and variance separately for each channel. In this case the mean pixel will also be 3-values vector.
for ch = 1:3
M(ch) = mean(reshape(img(:,:,ch),[],1));
V(ch) = var(reshape(img(:,:,ch),[],1));
end
MATLAB has function image. Avoid using it as a variable.