Matlab and OpenCV calculate different image moment m00 for the same image - matlab

For exactly the same image
Opencv Code:
img = imread("testImg.png",0);
threshold(img, img_bw, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
Mat tmp;
img_bwR.copyTo(tmp);
findContours(tmp, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE);
// Get the moment
vector<Moments> mu(contours.size() );
for( int i = 0; i < contours.size(); i++ )
{ mu[i] = moments( contours[i], false );
}
// Display area (m00)
for( int i = 0; i < contours.size(); i++ )
{
cout<<mu[i].m00 <<endl;
// I also tried the code
//cout<<contourArea(contours.at(i))<<endl;
// But the result is the same
}
Matlab code:
Img = imread('testImg.png');
lvl = graythresh(Img);
bw = im2bw(Img,lvl);
stats = regionprops(bw,'Area');
for k = 1:length(stats)
Area = stats(k).Area; %m00
end
Any one has any thought on it? How to unify them? I think they use different methods to find contours.
I uploaded the test image at the link below so that someone who is interested in this can reproduce the procedure
It is a 100 by 100 small 8 bit grayscale image with only 0 and 255 pixel intensity. For simplicity, it only has one blob on it.
For OpenCV, the area of contour (image moment m00) is 609.5 (Very odd value)
For Matlab, the area of contour (image moment m00) is 763.
Thanks

Exist many different definitions of how contours should be extracted from binary image. For example it can be polygon that is the perimeter of white object in a binary image. If this definition was used by OpenCV, then areas of contours would be the same as areas of connected components found by Matlab. But this is not the case. Contour found by findContour() function is the polygon that connects centers of neighbor "edge pixels". Edge pixel is a white pixel that has black neighbor in N4 neighborhood.
Example: suppose you have an image whose size is 100x100 pixels. Every pixel above the diagonal is black. Every pixel below or on the diagonal is white (black triangle and white triangle). Exact separation polygon will have almost 200 vertexes at distance of 1 pixel: (0,0), (1,0), (1,1), (2,1), (2,2),.... (100,99), (100,100), (0,100). As you can see this definition is not very good from practical point of view. Polygon returned by OpenCV will have exactly 3 vertexes needed to define the triangle: (0,0), (99,99), (0,99). Its area is (99 x 99 / 2) pixels. It is not equal to number of white pixels. It is not even an integer. But this polygon is more practical than previous one.
Those are not the only possible definitions for polygon extraction. Many other definitions exist. Some of them (in my opinion) may be better than the one used by OpenCV. But this is the one that was implemented and used by a lot of people.
Currently there no effective workaround for your problem. If you want to get exactly same numbers from MATLAB and OpenCV you will have to draw the contours found by foundContours on some black image, and use function moments() on image. I know that upcoming OpenCV 3 have function that finds connected components but I didn't tried it myself.

Related

RGB Depth Alignment [duplicate]

I am trying to allign two images - one rgb and another depth using MATLAB. Please note that I have checked several places for this - like here , here which requires a kinect device, and here here which says that camera parameters are required for calibration. I was also suggested to use EPIPOLAR GEOMETRY to match the two images though I do not know how. The dataset I am referring to is given in rgb-d-t face dataset. One such example is illustrated below :
The ground truth which basically means the bounding boxes which specify the face region of interest are already provided and I use them to crop the face regions only. The matlab code is illustrated below :
I = imread('1.jpg');
I1 = imcrop(I,[218,198,158,122]);
I2 = imcrop(I,[243,209,140,108]);
figure, subplot(1,2,1),imshow(I1);
subplot(1,2,2),imshow(I2);
The two cropped images rgb and depth are shown below :
Is there any way by which we can register/allign the images. I took the hint from
here where basic sobel operator has been used on both the rgb and depth images to generate an edge map and then keypoints will need to be generated for matching purposes. The edge maps for both the images are generated here.
.
However they are so noisy that I do not think we will be able to do keypoint matching for this images.
Can anybody suggest some algorithms in matlab to do the same ?
prologue
This answer is based on mine previous answer:
Does Kinect Infrared View Have an offset with the Kinect Depth View
I manually crop your input image so I separate colors and depth images (as my program need them separated. This could cause minor offset change by few pixels. Also as I do not have the depths (depth image is 8bit only due to grayscale RGB) then the depth accuracy I work with is very poor see:
So my results are affected by all this negatively. Anyway here is what you need to do:
determine FOV for both images
So find some measurable feature visible on both images. The bigger in size the more accurate the result. For example I choose these:
form a point cloud or mesh
I use depth image as reference so my point cloud is in its FOV. As I do not have the distances but 8bit values instead I converted that to some distance by multiplying by constant. So I scan whole depth image and for every pixel I create point in my point cloud array. Then convert the dept pixel coordinate to color image FOV and copy its color too. something like this (in C++):
picture rgb,zed; // your input images
struct pnt3d { float pos[3]; DWORD rgb; pnt3d(){}; pnt3d(pnt3d& a){ *this=a; }; ~pnt3d(){}; pnt3d* operator = (const pnt3d *a) { *this=*a; return this; }; /*pnt3d* operator = (const pnt3d &a) { ...copy... return this; };*/ };
pnt3d **xyz=NULL; int xs,ys,ofsx=0,ofsy=0;
void copy_images()
{
int x,y,x0,y0;
float xx,yy;
pnt3d *p;
for (y=0;y<ys;y++)
for (x=0;x<xs;x++)
{
p=&xyz[y][x];
// copy point from depth image
p->pos[0]=2.000*((float(x)/float(xs))-0.5);
p->pos[1]=2.000*((float(y)/float(ys))-0.5)*(float(ys)/float(xs));
p->pos[2]=10.0*float(DWORD(zed.p[y][x].db[0]))/255.0;
// convert dept image x,y to color image space (FOV correction)
xx=float(x)-(0.5*float(xs));
yy=float(y)-(0.5*float(ys));
xx*=98.0/108.0;
yy*=106.0/119.0;
xx+=0.5*float(rgb.xs);
yy+=0.5*float(rgb.ys);
x0=xx; x0+=ofsx;
y0=yy; y0+=ofsy;
// copy color from rgb image if in range
p->rgb=0x00000000; // black
if ((x0>=0)&&(x0<rgb.xs))
if ((y0>=0)&&(y0<rgb.ys))
p->rgb=rgb2bgr(rgb.p[y0][x0].dd); // OpenGL has reverse RGBorder then my image
}
}
where **xyz is my point cloud 2D array allocated t depth image resolution. The picture is my image class for DIP so here some relevant members:
xs,ys is the image resolution in pixels
p[ys][xs] is the image direct pixel access as union of DWORD dd; BYTE db[4]; so I can access color as single 32 bit variable or each color channel separately.
rgb2bgr(DWORD col) just reorder color channels from RGB to BGR.
render it
I use OpenGL for this so here the code:
glBegin(GL_QUADS);
for (int y0=0,y1=1;y1<ys;y0++,y1++)
for (int x0=0,x1=1;x1<xs;x0++,x1++)
{
float z,z0,z1;
z=xyz[y0][x0].pos[2]; z0=z; z1=z0;
z=xyz[y0][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x0].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
if (z0 <=0.01) continue;
if (z1 >=3.90) continue; // 3.972 pre vsetko nad .=3.95m a 4.000 ak nechyti vobec nic
if (z1-z0>=0.10) continue;
glColor4ubv((BYTE* )&xyz[y0][x0].rgb);
glVertex3fv((float*)&xyz[y0][x0].pos);
glColor4ubv((BYTE* )&xyz[y0][x1].rgb);
glVertex3fv((float*)&xyz[y0][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x1].rgb);
glVertex3fv((float*)&xyz[y1][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x0].rgb);
glVertex3fv((float*)&xyz[y1][x0].pos);
}
glEnd();
You need to add the OpenGL initialization and camera settings etc of coarse. Here the unaligned result:
align it
If you notice I added ofsx,ofsy variables to copy_images(). This is the offset between cameras. I change them on arrows keystrokes by 1 pixel and then call copy_images and render the result. This way I manually found the offset very quickly:
As you can see the offset is +17 pixels in x axis and +4 pixels in y axis. Here side view to better see the depths:
Hope It helps a bit
Well I have tried doing it after reading lots of blogs and all. I am still not sure whether I am doing it correct or not. Please feel free to give comments if something is found amiss. For this I used a mathworks fex submission that can be found here : ginputc function.
The matlab code is as follows :
clc; clear all; close all;
% no of keypoint
N = 7;
I = imread('2.jpg');
I = rgb2gray(I);
[Gx, Gy] = imgradientxy(I, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
I = Gmag;
[x,y] = ginputc(N, 'Color' , 'r');
matchedpoint1 = [x y];
J = imread('2.png');
[Gx, Gy] = imgradientxy(J, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
J = Gmag;
[x, y] = ginputc(N, 'Color' , 'r');
matchedpoint2 = [x y];
[tform,inlierPtsDistorted,inlierPtsOriginal] = estimateGeometricTransform(matchedpoint2,matchedpoint1,'similarity');
figure; showMatchedFeatures(J,I,inlierPtsOriginal,inlierPtsDistorted);
title('Matched inlier points');
I = imread('2.jpg'); J = imread('2.png');
I = rgb2gray(I);
outputView = imref2d(size(I));
Ir = imwarp(J,tform,'OutputView',outputView);
figure; imshow(Ir, []);
title('Recovered image');
figure,imshowpair(I,J,'diff'),title('Difference with original');
figure,imshowpair(I,Ir,'diff'),title('Difference with restored');
Step 1
I used the sobel edge detector to extract the edges for both the depth and rgb images and then used a thresholding values to get the edge map. I will be primarily working with the gradient magnitude only. This gives me two images as this :
Step 2
Next I use the ginput or ginputc function to mark keypoints on both the images. The correspondence between the points are established by me beforehand. I tried using SURF features but they do not work well on depth images.
Step 3
Use the estimategeometrictransform to get the transformation matrix tform and then use this matrix to recover the original position of the moved image. The next set of images tells this story.
Granted I still believe the results can be further improved if the keypoint selections in either of the images are more judiciously done. I also think #Specktre method is better. I just noticed that I used a separate image-pair in my answer compared to that of the question. Both images come from the same dataset to be found here vap rgb-d-t dataset.

Align already captured rgb and depth images

I am trying to allign two images - one rgb and another depth using MATLAB. Please note that I have checked several places for this - like here , here which requires a kinect device, and here here which says that camera parameters are required for calibration. I was also suggested to use EPIPOLAR GEOMETRY to match the two images though I do not know how. The dataset I am referring to is given in rgb-d-t face dataset. One such example is illustrated below :
The ground truth which basically means the bounding boxes which specify the face region of interest are already provided and I use them to crop the face regions only. The matlab code is illustrated below :
I = imread('1.jpg');
I1 = imcrop(I,[218,198,158,122]);
I2 = imcrop(I,[243,209,140,108]);
figure, subplot(1,2,1),imshow(I1);
subplot(1,2,2),imshow(I2);
The two cropped images rgb and depth are shown below :
Is there any way by which we can register/allign the images. I took the hint from
here where basic sobel operator has been used on both the rgb and depth images to generate an edge map and then keypoints will need to be generated for matching purposes. The edge maps for both the images are generated here.
.
However they are so noisy that I do not think we will be able to do keypoint matching for this images.
Can anybody suggest some algorithms in matlab to do the same ?
prologue
This answer is based on mine previous answer:
Does Kinect Infrared View Have an offset with the Kinect Depth View
I manually crop your input image so I separate colors and depth images (as my program need them separated. This could cause minor offset change by few pixels. Also as I do not have the depths (depth image is 8bit only due to grayscale RGB) then the depth accuracy I work with is very poor see:
So my results are affected by all this negatively. Anyway here is what you need to do:
determine FOV for both images
So find some measurable feature visible on both images. The bigger in size the more accurate the result. For example I choose these:
form a point cloud or mesh
I use depth image as reference so my point cloud is in its FOV. As I do not have the distances but 8bit values instead I converted that to some distance by multiplying by constant. So I scan whole depth image and for every pixel I create point in my point cloud array. Then convert the dept pixel coordinate to color image FOV and copy its color too. something like this (in C++):
picture rgb,zed; // your input images
struct pnt3d { float pos[3]; DWORD rgb; pnt3d(){}; pnt3d(pnt3d& a){ *this=a; }; ~pnt3d(){}; pnt3d* operator = (const pnt3d *a) { *this=*a; return this; }; /*pnt3d* operator = (const pnt3d &a) { ...copy... return this; };*/ };
pnt3d **xyz=NULL; int xs,ys,ofsx=0,ofsy=0;
void copy_images()
{
int x,y,x0,y0;
float xx,yy;
pnt3d *p;
for (y=0;y<ys;y++)
for (x=0;x<xs;x++)
{
p=&xyz[y][x];
// copy point from depth image
p->pos[0]=2.000*((float(x)/float(xs))-0.5);
p->pos[1]=2.000*((float(y)/float(ys))-0.5)*(float(ys)/float(xs));
p->pos[2]=10.0*float(DWORD(zed.p[y][x].db[0]))/255.0;
// convert dept image x,y to color image space (FOV correction)
xx=float(x)-(0.5*float(xs));
yy=float(y)-(0.5*float(ys));
xx*=98.0/108.0;
yy*=106.0/119.0;
xx+=0.5*float(rgb.xs);
yy+=0.5*float(rgb.ys);
x0=xx; x0+=ofsx;
y0=yy; y0+=ofsy;
// copy color from rgb image if in range
p->rgb=0x00000000; // black
if ((x0>=0)&&(x0<rgb.xs))
if ((y0>=0)&&(y0<rgb.ys))
p->rgb=rgb2bgr(rgb.p[y0][x0].dd); // OpenGL has reverse RGBorder then my image
}
}
where **xyz is my point cloud 2D array allocated t depth image resolution. The picture is my image class for DIP so here some relevant members:
xs,ys is the image resolution in pixels
p[ys][xs] is the image direct pixel access as union of DWORD dd; BYTE db[4]; so I can access color as single 32 bit variable or each color channel separately.
rgb2bgr(DWORD col) just reorder color channels from RGB to BGR.
render it
I use OpenGL for this so here the code:
glBegin(GL_QUADS);
for (int y0=0,y1=1;y1<ys;y0++,y1++)
for (int x0=0,x1=1;x1<xs;x0++,x1++)
{
float z,z0,z1;
z=xyz[y0][x0].pos[2]; z0=z; z1=z0;
z=xyz[y0][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x0].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
z=xyz[y1][x1].pos[2]; if (z0>z) z0=z; if (z1<z) z1=z;
if (z0 <=0.01) continue;
if (z1 >=3.90) continue; // 3.972 pre vsetko nad .=3.95m a 4.000 ak nechyti vobec nic
if (z1-z0>=0.10) continue;
glColor4ubv((BYTE* )&xyz[y0][x0].rgb);
glVertex3fv((float*)&xyz[y0][x0].pos);
glColor4ubv((BYTE* )&xyz[y0][x1].rgb);
glVertex3fv((float*)&xyz[y0][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x1].rgb);
glVertex3fv((float*)&xyz[y1][x1].pos);
glColor4ubv((BYTE* )&xyz[y1][x0].rgb);
glVertex3fv((float*)&xyz[y1][x0].pos);
}
glEnd();
You need to add the OpenGL initialization and camera settings etc of coarse. Here the unaligned result:
align it
If you notice I added ofsx,ofsy variables to copy_images(). This is the offset between cameras. I change them on arrows keystrokes by 1 pixel and then call copy_images and render the result. This way I manually found the offset very quickly:
As you can see the offset is +17 pixels in x axis and +4 pixels in y axis. Here side view to better see the depths:
Hope It helps a bit
Well I have tried doing it after reading lots of blogs and all. I am still not sure whether I am doing it correct or not. Please feel free to give comments if something is found amiss. For this I used a mathworks fex submission that can be found here : ginputc function.
The matlab code is as follows :
clc; clear all; close all;
% no of keypoint
N = 7;
I = imread('2.jpg');
I = rgb2gray(I);
[Gx, Gy] = imgradientxy(I, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
I = Gmag;
[x,y] = ginputc(N, 'Color' , 'r');
matchedpoint1 = [x y];
J = imread('2.png');
[Gx, Gy] = imgradientxy(J, 'Sobel');
[Gmag, ~] = imgradient(Gx, Gy);
figure, imshow(Gmag, [ ]), title('Gradient magnitude')
J = Gmag;
[x, y] = ginputc(N, 'Color' , 'r');
matchedpoint2 = [x y];
[tform,inlierPtsDistorted,inlierPtsOriginal] = estimateGeometricTransform(matchedpoint2,matchedpoint1,'similarity');
figure; showMatchedFeatures(J,I,inlierPtsOriginal,inlierPtsDistorted);
title('Matched inlier points');
I = imread('2.jpg'); J = imread('2.png');
I = rgb2gray(I);
outputView = imref2d(size(I));
Ir = imwarp(J,tform,'OutputView',outputView);
figure; imshow(Ir, []);
title('Recovered image');
figure,imshowpair(I,J,'diff'),title('Difference with original');
figure,imshowpair(I,Ir,'diff'),title('Difference with restored');
Step 1
I used the sobel edge detector to extract the edges for both the depth and rgb images and then used a thresholding values to get the edge map. I will be primarily working with the gradient magnitude only. This gives me two images as this :
Step 2
Next I use the ginput or ginputc function to mark keypoints on both the images. The correspondence between the points are established by me beforehand. I tried using SURF features but they do not work well on depth images.
Step 3
Use the estimategeometrictransform to get the transformation matrix tform and then use this matrix to recover the original position of the moved image. The next set of images tells this story.
Granted I still believe the results can be further improved if the keypoint selections in either of the images are more judiciously done. I also think #Specktre method is better. I just noticed that I used a separate image-pair in my answer compared to that of the question. Both images come from the same dataset to be found here vap rgb-d-t dataset.

Quantifying pixels from a list of coordinates

I have a list of coordinates, which are generated from another program, and I have an image.
I'd like to load those coordinates (making circular regions of interest (ROIs) with a diameter of 3 pixels) onto my image, and extract the intensity of those pixels.
I can load/impose the coordinates on to the image by using;
imshow(file);
hold on
scatter(xCoords, yCoords, 'g')
But can not extract the intensity.
Can you guys point me in the right direction?
I am not sure what you mean by a circle with 3 pixels diameter since you are in a square grid (as mentioned by Ander Biguri). But you could use fspecial to create a disk filter and then normalize. Something like this:
r = 1.5; % for diameter = 3
h = fspecial('disk', r);
h = h/h(ceil(r),ceil(r));
You can use it as a mask to get the intensities at the given region of the image.
im = imread(file);
ROI = im(xCoord-1:xCoord+1; yCoord-1:yCoord+1);
I = ROI.*h;

Extract black objects from color background

It is easy for human eyes to tell black from other colors. But how about computers?
I printed some color blocks on the normal A4 paper. Since there are three kinds of ink to compose a color image, cyan, magenta and yellow, I set the color of each block C=20%, C=30%, C=40%, C=50% and rest of two colors are 0. That is the first column of my source image. So far, no black (K of CMYK) ink is supposed to print. After that, I set the color of each dot K=100% and rest colors are 0 to print black dots.
You may feel my image is weird and awful. In fact, the image is magnified 30 times and how the ink cheat our eyes can be seen clearly. The color strips hamper me to recognize these black dots (the dot is printed as just one pixel in 800 dpi). Without the color background, I used to blur and do canny edge detector to extract the edge. However, when adding color background, simply do grayscale and edge detector cannot get good results because of the strips. How will my eyes do in order to solve such problems?
I determined to check the brightness of source image. I referred this article and formula:
brightness = sqrt( 0.299 R * R + 0.587 G * G + 0.114 B * B )
The brightness is more close to human perception and it works very well in the yellow background because the brightness of yellow is the highest compared with cyan and magenta. But how to make cyan and magenta strips as bright as possible? The expected result is that all the strips disappear.
More complicated image:
C=40%, M=40%
C=40%, Y=40%
Y=40%, M=40%
FFT result of C=40%, Y=40% brightness image
Anyone can give me some hints to remove the color strips?
#natan I tried FFT method you suggested me, but I was not lucky to get peak at both axis x and y. In order to plot the frequency as you did, I resized my image to square.
I would convert the image to the HSV colour space and then use the Value channel. This basically separates colour and brightness information.
This is the 50% cyan image
Then you can just do a simple threshold to isolate the dots.
I just did this very quickly and im sure you could get better results. Maybe find contours in the image and then remove any contours with a small area, to filter any remaining noise.
After inspecting the images, I decided that a robust threshold will be more simple than anything. For example, looking at the C=40%, M=40% photo, I first inverted the intensities so black (the signal) will be white just using
im=(abs(255-im));
we can inspect its RGB histograms using this :
hist(reshape(single(im),[],3),min(single(im(:))):max(single(im(:))));
colormap([1 0 0; 0 1 0; 0 0 1]);
so we see that there is a large contribution to some middle intensity whereas the "signal" which is now white, is mostly separated to higher value. I then applied a simple thresholds as follows:
thr = #(d) (max([min(max(d,[],1)) min(max(d,[],2))])) ;
for n=1:size(im,3)
imt(:,:,n)=im(:,:,n).*uint8(im(:,:,n)>1.1*thr(im(:,:,n)));
end
imt=rgb2gray(imt);
and got rid of objects smaller than some typical area size
min_dot_area=20;
bw=bwareaopen(imt>0,min_dot_area);
imagesc(bw);
colormap(flipud(bone));
here's the result together with the original image:
The origin of this threshold is from this code I wrote that assumed sparse signals in the form of 2-D peaks or blobs in a noisy background. By sparse I meant that there's no pile up of peaks. In that case, when projecting max(image) on the x or y axis (by (max(im,[],1) or (max(im,[],1) you get a good measure of the background. That is because you take the minimal intensity of the max(im) vector.
If you want to look at this differently you can look at the histogram of the intensities of the image. The background is supposed to be a normal distribution of some kind around some intensity, the signal should be higher than that intensity, but with much lower # of occurrences. By finding max(im) of one of the axes (x or y) you discover what was the maximal noise level.
You'll see that the threshold picks that point in the histogram where there are still some noise above it, but ALL the signal is above it too. that's why I adjusted it to be 1.1*thr. Last, there are many fancier ways to obtain a robust threshold, this is a quick and dirty way that in my view is good enough...
Thanks to everyone for posting his answer! After some search and attempt, I also come up with an adaptive method to extract these black dots from the color background. It seems that considering only the brightness could not solve the problem perfectly. Therefore natan's method which calculates and analyzes the RGB histogram is more robust. Unfortunately, I still cannot obtain a robust threshold to extract the black dots in other color samples, because things are getting more and more unpredictable when we add deeper color (e.g. Cyan > 60) or mix two colors together (e.g. Cyan = 50, Magenta = 50).
One day, I google "extract color" and TinEye's color extraction and color thief inspire me. Both of them are very cool application and the image processed by the former website is exactly what I want. So I determine to implement a similar stuff on my own. The algorithm I used here is k-means clustering. And some other related key words to search may be color palette, color quantation and getting dominant color.
I firstly apply Gaussian filter to smooth the image.
GaussianBlur(img, img, Size(5, 5), 0, 0);
OpenCV has kmeans function and it saves me a lot of time on coding. I modify this code.
// Input data should be float32
Mat samples(img.rows * img.cols, 3, CV_32F);
for (int i = 0; i < img.rows; i++) {
for (int j = 0; j < img.cols; j++) {
for (int z = 0; z < 3; z++) {
samples.at<float>(i + j * img.rows, z) = img.at<Vec3b>(i, j)[z];
}
}
}
// Select the number of clusters
int clusterCount = 4;
Mat labels;
int attempts = 1;
Mat centers;
kmeans(samples, clusterCount, labels, TermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS, 10, 0.1), attempts, KMEANS_PP_CENTERS, centers);
// Draw clustered result
Mat cluster(img.size(), img.type());
for (int i = 0; i < img.rows; i++) {
for(int j = 0; j < img.cols; j++) {
int cluster_idx = labels.at<int>(i + j * img.rows, 0);
cluster.at<Vec3b>(i, j)[0] = centers.at<float>(cluster_idx, 0);
cluster.at<Vec3b>(i, j)[1] = centers.at<float>(cluster_idx, 1);
cluster.at<Vec3b>(i, j)[2] = centers.at<float>(cluster_idx, 2);
}
}
imshow("clustered image", cluster);
// Check centers' RGB value
cout << centers;
After clustering, I convert the result to grayscale and find the darkest color which is more likely to be the color of the black dots.
// Find the minimum value
cvtColor(cluster, cluster, CV_RGB2GRAY);
Mat dot = Mat::zeros(img.size(), CV_8UC1);
cluster.copyTo(dot);
int minVal = (int)dot.at<uchar>(dot.cols / 2, dot.rows / 2);
for (int i = 0; i < dot.rows; i += 3) {
for (int j = 0; j < dot.cols; j += 3) {
if ((int)dot.at<uchar>(i, j) < minVal) {
minVal = (int)dot.at<uchar>(i, j);
}
}
}
inRange(dot, minVal - 5 , minVal + 5, dot);
imshow("dot", dot);
Let's test two images.
(clusterCount = 4)
(clusterCount = 5)
One shortcoming of the k-means clustering is one fixed clusterCount cannot be applied to every image. Also clustering is not so fast for larger images. That's the issue annoys me a lot. My dirty method for better real time performance (on iPhone) is to crop 1/16 of the image and cluster the smaller area. Then compare all the pixels in the original image with each cluster center, and pick the pixel that are the nearest to the "black" color. I simply calculate euclidean distance between two RGB colors.
A simple method is to just threshold all the pixels. Here is this idea expressed in pseudo code.
for each pixel in image
if brightness < THRESHOLD
pixel = BLACK
else
pixel = WHITE
Or if you're always dealing with cyan, magenta and yellow backgrounds then maybe you might get better results with the criteria
if pixel.r < THRESHOLD and pixel.g < THRESHOLD and pixel.b < THRESHOLD
This method will only give good results for easy images where nothing except the black dots is too dark.
You can experiment with the value of THRESHOLD to find a good value for your images.
I suggest to convert to some chroma-based color space, like LCH, and adjust simultaneous thresholds on lightness and chroma. Here is the result mask for L < 50 & C < 25 for the input image:
Seems like you need adaptive thresholds since different values work best for different areas of the image.
You may also use HSV or HSL as a color space, but they are less perceptually uniform than LCH, derived from Lab.

Using rectangle in Matlab. Using Sum()

I have performed rgb2gray on an image and did a sobel edge detection on the image.
then did
faceEdges = faceNoNoise(:,:) > 50; %binary threshold
so it sets the outline of the image (a picture of a face), to black and white. Values 1 is white pixel, and 0 is black pixel. Someone said I could use this,
mouthsquare = rectangle('position',[recX-mouthBoxBuffer, recY-mouthBoxBuffer, recXDiff*2+mouthBoxBuffer/2, recYDiff*2+mouthBoxBuffer/2],... % see the change in coordinates
'edgecolor','r');
numWhite = sum(sum(mouthsquare));
He said to use two sum()'s because it gets the columns and rows of the contained pixels within the rectangle. numWhite always returns 178 and some decimal numbers.
If you have a 2D matrix M (this being -- for exmple -- an image), the way to count how many elements have the value 1 is:
count_1 = sum(M(:)==1)
or
count_1 = sum(reshape(M,1,[])==1)
If the target values are not exactly 1, but have a Δ-threshold of, let's say, +/- 0.02, then one should ask for:
count_1_pm02 = sum((M(:)>=0.98) & (M(:)<=1.02))
or the equivalent using reshape.