I have a set of data points (around 20000) with their x,y values and I want to remove the points that not very close to other points. I try to approach by 'digitizing' and I think the closest way to implement it in Matlab is a 3D histogram so I can remove the points in the low-count bins. I used hist3() but the problems is I couldn't get the index of the points labeled with counts (like the output 'ind' from histc()). The only way I can think of is a nested for loop which is the last thing I want to try. Is there any way I can label the points index or any other approach to do this?
Thanks
I feel like I need some clarification
I have the histogram graph from the data generated by #rayryeng
There are some bins have N=0 or N=1 so I want to remove the data in these bins.
For histc() there is a form of output [bincounts,ind]= histc( ) where ind returns the bin numbers the data falls into. So I can find the index of bins which less/equal or larger than 1, then find the data in the particular bins. Is there any similar thing I can do for a 2D inputs?
Thanks Again
hist3 should be able to accomplish this for you. I'm not quite sure where the problem is. You can call hist3 like so:
[N,C] = hist3(X);
This will automatically partition your dataset into a 10 x 10 grid of equally spaced containers. You can override this behaviour by doing:
[N,C] = hist3(X, NBINS);
NBINS is a 2 element array where the first element tells you how many bins you want vertically and the second element is how many bins you would like horizontally.
N will tell you how many elements fall within each location of the grid and C will give you a 1 x 2 cell array where the first element of the cell array gives you the X co-ordinates of each centre of the bin while the second element of the cell array gives you the Y co-ordinates of each centre of the bin.
To be explicit, if we have a 10 x 10 grid, C will contain a two element cell array where each element is 10 elements long. For each X co-ordinate of the centre found in C{1}, we will have 10 corresponding Y co-ordinates that relate to the a bin's centre in C{2}. This means that the first 10 bin centres are located at C{1}(1), C{2}(1), C{1}(1), C{2}(2), C{1}(1), C{2}(3), ..., C{1}(1), C{2}(10), then the next 10 bin centres are located at: C{1}(2), C{2}(1), C{1}(2), C{2}(2), C{1}(2), C{2}(3), ..., C{1}(1), C{2}(10).
As a quick example, let's do this on a grid between [0,1] on the x-axis and [0,1] on the y-axis. I'm going to generate 100 2D points. Let's also decompose the image into 10 bins horizontally and 10 bins vertically (as per the default of hist3).
rng(100); %// Set seed for reproducibility
A = rand(100,2);
[N,C] = hist3(A);
disp(N);
celldisp(C);
We thus get:
N =
1 2 0 1 2 0 1 0 1 1
0 1 1 1 1 1 0 0 2 5
0 4 1 1 1 1 1 4 0 1
2 0 3 2 2 1 1 0 2 1
0 0 0 0 1 1 1 0 0 1
1 1 1 2 1 1 0 2 0 1
1 0 2 1 2 0 3 1 1 1
0 1 0 0 0 1 1 0 0 1
1 0 1 2 3 3 0 0 0 2
0 2 1 1 0 1 0 3 0 1
C{1} =
Columns 1 through 7
0.0541 0.1528 0.2516 0.3503 0.4491 0.5478 0.6466
Columns 8 through 10
0.7453 0.8440 0.9428
C{2} =
Columns 1 through 7
0.0513 0.1510 0.2508 0.3505 0.4503 0.5500 0.6498
Columns 8 through 10
0.7495 0.8493 0.9491
This tells us that the first grid located at the top left corner of our point distribution only has 1 value logged into it. The next grid after that has 2 bins logged in it and so on and so forth. We also have our bin centres for each of the bins shown in C. Remember, We have 10 x 10 possible bin centres. If we want to display our data with the bin locations, this is what we can do:
[X,Y] = meshgrid(C{1},C{2});
plot(A(:,1), A(:,2), 'b*', X(:), Y(:), 'r*');
grid;
We thus get:
The red stars denote the bin centres while the blue stars denote our data points within the grid. Because our origin is on the bottom left corner of our plot, but the origin of the N matrix is at the top left corner (i.e. the first bin that is decomposed is at the top left while in our data it's at the bottom left corner), we need to rotate N by 90 degrees counter-clockwise so that the origins of each of the matrices agree with each other, and also agree with the plot. As such:
Nrot = rot90(N);
disp(Nrot);
Nrot =
1 5 1 1 1 1 1 1 2 1
1 2 0 2 0 0 1 0 0 0
0 0 4 0 0 2 1 0 0 3
1 0 1 1 1 0 3 1 0 0
0 1 1 1 1 1 0 1 3 1
2 1 1 2 1 1 2 0 3 0
1 1 1 2 0 2 1 0 2 1
0 1 1 3 0 1 2 0 1 1
2 1 4 0 0 1 0 1 0 2
1 0 0 2 0 1 1 0 1 0
As you can see from the picture, this agrees with what we see within the (rotated) N matrix as well as the bin centres C. Using N (or Nrot if you get the convention correct), you can now figure out which points to eliminate from your array of points. Any points that have low membership within N, you would find those points that are the closest to that bin centre that is associated to the grid location in N and remove them.
As an example, supposing that the bin in the first row, second column (of the rotated result) is the one you want to filter out. This corresponds to (C{1}(2), C{2}(10)). We also know that we need to filter out 5 points as they belong to this bin centre. Therefore:
numPointsToRemove = N(2,10); %//or Nrot(1,2);
%// Computes Euclidean distance between this bin centre with every point
dists = sqrt(sum(bsxfun(#minus, A, [C{1}(2) C{2}(10)]).^2, 2));
%// Find the numPointsToRemove closest points to the bin centre and remove
[~,ind] = sort(dists);
A(ind(1:numPointsToRemove,:)) = [];
We sort our distances in ascending order, then determine the numPointsToRemove closest points to this bin centre. We thus remove them from our data matrix.
If you want to remove those bins that have either a 0 or a 1 for the count, we can find those locations, then run a for loop and filter accordingly. However, any bins that have 0 means that we don't even need to run through and filter anything, because no points were mapped to there! You really need to filter out those values that have just 1 in the bins. In other words:
[rows, cols] = find(N == 1);
for index = 1 : numel(rows)
row = rows(index);
col = cols(index);
%// Computes Euclidean distance between this bin centre with every point
dists = sqrt(sum(bsxfun(#minus, A, [C{1}(row) C{2}(col)]).^2, 2));
%// Finds the closest point to the bin centre and remove
[~,ind] = min(dists);
A(ind,:) = [];
end
As you can see, this is similar the same procedure as above. As we wish to filter out those bins that only have 1 assigned to a bin, we just need to find the minimum distance. Remember, we don't need to process any bins that have a count of 0 so we can skip those.
Related
I have been having problem with identifying two maximum values' position in 3D matrix (MATLAB). Say I have matrix A output as follows:
A(:,:,1) =
5 3 5
0 1 0
A(:,:,2) =
0 2 0
8 0 8
A(:,:,3) =
3 0 0
0 7 7
A(:,:,4) =
6 6 0
4 0 0
For the first A(:,:,1), I want to identify that the first row have the highest value (A=5). But I need the two index position, which in this case, 1 and 3. And this is the same as the other A(:,:,:).
I have searched through SO but since I am bad in MATLAB, I couldn't find way to work this through.
Please do help me on this. It would be better if I don't need to use for loop to get the desired output.
Shot #1 Finding the indices for maximum values across each 3D slice -
%// Reshape A into a 2D matrix
A_2d = reshape(A,[],size(A,3))
%// Find linear indices of maximum numbers for each 3D slice
idx = find(reshape(bsxfun(#eq,A_2d,max(A_2d,[],1)),size(A)))
%// Convert those linear indices to dim1, dim2,dim3 indices and
%// present the final output as a Nx3 array
[dim1_idx,dim2_idx,dim3_idx] = ind2sub(size(A),idx)
out_idx_triplet = [dim1_idx dim2_idx dim3_idx]
Sample run -
>> A
A(:,:,1) =
5 3 5
0 1 0
A(:,:,2) =
0 2 0
8 0 8
A(:,:,3) =
3 0 0
0 7 7
A(:,:,4) =
6 6 0
4 0 0
out_idx_triplet =
1 1 1
1 3 1
2 1 2
2 3 2
2 2 3
2 3 3
1 1 4
1 2 4
out_idx_triplet(:,2) is what you are looking for!
Shot #2 Finding the indices for highest two numbers across each 3D slice -
%// Get size of A
[m,n,r] = size(A)
%// Reshape A into a 2D matrix
A_2d = reshape(A,[],r)
%// Find linear indices of highest two numbers for each 3D slice
[~,sorted_idx] = sort(A_2d,1,'descend')
idx = bsxfun(#plus,sorted_idx(1:2,:),[0:r-1]*m*n)
%// Convert those linear indices to dim1, dim2,dim3 indices
[dim1_idx,dim2_idx,dim3_idx] = ind2sub(size(A),idx(:))
%// Present the final output as a Nx3 array
out_idx_triplet = [dim1_idx dim2_idx dim3_idx]
out_idx_triplet(:,2) is what you are looking for!
The following code gives you the column and row of the respective maximum.
The first step will obtain the maximum of each sub-matrix containing the first and second dimension. Since max works per default with the first dimension, the matrix is reshaped to combine the original first and second dimension.
max_vals = max(reshape(A,size(A,1)*size(A,2),size(A,3)));
max_vals =
5 8 7 6
In the second step, the index of elements equal to the respective max_vals of each sub-matrix is obtained using arrayfun over the third dimension. Since the output of arrayfun are cells, cell2mat is used to transform the output into a matrix. As a last step, the linear index from find is transformed into sub-indices by ind2sub.
[i,j] = ind2sub(size(A(:,:,1)),cell2mat(arrayfun(#(i)find(A(:,:,i)==max_vals(i)),1:size(A,3),'UniformOutput',false)))
i =
1 2 2 1
1 2 2 1
j =
1 1 2 1
3 3 3 2
Hence, the values in j are the ones you want to have.
Suppose I have matrix, where each cell of this matrix describes a location (e.g. a bin of a histogram) in a two dimensional space. Lets say, some of these cells contain a '1' and some a '2', indicating where object number 1 and 2 are located, respectively.
I now want to find those cells that describe the "touching points" between the two objects. How do I do that efficiently?
Here is a naive solution:
X = locations of object number 1 (x,y)
Y = locations of object number 2 (x,y)
distances = pdist2(X,Y,'cityblock');
Locations (x,y) and (u,v) touch, iff the respective entry in distances is 1. I believe that should work, however does not seem very clever and efficient.
Does anyone have a better solution? :)
Thank you!
Use morphological operations.
Let M be your matrix with zeros (no object) ones and twos indicating the locations of different objects.
M1 = M == 1; % create a logical mask of the first object
M2 = M == 2; % logical mask of second object
dM1 = imdilate( M1, [0 1 0; 1 1 1; 0 1 0] ); % "expand" the mask to the neighboring pixels
[touchesY touchesX] =...
find( dM1 & M2 ); % locations where the expansion of first object overlap with second one
Code
%%// Label matrix
L = [
0 0 2 0 0;
2 2 2 1 1;
2 2 1 1 0
0 1 1 1 1]
[X_row,X_col] = find(L==1);
[Y_row,Y_col] = find(L==2);
X = [X_row X_col];
Y = [Y_row Y_col];
%%// You code works till this point to get X and Y
%%// Peform subtractions so that later on could be used to detect
%%// where Y has any index that touches X
%%// Subtract all Y from all X. This can be done by getting one
%%//of them and in this case Y into the third dimension and then subtracting
%%// from all X using bsxfun. The output would be used to index into Y.
Y_touch = abs(bsxfun(#minus,X,permute(Y,[3 2 1])));
%%// Perform similar subtractions, but this time subtracting all X from Y
%%// by putting X into the third dimension. The idea this time is to index
%%// into X.
X_touch = abs(bsxfun(#minus,Y,permute(X,[3 2 1]))); %%// for X too
%%// Find all touching indices for X, which would be [1 1] from X_touch.
%%// Thus, their row-sum would be 2, which can then detected and using `all`
%%// command. The output from that can be "squeezed" into a 2D matrix using
%%// `squeeze` command and then the touching indices would be any `ones`
%%// columnwise.
ind_X = any(squeeze(all(X_touch==1,2)),1)
%%// Similarly for Y
ind_Y = any(squeeze(all(Y_touch==1,2)),1)
%%// Get the touching locations for X and Y
touching_loc = [X(ind_X,:) ; Y(ind_Y,:)]
%%// To verify, let us make the touching indices 10
L(sub2ind(size(L),touching_loc(:,1),touching_loc(:,2)))=10
Output
L =
0 0 2 0 0
2 2 2 1 1
2 2 1 1 0
0 1 1 1 1
L =
0 0 10 0 0
2 10 10 10 1
10 10 10 10 0
0 10 10 1 1
In Matlab I've matrix where, in a previous stage of my code, an specific element was chosen. From this point of the matrix I would like to find a maximum, not just the maximum value between all its surounding neighbours for a given radius, but the maximum value at a given angle of orientation. Let me explain this with an example:
This is matrix A:
A =
0 1 1 1 0 0 9 1 0
0 2 2 4 3 2 8 1 0
0 2 2 3 3 2 2 1 0
0 1 1 3 2 2 2 1 0
0 8 2 3 3 2 7 2 1
0 1 1 2 3 2 3 2 1
The element chosen in the first stage is the 4 in A(2,4), and the next element should be the maximum value with, for example, a 315 degrees angle of orientation, that is the 7 in A(5,7).
What I've done is, depending on the angle, subdivide matrix A in different quadrants and make a new matrix (an A's submatrix) with only the values of that quadrant.
So, for this example, the submatrix will be A's 4th quadrant:
q_A =
4 3 2 8 1 0
3 3 2 2 1 0
3 2 2 2 1 0
3 3 2 7 2 1
2 3 2 3 2 1
And now, here is my question, how can I extract the 7?
The only thing I've been able to do (and it works) is to find all the values over a threshold value and then calculate how those points are orientated. Then, saving all the values that have a similar orientation to the given one (315 degrees in this example) and finally finding the maximum among them. It works but I guess there could be a much faster and "cleaner" solution.
This is my theory, but I don't have the image processing toolbox to test it. Maybe someone who does can comment?
%make (r,c) the center by padding with zeros
if r > size(A,1)/2
At = padarray(A, [size(A,1) - r], 0, 'pre');
else
At = padarray(A, [r-1], 0 'post');
if c > size(A,2)/2
At = padarray(At, [size(A,2) - c], 0, 'pre');
else
At = padarray(At, [c-1], 0 'post');
%rotate by your angle (maybe this should be -angle or else 360-angle or 2*pi-angle, I'm not sure
Ar = imrotate(At,angle, 'nearest', 'loose'); %though I think nearest and loose are defaults
%find the max
max(Ar(size(Ar,1)/2, size(Ar,2)/2:end); %Obviously you must adjust this to handle the case of odd dimension sizes.
Also depending on your array requirements, padding with -inf might be better than 0
The following is a relatively inexpensive solution to the problem, although I found wrapping my head around the matrix coordinate system a real pain, and there is probably room to tidy it up somewhat. It simply traces all matrix entries along a line around the starting point at the supplied angle (all coordinates and angles are of course based on matrix index units):
A = [ 0 1 1 1 0 0 9 1 0
0 2 2 4 3 2 8 1 0
0 2 2 3 3 2 2 1 0
0 1 1 3 2 2 2 1 0
0 8 2 3 3 2 7 2 1
0 1 1 2 3 2 3 2 1 ];
alph = 315;
r = 2;
c = 4;
% generate a line through point (r,c) with angle alph
[nr nc] = size(A);
x = [1:0.1:nc]; % overkill
m = tan(alph);
b = r-m*c;
y = m*x + b;
crd = unique(round([y(:) x(:)]),'rows');
iok = find((crd(:,1)>0) & (crd(:,1)<=nr) & (crd(:,2)>0) & (crd(:,2)<=nc));
crd = crd(iok,:);
indx=sub2ind([nr,nc],crd(:,1),crd(:,2));
% find max and position of max
[val iv]=max(A(indx)); % <-- val is the value of the max
crd(iv,:) % <-- matrix coordinates (row, column) of max value
Result:
val =
7
iv =
8
ans =
5 7
Consider a matrix like
A = 0 1 0 1
1 1 0 0
0 0 0 0
1 1 1 1
I would like to calculate the average size of each cluster of 1's. I define a cluster as occurring when two or more 1's are near each other, i.e. next to or above/below. Eg, in this matrix there is a cluster of size 3 in the top left hand corner and a cluster of size 4 in the bottom row.
I need a way to extract this information in a non-visual way because I need to do this many times for different A.
You may want to use bwlabel which isolates the connected components (clusters of 1) in your binary matrix.
A = [0 1 0 1
1 1 0 0
0 0 0 0
1 1 1 1 ];
[L,n] = bwlabel(A,8) % # for a 8-pixel stencil
% # (i.e. hor/vert/diag first neighbors)
or
[L,n] = bwlabel(A,4) % # for 4-pixel stencil
% # (just horizontal & vertical neighbors)
L = 0 1 0 3
1 1 0 0
0 0 0 0
2 2 2 2
Doing so, you obtain a matrix L which labels the n different connected components.
Then you may want to extract some statistics; for instance you may want to histogram the size of the clusters.
cluster_size = hist(L(:),0:n);
cluster_size = cluster_size(2:end); % # histogram of component vs. size
% # (without zeros)
hist(cluster_size) % # histogram of sizes
which tells you thay you have one cluser of 1 element, one cluster of 3 and one cluster of four.
Finally, if you are looking for the average size of the clusters, you can do
mean(cluster_size)
2.6667
I have an image in MATLAB:
y = rgb2gray(imread('some_image_file.jpg'));
and I want to do some processing on it:
pic = some_processing(y);
and find the local maxima of the output. That is, all the points in y that are greater than all of their neighbors.
I can't seem to find a MATLAB function to do that nicely. The best I can come up with is:
[dim_y,dim_x]=size(pic);
enlarged_pic=[zeros(1,dim_x+2);
zeros(dim_y,1),pic,zeros(dim_y,1);
zeros(1,dim_x+2)];
% now build a 3D array
% each plane will be the enlarged picture
% moved up,down,left or right,
% to all the diagonals, or not at all
[en_dim_y,en_dim_x]=size(enlarged_pic);
three_d(:,:,1)=enlarged_pic;
three_d(:,:,2)=[enlarged_pic(2:end,:);zeros(1,en_dim_x)];
three_d(:,:,3)=[zeros(1,en_dim_x);enlarged_pic(1:end-1,:)];
three_d(:,:,4)=[zeros(en_dim_y,1),enlarged_pic(:,1:end-1)];
three_d(:,:,5)=[enlarged_pic(:,2:end),zeros(en_dim_y,1)];
three_d(:,:,6)=[pic,zeros(dim_y,2);zeros(2,en_dim_x)];
three_d(:,:,7)=[zeros(2,en_dim_x);pic,zeros(dim_y,2)];
three_d(:,:,8)=[zeros(dim_y,2),pic;zeros(2,en_dim_x)];
three_d(:,:,9)=[zeros(2,en_dim_x);zeros(dim_y,2),pic];
And then see if the maximum along the 3rd dimension appears in the 1st layer (that is: three_d(:,:,1)):
(max_val, max_i) = max(three_d, 3);
result = find(max_i == 1);
Is there any more elegant way to do this? This seems like a bit of a kludge.
bw = pic > imdilate(pic, [1 1 1; 1 0 1; 1 1 1]);
If you have the Image Processing Toolbox, you could use the IMREGIONALMAX function:
BW = imregionalmax(y);
The variable BW will be a logical matrix the same size as y with ones indicating the local maxima and zeroes otherwise.
NOTE: As you point out, IMREGIONALMAX will find maxima that are greater than or equal to their neighbors. If you want to exclude neighboring maxima with the same value (i.e. find maxima that are single pixels), you could use the BWCONNCOMP function. The following should remove points in BW that have any neighbors, leaving only single pixels:
CC = bwconncomp(BW);
for i = 1:CC.NumObjects,
index = CC.PixelIdxList{i};
if (numel(index) > 1),
BW(index) = false;
end
end
Alternatively, you can use nlfilter and supply your own function to be applied to each neighborhood.
This "find strict max" function would simply check if the center of the neighborhood is strictly greater than all the other elements in that neighborhood, which is always 3x3 for this purpose. Therefore:
I = imread('tire.tif');
BW = nlfilter(I, [3 3], #(x) all(x(5) > x([1:4 6:9])) );
imshow(BW)
In addition to imdilate, which is in the Image Processing Toolbox, you can also use ordfilt2.
ordfilt2 sorts values in local neighborhoods and picks the n-th value. (The MathWorks example demonstrates how to implemented a max filter.) You can also implement a 3x3 peak finder with ordfilt2 with the following logic:
Define a 3x3 domain that does not include the center pixel (8 pixels).
>> mask = ones(3); mask(5) = 0 % 3x3 max
mask =
1 1 1
1 0 1
1 1 1
Select the largest (8th) value with ordfilt2.
>> B = ordfilt2(A,8,mask)
B =
3 3 3 3 3 4 4 4
3 5 5 5 4 4 4 4
3 5 3 5 4 4 4 4
3 5 5 5 4 6 6 6
3 3 3 3 4 6 4 6
1 1 1 1 4 6 6 6
Compare this output to the center value of each neighborhood (just A):
>> peaks = A > B
peaks =
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0
or, just use the excellent: extrema2.m