I have a matrix A which holds integers in a bounded range (0..255) and I need to build a table mapping a value (0..255) to all the coordinates in the matrix which hold this value.
What is the best way to achieve this? - I thought about using containers.Map for the task but Map doesn't support multiple values per key. I could have used lists but that would seem inefficient as I would have to create a new list on each iteration.
A vectorized solution, which gives the same output as the solution from Mikhail, is to sort all the pixel values in your image using the SORT function, convert the linear indices returned from SORT into subscripted indices using the function IND2SUB, and collect them together into a single cell array using the functions ACCUMARRAY and MAT2CELL:
A = randi([0 255],[5 5],'uint8'); %# A sample matrix
[values,indices] = sort(double(A(:))); %# Sort all the pixel values
[y,x] = ind2sub(size(A),indices); %# Convert linear index to subscript
counts = accumarray(values+1,1,[256 1]); %# Count number of each value
map = mat2cell([y x],counts); %# Create a 256-by-1 cell array
Now, for a given integer value iValue you can get the N-by-2 matrix containing the y (first column) and x (second column) coordinates for the N pixels in the image with that value by doing the following:
key = double(iValue)+1; %# Need to use double to avoid integer saturation
points = map{key}; %# An N-by-2 coordinate matrix
In addition, just in case you're interested, you could also make map a structure array with fields x and y using the function STRUCT:
map = struct('x',mat2cell(x,counts),'y',mat2cell(y,counts));
And you can then access the x and y coordinates for pixels with a value iValue as follows:
key = double(iValue)+1;
x = map(key).x;
y = map(key).y
What about using a cell array? You can index it with integers. For example:
map = {[1,1;13,56], [], [4,5]};
In this example index 0 is in the matrix in 1,1 and 13,56, index 1 in none and index 2 in 4,5
Your cell would have 256 elements (mine has 3) and to acces you would simply add 1 to the index.
You could also store indices linearly so the code to fill the table would be:
for ii = 0:255
map{ii+1} = find( mat(:)==ii )
end
Well, I wrote the following and it seems to work in reasonable time. I think the thing that does the trick is preallocating the cell arrays based on the histogram for each value:
[H, W] = size(A);
histogram = hist(A, 256);
AGT = arrayfun(#(avg) {0 cell(1, histogram(avg))}, 1:256, 'UniformOutput', false);
for y = 1:H
for x = 1:W
idx = A(y, x) + 1;
count = AGT{idx}{1};
AGT{idx}{2}{count + 1} = [y x];
AGT{idx}{1} = count + 1;
end
end
Accessing the table is a bit annoyting though :
AGT{200}{2}{:}
to access all coordinates with value 200.
Related
Here's my example data.
Ycoordinate = 10;
Xcoordinate = 12;
Zdata = 4;
my3Darray = zeros(Ycoordinate, Xcoordinate, Zdata);
for i = 1:Ycoordinate
for j = 1:Xcoordinate
my3Darray(i,j,:) = uint8(rand(Zdata,1)*64);
end
end
my3Darray = uint8(my3Darray);
As you can see, there're 120 locations (Y:10 * X:12) and each location has 4 of uint8 value.
And here're my questions.
I want to find if there're any two or more locations have same vector of Zdata (4 of uint8 value). How can I do this?
My actual data will be Ycoordinate=7000, Xcoordinate=7000, Zdata = 500.
So it will be around 24GB array (7000*7000*500 = 24,500,000,000 byte)
Is it possible to find same Zdata with this huge size of array?
Additionally, my data is actually boolean so it is just 0 or 1 but I don't know how to allocate only "1 bit(not 1 byte)" to my data.
The code below will tell you how many locations have duplicate z-data vectors. The idea is to reshape your data in to a 2D matrix where each row represents a single column of z-data from the original matrix. The reshaped matrix will have Xcoordinate*Ycoordinate rows and Zdata columns. Then you can use the unique function to get the unique rows of this reshaped matrix, which essentially removes any duplicate z-data vectors.
You can also replace the nested loop in your code with the following line to directly generate a 3D random matrix:
my3Darray = uint8(rand(Ycoordinate, Xcoordinate, Zdata)*64);
If you want to store boolean data, use logical arrays in MATLAB.
Edit: Follow beaker's comment above to reduce the memory footprint.
Here's the code:
clear
clc
Ycoordinate = 4000;
Xcoordinate = 4000;
Zdata = 63;
my3Darray = uint8(rand(Ycoordinate,Xcoordinate,Zdata)*64);
%reshape data so that each z-column becomes a row
A = reshape(my3Darray,Ycoordinate*Xcoordinate,Zdata);
[A_unique, I, J] = unique(A,'rows'); %get the unique rows of A
duplicate_count = size(A,1) - size(A_unique,1)
The sort() function sorts the elements row/column wise but how to sort the elements absolutely? The result should be another matrix with smallest element in (1,1) , second smallest in (1,2) and so on.
Take some random input
input = rand(5,10);
If you want the output to be a row vector, simply use
sortedRow = sort(input(:)).';
If you want the result to be the same shape as the input, then use
sortedOriginalShape = reshape(sort(input(:)), size(input,2), size(input,1)).';
Note that when maintaining the shape, we must use the reversed size dimensions and then transpose. This is because otherwise the result is column-wise i.e. smallest element in (1,1), next in (2,1) etc, which is the opposite of what you requested.
You can use the column operator (:) to vectorize all elements of 'nxm' matrix as a vector of 'nxm' elements and sort this vector. Then you can use direct assignement or 'reshape' function to store elements as matricial form.
All you need to know is that matlab use column-major-ordering to vectorize/iterate elements:
A = rand(3, 5);
A(:) = sort(A(:);
Will preserve colum-major-ordering, or as you said you prefer row-major ordering:
A = rand(3, 5);
A = reshape(sort(A(:)), fliplr(size(A)).';
Note the fliplr to store columnwise with reversed dimension and then the .' operator to transpose again the result.
EDIT
Even if matlab uses column-major-ordering for storing elements in memory, here below are two generic routines to work with row-major-order whatever the number of dimension of your array (i.e. no limited to 2D):
function [vector] = VectorizeWithRowMajorOrdering(array)
%[
axisCount = length(size(array)); % Squeezed size of original array
permutation = fliplr(1:(axisCount + 2)); % +2 ==> Trick to vectorize data in correct order
vector = permute(array, permutation);
vector = vector(:);
%]
end
function [array] = ReshapeFromRowMajorOrdering(vector, siz)
%[
siz = [siz( : ).' 1]; % Fix size if only one dim
array = NaN(siz); % Init
axisCount = length(size(array)); % Squeezed size!
permutation = fliplr(1:(axisCount + 2)); % +2 ==> Trick to vectorize data in correct order
array = reshape(vector, [1 1 fliplr(size(array))]);
array = ipermute(array, permutation);
%]
end
This can be useful when working with data coming from C/C++ (these languages use row-major-ordering). In your case this can be used this way:
A = rand(3, 5);
A = ReshapeFromRowMajorOrdering(sort(A(:)), size(A));
I have a source matrix A(m,n) for which I used "find" and now I have a list of desired indices [y,x].
I also have a 3D matrix with dimensions B(m,n,3).
I want to extract all the elements in B using the result from find.
So if find yields 4 pairs of results, I would like to have a 4x3 matrix with the contents of the Z dimension of B for the resulting indices.
I tried many things but keep failing:
A = rand(480,640);
[y,x] = find(A < 0.5);
o = B(y,x,:);
Requested 39024x39024x3 (34.0GB) array exceeds maximum array size preference.
I am clearly doing something wrong since B has dimensions (640,640,3).
With the way you are trying to index, matlab tries to index B with every combination of the elements in y and x resulting in a massive matrix. I've implemented a for loop to do what I think you are asking.
I would also note that in order to index across B the first two dimensions need to be the same size as A, Otherwise you will not be able to index B past the maximum row or column index in A.
A = rand(480,640);
B = rand(480,640,3);
[x,y] = find(A < 0.5);
o = zeros(size(x,1),1,3); % x and y are the same length so it doesn't matter
for i = 1:size(x,1)
o(i,1,:)= B(x(i),y(i),:);
end
o = reshape(o,size(x,1),3);
You can reshape B to a 2D matrix of size [m*n , 3] then use logical indexing to extract elements:
C = reshape(B, [], 3);
o = C(A<0.5, :);
I've written a function that generates a sparse matrix of size nxd
and puts in each column 2 non-zero values.
function [M] = generateSparse(n,d)
M = sparse(d,n);
sz = size(M);
nnzs = 2;
val = ceil(rand(nnzs,n));
inds = zeros(nnzs,d);
for i=1:n
ind = randperm(d,nnzs);
inds(:,i) = ind;
end
points = (1:n);
nnzInds = zeros(nnzs,d);
for i=1:nnzs
nnzInd = sub2ind(sz, inds(i,:), points);
nnzInds(i,:) = nnzInd;
end
M(nnzInds) = val;
end
However, I'd like to be able to give the function another parameter num-nnz which will make it choose randomly num-nnz cells and put there 1.
I can't use sprand as it requires density and I need the number of non-zero entries to be in-dependable from the matrix size. And giving a density is basically dependable of the matrix size.
I am a bit confused on how to pick the indices and fill them... I did with a loop which is extremely costly and would appreciate help.
EDIT:
Everything has to be sparse. A big enough matrix will crash in memory if I don't do it in a sparse way.
You seem close!
You could pick num_nnz random (unique) integers between 1 and the number of elements in the matrix, then assign the value 1 to the indices in those elements.
To pick the random unique integers, use randperm. To get the number of elements in the matrix use numel.
M = sparse(d, n); % create dxn sparse matrix
num_nnz = 10; % number of non-zero elements
idx = randperm(numel(M), num_nnz); % get unique random indices
M(idx) = 1; % Assign 1 to those indices
I have sum of 3 cell arrays
A=72x1
B=72x720
C=72x90
resultant=A+B+C
size of resultant=72x64800
now when I find the minimum value with row and column indices I can locate the row element easily but how can I locate the column element in variables?
for example
after dong calculations for A,B,C I added them all and got a resultant in from of <72x(720x90)> or can say a matrix of integers of size <72x64800> then I found the minimum value of resultant with row and column index using the code below.
[minimumValue,ind]=min(resultant(:));
[row,col]=find(result== minimumValue);
then row got 14 and column got 6840 value..
now I can trace row 14 of all A,B,C variables easily but how can I know that the resultant column 6480 belongs to which combination of A,B,C?
Instead of using find, use the ind output from the min function. This is the linear index for minimumValue. To do that you can use ind2sub:
[r,c] = ind2sub(size(resultant),ind);
It is not quite clear what do you mean by resultant = A+B+C since you clearly don't sum them if you get a bigger array (72x64800), on the other hand, this is not a simple concatenation ([A B C]) since this would result in a 72x811 array.
However, assuming this is a concatenation you can do the following:
% get the 2nd dimension size of all matrices:
cols = cellfun(#(x) size(x,2),{A,B,C})
% create a vector with reapiting matrices names for all their columns:
mats = repelem(['A' 'B' 'C'],cols);
% get the relevant matrix for the c column:
mats(c)
so mats(c) will be the matrix with the minimum value.
EDIT:
From your comment I understand that your code looks something like this:
% arbitrary data:
A = rand(72,1);
B = rand(72,720);
C = rand(72,90);
% initializing:
K = size(B,2);
N = size(C,2);
counter = 1;
resultant = zeros(72,K*N);
% summing:
for k = 1:K
for n = 1:N
resultant(:,counter) = A + B(:,k) + C(:,n);
counter = counter+1;
end
end
% finding the minimum value:
[minimumValue,ind] = min(resultant(:))
and from the start of the answer you know that you can do this:
[r,c] = ind2sub(size(resultant),ind)
to get the row and column of minimumValue in resultant. So, in the same way you can do:
[Ccol,Bcol] = ind2sub([N,K],c)
where Bcol and Ccol is the column in B and C, respectively, so that:
minimumValue == A(r) + B(r,Bcol) + C(r,Ccol)
To see how it's working imagine that the loop above fills a matrix M with the value of counter, and M has a size of N-by-K. Because we fill M with a linear index, it will be filled in a column-major way, so the row will correspond to the n iterator, and the column will correspond to the k iterator. Now c corresponds to the counter where we got the minimum value, and the row and column of counter in M tells us the columns in B and C, so we can use ind2sub again to get the subscripts of the position of counter. Off course, we don't really need to create M, because the values within it are just the linear indices themselves.