I have an array of zeros and ones and I need to know if the data is spread out across the columns or concentrated in clumps.
For example:
If I have array x and it has these values:
Column 1 values: 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
Column 2 values: 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1
if we counted the number of ones we can know that it is the same number but the ones are more well spread out and distributed in column 2 compared with column 1.
I am trying to make a score that gives me a high value if the spreading is good and low value if the spreading is bad... any ideas??
Sample of Data:
1 0 0 0 5 0 -2 -3 0 0 1
1 0 0 0 0 0 0 0 0 0 1
2 0 0 0 0 0 0 3 -3 1 0
1 2 3 0 5 0 2 13 4 5 1
1 0 0 0 0 0 -4 34 0 0 1
I think what you're trying to measure is the variance of the distribution of the number of 0s between the 1s, i.e:
f = #(x)std(diff(find(x)))
So for you data:
a = [1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1]
b = [1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 1]
f(a)
= 8.0498
f(b)
= 2.0736
But I still think you're essentially trying to measure the disorder of the system which is what I imagine entropy measures but I don't know how
Note that this gives a low value if the "spreading" is good and a high value if it is bad (i.e. the opposite of your request).
Also if you want it per column then it becomes a little more complicated:
f = #(x)arrayfun(#(y)std(diff(find(x(:,y)))), 1:size(x,2))
data = [a', b'];
f(data)
WARNING: This method pretty much does not consider trailing and leading 0s. I don't know if that's a problem or not. but basically f([0; 0; 0; 1; 1; 1; 0; 0; 0]) returns 0 where as f([1; 0; 0; 1; 0; 1; 0; 0; 0]) returns a positive indicating (incorrectly) that first case is more distributed. One possible fix might be to prepend and append a row of ones to the matrix...
I think you would need an interval to find the "spreadness" locally, otherwise the sample 1 (which is named as Column 1 in the question) would appear as spread too between the 2nd and 3rd ones.
So, following that theory and assuming input_array to be the input array, you can try this approach -
intv = 10; %// Interval
diff_loc = diff(find(input_array))
spread_factor = sum(diff_loc(diff_loc<=intv)) %// desired output/score
For sample 1, spread_factor gives 4 and for sample 2 it is 23.
Another theory that you can employ would be if you assume an interval such that distance between consecutive ones must be greater than or equal to that interval. This theory would lead us to a code like this -
intv = 3; %// Interval
diff_loc = diff(find(input_array))
spread_factor = sum(diff_loc>=intv)
With this new approach - For sample 1, spread_factor is 1 and for sample 2 it is 5.
Related
I have a logical vector in which I would like to iterate over every n-elements. If in any given window at least 50% are 1's, then I change every element to 1, else I keep as is and move to the next window. For example.
n = 4;
input = [0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1];
output = func(input,4);
output = [0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1];
This function is trivial to implement but is it possible to apply a vectorized implementation using logical indexing?. I am trying to build up the intuition of applying this technique.
here's a one liner (that works for your input):
func = #(input,n) input | kron(sum(reshape(input ,n,[]))>=n/2,ones(1,n));
of course, there are cases to solve that this doesnt answer, what if the size of the input is not commensurate in n? etc...
i'm not sure if that's what you meant by vectorization, and I didnt benchmark it vs a for loop...
Here is one way of doing it. Once understood you can compact it in less lines but I'll details the intermediate steps for the sake of clarity.
%% The inputs
n = 4;
input = [0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 1];
1) Split your input into blocks of size n (note that your final function will have to check that the number of elements in input is a integer multiple of n)
c = reshape(input,n,[]) ;
Gives you a matrix with your blocks organized in columns:
c =
0 0 0 0 0
0 1 0 1 0
0 1 0 0 0
1 0 1 1 1
2) Perform your test condition on each of the block. For this we'll take advantage that Matlab is working column wise for the sum function:
>> cr = sum(c) >= (n/2)
cr =
0 1 0 1 0
Now you have a logical vector cr containing as many elements as initial blocks. Each value is the result of the test condition over the block. The 0 blocks will be left unchanged, the 1 blocks will be forced to value 1.
3) Force 1 columns/block to value 1:
>> c(:,cr) = 1
c =
0 1 0 1 0
0 1 0 1 0
0 1 0 1 0
1 1 1 1 1
4) Now all is left is to unfold your matrix. You can do it several ways:
res = c(:) ; %% will give you a column vector
OR
>> res = reshape(c,1,[]) %% will give you a line vector
res =
0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1
I have a column vector x made up of 4 elements, how can i generate all the possible combinations of the values that x can take such that x*x' is less than or equal to a certain value?
note that the values of x are positive and integers.
To be more clear:
the input is the number of elements of the column vector x and the threshold, the output are the different possible combinations of the values of x respecting the fact that x*x' <=threshold
Example: threshold is 4 and x is a 4*1 column vector.....the output is x=[0 0 0 0].[0 0 0 1],[1 1 1 1]......
See if this works for you -
threshold = 4;
A = 0:threshold
A1 = allcomb(A,A,A,A)
%// Or use: A1 = combvec(A,A,A,A).' from Neural Network Toolbox
combs = A1(sum(A1.^2,2)<=threshold,:)
Please note that the code listed above uses allcomb from MATLAB File-exchange.
Output -
combs =
0 0 0 0
0 0 0 1
0 0 0 2
0 0 1 0
0 0 1 1
0 0 2 0
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
0 2 0 0
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
2 0 0 0
How can I generate a Matrix with Boolean elements, but the sum of each row is equal to a certain constant number.
Is each row the same one number?
k = 5;
m = 10;
n = 10;
[~, I] = sort(rand(m,n), 2)
M = I <= k
If you don't want the same number of 1s in each row, but rather have a vector that specifies per row how many 1s you want then you need to use bsxfun as well:
K = (1:10)'; %//'
m = 10;
n = 10;
[~, I] = sort(rand(m,n), 2)
M = bsxfun(#ge, K,I)
Lets say you want to have 20 columns (n=20) and your vector a contains the number of ones you want in each row:
n=20;
a= [5 6 1 9 4];
X= zeros(numel(a),n);
for k=1:numel(a)
rand_order=randperm(n);
row_entries=[ones(1,a(k)),zeros(1,n-a(k))];
row_entries=row_entries(rand_order);
X(k,:)=row_entries;
end
X=boolean(X);
What I do is generate me a random ordered index array rand_order then getting an array which contains the wanted number of ones filled with zero. Reorder those elements according to rand_order saving it and converting it to logical. And because of the use of a for loop rand_order is all the time computed again, so giving you different locations for your output:
1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 0
0 0 0 1 0 0 0 1 1 0 1 0 0 0 0 0 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 0 1 1 0 0
1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0
Given a matrix where 1 is the current subset
test =
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Is there a function, or quick method to get change the subset to the boundary of the current subset?
Eg. Get this subset from 'test' above
test =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 0
0 1 1 1 1 0
0 0 0 0 0 0
In the end I just want to get the minimum of the cells surrounding a subset of a matrix. Sure I could loop through and get the minimum of the boundary (cell by cell), but there must be a way to do it with the method i've shown above.
Note the subset WILL be connected, but may not be rectangular. This may be the big catch.
This is a possible subset.... (Would pad this with a NaN border)
test =
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 1 1
0 0 1 1 1 1
Ideas?
The basic steps I'd use are:
Perform a dilation on the shape to get a new area which is the shape plus its boundary
Subtract the original shape from the dilated shape to leave just the boundary
Use the boundary to index your data matrix, then take the minimum.
Dilation
What I want to do here is pass a 3x3 window over each cell and take the maximum value in that window:
[m, n] = size(A); % assuming A is your original shape matrix
APadded = zeros(m + 2, n + 2);
APadded(2:end-1, 2:end-1) = A; % pad A with zeroes on each side
ADilated = zeros(m + 2, n + 2); % this will hold the dilated shape.
for i = 1:m
for j = 1:n
mask = zeros(size(APadded));
mask(i:i+2, j:j+2) = 1; % this places a 3x3 square of 1's around (i, j)
ADilated(i + 1, j + 1) = max(APadded(mask));
end
end
Shape subtraction
This is basically a logical AND and a logical NOT to remove the intersection:
ABoundary = ADilated & (~APadded);
At this stage you may want to remove the border we added to do the dilation, since we don't need it any more.
ABoundary = ABoundary(2:end-1, 2:end-1);
Find the minimum data point along the boundary
We can use our logical boundary to index the original data into a vector, then just take the minimum of that vector.
dataMinimum = min(data(ABoundary));
You should look at this as morphology problem, not set theory. This can be solved pretty easily with imdilate() (requires the image package). You basically only need to subtract the image to its dilation with a 3x3 matrix of 1.
octave> test = logical ([0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 1 1
0 0 1 1 1 1]);
octave> imdilate (test, true (3)) - test
ans =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 0 0 0 0
0 1 0 0 0 0
It does not, however, pads with NaN. If you really want that, you could pad your original matrix with false, do the operation, and then check if there's any true values in the border.
Note that you don't have to use logical() in which case you'll have to use ones() instead of true(). But that takes more memory and has worse performance.
EDIT: since you are trying to do it without using any matlab toolbox, take a look at the source of imdilate() in Octave. For the case of logical matrices (which is your case) it's a simple usage of filter2() which belongs to matlab core. That said, the following one line should work fine and be much faster
octave> (filter2 (true (3), test) > 0) - test
ans =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 0 0 0 0
0 1 0 0 0 0
One possible solution is to take the subset and add it to the original matrix, but ensure that each time you add it, you offset its position by +1 row, -1 row and +1 column, -1 column. The result will then be expanded by one row and column all around the original subset. You then use the original matrix to mask the original subet to zero.
Like this:
test_new = test + ...
[[test(2:end,2:end);zeros(1,size(test,1)-1)],zeros(size(test,1),1)] + ... %move subset up-left
[[zeros(1,size(test,1)-1);test(1:end-1,2:end)],zeros(size(test,1),1)] + ... %move down-left
[zeros(size(test,1),1),[test(2:end,1:end-1);zeros(1,size(test,1)-1)]] + ... %move subset up-right
[zeros(size(test,1),1),[zeros(1,size(test,1)-1);test(1:end-1,1:end-1)]]; %move subset down-right
test_masked = test_new.*~test; %mask with original matrix
result = test_masked;
result(result>1)=1; % ensure that there is only 1's, not 2, 3, etc.
The result for this on your test matrix is:
result =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 0 0 0 0
0 1 0 0 0 0
Edited - it now grabs the corners as well, by moving the subset up and to the left, up and to the right, down then left and down then right.
I expect this would be a very quick way to achieve this - it doesn't have any loops, nor functions - just matrix operations.
I am having some problems with the find function in MATLAB. I have a matrix consisting of zeros and ones (representing the geometry of a structural element), where material is present when the matrix element = 1, and where no material is present when the matrix element = 0. The matrix may have the general form shown below (it will update as the geometry is changed, but that isn't too important).
Geometry = [0 0 0 0 0 0 0 0 0 0;
0 0 1 0 1 0 1 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 0 0 0;
0 0 0 0 0 0 0 1 0 0;
0 0 0 0 0 0 0 0 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 1 1 1 0 1 0 0;
0 0 0 0 0 0 0 0 0 0;]
I'm trying to find the the rows and columns that are not continuously connected (i.e. where the row and columns are not all equal to 1 between the outer extents of the row or column) and then update them so they are all connected. I.e. the matrix above becomes:
Geometry = [0 0 0 0 0 0 0 0 0 0;
0 0 1 1 1 1 1 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 0 0 0 0 1 0 0;
0 0 1 1 1 1 1 1 0 0;
0 0 0 0 0 0 0 0 0 0;]
The problem I am having is I want to be able to find the indices of the first and last element that is equal to 1 in each row (and column), which will then be used to update the geoemtry matrix.
Ideally, I want to represent these in vectors, so going across the columns, find the row number of the first element equal to 1 and store this in a vector called rowfirst.
I.e.:
rowfirst = zeros(1,numcols)
for i = 1:numcols % Going across the columns
rowfirst(i) = find(Geometry(i,1) == 1, 1,'first')
% Store values in vector called rowfirst
end
and the repeat this for the columns and to find the last elements in each row.
For some reason, I can't get the values to store properly in the vector, does anyone have an idea of where I'm going wrong?
Thanks in advance. Please let me know if that isn't clear, as I may not have explained the problem very well.
0) bwmorph(Geometry,'close') dose it all in one line. If the holes may be bigger, try bwmorph(Geometry,'close',Inf).
Regarding your attempt:
1) It should be Geometry(i,:) instead of Geometry(i,1).
2) Your real problem here is empty matrices. Actually, what do you want rowfirst(i) to be if there are no 1s in the i'th row?
Ok, I can spot two mistakes:
You should use an array as the first argument of find. So, if you want to find the row number of the first element of each column, then you should use find(Geometry(:, i), 1, 'first').
Find returns an empty array if the column contains only zeros. You should handle this case and decide what number you want to put into rownumber (e.g. you can put -1, to indicate that the corresponding column contains no non-zero elements).
Following the above, you can try this:
for i = 1:numcols
tmp = find(Geometry(:, i), 1, 'first');
if(tmp)
rowfirst(i) = tmp;
else
rowfirst(i) = -1;
end;
end;
I'm pretty sure there's a more efficient way of doing this, but if you replace your call to find with this, it should work ok:
find(Geometry(i,:), 1,'first')
(otherwise you're just looking at the first cell of the ith row. And the == 1 is useless, since find already returns only non-zero elements, and your matrix is binary)
Use the AccumArray() function to find the min and max col (row) number.
Imagine finding the last (first) row in each column that contains a NaN.
a = [1 nan nan nan ;
2 2 3 4;
3 nan 3 3;
4 nan 4 4]
This code gets the row indices for the last NaN in each column.
[row,col] = find(isnan(a))
accumarray(col,row,[],#max)
This code gets the row indices for the first NaN in each column.
[row,col] = find(isnan(a))
accumarray(col,row,[],#min)
Swap the row and col variables to scan row-wise instead of column-wise.
This answer inspired by Finding value and index of min value in a matrix, grouped by column values