Count number of elements in a specific range in an array [duplicate] - matlab

I have a huge vector. I have to count values falling within certain ranges.
the ranges are like 0-10, 10-20 etc. I have to count the number of values which fall in certain range.
I did something like this :
for i=1:numel(m1)
if (0<m1(i)<=10)==1
k=k+1;
end
end
Also:
if not(isnan(m1))==1
x=(0<m1<=10);
end
But both the times it gives array which contains all 1s. What wrong am I doing?

You can do something like this (also works for non integers)
k = sum(m1>0 & m1<=10)

You can use logical indexing. Observe:
>> x = randi(40, 1, 10) - 20
x =
-2 17 -12 -9 -14 -14 15 4 2 -14
>> x2 = x(0 < x & x < 10)
x2 =
4 2
>> length(x2)
ans =
2
and the same done in one step:
>> length(x(0 < x & x < 10))
ans =
2

to count the values in a specific range you can use ismember,
if m1 is vector use,
k = sum(ismember(m1,0:10));
If m1 is matrix use k = sum(sum(ismember(m1,0:10)));
for example,
m1=randi(20,[5 5])
9 10 6 10 16
8 9 14 20 6
16 13 14 7 11
16 15 4 12 14
4 16 3 5 18
sum(sum(ismember(m1,1:10)))
12

Why not simply do something like this?
% Random data
m1 = 100*rand(1000,1);
%Count elements between 10 and 20
m2 = m1(m1>10 & m1<=20);
length(m2) %number of elements of m1 between 10 and 20
You can then put things in a loop
% Random data
m1 = 100*rand(1000,1);
nb_elements = zeros(10,1);
for k=1:length(nb_elements)
temp = m1(m1>(10*k-10) & m1<=(10*k));
nb_elements(k) = length(temp);
end
Then nb_elements contains your data with nb_elements(1) for the 0-10 range, nb_elements(2) for the 10-20 range, etc...

Matlab does not know how to evaluate the combined logical expression
(0<m1(i)<=10)
Insted you should use:
for i=1:numel(m1)
if (0<m1(i)) && (m1(i)<=10)
k=k+1;
end
end
And to fasten it up probably something like this:
sum((0<m1) .* (m1<=10))

Or you can create logical arrays and then use element-wise multiplication. Don't know how fast this is though and it might use a lot of memory for large arrays.
Something like this
A(find((A>0.2 .* (A<0.8)) ==1))
Generate values
A= rand(5)
A =
0.414906 0.350930 0.057642 0.650775 0.525488
0.573207 0.763477 0.120935 0.041357 0.900946
0.333857 0.241653 0.421551 0.737704 0.162307
0.517501 0.491623 0.016663 0.016396 0.254099
0.158867 0.098630 0.198298 0.223716 0.136054
Find the intersection where the values > 0.8 and < 0.2. This will give you two logical arrays and the values where A>0.2 and A<0.8 will be =1 after element-wise multiplication.
find((A>0.2 .* (A<0.8)) ==1)
Then apply those indices to A
A(find((A>0.2 .* (A<0.8)) ==1))
ans =
0.41491
0.57321
0.33386
0.51750
0.35093
0.76348
0.24165
0.49162
0.42155
0.65077
0.73770
0.22372
0.52549
0.90095
0.25410

Related

MATLAB - Returning a matrix of sums of elements corresponding to the same kind

Overview
An n×m matrix A and an n×1 vector Date are the inputs of the function S = sumdate(A,Date).
The function returns an n×m vector S such that all rows in S correspond to the sum of the rows of A from the same date.
For example, if
A = [1 2 7 3 7 3 4 1 9
6 4 3 0 -1 2 8 7 5]';
Date = [161012 161223 161223 170222 160801 170222 161012 161012 161012]';
Then I would expect the returned matrix S is
S = [15 9 9 6 7 6 15 15 15;
26 7 7 2 -1 2 26 26 26]';
Because the elements Date(2) and Date(3) are the same, we have
S(2,1) and S(3,1) are both equal to the sum of A(2,1) and A(3,1)
S(2,2) and S(3,2) are both equal to the sum of A(2,2) and A(3,2).
Because the elements Date(1), Date(7), Date(8) and Date(9) are the same, we have
S(1,1), S(7,1), S(8,1), S(9,1) equal the sum of A(1,1), A(7,1), A(8,1), A(9,1)
S(1,2), S(7,2), S(8,2), S(9,2) equal the sum of A(1,2), A(7,2), A(8,2), A(9,2)
The same for S([4,6],1) and S([4,6],2)
As the element Date(5) does not repeat, so S(5,1) = A(5,1) = 7 and S(5,2) = A(5,2) = -1.
The code I have written so far
Here is my try on the code for this task.
function S = sumdate(A,Date)
S = A; %Pre-assign S as a matrix in the same size of A.
Dlist = unique(Date); %Sort out a non-repeating list from Date
for J = 1 : length(Dlist)
loc = (Date == Dlist(J)); %Compute a logical indexing vector for locating the J-th element in Dlist
S(loc,:) = repmat(sum(S(loc,:)),sum(loc),1); %Replace the located rows of S by the sum of them
end
end
I tested it on my computer using A and Date with these attributes:
size(A) = [33055 400];
size(Date) = [33055 1];
length(unique(Date)) = 2645;
It took my PC about 1.25 seconds to perform the task.
This task is performed hundreds of thousands of times in my project, therefore my code is too time-consuming. I think the performance will be boosted up if I can eliminate the for-loop above.
I have found some built-in functions which do special types of sums like accumarray or cumsum, but I still do not have any ideas on how to eliminate the for-loop.
I would appreciate your help.
You can do this with accumarray, but you'll need to generate a set of row and column subscripts into A to do it. Here's how:
[~, ~, index] = unique(Date); % Get indices of unique dates
subs = [repmat(index, size(A, 2), 1) ... % repmat to create row subscript
repelem((1:size(A, 2)).', size(A, 1))]; % repelem to create column subscript
S = accumarray(subs, A(:)); % Reshape A into column vector for accumarray
S = S(index, :); % Use index to expand S to original size of A
S =
15 26
9 7
9 7
6 2
7 -1
6 2
15 26
15 26
15 26
Note #1: This will use more memory than your for loop solution (subs will have twice the number of element as A), but may give you a significant speed-up.
Note #2: If you are using a version of MATLAB older than R2015a, you won't have repelem. Instead you can replace that line using kron (or one of the other solutions here):
kron((1:size(A, 2)).', ones(size(A, 1), 1))

difference of each two elements of a column in the matrix

I have a matrix like this:
fd =
x y z
2 5 10
2 6 10
3 5 11
3 9 11
4 3 11
4 9 12
5 4 12
5 7 13
6 1 13
6 5 13
I have two parts of my problem:
1) I want to calculate the difference of each two elements in a column.
So I tried the following code:
for i= 1:10
n=10-i;
for j=1:n
sdiff1 = diff([fd(i,1); fd(i+j,1)],1,1);
sdiff2 = diff([fd(i,2); fd(i+j,2)],1,1);
sdiff3 = diff([fd(i,3); fd(i+j,3)],1,1);
end
end
I want all the differences such as:
x1-x2, x1-x3, x1-x4....x1-x10
x2-x3, x2-x4.....x2-x10
.
.
.
.
.
x9-x10
same for y and z value differences
Then all the values should stored in sdiff1, sdiff2 and sdiff3
2) what I want next is for same z values, I want to keep the original data points. For different z values, I want to merge those points which are close to each other. By close I mean,
if abs(sdiff3)== 0
keep the original data
for abs(sdiff3) > 1
if abs(sdiff1) < 2 & abs(sdiff2) < 2
then I need mean x, mean y and mean z of the points.
So I tried the whole programme as:
for i= 1:10
n=10-i;
for j=1:n
sdiff1 = diff([fd(i,1); fd(i+j,1)],1,1);
sdiff2 = diff([fd(i,2); fd(i+j,2)],1,1);
sdiff3 = diff([fd(i,3); fd(i+j,3)],1,1);
if (abs(sdiff3(:,1)))> 1
continue
mask1 = (abs(sdiff1(:,1)) < 2) & (abs(sdiff2(:,1)) < 2) & (abs(sdiff3:,1)) > 1);
subs1 = cumsum(~mask1);
xmean1 = accumarray(subs1,fd(:,1),[],#mean);
ymean1 = accumarray(subs1,fd(:,2),[],#mean);
zmean1 = accumarray(subs1,fd(:,3),[],#mean);
fd = [xmean1(subs1) ymean1(subs1) zmean1(subs1)];
end
end
end
My final output should be:
2.5 5 10.5
3.5 9 11.5
5 4 12
5 7 13
6 1 13
where, (1,2,3),(4,6),(5,7,10) points are merged to their mean position (according to the threshold difference <2) whereas 8 and 9th point has their original data.
I am stuck in finding the differences for each two elements of a column and storing them. My code is not giving me the desired output.
Can somebody please help?
Thanks in advance.
This can be greatly simplified using vectorised notation. You can do for instance
fd(:,1) - fd(:,2)
to get the difference between columns 1 and 2 (or equivalently diff(fd(:,[1 2]), 1, 2)). You can make this more elegant/harder to read and debug with pdist but if you only have three columns it's probably more trouble than it's worth.
I suspect your first problem is with the third argument to diff. If you use diff(X, 1, 1) it will do the first order diff in direction 1, which is to say between adjacent rows (downwards). diff(X, 1, 2) will do it between adjacent columns (rightwards), which is what you want. Matlab uses the opposite convention to spreadsheets in that it indexes rows first then columns.
Once you have your diffs you can then test the elements:
thesame = find(sdiff3 < 2); % for example
this will yield a vector of the row indices of sdiff3 where the value is less than 2. Then you can use
fd(thesame,:)
to select the elements of fd at those indexes. To remove matching rows you would do the opposite test
notthesame = find(sdiff > 2);
to find the ones to keep, then extract those into a new array
keepers = fd(notthesame,:);
These won't give you the exact solution but it'll get you on the right track. For the syntax of these commands and lots of examples you can run e.g. doc diff in the command window.

Matching by Matlab

A =[1;2;3;4;5]
B= [10 1 ;11 2;19 5]
I want to get
D = [1 10 ;2 11; 3 -9; 4 -9; 5 19]
That is, if something in A doesn't exist in B(:,2), 2nd column of D should be -9.
If something in A exists in B(:,2), I want to put the corresponding row of 1st column of B in 2nd column of D.
I know how to do it with a mix of ismember and for and if. But I need a more elegant method which doesn't use "for" to speed it up.
Unless I'm missing something (or A is not a vector of indices), this is actually much simpler and doesn't require ismember or find at all, just direct indexing:
D = [A zeros(length(A),1)-9];
D(B(:,2),2) = B(:,1)
which for your example matrices gives
D =
1 10
2 11
3 -9
4 -9
5 19
For general A:
A =[1;2;3;4;6]; % Note change in A
B= [10 1 ;11 2;19 5];
[aux, where] = ismember(A,B(:,2));
b_where = find(where>0);
D = [(1:length(A)).' repmat(-9,length(A),1)];
D(b_where,2) = B(where(b_where),1);
This gives
D = [ 1 10
2 11
3 -9
4 -9
5 -9 ]
I am not sure how efficient this solution is, but it avoids using any loops so at least it would be a place to start:
D = [A -9*ones(size(A))]; % initialize your result
[tf, idx] = ismember(A, B(:,2)); % get indices of matching elements
idx(idx==0) = []; % trim the zeros
D(find(tf),2) = B(idx,1); % set the matching entries in D to the appropriate entries in B
disp(E)
You allocate your matrix ahead of time to save time later on (building up matrices dynamically is really slow in MATLAB). The ismember call is returning the true-false vector tf showing which elements of A correspond to something in B, as well as the associated index of what they correspond to in idx. The problem with idx is that it contains a zero any time ft is zero, which is why we have the line idx(idx==0) = []; to clear out these zeros. Finally, find(tf) is used to get the indices associated with a match on the destination (rows of D in this case), and we set the values at those indices equal to the corresponding value in B that we want.
First create a temporary matrix:
tmp=[B; -9*ones(size(A(~ismember(A,B)))), A(~ismember(A,B))]
tmp =
10 1
11 2
19 5
-9 3
-9 4
Then use find to obtain the first row index with a match to elements of A in the second column of tmp (there is always a match by design).
D=[A, arrayfun(#(x) tmp(find(tmp(:,2)==x,1,'first'),1),A)];
D =
1 10
2 11
3 -9
4 -9
5 19
As an alternative for the second step, you could simply sort tmp based on its second column:
[~,I]=sort(tmp(:,2));
D=[tmp(I,2), tmp(I,1)]
D =
1 10
2 11
3 -9
4 -9
5 19

How to compare array to a number for if statement?

H0 is an array ([1:10]), and H is a single number (5).
How to compare every element in H0 with the single number H?
For example in
if H0>H
do something
else
do another thing
end
MATLAB always does the other thing.
if requires the following statement to evaluate to a scalar true/false. If the statement is an array, the behaviour is equivalent to wrapping it in all(..).
If your comparison results in a logical array, such as
H0 = 1:10;
H = 5;
test = H0>H;
you have two options to pass test through the if-statement:
(1) You can aggregate the output of test, for example you want the if-clause to be executed when any or all of the elements in test are true, e.g.
if any(test)
do something
end
(2) You iterate through the elements of test, and react accordingly
for ii = 1:length(test)
if test(ii)
do something
end
end
Note that it may be possible to vectorize this operation by using the logical vector test as index.
edit
If, as indicated in a comment, you want P(i)=H0(i)^3 if H0(i)<H, and otherwise P(i)=H0(i)^2, you simply write
P = H0 .^ (H0<H + 2)
#Jonas's nice answer at his last line motivated me to come up with a version using logical indexing.
Instead of
for i=1:N
if H0(i)>H
H0(i)=H0(i)^2;
else
H0(i)=H0(i)^3;
end
end
you can do this
P = zeros(size(H0)); % preallocate output
test = H0>H;
P(test) = H0(test).^2; % element-wise operations
% on the elements for which the test is true
P(~test) = H0(~test).^3; % element-wise operations
% on the elements for which the test is false
Note that this is a general solution.
Anyway take a look at this: using ismemeber() function. Frankly not sure how do you expect to compare. Either greater than, smaller , equal or within as a member. So my answer might not be yet satisfying to you. But just giving you an idea anyway.
H0 = [0 2 4 6 8 10 12 14 16 18 20];
H = [10];
ismember(H,H0)
IF (ans = 1) then
// true
else
//false
end
Update Answer
This is super bruteforce method - just use it explain. You are better off with any other answers given here than what I present. Ideally what you need is to rip off greater/lower values into two different vectors with ^3 processing - I assume... :)
H0 = [0 2 4 6 8 10 12 14 16 18 20];
H = [10];
H0(:)
ans =
0
2
4
6
8
10
12
14
16
18
20
Function find returns indices of all values in H0 greater than 10 values in a linear index.
X = find(H0>H)
X =
7
8
9
10
11
Function find returns indices of all values in H0 lower than 10 in a linear index.
Y = find(H0<H)
Y =
1
2
3
4
5
6
If you want you can access each element of H0 to check greater/lower values or you can use the above matrices with indices to rip the values off H0 into two different arrays with the arithmetic operations.
G = zeros(size(X)); // matrix with the size = number of values greater than H
J = zeros(size(Y)); // matrix with the size = number of values lower than H
for i = 1:numel(X)
G(i) = H0(X(i)).^3
end
G(:)
ans =
1728
2744
4096
5832
8000
for i = 1:numel(Y)
J(i) = H0(Y(i)).^2
end
J(:)
ans =
0
4
16
36
64
100

vectorized array creation from a list of start/end indices

I have a two-column matrix M that contains the start/end indices of a bunch of intervals:
startInd EndInd
1 3
6 10
12 12
15 16
How can I generate a vector of all the interval indices:
v = [1 2 3 6 7 8 9 10 12 15 16];
I'm doing the above using loops, but I'm wondering if there's a more elegant vectorized solution?
v = [];
for i=1:size(M,1)
v = [v M(i,1):M(i,2)];
end
Here's a vectorized solution I like to use for this particular problem, using the function cumsum:
v = zeros(1, max(endInd)+1); % An array of zeroes
v(startInd) = 1; % Place 1 at the starts of the intervals
v(endInd+1) = v(endInd+1)-1; % Add -1 one index after the ends of the intervals
v = find(cumsum(v)); % Perform a cumulative sum and find the nonzero entries
cell2mat(arrayfun(#colon,M(:,1)',M(:,2)','UniformOutput',false))
I don't have IMFILL, but on my machine this method is faster than the other suggestions and I think would beat the IMFILL method due to the use of find.
It can be made even faster if M is set up transposed (and we adjust the third and fourth arguments of arrayfun).
There's probably an even better solution I'm somehow not seeing, but here's a version using IMFILL
startInd = [1,6,12,15];
endInd = [3,10,12,16];
%# create a logical vector with starts and ends set to true to prepare for imfill
tf = false(endInd(end),1);
tf([startInd,endInd]) = true;
%# fill at startInd+1 wherever startInd is not equal endInd
tf = imfill(tf,startInd(startInd~=endInd)'+1); %' SO formatting
%# use find to get the indices
v = find(tf)' %' SO formatting
v =
1 2 3 6 7 8 9 10 12 15 16
Very weird solution IMHO creating temporary strings and using EVAL. Can be also one-liner.
tmp = cellstr(strcat(num2str(M(:,1)),{':'},num2str(M(:,2)),{' '}));
v = eval(['[' cell2mat(tmp') ']']);
I know it will probably not work on large matrix. Just for fun.