Sort a vector and count the identical occurrences - matlab

What is a Matlab-efficient way (no loop) to do the following operation: transform an input vector input into an output vector output such as output(i) is the number of integers in input that are less or equal than input(i).
For example:
input = [5 3 3 2 4 4 4]
would give:
output = [7 3 3 1 6 6 6]

First of all, don't use input for a variable name, it's a reserved keyword. I'll use X here instead.
An alternative way to obtain your desired result would be:
[U, V] = meshgrid(1:numel(X), 1:numel(X));
Y = sum(X(U) >= X(V))
and here's a one-liner:
Y = sum(bsxfun(#ge, X, X'))
EDIT:
If X has multiple rows and you want to apply this operation on each row, this is a little bit trickier. Here's what you can do:
[U, V] = meshgrid(1:numel(X), 1:size(X, 2));
V = V + size(X, 2) * idivide(U - 1, size(X, 2));
Xt = X';
Y = reshape(sum(Xt(U) >= Xt(V))', size(Xt))'
Example:
X =
5 3 3 2 4 4 4
3 9 7 7 1 2 2
Y =
7 3 3 1 6 6 6
4 7 6 6 1 3 3

I have found a possible answer:
output = arrayfun(#(x) sum(x>=input),input)
but it doesn't take advantage of vectorization.

Related

How to align vectors with asynchronous time stamp in matlab?

I would like to align and count vectors with different time stamps to count the corresponding bins.
Let's assume I have 3 matrix from [N,edges] = histcounts in the following structure. The first row represents the edges, so the bins. The second row represents the values. I would like to sum all values with the same bin.
A = [0 1 2 3 4 5;
5 5 6 7 8 5]
B = [1 2 3 4 5 6;
2 5 7 8 5 4]
C = [2 3 4 5 6 7 8;
1 2 6 7 4 3 2]
Now I want to sum all the same bins. My final result should be:
result = [0 1 2 3 4 5 6 7 8;
5 7 12 16 ...]
I could loop over all numbers, but I would like to have it fast.
You can use accumarray:
H = [A B C].'; %//' Concatenate the histograms and make them column vectors
V = [unique(H(:,1)) accumarray(H(:,1)+1, H(:,2))].'; %//' Find unique values and accumulate
V =
0 1 2 3 4 5 6 7 8
5 7 12 16 22 17 8 3 2
Note: The H(:,1)+1 is to force the bin values to be positive, otherwise MATLAB will complain. We still use the actual bins in the output V. To avoid this, as #Daniel says in the comments, use the third output of unique (See: https://stackoverflow.com/a/27783568/2732801):
H = [A B C].'; %//' stupid syntax highlighting :/
[U, ~, IU] = unique(H(:,1));
V = [U accumarray(IU, H(:,2))].';
If you're only doing it with 3 variables as you've shown then there likely aren't going to be any performance hits with looping it.
But if you are really averse to the looping idea, then you can do it using arrayfun.
rng = 0:8;
output = arrayfun(#(x)sum([A(2,A(1,:) == x), B(2,B(1,:) == x), C(2,C(1,:) == x)]), rng);
output = cat(1, rng, output);
output =
0 1 2 3 4 5 6 7 8
5 7 12 16 22 17 8 3 2
This can be beneficial for particularly large A, B, and C variables as there is no copying of data.

maximum value column with the condition of another column [duplicate]

I need to find the maximum among values with same labels, in matlab, and I am trying to avoid using for loops.
Specifically, I have an array L of labels and an array V of values, same size. I need to produce an array S which contains, for each value of L, the maximum value of V. An example will explain better:
L = [1,1,1,2,2,2,3,3,3,4,4,4,1,2,3,4]
V = [5,4,3,2,1,2,3,4,5,6,7,8,9,8,7,6]
Then, the values of the output array S will be:
s(1) = 9 (the values V(i) such that L(i) == 1 are: 5,4,3,9 -> max = 9)
s(2) = 8 (the values V(i) such that L(i) == 2 are: 2,1,2,8 -> max = 8)
s(3) = 7 (the values V(i) such that L(i) == 3 are: 3,4,5,7 -> max = 7)
s(4) = 8 (the values V(i) such that L(i) == 4 are: 6,7,8,6 -> max = 8)
this can be trivially implemented by traversing the arrays L and V with a for loop, but in Matlab for loops are slow, so I was looking for a faster solution. Any idea?
This is a standard job for accumarray.
Three cases need to be considered, with increasing generality:
Integer labels.
Integer labels, specify fill value.
Remove gaps; or non-integer labels. General case.
Integer labels
You can just use
S = accumarray(L(:), V(:), [], #max).';
In your example, this gives
>> L = [1 1 1 2 2 2 3 3 3 4 4 4 1 2 3 7];
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6];
>> S = accumarray(L(:), V(:), [], #max).'
S =
9 8 7 8
Integer labels, specify fill value
If there are gaps between integers in L, the above will give a 0 result for the non-existing labels. If you want to change that fill value (for example to NaN), use a fifth input argument in acccumarray:
S = accumarray(L(:), V(:), [], #max, NaN).';
Example:
>> L = [1 1 1 2 2 2 3 3 3 4 4 4 1 2 3 7]; %// last element changed
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6]; %// same as in your example
>> S = accumarray(L(:), V(:), [], #max, NaN).'
S =
9 8 7 8 NaN NaN 6
Remove gaps; or non-integer labels. General case
When the gaps between integer labels are large, using a fill value may be inefficient. In that case you may want to get only the meaningful values in S, without fill values, i.e.skip non-existing labels. Also, it may be the case that L doesn't necessarily contain integers.
These two issues are solved by applying unique to the labels before using accumarray:
[~, ~, Li] = unique(L); %// transform L into consecutive integers
S = accumarray(Li(:), V(:), [], #max, NaN).';
Example:
>> L = [1.5 1.5 1.5 2 2 2 3 3 3 4 4 4 1 2 3 7.8]; %// note: non-integer values
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6 ]; %// same as in your example
>> [~, ~, Li] = unique(L); %// transform L into consecutive integers
>> S = accumarray(Li(:), V(:), [], #max, NaN).'
S =
9 5 8 7 8 6
helper=[L.', V.'];
helper=sortrows(helper,-2);
[~,idx,~]=unique(helper(:,1));
S=helper(idx,2);
What I do is: I join the two arrays as columns. Then I sort them regarding second column with biggest element first. Then I get the idx of the unique Values in L before I return the corresponding Values from V.
The solution from Luis Mendo is faster. But as far as I see his solution doesn't work if there is a zero,negative value or a noninteger inside L:
Luis solution: Elapsed time is 0.722189 seconds.
My solution: Elapsed time is 2.575943 seconds.
I used:
L= ceil(rand(1,500)*10);
V= ceil(rand(1,500)*250);
and ran the code 10000 times.

Matlab: avoid for loops to find the maximum among values with same labels

I need to find the maximum among values with same labels, in matlab, and I am trying to avoid using for loops.
Specifically, I have an array L of labels and an array V of values, same size. I need to produce an array S which contains, for each value of L, the maximum value of V. An example will explain better:
L = [1,1,1,2,2,2,3,3,3,4,4,4,1,2,3,4]
V = [5,4,3,2,1,2,3,4,5,6,7,8,9,8,7,6]
Then, the values of the output array S will be:
s(1) = 9 (the values V(i) such that L(i) == 1 are: 5,4,3,9 -> max = 9)
s(2) = 8 (the values V(i) such that L(i) == 2 are: 2,1,2,8 -> max = 8)
s(3) = 7 (the values V(i) such that L(i) == 3 are: 3,4,5,7 -> max = 7)
s(4) = 8 (the values V(i) such that L(i) == 4 are: 6,7,8,6 -> max = 8)
this can be trivially implemented by traversing the arrays L and V with a for loop, but in Matlab for loops are slow, so I was looking for a faster solution. Any idea?
This is a standard job for accumarray.
Three cases need to be considered, with increasing generality:
Integer labels.
Integer labels, specify fill value.
Remove gaps; or non-integer labels. General case.
Integer labels
You can just use
S = accumarray(L(:), V(:), [], #max).';
In your example, this gives
>> L = [1 1 1 2 2 2 3 3 3 4 4 4 1 2 3 7];
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6];
>> S = accumarray(L(:), V(:), [], #max).'
S =
9 8 7 8
Integer labels, specify fill value
If there are gaps between integers in L, the above will give a 0 result for the non-existing labels. If you want to change that fill value (for example to NaN), use a fifth input argument in acccumarray:
S = accumarray(L(:), V(:), [], #max, NaN).';
Example:
>> L = [1 1 1 2 2 2 3 3 3 4 4 4 1 2 3 7]; %// last element changed
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6]; %// same as in your example
>> S = accumarray(L(:), V(:), [], #max, NaN).'
S =
9 8 7 8 NaN NaN 6
Remove gaps; or non-integer labels. General case
When the gaps between integer labels are large, using a fill value may be inefficient. In that case you may want to get only the meaningful values in S, without fill values, i.e.skip non-existing labels. Also, it may be the case that L doesn't necessarily contain integers.
These two issues are solved by applying unique to the labels before using accumarray:
[~, ~, Li] = unique(L); %// transform L into consecutive integers
S = accumarray(Li(:), V(:), [], #max, NaN).';
Example:
>> L = [1.5 1.5 1.5 2 2 2 3 3 3 4 4 4 1 2 3 7.8]; %// note: non-integer values
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6 ]; %// same as in your example
>> [~, ~, Li] = unique(L); %// transform L into consecutive integers
>> S = accumarray(Li(:), V(:), [], #max, NaN).'
S =
9 5 8 7 8 6
helper=[L.', V.'];
helper=sortrows(helper,-2);
[~,idx,~]=unique(helper(:,1));
S=helper(idx,2);
What I do is: I join the two arrays as columns. Then I sort them regarding second column with biggest element first. Then I get the idx of the unique Values in L before I return the corresponding Values from V.
The solution from Luis Mendo is faster. But as far as I see his solution doesn't work if there is a zero,negative value or a noninteger inside L:
Luis solution: Elapsed time is 0.722189 seconds.
My solution: Elapsed time is 2.575943 seconds.
I used:
L= ceil(rand(1,500)*10);
V= ceil(rand(1,500)*250);
and ran the code 10000 times.

average 3rd column when 1st and 2nd column have same numbers

just lets make it simple, assume that I have a 10x3 matrix in matlab. The numbers in the first two columns in each row represent the x and y (position) and the number in 3rd columns show the corresponding value. For instance, [1 4 12] shows that the value of function in x=1 and y=4 is equal to 12. I also have same x, and y in different rows, and I want to average the values with same x,y. and replace all of them with averaged one.
For example :
A = [1 4 12
1 4 14
1 4 10
1 5 5
1 5 7];
I want to have
B = [1 4 12
1 5 6]
I really appreciate your help
Thanks
Ali
Like this?
A = [1 4 12;1 4 14;1 4 10; 1 5 5;1 5 7];
[x,y] = consolidator(A(:,1:2),A(:,3),#mean);
B = [x,y]
B =
1 4 12
1 5 6
Consolidator is on the File Exchange.
Using built-in functions:
sparsemean = accumarray(A(:,1:2), A(:,3).', [], #mean, 0, true);
[i,j,v] = find(sparsemean);
B = [i.' j.' v.'];
A = [1 4 12;1 4 14;1 4 10; 1 5 5;1 5 7]; %your example data
B = unique(A(:, 1:2), 'rows'); %find the unique xy pairs
C = nan(length(B), 1);
% calculate means
for ii = 1:length(B)
C(ii) = mean(A(A(:, 1) == B(ii, 1) & A(:, 2) == B(ii, 2), 3));
end
C =
12
6
The step inside the for loop uses logical indexing to find the mean of rows that match the current xy pair in the loop.
Use unique to get the unique rows and use the returned indexing array to find the ones that should be averaged and ask accumarray to do the averaging part:
[C,~,J]=unique(A(:,1:2), 'rows');
B=[C, accumarray(J,A(:,3),[],#mean)];
For your example
>> [C,~,J]=unique(A(:,1:2), 'rows')
C =
1 4
1 5
J =
1
1
1
2
2
C contains the unique rows and J shows which rows in the original matrix correspond to the rows in C then
>> accumarray(J,A(:,3),[],#mean)
ans =
12
6
returns the desired averages and
>> B=[C, accumarray(J,A(:,3),[],#mean)]
B =
1 4 12
1 5 6
is the answer.

What does A=[x; y'] in Matlab mean?

I'm learning Matlab and I see a line that I don't understand:
A=[x; y']
What does it mean? ' usually means the transponate but I don't know what ; means in the vector. Can you help me?
The [ ] indicates create a matrix.
The ; indicates that the first vector is on the first line, and that the second one is on the second line.
The ' indicates the transponate.
Exemple :
>> x = [1,2,3,4]
x =
1 2 3 4
>> y = [5;6;7;8]
y =
5
6
7
8
>> y'
ans =
5 6 7 8
>> A = [x;y']
A =
1 2 3 4
5 6 7 8
[x y] means horizontal cat of the vectors, while [x;y] means vertical.
For example (Horizontal cat):
x = [1
2
3];
y = [4
5
6];
[x y] = [1 4
2 5
3 6];
(Vertical cat):
x = [1 2 3];
y = [4 5 6];
[x; y] =
[1 2 3;
4 5 6];
Just to be clear, in MATLAB ' is the complex conjugate transpose. If you want the non-conjugating transpose, you should use .'.
It indicates the end of a row when creating a matrix from other matrices.
For example
X = [1 2];
Y = [3,4]';
A = [X; Y']
gives a matrix
A = [ 1 2 ]
[ 3 4 ]
This is called vertical concatenation which basically means forming a matrix in row by row fashion from other matrices (like the example above). And yes you are right about ' indicating the transpose operator. As another example you could use it to create a transposed vector as follows
Y = [1 2 3 4 5];
X = [1; 2; 3; 4; 5];
Y = Y';
Comparing the above you will see that X is now equal to Y. Hope this helps.
Let set the size of x m*n (m rows and n columns) and the size of y n*p.
Then A is the matrix formed by the vertical concatenation of x and the transpose of y (operator '), and its size is (m+p)*n. The horizontal concatenation is done with comma instead of semi-column.
This notation is a nice shorthand for function vertcat.
See http://www.mathworks.fr/help/techdoc/math/f1-84864.html for more information
The semicolon ' ; ' is used to start a new row.
e.g. x=[1 2 3; 4 5 6; 7 8 9] means
x= 1 2 3
4 5 6
7 8 9
So if u take x=[1 2 3; 4 5 6] and y=[7 8 9]'
then z=[x; y'] means
z= 1 2 3
4 5 6
7 8 9